enables arbitrary operations on databases: add, search, modify and delete entries.
Furthermore, SQL also enables to manage administrative operations of the databases themselves: creating databases and tables, managing users with fine-grained authorizations, etc.
For example, designing databases becomes difficult when the list of tables grows;
Unified Modeling Language (UML) is then used to provide a graphical overview of the relations between tables.
SQL databases may be fast to retrieve data despite complicated operations, but when multiple sequential operations are required they become slow because of all the back-and-forths with the application;
thus, SQL databases can be scripted to automate operations and provide a massive speed up
Writing SQL requests requires a lot of boilerplate since there is no integration in the programming languages, leading to multiple function calls for any operation on the database;
The encountered difficulties mentioned above and the original objectives of SQL not being universal\*[*], other database designs were created\*[*].
.FOOTNOTE1
To say the least!
Not everyone needs to let users access the database without going through the application.
For instance, writing a \f[I]blog\f[] for a small event or to share small stories about your life doesn't require manual operations on the database, fortunately.
.FOOTNOTE2
.FOOTNOTE1
A lot of designs won't be mentioned here.
The actual history of databases is often quite unclear since the categories of databases are sometimes vague, underspecified.
As mentioned, SQL is not a silver bullet and a lot of developers shifted towards other solutions, that's the important part.
In this way, DODB transforms any application in a database manager.
DODB doesn't provide an interactive shell, there is no request language to perform arbitrary operations on the database, no statistical optimizations of the requests based on query frequencies, etc.
Instead, DODB reduces the complexity of the infrastructure, stores data in plain files and enables simple manual scripting with widespread unix tools.
The presented code is in Crystal such as the DODB library for now, but keep in mind that this document is all about the method more that the actual implementation, anyone could implement the exact same library in almost every other language.
Though, their implementation via the creation of symlinks is the result of a certain vision about how a database should behave in order to provide a practical way for users to sort the entries.
The file-system representation (of data and indexes) is convenient for the administrator, but input-output operations on a file-system are slow.
Storing the data on a storage device is required to protect it from crashes and application restarts.
But data can be kept in memory for faster processing of requests.
The DODB library has an API close to a hash table.
Having a data cache is as simple as keeping a hash table in memory besides providing a file-system storage, the retrieval becomes incredibly fast\*[*].
.FOOTNOTE1
Several hundred times faster, see the experiment section.
.FOOTNOTE2
Same thing for cached indexes.
Indexes can easily be cached, thanks to simple hash tables.
Databases are built around the objective to actually
.I store
data.
But sometimes the data has only the same lifetime as the application.
Stop the application and the data itself become irrelevant, which happens in several occasions, for instance when the application keeps track of the connected users.
This case is not covered by traditional databases; this is out-of-scope, short-lived data only is handled within the application.
Yet, since DODB is a library and not a separate application (read: DODB is incredibly faster), this usage of the database can be relevant.
Having the same API to handle both long and short-lived data can be useful.
Moreover, the previously mentioned indexes (basic indexes, partitions and tags) would also work the same way for these short-lived data.
Of course, in this case, the file-system representation may be completely irrelevant.
And for all these reasons, the
.I RAM-only
DODB database and
.I RAM-only
indexes were created.
Let's recap the advantages of the RAM-only DODB database.
The DODB API is the same for short-lived (read: temporary) and long-lived data.
This includes the same indexes too, so a file-system representation of the current state of the application is possible.
RAM-only also means incredible performances since DODB only is a
.I very
small layer over a hash table.
.SS RAM-only database
Instanciate a RAM-only database is as simple as the other options.
Moreover, this database has exactly the same API as the others, thus changing from one to another is painless.
A path is still required despite the database being only in memory because indexes can still be instanciated for the database, and those indexes will require this directory.
As for the database API itself, changing from a version of an index to another is painless.
This way, one can opt for a cached index and, after some time not using the file-system representation, decide to change for its RAM-only version; a 4-character modification and nothing else.
Caching would be inefficient for databases where the distribution of requests is homogeneous between the different entries, for example.
If the requests are random, without a small portion of the data receiving most requests (such as a Pareto distribution), caching becomes mostly irrelevant.
DODB provides basic database operations such as storing, searching, modifying and removing data.
Though, SQL databases have a few
.I properties
enabling a more standardized behavior and may create some expectations towards databases from a general public standpoint.
These properties are called "ACID": atomicity, consistency, isolation and durability.
DODB doesn't fully handle ACID properties.
DODB doesn't provide
.I atomicity .
Instructions cannot be chained and rollback if one of them fails.
DODB doesn't handle
.I consistency .
There is currently no mechanism to prevent adding invalid values.
.I Isolation
is partially taken into account with a locking mechanism preventing race conditions.
Though, parallelism is mostly required to respond to a large number of clients at the same time.
Also, SQL databases require a communication with an inherent latency between the application and the database, slowing down the requests despite the fast algorithms to search for a value within the database.
Parallelism is required for SQL databases because of this latency (at least partially), which doesn't exist with DODB\*[*].
.FOOTNOTE1
FYI, the service
.I netlib.re
uses DODB and since the database is fast enough, parallelism isn't required despite enabling more than a thousand requests per second.
.FOOTNOTE2
With a cache, data is retrieved five hundred times quicker than with a SQL database.
(a) basic indexes, representing 1 to 1 relations, the document's attribute is related to a value and each value of this attribute is unique,
(b) partitions, representing 1 to n relations, the attribute has a value and this value can be shared by other documents,
(c) tags, representing n to n relations, enabling the attribute to have multiple values whose are shared by other documents.
The scenario is simple: adding values to a database with indexes (basic, partitions and tags) then query 100 times a value based on the different indexes.
.BULLET \fIRAM only\f[], the database doesn't have a representation on disk (no data is written on it).
The \fIRAM only\f[] instance shows a possible way to use DODB: to keep a consistent API to store data, including in-memory data with a lifetime related to the application's.
.ENDBULLET
.FOOTNOTE1
Having a cached database will probably be the most widespread use of DODB.
When memory isn't scarce, there is no point not using it to achieve better performance.
Moreover, the "common database" enables to configure the cache size, so this database is relevant even when the data-set is bigger than the available memory.
The computer on which this test is performed\*[*] is a AMD PRO A10-8770E R7 (4 cores), 2.8 GHz.When mentioned, the
.I disk
is actually a
.I "temporary file-system (tmpfs)"
to enable maximum efficiency.
.FOOTNOTE1
A very simple $50 PC, buyed online.
Nothing fancy.
.FOOTNOTE2
The library is written in Crystal and so is the benchmark (\f[CW]spec/benchmark-cars.cr\f[]).
Nonetheless, despite a few technicalities, the objective of this document is to provide an insight on the approach used in DODB more than this particular implementation.
The manipulated data type can be found in \f[CW]spec/db-cars.cr\f[].
This figure shows the request durations to retrieve data based on a basic index with a database containing up to 250k entries, both with linear and logarithmic scales.
When the value and the index are kept in memory (see \f[CW]RAM only\f[], \f[CW]Cached db\f[] and \f[CW]Common db\f[]), the retrieval is almost instantaneous\*[*].
.FOOTNOTE1
About 110 to 120 ns for RAM-only and cached database.
This is slightly more (about 200 ns) for Common database since there is a few more steps due to the inner structure to maintain.
.FOOTNOTE2
In case the value is on the disk, deserialization takes about 15 µs (see \f[CW]Uncached db\f[]).
The request is a little longer when the index isn't cached (see \f[CW]Uncached db and index\f[]); in this case DODB walks the file-system to find the right symlink to follow, thus slowing the process even more, by up to 20%.
The logarithmic scale version of this figure shows that \fIRAM-only\f[] and \fIcached\f[] databases have exactly the same performance.
The \fIcommon\f[] database is somewhat slower than these two due to the caching policy: when a value is asked, the \fIcommon\f[] database puts its key at the start of a list to represent a
Thus, the \fIcommon\f[] database takes 80 ns for its caching policy, which makes this database about 67% slower than the previous ones to retrieve a value.
This difference comes, at least partially, from the additional process of putting all the results in an array (which may also include some memory management) and the accumulated random delays for the retrieval of each value (due to processus scheduling on the machine, for example).
Further analysis of the results may be interesting but this is far beyond the scope of this document.
The objective of this experiment is to give an idea of the performance that can be expected from DODB.
Basically, uncached databases are between 70 to 600 times slower than cached ones.
The eviction policy in
.I common
database slows down the retrievals, which makes it 70% to 6 times slower than
.I cached
and
.I RAM-only
databases, and the more data there is to retrieve, the worst it gets.
However, retrieving thousands and thousands of entries in a single request may not be a typical usage of databases, anyway.
The results are similar to the retrivial of partition indexes, because this is fundamentally the same thing:
.ENUM both tag and partition indexes enable to retrieve a list of entries;
.ENUM the keys of the database entries come from listing the content of a directory (uncached indexes) or are directly available from a hash (cached indexes);
.ENUM data is retrieved irrespective of the index, it is either read from the storage device or retrieved from a data cache, which depends on the type of database.
.ENDENUM
Retrieving data from a partition or a tag involves exactly the same actions, which leads to the same results.
A particularity of the tag index compared to partitions is that it enables multiple values for the same attribute, thus a database entry can be referenced in multiple directories.
For example, a car can be both
.I elegant
and
.I fast .
The retrieval of entries corresponding to a single
.I tag
is then exactly similar to retrieving a partition\*[*].
.FOOTNOTE1
It would be different in case of a retrieval of entries corresponding to
.I several
tags, such as selecting cars that are
.UL "both elegant and fast" .
This test may be done in a future version of this document.
.FOOTNOTE2
.
.
.SS Summary of the different databases and their use
.LP
.B "RAM-only database"
is the fastest database but has a limited use since data isn't saved.
Right now, security isn't managed in DODB, at all.
Sure, DODB isn't vulnerable to SQL injections, but an internet-facing application may encounter a few other problems including, but not limited to, code injection, buffer overflows, etc.
Of course, DODB isn't a mechanism to protect applications from any possible attack, so most of the vulnerabilities cannot be countered by the library.
However, a few security mechanisms exist to prevent data leak or data modification from an outsider and the DODB library may implement some of them in the future.
prevents a region of memory from being put in the swap because it could lead to a data leak.
The
.FUNCTION_CALL madvise
syscall can prevent parts of the application's memory to be put in a code-dump\*[*], which is a debug file created when an application crashes containing (part of) the process's memory when it crashed.
.FOOTNOTE1
.FUNCTION_CALL madvice
has the interesting option
.I MADV_DONTDUMP
to prevent a data leak through a core-dump, but it is linux-specific.
Since this implementation of DODB is related to the Crystal language (which isn't fully ported to the OpenBSD plateform at-the-moment), this is a problem.
The service is split into two three components: a user interface (the website, written in purescript), an authentication daemon (\fIauthd\f[]) and a daemon handling all the server operations related to the actual service (\fIdnsmanagerd\f[]).
Several "databases" are maintained:
.ENUM \fIusers\f[],
.ENUM \fIdomains\f[],
.ENUM \fItokens\f[],
.ENUM \fIconnected users\f[] for collaborative work
Performance-wise, netlibre handles between 2 to 3k req/s with a single core, without any optimization.
Code is written in an almost naive\*[*] way and still performing fine.
.FOOTNOTE1
Keep in mind that netlibre uses poll(2), a very old syscall to handle its event loop (from the 80's!); not newest and way faster event facilities such as epoll(2) and the like.
.FOOTNOTE2
Indexes with file-system representation enables quick debugging sessions and to perform a few basic tasks such as listing all the domains of a user which, in practice, is great to have at our fingertips with simple unix tools.
This figure shows a value being requested and since there is only a single value being requested in the test, it is immediately put in the cache and is never evicted.
As we see in the figure, the duration for data retrieval grows almost linearly for databases with a sufficient cache size (starting with 10k entries).
When the cache size is not sufficient, the requests are hundred times slower, which explain why the database with a cache size of one thousand entries isn't even represented in the graph, and why the 5k database has great results up to 5k partitions.
function returns an empty list in case the search failed.
.br
The implementation was designed to be simple (7 lines of code), not efficient.
However, with data and index caches, the search is expected to meet about everyone's requirements, speed-wise, given that the tags are small enough (a few thousand entries).