Paper improved. Slowing reaching a first readable version.

This commit is contained in:
Philippe Pittoli 2025-01-27 05:15:18 +01:00
parent b08c8d43d8
commit 78df769a29

View file

@ -45,14 +45,15 @@
.TITLE Document Oriented DataBase (DODB)
.AUTHOR Philippe PITTOLI
.ABSTRACT1
DODB is a database-as-library, enabling a very simple way to store applications' data: storing serialized
DODB is a document-oriented database library, enabling a very simple way to store applications' data without external dependencies by storing serialized
.I documents
(basically any data type) in plain files.
The objective is to avoid complex traditional relational databases and to explore a more straightforward way to handle data, to have a tool anyone can
.B "read and understand entirely" .
To speed-up searches, attributes of these documents can be used as indexes.
DODB can provide a file-system representation of those indexes through symbolic links
.I symlinks ). (
This enables administrators to search for data outside the application with the most basic tools, such as
.I ls .
DODB can provide a file-system representation of those indexes to enable off-application data manipulation with the most basic tools, such as
.I ls
or even a file explorer.
This document briefly presents DODB and its main differences with other database engines.
Limits of such approach are discussed.
@ -60,20 +61,24 @@ An experiment is described and analyzed to understand the performance that can b
.ABSTRACT2
.SINGLE_COLUMN
.SECTION Introduction to DODB
A database consists in managing data, enabling queries (preferably fast) to retrieve, to modify, to add and to delete a piece of information.
Anything else is
.UL accessory .
A database consists in managing data, enabling queries to add, to retrieve, to modify and to delete a piece of information.
These actions are grouped under the acronym CRUD: creation, retrieval, update and deletion.
CRUD operations are the foundation for the most basic databases.
Yet, almost every single database engine goes far beyond this minimalistic set of features.
Although everyone using the file-system of their computer as some sort of database (based on previous definition) by storing raw data (files) in a hierarchical manner (directories), computer science classes introduce a particularly convoluted way of managing data.
Universities all around the world teach about Structured Query Language (SQL) and relational databases.
These two concepts are closely interlinked and require a brief explanation.
.
.UL "Relational databases"
are built around the idea to put data into
are built around the idea to describe data to a database engine so it can optimize operations and storage.
Data is put into
.I tables ,
with typed columns so the database can optimize operations and storage.
Data are thus described.
with each column being an attribute of the stored data and each line being a new entry.
A database is a list of tables with relations between them.
For example, let's imagine a database of a movie theater.
As an example, let's imagine a database of a movie theater.
The database will have a
.I table
for the list of movies they have
@ -92,14 +97,14 @@ Tables have relations, for example the table "scheduling" has a column
which points to entries in the "movie" table.
.UL "The SQL language"
enables arbitrary operations on databases: add, search, modify and delete entries.
SQL also enables administrative operations of the databases themselves: creating databases and tables, managing users with fine-grained authorizations, etc.
enables CRUD operations on databases: creation, retrieval, update and deletion of entries.
SQL also enables administrative operations on the databases themselves: creating databases and tables, managing users with fine-grained authorizations, etc.
SQL is used between the application and the database, to perform operations and to provide results when due.
SQL is also used
.UL outside
the application, by admins for managing databases and potentially by some
the application, by admins for managing databases and potentially by
.I non-developer
users to retrieve some data without a dedicated interface\*[*].
users to retrieve data without a dedicated interface\*[*].
.FOOTNOTE1
One of the first objectives of SQL was to enable a class of
.I non-developer
@ -115,6 +120,7 @@ thus, SQL databases can be scripted to automate operations and provide a massive
.I "stored procedures" , (
see
.I "PL/SQL" ).
Moreover, the latency between the database and the application makes internet-facing applications require parallelism to handle a high number of clients (or even moderate by today's standards), via multiple threads or concurrent applications.
Furthermore, writing SQL requests requires a lot of boilerplate since there is no integration in the programming languages, leading to multiple function calls for any operation on the database;
thus, object-relational mapping (ORM) libraries were created to reduce the massive code duplication.
And so on.
@ -147,8 +153,14 @@ Since homogeneity is not necessary anymore, databases have fewer (or different)
Document-oriented databases are a sub-class of key-value stores, where metadata can be extracted from the entries for further optimizations.
And that's exactly what is being done in Document Oriented DataBase (DODB).
.UL "The stated goal of DODB"
is to provide a simple library for developers to handle data for basic projects.
Traditional SQL relational databases have a snowballing effect on code complexity, including for applications with basic requirements.
However, DODB may be a great starting point to implement more sophisticated features for creative minds.
Code simplicity implies hackability.
.UL "Contrary to SQL" ,
DODB has a very narrow scope: to provide a library enabling to store, retrieve, modify and delete data.
DODB has a very narrow scope: to provide a library enabling to store, to retrieve, to modify and to delete data.
In this way, DODB transforms any application in a database manager.
DODB doesn't provide an interactive shell, there is no request language to perform arbitrary operations on the database, no statistical optimizations of the requests based on query frequencies, etc.
Instead, DODB reduces the complexity of the infrastructure, stores data in plain files and enables simple manual scripting with widespread unix tools.
@ -497,12 +509,16 @@ Though, the implementation involving an heavy use of the file-system via the cre
Other kinds of triggers could
.B easily
be implemented in addition of those presented.
These new triggers may have completely different objectives\*[*], methods and performance.
These new triggers may have completely different objectives\*[*], methods and performance\*[*].
.FOOTNOTE1
Providing a file-system representation of the data is a fun experiment;
sysadmins can have a playful relation with the database thanks to an unconventional representation of the data.
On the other hand, new triggers could seek to improve performance by any means necessary including the gazillion ways which already exist.
.FOOTNOTE2
.FOOTNOTE1
New triggers could seek to improve performance by any means necessary including the gazillion ways which already exist.
.FOOTNOTE2
For example, a new kind of triggers could implement a way to accelerate searches for an attribute.
.TBD
The following sections will precisely cover this aspect.
.
.
@ -723,10 +739,10 @@ The following experiment shows the performance of DODB based on querying duratio
Data can be searched via
.I indexes ,
as for SQL databases.
Three possible indexes exist in DODB:
(a) basic indexes, representing 1 to 1 relations, the document's attribute is related to a value and each value of this attribute is unique,
(b) partitions, representing 1 to n relations, the attribute has a value and this value can be shared by other documents,
(c) tags, representing n to n relations, enabling the attribute to have multiple values whose are shared by other documents.
As a reminder, three possible indexes exist in DODB:
(a) basic indexes for 1-to-1 relations, the document's attribute is related to a value and each value of this attribute is unique,
(b) partitions for 1-to-n relations, the attribute has a value and this value can be shared by other documents,
(c) tags for n-to-n relations, enabling the attribute to have multiple values whose are shared by other documents.
The scenario is simple: adding values to a database with indexes (basic, partitions and tags) then query 100 times a value based on the different indexes.
Loop and repeat.
@ -738,12 +754,9 @@ Five instances of DODB are tested:
.BULLET \fIcommon database\f[] shows the most basic use of DODB, with a limited cache (100k entries)\*[*];
.BULLET \fIcached database\f[] represents a database will all the entries in cache (no eviction mechanism);
.BULLET \fIRAM only\f[], the database doesn't have a representation on disk (no data is written on it).
The \fIRAM only\f[] instance shows a possible way to use DODB: to keep a consistent API to store data, including in-memory data with a lifetime related to the application's.
.ENDBULLET
.FOOTNOTE1
Having a cached database will probably be the most widespread use of DODB.
When memory isn't scarce, there is no point not using it to achieve better performance.
Moreover, the "common database" enables to configure the cache size, so this database is relevant even when the data-set is bigger than the available memory.
The data cache can be fine-tuned with the "common database", enabling the use of DODB in environments with low memory.
.FOOTNOTE2
The computer on which this test is performed\*[*] is a AMD PRO A10-8770E R7 (4 cores), 2.8 GHz.When mentioned, the
@ -791,24 +804,29 @@ About 110 to 120 ns for RAM-only and cached database.
This is slightly more (about 200 ns) for Common database since there is a few more steps due to the inner structure to maintain.
.FOOTNOTE2
In case the value is on the disk, deserialization takes about 15 µs (see \f[CW]Uncached db\f[]).
The request is a little longer when the index isn't cached (see \f[CW]Uncached db and index\f[]); in this case DODB walks the file-system to find the right symlink to follow, thus slowing the process even more, by up to 20%.
The request is a little longer when the index isn't cached (see \f[CW]Uncached db and index\f[]); in this case DODB walks the file-system to find the right symlink to follow, thus slowing the process even more, up to 20%.
The logarithmic scale version of this figure shows that \fIRAM-only\f[] and \fIcached\f[] databases have exactly the same performance.
The \fIcommon\f[] database is somewhat slower than these two due to the caching policy: when a value is asked, the \fIcommon\f[] database puts its key at the start of a list to represent a
.I recent
use of this data (respectively, the last values in this list are the least recently used entries).
Thus, the \fIcommon\f[] database takes 80 ns for its caching policy, which makes this database about 67% slower than the previous ones to retrieve a value.
The \fIcommon\f[] database spends 80 ns for its LRU caching eviction policy\*[*], making this database about 67% slower than the previous ones to retrieve a value.
.FOOTNOTE1
The LRU policy in DODB is implemented with a double-linked list and a hash table.
When a value is retrieved or modified, its key is put at the start of a list so the list order represents values from the most to the least recently used.
Also, a hash table is maintained to quickly jump to the right list entry.
Both these operations take time.
.FOOTNOTE2
Uncached databases are far away from these results, as shown by the logarithmically scaled figure.
The data cache improves the duration of the requests, this makes them at least 170 times faster.
The data cache makes requests at least 170 times faster.
The results depend on the data size; the bigger the data, the slower the serialization (and deserialization).
.B "As a side note" :
the results depend on the data size.
The bigger the data, the slower the serialization (and deserialization).
In this example, the database entries are almost empty; they have very few attributes and not much content (a few dozen characters max).
Thus, performance of non-cached databases will be even more severely impacted with real-world data.
That is why alternative encodings, such as CBOR,
Alternative encodings, such as CBOR,
.[
CBOR
.]
should be considered for large databases.
should be considered for databases with non-trivial documents.
.
.
.SS Partitions (1 to n relations)
@ -834,7 +852,7 @@ which are flattened in the linear scale since they are between one to five hundr
The duration of a retrieval grows linearly with the number of matched entries.
On both figures, a dashed line is drawn representing a linear growth based on the quickest retrieval observed from basic indexes for each database.
This dashed line and the observed results differ slightly; observed results grow more than what has been calculated.
This difference comes, at least partially, from the additional process of putting all the results in an array (which may also include some memory management) and the accumulated random delays for the retrieval of each value (due to processus scheduling on the machine, for example).
This difference comes, at least partially, from the additional process of putting all the results in an array (which may also include some memory management) and the accumulated random delays for the retrieval of each value (due to the cache policy processing, to the processus scheduling on the machine, etc.).
Further analysis of the results may be interesting but this is far beyond the scope of this document.
The objective of this experiment is to give an idea of the performance that can be expected from DODB.
@ -867,34 +885,28 @@ The number of cars retrieved scales from 1000 to 5000.
.QE
.
.
The results are similar to the retrivial of partition indexes, because this is fundamentally the same thing:
Tag and partition indexes request durations are similar because both are fundamentally the same thing:
.ENUM both tag and partition indexes enable to retrieve a list of entries;
.ENUM the keys of the database entries come from listing the content of a directory (uncached indexes) or are directly available from a hash (cached indexes);
.ENUM data is retrieved irrespective of the index, it is either read from the storage device or retrieved from a data cache, which depends on the type of database.
.ENDENUM
Retrieving data from a partition or a tag involves exactly the same actions, which leads to the same results.
A particularity of the tag index compared to partitions is that it enables multiple values for the same attribute, thus a database entry can be referenced in multiple directories.
Contrary to partitions, the tag index enables multiple values for the same attribute.
For example, a car can be both
.I elegant
and
.I fast .
The retrieval of entries corresponding to a single
.I tag
is then exactly similar to retrieving a partition\*[*].
The DODB API enables to retrieve data matching several tags\*[*].
.FOOTNOTE1
It would be different in case of a retrieval of entries corresponding to
.I several
tags, such as selecting cars that are
.UL "both elegant and fast" .
This test may be done in a future version of this document.
Current DODB implementation performs the request of both tags then produces a list intersection.
The duration of the request would then be the addition of both tag requests and the duration of the intersection operation (plus an additional time span for memory management, depending on the list sizes).
.FOOTNOTE2
.
.
.SS Summary of the different databases and their use
.LP
.B "RAM-only database"
is the fastest database but has a limited use since data isn't saved.
is the fastest database but dedicated to short-lived data (data is not saved on disk).
.B "Cached database"
enables the same performance on data retrieval than RAM-only while actually storing data on a storage device.
@ -902,12 +914,13 @@ This database is to be considered to achieve maximum speed for data-sets fitting
.B "Common database"
enables to lower the memory requirements as much as desired.
The eviction policy implies some operations which leads to poorer performances, however still acceptable.
The eviction policy implies some operations leading to poorer performances, however still widely acceptable in most cases.
.B "Uncached database"
is mostly in this experiment as a control sample, to see what could be the worst possible performances of DODB.
is essentially a debug mode and is not expected to run in most real-life scenarii.
The purpose is to produce a control sample (involving only raw IO operations) to compare it to other (more realistic) implementations.
Cached indexes should be considered for most applications, or even their RAM-only version in case the file-system representation isn't necessary.
Cached indexes should be considered for most applications, and even more their RAM-only version in case the file-system representation isn't necessary.
.
.\" .ps -2
.\" .TS
@ -945,16 +958,14 @@ Caching the value enables a massive performance gain, data can be retrieved seve
The more entries requested, the slower it gets; but more importantly, the poorer performances it gets
.UL "per entry" .
The eviction policy also implies poorer performances since it requires operations to select the data to cache.
However, the implementation is as simple as it gets, and some approaches could be considered to make it faster.
Notably, specific data-sets or database uses could lead to adapt the eviction policy.
Same thing for the entire caching mechanism.
The current implementation offers a simple and generic way to store data based on typical database uses.
The eviction policy implies poorer performances since it requires a few list and hash table operations, even if the current implementation (based on the LRU algorithm) is fairly simple and efficient.
As a side note, let's keep in mind that requesting several thousand entries in DODB, with the common database for instance, is as slow as getting
To put things into perspective, requesting several thousand entries in DODB based on an index (partition or tag) is as slow as getting
.B "a single entry"
with SQL (varies from 0.1 to 2 ms on my machine for a single value without a search, just the first available entry).
This should help put things into perspective.
with a traditional SQL database\*[*].
.FOOTNOTE1
With Postgres, the request duration of a single value varies from 0.1 to 2 ms on my machine without a search, just the first available entry.
.FOOTNOTE2
.
.
.
@ -1007,17 +1018,18 @@ Again, this is basic but
.SHINE "good enough"
for most applications.
.ENDBULLET
A future improvement could be to write a checksum for every written data, to easily remove corrupt data from a database.
.B "Discussion on ACID properties" .
First and foremost, both atomicity and isolation properties are inherently related to parallelism, whether through concurrent threads or applications.
Traditional SQL databases require both atomicity and isolation properties because they cannot afford not to have parallelism.
Since DODB is a library (and not a separate application) and is kept simple (no intermediary language to interpret, no complicated algorithm), it doesn't suffer from any communication latency or processing delay slowing down the requests.
Since DODB is a library (and not a separate application) and is kept simple (no intermediary language to interpret, no complicated algorithm), it doesn't suffer from any communication latency or long processing delaying requests.
As the experimentation shown, retrieving a value in DODB only takes about 20 µs, 200 ns with a data cache.
Therefore, DODB could theoretically serve millions of requests per second from a single thread\*[*].
.FOOTNOTE1
FYI, the service
.I netlib.re
uses DODB and since the database is fast enough, parallelism isn't required despite enabling more than a thousand requests per second.
uses DODB and since the database is fast enough, parallelism isn't required despite enabling several thousand requests per second.
.FOOTNOTE2
Considering this swiftness, parallelism may seem as optional.
@ -1031,12 +1043,24 @@ As a side note, consistency is already taken care of within the application anyw
Database verifications are just the last bastion against inserting junk data.
.FOOTNOTE2
Moreover, the consistency property in traditional SQL databases is often used for simple tasks but quickly becomes difficult to deal with.
Some companies and organizations (such as Doctors Without Borders for instance) cannot afford to implement all the preventive measures in their DBMSs due to the sheer complexity of it.
Instead, these organizations adopt curative measures that they may call "data-fix".
Thus, having some verifications in the database is not a silver bullet, it is complementary to other measures.
DODB may provide some form of atomicity and consistency at some point, but nothing fancy nor too advanced.
The whole point of the DODB project is to keep the code simple, hackable.
The whole point of the DODB project is to keep the code simple, hackable, enjoyable even.
Not handling these properties isn't a limitation of the DODB approach but a choice for this project\*[*].
.FOOTNOTE1
Which also results from a lack of time.
.FOOTNOTE2
.B "Beyond ACID properties" .
Most current databases (traditional relational databases, some key-value databases and so on) provide additional features.
These features may include for example high availability toolsets (replication, clustering, etc.), some forms of modularity (several storage backends, specific interfaces with other tools, etc.), interactive command lines or shells, user and authorization management, administration of databases, and so on.
Because DODB is a library and doesn't support the SQL language, because DODB
.TBD
.
.
.
@ -1048,15 +1072,23 @@ This section briefly presents some of them and their difference from DODB.
.BULLET
.B "Traditional DBMS" .
This category includes all SQL database management systems with a dedicated application handling databases and the operations upon them.
Most known DBMSs are MSSQL, PostgreSQL, Oracle and MariaDB.
Most known DBMSs are MSSQL, Postgres, Oracle and MariaDB.
These applications are inherently complex for different reasons.
.STARTBULLET
.BULLET They require a description of the data;
.BULLET They require queries written in a dedicated language (SQL);
.BULLET They implement many sophisticated algorithms for performance reasons;
.BULLET Data is written in unconventional formats.
.BULLET Data is written in unconventional formats that may change (slightly or completely) at any moment;
.BULLET Their codebase is gargantuan\*[*], between one and several million lines of code.
.ENDBULLET
.FOOTNOTE1
MadiaDB has 2.3 million lines of code (MLOC) and 1.7 MLOC for Postgres.
Other mentioned DBMSs aren't open-source software, but it seems reasonable to consider their number of LOC to be in the same ballpark.
.br
Just to put things into perspective, DODB is less than 1300 lines of code.
Sure, DODB doesn't have the same features, but are they worth multiplying the codebase by 1700?
.FOOTNOTE2
.BULLET
.B "Key-value stores."
@ -1235,6 +1267,11 @@ database should be an acceptable choice for most applications.
.BULLET ramdb is a great tool, same API than the rest so you can attach indexes to it
.ENDBULLET
DODB won't power the next
.I "AI thing" ,
it will never handle databases with a petabyte of data nor revolutionize cryptocurrency.
However, DODB may be a better fit than traditional databases for your next blog, your gaming forum, the software forge of your dreams and maybe your future MMORPG.
.TBD
.APPENDIX LRU vs Efficient LRU