Limits of DODB++.
This commit is contained in:
parent
5dbc282027
commit
9c98dd33ce
1 changed files with 63 additions and 27 deletions
|
@ -1088,40 +1088,40 @@ Which also results from a lack of time.
|
||||||
.FOOTNOTE2
|
.FOOTNOTE2
|
||||||
|
|
||||||
.SS "Beyond ACID properties \[en] modern databases' features"
|
.SS "Beyond ACID properties \[en] modern databases' features"
|
||||||
Most current databases (traditional relational databases, some key-value databases and so on) provide additional features.
|
Most current databases (traditional relational databases, some key-value databases and so on) provide additional features that need to be addressed.
|
||||||
|
|
||||||
.STARTBULLET
|
.STARTBULLET
|
||||||
.KS
|
.KS
|
||||||
.BULLET
|
.BULLET
|
||||||
.B "High availability toolsets"
|
.B "High availability toolsets"
|
||||||
(replication, clustering, etc.).
|
(replication, clustering, etc.).
|
||||||
Well, this simply doesn't match with DODB goals to provide a database for small projects.
|
This is out-of-scope.
|
||||||
These tools imply an unreasonable amount of code compared to the current DODB library.
|
They don't match the DODB goals to begin with, which is to provide a database for small projects.
|
||||||
|
The author of this document did not explore this idea and probably never will.
|
||||||
.KE
|
.KE
|
||||||
|
|
||||||
However, some of these features could be provided by the filesystem itself.
|
A (maybe limited) version of these features could be provided by the filesystem itself.
|
||||||
|
For example, CephFS is a filesystem designed for replication, fault tolerance, large-scale deployment and so on.
|
||||||
|
|
||||||
.KS
|
.KS
|
||||||
.BULLET
|
.BULLET
|
||||||
.B Modularity
|
.B Modularity .
|
||||||
(several storage backends, specific interfaces with other tools, etc.).
|
Traditional DBMSs often have several storage backends to meet the needs in different contexts.
|
||||||
|
Technically, DODB already implements several storage backends since the DODB RAMOnly database doesn't record data on a storage device contrary to the other implementations.
|
||||||
|
More importantly, the definition of a database in DODB is simple enough to consider developing a specialized backend for any specific need.
|
||||||
|
The RAMOnly database only has 33 lines of code and is a great starting point for more complex implementations.
|
||||||
.KE
|
.KE
|
||||||
|
|
||||||
|
Also, traditional DBMSs may have specific interfaces with other tools, for example to delegate a feature to an external software such as ElasticSearch for complex requests on strings (which may require some sophisticated text analysis).
|
||||||
|
There is no facility in DODB to provide this, however providing data to an external tool could be as simple as implementing a new trigger which could be achieved in a few dozen lines.
|
||||||
|
|
||||||
.KS
|
.KS
|
||||||
.BULLET
|
.BULLET
|
||||||
.B "Interactive management"
|
.B "Database administration" .
|
||||||
(through command lines or a dedicated shell).
|
Traditional databases can be managed through command lines or a dedicated shell, enabling interactive CRUD on databases (and tables) themselves, user and authorization management, etc.
|
||||||
.KE
|
DODB cannot, for the very same reason it came into existence: enabling this kind of tooling implies an enormous amount of code and complexity, obfuscating core database operations that should be both understandable and customizable.
|
||||||
|
|
||||||
.KS
|
|
||||||
.BULLET
|
|
||||||
.B "Database administration"
|
|
||||||
(CRUD on databases themselves, user and authorization management, etc.).
|
|
||||||
.KE
|
.KE
|
||||||
.ENDBULLET
|
.ENDBULLET
|
||||||
|
|
||||||
Because DODB is a library and doesn't support an intermediary language for generic requests,
|
|
||||||
.TBD
|
|
||||||
.
|
.
|
||||||
.SS "The state of file systems, their limitations and useful features for DODB instances"
|
.SS "The state of file systems, their limitations and useful features for DODB instances"
|
||||||
A
|
A
|
||||||
|
@ -1132,7 +1132,8 @@ The next paragraphs will give an idea of how filesystems work, the implied limit
|
||||||
.FOOTNOTE1
|
.FOOTNOTE1
|
||||||
Explaining the way filesystem work and their design is out of the scope of this document, so this part will be kept short for readability reasons.
|
Explaining the way filesystem work and their design is out of the scope of this document, so this part will be kept short for readability reasons.
|
||||||
.FOOTNOTE2
|
.FOOTNOTE2
|
||||||
|
.
|
||||||
|
.SSS "How a filesystem works"
|
||||||
Filesystems designed for specific constraints, such as writing data on a compact disk\*[*] or providing a network filesystem, are out-of-scope of this document.
|
Filesystems designed for specific constraints, such as writing data on a compact disk\*[*] or providing a network filesystem, are out-of-scope of this document.
|
||||||
.FOOTNOTE1
|
.FOOTNOTE1
|
||||||
A compact disk has specific constraints since the device will then only provide read-only access to the data, obviating the need for most of the complexity revolving around fragmentation, inode management and so on.
|
A compact disk has specific constraints since the device will then only provide read-only access to the data, obviating the need for most of the complexity revolving around fragmentation, inode management and so on.
|
||||||
|
@ -1168,7 +1169,8 @@ Since all files cannot be reasonably expected to be written in a continuous segm
|
||||||
Filesystems may enable to tweak the block size (related to the
|
Filesystems may enable to tweak the block size (related to the
|
||||||
.I "sector"
|
.I "sector"
|
||||||
size of the storage device) either to reduce fragmentation and metadata (bigger block sizes to partitions with big files) or to avoid wasting space (smaller block sizes to partitions with a huge number of files under the size of a block).
|
size of the storage device) either to reduce fragmentation and metadata (bigger block sizes to partitions with big files) or to avoid wasting space (smaller block sizes to partitions with a huge number of files under the size of a block).
|
||||||
|
.
|
||||||
|
.SSS "Objectives of a filesystem"
|
||||||
Filesystems share a (loosely) common set of objectives.
|
Filesystems share a (loosely) common set of objectives.
|
||||||
|
|
||||||
.STARTBULLET
|
.STARTBULLET
|
||||||
|
@ -1225,7 +1227,9 @@ A few other features need to be mentionned, such as block suballocation, file co
|
||||||
Some filesystems added more than a decade ago then under-explored features such as snapshots, compression and transactions.
|
Some filesystems added more than a decade ago then under-explored features such as snapshots, compression and transactions.
|
||||||
.KE
|
.KE
|
||||||
.ENDBULLET
|
.ENDBULLET
|
||||||
|
.
|
||||||
|
.KS
|
||||||
|
.SSS "Quick comparison between DBMSs and filesystems"
|
||||||
.ds OK \[OK]
|
.ds OK \[OK]
|
||||||
.ds NOK \[tmu]
|
.ds NOK \[tmu]
|
||||||
.nr total 16.0c
|
.nr total 16.0c
|
||||||
|
@ -1248,10 +1252,13 @@ T}
|
||||||
Consistency : \*[OK] : \*[NOK]
|
Consistency : \*[OK] : \*[NOK]
|
||||||
Isolation : \*[OK] :T{
|
Isolation : \*[OK] :T{
|
||||||
.dq "new file then mv"
|
.dq "new file then mv"
|
||||||
technique
|
technique\*[*]
|
||||||
T}
|
T}
|
||||||
Durability : \*[OK] :limited (checksums)
|
Durability : \*[OK] :limited (checksums)
|
||||||
Access Time : 0.1 to 2ms :a few µs (cache) to a few ms (first access with a hard disk)
|
Access Time : 0.1 to 2ms :a few µs (cache) to a few ms (first access with a hard disk)
|
||||||
|
High avail. : \*[OK] :T{
|
||||||
|
RAID & variants
|
||||||
|
T}
|
||||||
Transactions : \*[OK] :T{
|
Transactions : \*[OK] :T{
|
||||||
implemented in a few filesystems (BTRFS, ZFS)
|
implemented in a few filesystems (BTRFS, ZFS)
|
||||||
T}
|
T}
|
||||||
|
@ -1265,26 +1272,55 @@ T}:T{
|
||||||
depends on many factors, but generally important
|
depends on many factors, but generally important
|
||||||
T}
|
T}
|
||||||
.TE
|
.TE
|
||||||
|
.FOOTNOTE1
|
||||||
|
In a desktop environment this technique isn't viable, users usually just rewrite data in-place.
|
||||||
|
However, considering a data management library, this method to ensure data integrity is a no-brainer.
|
||||||
|
.FOOTNOTE2
|
||||||
|
.KE
|
||||||
|
.
|
||||||
|
.KS
|
||||||
|
.SSS "Exotic filesystems"
|
||||||
|
Filesystems have been developed over the years for various requirements.
|
||||||
|
Let's browse for a moment to provide an overview of what is possible.
|
||||||
|
|
||||||
.B "Conclusion" .
|
.B Kernel-related .
|
||||||
|
A whole class of filesystems is dedicated to provide an interface to the kernel, such as
|
||||||
|
.I procfs
|
||||||
|
(information about running processes),
|
||||||
|
.I sysfs
|
||||||
|
(to tweak a few device parameters) or even
|
||||||
|
.I debugfs
|
||||||
|
(to provide debug info from the kernel to user-space).
|
||||||
|
|
||||||
|
.B "Cluster, network, high-availability, distributed" …
|
||||||
|
.\" Many filesystems aim to provide a network-accessible cluster of storage,
|
||||||
|
.\" Beside
|
||||||
|
.\" Research on filesystems Beside well-known
|
||||||
|
.KE
|
||||||
|
.
|
||||||
|
.KS
|
||||||
|
.SSS "Conclusion on filesystems"
|
||||||
The difference between the feature set of traditional databases and filesystems slightly narrowed over time.
|
The difference between the feature set of traditional databases and filesystems slightly narrowed over time.
|
||||||
The discrepancy will always be there since they do not share the same goal, yet some features overlap.
|
The discrepancy will always be there since they do not share the same goal, yet some features overlap.
|
||||||
Even though no current filesystem has been designed to be used the way DODB use them, this kind of database system can profit from some
|
Even though no current filesystem has been designed to be used the way DODB use them, this kind of database system can profit from some
|
||||||
.dq recent
|
.dq recent
|
||||||
developments in the filesystem world (such as transactions).
|
developments in the filesystem world (such as transactions).
|
||||||
The codebase size (and complexity) necessary to create a database system that provides acceptable performances for a small project \*[*] shrunk drastically thanks to hardware and filesystem developments.
|
.KE
|
||||||
|
|
||||||
|
Also, the codebase size (and complexity) necessary to create a database system that provides acceptable performances for a small project \*[*] shrunk drastically thanks to hardware and filesystem developments.
|
||||||
.FOOTNOTE1
|
.FOOTNOTE1
|
||||||
Beside CRUD operations, a small project could imply basic relations between data, some simple transactions, a few databases (or
|
Beside CRUD operations, a small project could imply basic relations between data, some simple transactions, a few databases (or
|
||||||
.I tables
|
.I tables
|
||||||
in DBMS jargon) and a few thousand operations per second.
|
in DBMS jargon) and a few thousand operations per second.
|
||||||
Both relations and transactions could be handled by the application, not necessarily by the database system itself.
|
Both relations and transactions could be handled by the application, not necessarily by the database system itself.
|
||||||
.FOOTNOTE2
|
.FOOTNOTE2
|
||||||
|
Performance is simply not a problem for most use nowadays.
|
||||||
Performance is simply not a problem for most use.
|
|
||||||
Having a directory with a few million entries is fine on modern filesystems.
|
Having a directory with a few million entries is fine on modern filesystems.
|
||||||
The access time is slow (a few ms) only on the first access, the kernel
|
The first file access is slow (a few ms) then the kernel
|
||||||
.B automatically
|
.B automatically
|
||||||
caches accessed files, then we are talking about a few dozen µs which is virtually nothing.
|
caches the file, making it reachable in about a few dozen µs which is virtually nothing.
|
||||||
|
|
||||||
|
TODO: des systèmes de fichiers dédiés
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.SECTION Alternatives
|
.SECTION Alternatives
|
||||||
|
|
Loading…
Add table
Reference in a new issue