From cdfaf3a006070d6d4825efd094840adf81274b02 Mon Sep 17 00:00:00 2001 From: Philippe Pittoli Date: Thu, 6 Mar 2025 16:35:10 +0100 Subject: [PATCH] Alternatives++. --- paper/paper.ms | 71 +++++++++++++++++++++++++++++++++++++------------- 1 file changed, 53 insertions(+), 18 deletions(-) diff --git a/paper/paper.ms b/paper/paper.ms index 9caa88f..f5ad955 100644 --- a/paper/paper.ms +++ b/paper/paper.ms @@ -1361,7 +1361,10 @@ Isolation : \*[OK] :T{ technique\*[*] T} Durability : \*[OK] :\*[OK] checksums -Access Time : 0.1 to 2ms :a few µs (cache) to a few ms (first access with a hard disk) +Access Time : 0.1 to 2ms :T{ +a few µs (cache), a few dozen µs (SSD+NVMe), a few hundred µs (SSD) +and up to a dozen ms (hard disk) +T} High avail. : \*[OK] :T{ \*[OK] RAID & variants plus many distributed or cluster filesystems T} @@ -1414,11 +1417,15 @@ Beside CRUD operations, a small project could imply basic relations between data in DBMS jargon) and a few thousand operations per second. Both relations and transactions could be handled by the application, not necessarily by the database system itself. .FOOTNOTE2 -Performance is simply not a problem for most use nowadays. -Having a directory with a few million entries is fine on modern filesystems. -The first file access is slow (a few ms) then the kernel +Performance simply isn't a problem for most uses nowadays. +Having a directory with a few million entries is fine on modern filesystems since reaching a file by name (with its full path) doesn't trigger a linear search. +The first file access is slow\*[*] then the kernel .B automatically caches the file, making it reachable in about a few dozen µs which is virtually nothing. +.FOOTNOTE1 +The first access to a file on a hard drive can be as slow as a few miliseconds and about a hundred microseconds for a SSD. +But today with the NVMe protocol, the latency to the first file access can be as low as a dozen microseconds. +.FOOTNOTE2 . . .SECTION Alternatives @@ -1451,7 +1458,18 @@ Sure, DODB doesn't have the same features, but are they worth multiplying the co .BULLET .B "Embedded SQL database" . -Examples: SQLite and DuckDB. +Example: +.B SQLite . +This is a library (for +.dq "serverless" +applications) implementing SQL for database operations, making it far more complex and slower than DODB. + +As SQLite, +.B DuckDB +also is a library, with a slightly different objective. +Instead of being written and optimized to answer real-time requests, the goal is to perform operations on large data sets with a focus on analytical processing. +And again, as SQLite, DuckDB implements SQL and sophisticated operations making it far more complex than DODB. +Though, the stated goal is fairly different from the subject of this paper which may explain its complexity. .BULLET .B "New types of SQL" . @@ -1460,20 +1478,39 @@ Example: (modern SQL-like with easy-of-use complex relations), .B RethinkDB (modern SQL-like, JSON exchanges, distributed). +These new applications try to improve the SQL language with the benefit of hindsight provided by the experience with current SQL technologies. +Beside a few simplified operations compared to current SQL equivalent, and some performance improvements thanks to fine-grained typing, none of them tackle the complexity problem of the database itself. +A new SQL-like language would still require an enormous piece of code to run. .BULLET .B "Key-value stores." -Examples: -.B RocksDB -(embedded), +Example: .B Memcached -(application to store data cache, not to be used as a primary database system), -.B CockroachDB -(proprietary, distributed, ACID transactions) +(application to store data cache, not to be used as a primary database system). +KV stores are often used as cache for traditional DBMSs. +KV stores have the advantage of being simpler than SQL databases, Memcached +.dq only +has 61 kloc for example. +However, most KV stores implement features beyond the core functionality. +.B Redis , +and its open-source fork +.B Valkey , +are complex KV stores with a lot of features, including support for many typed data, message broker, clustering, distributed cache, optional durability, server-side scripting, etc. + +Many other KV stores can be mentioned, such as +.B LevelDB +(embedded), +and +.B RocksDB +(fork of LevelDB with added features, such as transactions, snapshots, bloom filters, optimizations for multi-CPUs, etc.), +.B CockroachDB +(proprietary, distributed, ACID transactions), etc. + +.KS .BULLET .B "Document databases" . -Such as DODB, many other document-oriented databases exist. +Many other document-oriented databases exist beside DODB. For example, .B CouchDB (distributed, fault-tolerant, RESTful HTTP and JSON API…), @@ -1481,14 +1518,12 @@ For example, (proprietary, ACID transactions, replication…), .B UnQlite (embedded, ACID transactions, embedded scripting language…). - -.BULLET -.B "Redis" - -.BULLET -.B "duckdb" +As far as the author knows, none of them is as simple as DODB. +.KE .ENDBULLET +.B Cassandra + .TBD . .SECTION Future work