From afa96d8ae77ee4839e2b0d854264fa0bcd5b4f46 Mon Sep 17 00:00:00 2001 From: Philippe PITTOLI Date: Wed, 15 May 2024 03:10:59 +0200 Subject: [PATCH] A few new explanations. --- graphs/Makefile.in | 1 + graphs/bin/utf8-to-ms.sh | 9 +++- graphs/graphs.ms | 108 ++++++++++++++++++++++++++++++++++++--- graphs/macros.roff | 6 ++- 4 files changed, 113 insertions(+), 11 deletions(-) diff --git a/graphs/Makefile.in b/graphs/Makefile.in index 8d23982..6b6a8bd 100644 --- a/graphs/Makefile.in +++ b/graphs/Makefile.in @@ -63,6 +63,7 @@ GROFF = groff $(GROFF_OPTS) $(SRC).pdf: $(SOELIM) < $(SRC).ms |\ + ./bin/utf8-to-ms.sh |\ $(PRECONV) |\ $(EQN) |\ $(GHIGHLIGHT) |\ diff --git a/graphs/bin/utf8-to-ms.sh b/graphs/bin/utf8-to-ms.sh index 70337db..64b0363 100755 --- a/graphs/bin/utf8-to-ms.sh +++ b/graphs/bin/utf8-to-ms.sh @@ -140,8 +140,15 @@ legal_symbols() sed \ -e "s/c2 ae/5c 5b 72 67 5d/g"\ -e "s/e2 84 a2/5c 5b 74 6d 5d/g" +# TODO: ├─│└ +misc() sed \ + -e "s/e2 94 9c/+/g"\ + -e "s/e2 94 80/-/g"\ + -e "s/e2 94 82/|/g"\ + -e 's/e2 94 94/+/g' + hexutf8_to_hexms() { - text_markers | accents | ligatures | legal_symbols + text_markers | accents | ligatures | legal_symbols | misc } to_hex_one_column | regroup_lines | hexutf8_to_hexms | from_hex diff --git a/graphs/graphs.ms b/graphs/graphs.ms index d2020c7..61b2b38 100644 --- a/graphs/graphs.ms +++ b/graphs/graphs.ms @@ -53,14 +53,17 @@ which points to entries in the "movie" table. .UL "The SQL language" enables arbitrary operations on databases: add, search, modify and delete entries. Furthermore, SQL also enables to manage administrative operations of the databases themselves: creating databases and tables, managing users with fine-grained authorizations, etc. -This language is used in applications to perform operations on the database, binding the code with the database. +SQL is used between the application and the database, to perform operations and to provide results when due. SQL is also used .UL outside -the application, by admins for managing databases and potentially by some technical users to retrieve some data without a dedicated interface\*[*]. +the application, by admins for managing databases and potentially by some +.I non-developer +users to retrieve some data without a dedicated interface\*[*]. .FOOTNOTE1 One of the first objectives of SQL was to enable a class of .I non-developer users to talk directly to the database so they can access the data without bothering the developers. +This has value for many companies and organizations. .FOOTNOTE2 Many tools were used or even developed over the years specifically to aleviate the inherent complexity and limitations of SQL. @@ -71,7 +74,7 @@ thus, SQL databases can be scripted to automate operations and provide a massive .I "stored procedures" , ( see .I "PL/SQL" ). -Writing SQL requests requires a lot of boiletplate since there is no integration in the programming languages, leading to multiple function calls for any operation on the database; +Writing SQL requests requires a lot of boilerplate since there is no integration in the programming languages, leading to multiple function calls for any operation on the database; thus, object-relational mapping (ORM) libraries were created to reduce the massive code duplication. And so on. @@ -103,18 +106,102 @@ Since homogeneity is not necessary anymore, databases have fewer (or different) Document-oriented databases are a sub-class of key-value stores, where metadata can be extracted from the entries for further optimizations. And that's exactly what is being done in Document Oriented DataBase (DODB). -Contrary to SQL, DODB has a very narrow scope: to provide a library enabling to store, retrieve, modify and delete data. +.UL "Contrary to SQL" , +DODB has a very narrow scope: to provide a library enabling to store, retrieve, modify and delete data. In this way, DODB transforms any application in a database manager. DODB doesn't provide an interactive shell, there is no request language to perform arbitrary operations on the database, no statistical optimizations of the requests based on query frequencies, etc. Instead, DODB reduces the complexity of the infrastructure, stores data in plain files and enables simple manual scripting with widespread unix tools. Simplicity is key. + +.UL "Contrary to other NoSQL databases" , +DODB doesn't provide an application but a library, nothing else. +The idea is to help developers to store their data themselves, not depending on +. I yet-another-all-in-one +massive tool. +The library writes (and removes) data on a storage device, has a few retrieval and update mechanisms and that's it\*[*]. +.FOOTNOTE1 +The lack of features +.I is +the feature. +Even with that motto, the tool still is expected to be convenient for most applications. +.FOOTNOTE2 + +This document will provide an extensive documentation on how DODB works and how to use it. +Limitations are also clearly stated in a dedicated section. +A few experiments are described to provide an overview of the performance you can expect from this approach. +Finally, a conclusion is drawn based on a real-world usage of this library. . -.SECTION Basic usage -. +.SECTION How DODB works +DODB is a hash table. +The key of the hash is an auto-incremented number, the value is the stored data. +The following section will explain the file-system representation of the data and the very few added mechanisms to speed-up searches. +.SS Storing data +When a value is added, it is serialized\*[*] and written in a dedicated file. +.FOOTNOTE1 +Serialization is currently in JSON. +CBOR is a work-in-progress. +Nothing binds DODB to a particular format. +.FOOTNOTE2 +The key of the hash is a number, auto-incremented, used as the name of the stored file. +The following example shows the content of the file system after adding three values. +.de TREE1 +.QP +.KS +.ft CW +.nf +.. +.de TREE2 +.ft +.fi +.KE +.QE +.. +.TREE1 +$ tree storage/ +storage +`-- data +   +-- 0000000000 +   +-- 0000000001 +   `-- 0000000002 +.TREE2 +In this example, the directory +.I storage/data +contains all three serialized values, with a formated number as their file name. +.SS Indexes +Database entries can be +.I indexed +based on their attributes. +There are currently three main ways to search a value by its attributes: basic indexes, partitions and tags. +.SSS Basic indexes (1 to 1 relation) +Basic indexes represent one-to-one relations, such as an index in SQL. +For example, in a database of +.I cars , +each car can have a dedicted (unique) name. +This +.I name +attribute can be used to speed-up searches. +On the file-system, this will be translated as this: +.TREE1 +storage ++-- data +|  `-- 0000000000 +`-- indexes +    `-- by_name +    `-- Ford C-MAX -> ../../data/0000000000 +.TREE2 +As shown, the file "Ford C-MAX" is a symbolic link to a data file. +The name of the symlink file has been extracted from the value itself, enabling to list all the cars and their names with a simple +.UL ls +in the +.I storage/indexes/by_name/ +directory. +.TBD +.SECTION Basic usage of the DODB library +.TBD .SECTION A few more options -. +.TBD .SECTION Limits of DODB -. +.TBD .SECTION Experimental scenario .LP The following experiment shows the performance of DODB based on quering durations. @@ -211,3 +298,8 @@ Caching the value enables a massive performance gain, data can be retrieved seve .SS Tags (n to n relations) .LP .so graph_query_tag.grap +. +.SECTION Future work +.TBD +.SECTION Conclusion +.TBD diff --git a/graphs/macros.roff b/graphs/macros.roff index cb43d07..e6abfb4 100644 --- a/graphs/macros.roff +++ b/graphs/macros.roff @@ -4,6 +4,7 @@ .nr FM 0.3i \" page foot margin default 1i .nr DI 0 .nr FF 3 \" footnotes' type: numbered, with point, indented +.nr PS 12 . .nr LIST_NUMBER 0 +1 . @@ -398,8 +399,9 @@ Compilé pour la dernière fois le \\$* .ds LH \\$* .de HD .XX -.sp -2.8 -\l'7.5i' +.sp -2.3 +.nr LINEWIDTH (\n[LL]/1.0i) +\l'\\\\n[LINEWIDTH]i' .sp +1.5 .br ..XX