From 2b24fbc8a0fd3e0d459a312d4107e0304fe8e734 Mon Sep 17 00:00:00 2001 From: Philippe PITTOLI Date: Wed, 22 May 2024 03:58:15 +0200 Subject: [PATCH] Documentation PDF: smaller trees and source code, some more API doc. --- graphs/graphs.ms | 142 ++++++++++++++++++++++++++++------------------- 1 file changed, 84 insertions(+), 58 deletions(-) diff --git a/graphs/graphs.ms b/graphs/graphs.ms index 2e13098..450c1ab 100644 --- a/graphs/graphs.ms +++ b/graphs/graphs.ms @@ -1,7 +1,8 @@ .so macros.roff .de TREE1 .QP -.ps -2 +.ps -3 +.vs -3 .KS .ft CW .b1 @@ -12,18 +13,24 @@ .fi .b2 .ps +.vs .KE .QE .. -.de FUNCTION_CALL +.de CLASS .I \\$* .. +.de FUNCTION_CALL +.I "\\$*" +.. . .de COMMAND .I "\\$*" .. .de DIRECTORY +.ps -2 .I "\\$*" +.ps +2 .. .de PRIMARY_KEY .I \\$1 \\$2 \\$3 @@ -40,12 +47,16 @@ DODB is a database-as-library, enabling a very simple way to store applications' data: storing serialized .I documents (basically any data type) in plain files. -To speed-up searches, attributes of these documents can be used as indexes which leads to create a few symbolic links +To speed-up searches, attributes of these documents can be used as indexes. +DODB can provide a file-system representation of those indexes through a few symbolic links .I symlinks ) ( on the disk. +This enables administrators to search for data outside the application with the most basic tools, like +.I ls . This document briefly presents DODB and its main differences with other database engines. -An experiment is described and analysed to understand the performance that can be expected from this approach. +Limits of such approach are discussed. +An experiment is described and analyzed to understand the performance that can be expected. .ABSTRACT2 .SINGLE_COLUMN .SECTION Introduction to DODB @@ -173,7 +184,7 @@ First things first, the following code is the structure used in the rest of the This is a simple object .I Car , with a name, a color and a list of associated keywords (fast, elegant, etc.). -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 class Car property name : String property color : String @@ -184,7 +195,7 @@ end . .SS DODB basic usage Let's create a DODB database for our cars. -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 # Database creation database = DODB::DataBase(Car).new "path/to/db-cars" @@ -216,7 +227,7 @@ In this example, the directory contains the serialized value, with a formated number as file name. The file "0000000000" contains the following: .QP -.SOURCE JSON ps=10 +.SOURCE JSON ps=9 vs=10 { "name": "Corvet", "color": "red", @@ -234,7 +245,7 @@ The car is serialized as expected in the file Next step, to retrieve, to modify or to delete a value, its key will be required. . .QP -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 # Get a value based on its key. database[key] @@ -251,7 +262,7 @@ The function lists the entries with their keys. . .QP -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 database.each_with_key do |value, key| puts "#{key}: #{value}" end @@ -274,7 +285,7 @@ This .I name attribute can be used to speed-up searches. .QP -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 # Create an index based on the "name" attribute of the cars. cars_by_name = cars.new_index "name", do |car| car.name @@ -290,7 +301,7 @@ The .I "index object" has several useful functions. .QP -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 # Retrieve the car named "Corvet". corvet = cars_by_name.get? "Corvet" @@ -337,12 +348,14 @@ An attribute can have a value that is shared by other entries in the database, s .I color attribute of our cars. -.SOURCE Ruby ps=10 +.QP +.SOURCE Ruby ps=9 vs=10 # Create a partition based on the "color" attribute of the cars. cars_by_color = database.new_partition "color", do |car| car.color end .SOURCE +.QE As with basic indexes, once the partition is asked to the database, every new or modified entry will be indexed. .KS @@ -364,7 +377,7 @@ db-cars   `-- 0000000002 -> 0000000002 .TREE2 .QP -Listing all the blue cars is simple as a +Listing all the blue cars is simple as running .COMMAND ls in the .DIRECTORY db-cars/partitions/by_color/blue @@ -375,15 +388,17 @@ directory! . . .SSS Tags (n to n relations) -Tags are basically partitions but the attribute can have multiple values. +Tags are basically partitions but the indexed attribute can have multiple values. -.SOURCE Ruby ps=10 +.QP +.SOURCE Ruby ps=9 vs=10 # Create a tag based on the "keywords" attribute of the cars. cars_by_keywords = database.new_tags "keywords", do |car| car.keywords end .SOURCE As with other indexes, once the tag is requested to the database, every new or modified entry will be indexed. +.QE . . .KS @@ -394,16 +409,18 @@ db-cars +-- data |  +-- 0000000000 <- this car is fast and cheap |  `-- 0000000001 <- this car is fast and elegant -`-- partitions -    `-- by_color +`-- tags +    `-- by_keywords +-- cheap `-- 0000000000 -> 0000000000 + +-- elegant + `-- 0000000001 -> 0000000001 `-- fast +-- 0000000000 -> 0000000000 `-- 0000000001 -> 0000000001 .TREE2 .QP -Listing all the fast cars is simple as a +Listing all the fast cars is simple as running .COMMAND ls in the .DIRECTORY db-cars/tags/by_keywords/fast @@ -442,14 +459,14 @@ Indexes can easily be cached, thanks to simple hash tables. .SS Cached database A cached database has the same API as the other DODB databases. .QP -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 # Create a cached database database = DODB::CachedDataBase(Car).new "path/to/db-cars" .SOURCE All operations of the -.I DODB::DataBase +.CLASS DODB::DataBase class are available for -.I DODB::CachedDataBase . +.CLASS DODB::CachedDataBase . .QE . .SS Cached indexes @@ -483,7 +500,7 @@ small layer over a hash table. Instanciate a RAM-only database is as simple as the other options. Moreover, this database has exactly the same API as the others, thus changing from one to another is painless. .QP -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 # RAM-only database creation database = DODB::RAMOnlyDataBase(Car).new "path/to/db-cars" .SOURCE @@ -496,7 +513,7 @@ Also, I worked enough already, leave me alone. .SS RAM-only indexes Indexes have their RAM-only version. .QP -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 # RAM-only basic indexes. cars_by_name = cars.new_RAM_index "name", &.name @@ -527,7 +544,7 @@ See the "Future work" section. . .SS Uncached database By default, the database (provided by -.I "DODB::DataBase" ) +.CLASS "DODB::DataBase" ) isn't cached. . .SS Uncached indexes @@ -537,7 +554,7 @@ of the data). For that reason, indexes are cached by default. But for highly memory-constrained environments, the cache can be removed. .QP -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 # Uncached basic indexes. cars_by_name = cars.new_uncached_index "name", &.name @@ -565,7 +582,7 @@ command enables to browse the full documentation with a web browser. . .SS Database creation .QP -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 # Uncached, cached and RAM-only database creation. database = DODB::DataBase(Car).new "path/to/db-cars" database = DODB::CachedDataBase(Car).new "path/to/db-cars" @@ -575,7 +592,7 @@ database = DODB::RAMOnlyDataBase(Car).new "path/to/db-cars" . .SS Browsing the database .QP -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 # List all the values in the database database.each do |value| # ... @@ -584,7 +601,7 @@ end .QE .QP -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 # List all the values in the database with their key database.each_with_key do |value, key| # ... @@ -595,16 +612,16 @@ end .SS Database search, update and deletion with the key (integer associated to the value) .KS .QP -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 value = database[key] # May throw a MissingEntry exception -value = database[key]? # Return nil if the value doesn't exist +value = database[key]? # Returns nil if the value doesn't exist database[key] = value database.delete key .SOURCE Side note for the .I [] function: in case the value isn't in the database, the function throws an exception named -.I DODB::MissingEntry . +.CLASS DODB::MissingEntry . To avoid this exception and get a .I nil value instead, use the @@ -616,7 +633,7 @@ function. . .SS Indexes creation .QP -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 # Uncached, cached and RAM-only basic indexes. cars_by_name = cars.new_uncached_index "name", &.name cars_by_name = cars.new_index "name", &.name @@ -636,40 +653,49 @@ cars_by_keywords = cars.new_RAM_tags "keywords", &.keywords . . .SS Database retrieval, update and deletion with an index +. .QP -.SOURCE Ruby ps=10 -# Get a value from an index. +.SOURCE Ruby ps=9 vs=10 +# Get a value from a 1-1 index. car = cars_by_name.get "Corvet" # May throw a MissingEntry exception -car = cars_by_name.get? "Corvet" # Return nil if the value doesn't exist -.SOURCE -Works the same for partitions and tags, the API is consistent. -.QE -.QP -.SOURCE Ruby ps=10 -# Update a value. -car = cars_by_name.update updated_car -# In case the indexed attribute changed. -car = cars_by_name.update "Corvet", updated_car +car = cars_by_name.get? "Corvet" # Returns nil if the value doesn't exist .SOURCE .QE - +. .QP -.SOURCE Ruby ps=10 -# In case the value may not exist. -car = cars_by_name.update_or_create updated_car -# In case the indexed attribute has changed. -car = cars_by_name.update_or_create "Corvet", updated_car +.SOURCE Ruby ps=9 vs=10 +# Get a value from a partition (1-n relations) or a tag (n-n relations) index. +red_cars = cars_by_color.get "red" # empty array if no such cars exist +fast_cars = cars_by_keywords.get "fast" # empty array if no such cars exist + +# Several tags can be selected at the same time, to narrow the search. +cars_both_fast_and_expensive = cars_by_keywords.get ["fast", "expensive"] .SOURCE .QE - +. +The basic 1-1 +.I "index object" +can update a value by selecting an unique entry in the database. .QP -.SOURCE Ruby ps=10 -cars_by_name.delete "Corvet" +.SOURCE Ruby ps=9 vs=10 +car = cars_by_name.update updated_car # If the `name` hasn't changed. +car = cars_by_name.update "Corvet", updated_car # If the `name` has changed. -# Delete all red cars. -cars_by_color.delete "red" +car = cars_by_name.update_or_create updated_car # Updates or creates the value. +car = cars_by_name.update_or_create "Corvet", updated_car # Same. +.SOURCE +.QE +For deletion, database entries can be selected based on any index. +Partitions and tags can take a block of code to narrow the selection. +.QP +.SOURCE Ruby ps=9 vs=10 +cars_by_name.delete "Corvet" # Deletes the car named "Corvet". +cars_by_color.delete "red" # Deletes all red cars. -# Delete all cars that are both blue and slow. +# Deletes cars that are both slow and expensive. +cars_by_keywords.delete ["slow", "expensive"] + +# Deletes all cars that are both blue and slow. cars_by_color.delete "blue", do |car| car.keywords.includes? "slow" end @@ -686,7 +712,7 @@ end The Tag index enables to search for a value based on multiple keys. For example, searching for all cars that are both fast and elegant can be written this way: .QP -.SOURCE Ruby ps=10 +.SOURCE Ruby ps=9 vs=10 fast_elegant_cars = cars_by_keywords.get ["fast", "elegant"] .SOURCE Used with a list of keys, the @@ -801,7 +827,7 @@ The library is written in Crystal and so is the benchmark (\f[CW]spec/benchmark- Nonetheless, despite a few technicalities, the objective of this document is to provide an insight on the approach used in DODB more than this particular implementation. The manipulated data type can be found in \f[CW]spec/db-cars.cr\f[]. -.SOURCE Ruby ps=9 vs=9p +.SOURCE Ruby ps=9 vs=9p vs=10 class Car property name : String # 1-1 relation property color : String # 1-n relation