Documentation PDF: smaller trees and source code, some more API doc.

This commit is contained in:
Philippe PITTOLI 2024-05-22 03:58:15 +02:00
parent ce50f6f334
commit 2b24fbc8a0

View File

@ -1,7 +1,8 @@
.so macros.roff .so macros.roff
.de TREE1 .de TREE1
.QP .QP
.ps -2 .ps -3
.vs -3
.KS .KS
.ft CW .ft CW
.b1 .b1
@ -12,18 +13,24 @@
.fi .fi
.b2 .b2
.ps .ps
.vs
.KE .KE
.QE .QE
.. ..
.de FUNCTION_CALL .de CLASS
.I \\$* .I \\$*
.. ..
.de FUNCTION_CALL
.I "\\$*"
..
. .
.de COMMAND .de COMMAND
.I "\\$*" .I "\\$*"
.. ..
.de DIRECTORY .de DIRECTORY
.ps -2
.I "\\$*" .I "\\$*"
.ps +2
.. ..
.de PRIMARY_KEY .de PRIMARY_KEY
.I \\$1 \\$2 \\$3 .I \\$1 \\$2 \\$3
@ -40,12 +47,16 @@
DODB is a database-as-library, enabling a very simple way to store applications' data: storing serialized DODB is a database-as-library, enabling a very simple way to store applications' data: storing serialized
.I documents .I documents
(basically any data type) in plain files. (basically any data type) in plain files.
To speed-up searches, attributes of these documents can be used as indexes which leads to create a few symbolic links To speed-up searches, attributes of these documents can be used as indexes.
DODB can provide a file-system representation of those indexes through a few symbolic links
.I symlinks ) ( .I symlinks ) (
on the disk. on the disk.
This enables administrators to search for data outside the application with the most basic tools, like
.I ls .
This document briefly presents DODB and its main differences with other database engines. This document briefly presents DODB and its main differences with other database engines.
An experiment is described and analysed to understand the performance that can be expected from this approach. Limits of such approach are discussed.
An experiment is described and analyzed to understand the performance that can be expected.
.ABSTRACT2 .ABSTRACT2
.SINGLE_COLUMN .SINGLE_COLUMN
.SECTION Introduction to DODB .SECTION Introduction to DODB
@ -173,7 +184,7 @@ First things first, the following code is the structure used in the rest of the
This is a simple object This is a simple object
.I Car , .I Car ,
with a name, a color and a list of associated keywords (fast, elegant, etc.). with a name, a color and a list of associated keywords (fast, elegant, etc.).
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
class Car class Car
property name : String property name : String
property color : String property color : String
@ -184,7 +195,7 @@ end
. .
.SS DODB basic usage .SS DODB basic usage
Let's create a DODB database for our cars. Let's create a DODB database for our cars.
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
# Database creation # Database creation
database = DODB::DataBase(Car).new "path/to/db-cars" database = DODB::DataBase(Car).new "path/to/db-cars"
@ -216,7 +227,7 @@ In this example, the directory
contains the serialized value, with a formated number as file name. contains the serialized value, with a formated number as file name.
The file "0000000000" contains the following: The file "0000000000" contains the following:
.QP .QP
.SOURCE JSON ps=10 .SOURCE JSON ps=9 vs=10
{ {
"name": "Corvet", "name": "Corvet",
"color": "red", "color": "red",
@ -234,7 +245,7 @@ The car is serialized as expected in the file
Next step, to retrieve, to modify or to delete a value, its key will be required. Next step, to retrieve, to modify or to delete a value, its key will be required.
. .
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
# Get a value based on its key. # Get a value based on its key.
database[key] database[key]
@ -251,7 +262,7 @@ The function
lists the entries with their keys. lists the entries with their keys.
. .
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
database.each_with_key do |value, key| database.each_with_key do |value, key|
puts "#{key}: #{value}" puts "#{key}: #{value}"
end end
@ -274,7 +285,7 @@ This
.I name .I name
attribute can be used to speed-up searches. attribute can be used to speed-up searches.
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
# Create an index based on the "name" attribute of the cars. # Create an index based on the "name" attribute of the cars.
cars_by_name = cars.new_index "name", do |car| cars_by_name = cars.new_index "name", do |car|
car.name car.name
@ -290,7 +301,7 @@ The
.I "index object" .I "index object"
has several useful functions. has several useful functions.
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
# Retrieve the car named "Corvet". # Retrieve the car named "Corvet".
corvet = cars_by_name.get? "Corvet" corvet = cars_by_name.get? "Corvet"
@ -337,12 +348,14 @@ An attribute can have a value that is shared by other entries in the database, s
.I color .I color
attribute of our cars. attribute of our cars.
.SOURCE Ruby ps=10 .QP
.SOURCE Ruby ps=9 vs=10
# Create a partition based on the "color" attribute of the cars. # Create a partition based on the "color" attribute of the cars.
cars_by_color = database.new_partition "color", do |car| cars_by_color = database.new_partition "color", do |car|
car.color car.color
end end
.SOURCE .SOURCE
.QE
As with basic indexes, once the partition is asked to the database, every new or modified entry will be indexed. As with basic indexes, once the partition is asked to the database, every new or modified entry will be indexed.
.KS .KS
@ -364,7 +377,7 @@ db-cars
  `-- 0000000002 -> 0000000002   `-- 0000000002 -> 0000000002
.TREE2 .TREE2
.QP .QP
Listing all the blue cars is simple as a Listing all the blue cars is simple as running
.COMMAND ls .COMMAND ls
in the in the
.DIRECTORY db-cars/partitions/by_color/blue .DIRECTORY db-cars/partitions/by_color/blue
@ -375,15 +388,17 @@ directory!
. .
. .
.SSS Tags (n to n relations) .SSS Tags (n to n relations)
Tags are basically partitions but the attribute can have multiple values. Tags are basically partitions but the indexed attribute can have multiple values.
.SOURCE Ruby ps=10 .QP
.SOURCE Ruby ps=9 vs=10
# Create a tag based on the "keywords" attribute of the cars. # Create a tag based on the "keywords" attribute of the cars.
cars_by_keywords = database.new_tags "keywords", do |car| cars_by_keywords = database.new_tags "keywords", do |car|
car.keywords car.keywords
end end
.SOURCE .SOURCE
As with other indexes, once the tag is requested to the database, every new or modified entry will be indexed. As with other indexes, once the tag is requested to the database, every new or modified entry will be indexed.
.QE
. .
. .
.KS .KS
@ -394,16 +409,18 @@ db-cars
+-- data +-- data
|  +-- 0000000000 <- this car is fast and cheap |  +-- 0000000000 <- this car is fast and cheap
|  `-- 0000000001 <- this car is fast and elegant |  `-- 0000000001 <- this car is fast and elegant
`-- partitions `-- tags
   `-- by_color    `-- by_keywords
+-- cheap +-- cheap
`-- 0000000000 -> 0000000000 `-- 0000000000 -> 0000000000
+-- elegant
`-- 0000000001 -> 0000000001
`-- fast `-- fast
+-- 0000000000 -> 0000000000 +-- 0000000000 -> 0000000000
`-- 0000000001 -> 0000000001 `-- 0000000001 -> 0000000001
.TREE2 .TREE2
.QP .QP
Listing all the fast cars is simple as a Listing all the fast cars is simple as running
.COMMAND ls .COMMAND ls
in the in the
.DIRECTORY db-cars/tags/by_keywords/fast .DIRECTORY db-cars/tags/by_keywords/fast
@ -442,14 +459,14 @@ Indexes can easily be cached, thanks to simple hash tables.
.SS Cached database .SS Cached database
A cached database has the same API as the other DODB databases. A cached database has the same API as the other DODB databases.
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
# Create a cached database # Create a cached database
database = DODB::CachedDataBase(Car).new "path/to/db-cars" database = DODB::CachedDataBase(Car).new "path/to/db-cars"
.SOURCE .SOURCE
All operations of the All operations of the
.I DODB::DataBase .CLASS DODB::DataBase
class are available for class are available for
.I DODB::CachedDataBase . .CLASS DODB::CachedDataBase .
.QE .QE
. .
.SS Cached indexes .SS Cached indexes
@ -483,7 +500,7 @@ small layer over a hash table.
Instanciate a RAM-only database is as simple as the other options. Instanciate a RAM-only database is as simple as the other options.
Moreover, this database has exactly the same API as the others, thus changing from one to another is painless. Moreover, this database has exactly the same API as the others, thus changing from one to another is painless.
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
# RAM-only database creation # RAM-only database creation
database = DODB::RAMOnlyDataBase(Car).new "path/to/db-cars" database = DODB::RAMOnlyDataBase(Car).new "path/to/db-cars"
.SOURCE .SOURCE
@ -496,7 +513,7 @@ Also, I worked enough already, leave me alone.
.SS RAM-only indexes .SS RAM-only indexes
Indexes have their RAM-only version. Indexes have their RAM-only version.
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
# RAM-only basic indexes. # RAM-only basic indexes.
cars_by_name = cars.new_RAM_index "name", &.name cars_by_name = cars.new_RAM_index "name", &.name
@ -527,7 +544,7 @@ See the "Future work" section.
. .
.SS Uncached database .SS Uncached database
By default, the database (provided by By default, the database (provided by
.I "DODB::DataBase" ) .CLASS "DODB::DataBase" )
isn't cached. isn't cached.
. .
.SS Uncached indexes .SS Uncached indexes
@ -537,7 +554,7 @@ of the data).
For that reason, indexes are cached by default. For that reason, indexes are cached by default.
But for highly memory-constrained environments, the cache can be removed. But for highly memory-constrained environments, the cache can be removed.
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
# Uncached basic indexes. # Uncached basic indexes.
cars_by_name = cars.new_uncached_index "name", &.name cars_by_name = cars.new_uncached_index "name", &.name
@ -565,7 +582,7 @@ command enables to browse the full documentation with a web browser.
. .
.SS Database creation .SS Database creation
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
# Uncached, cached and RAM-only database creation. # Uncached, cached and RAM-only database creation.
database = DODB::DataBase(Car).new "path/to/db-cars" database = DODB::DataBase(Car).new "path/to/db-cars"
database = DODB::CachedDataBase(Car).new "path/to/db-cars" database = DODB::CachedDataBase(Car).new "path/to/db-cars"
@ -575,7 +592,7 @@ database = DODB::RAMOnlyDataBase(Car).new "path/to/db-cars"
. .
.SS Browsing the database .SS Browsing the database
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
# List all the values in the database # List all the values in the database
database.each do |value| database.each do |value|
# ... # ...
@ -584,7 +601,7 @@ end
.QE .QE
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
# List all the values in the database with their key # List all the values in the database with their key
database.each_with_key do |value, key| database.each_with_key do |value, key|
# ... # ...
@ -595,16 +612,16 @@ end
.SS Database search, update and deletion with the key (integer associated to the value) .SS Database search, update and deletion with the key (integer associated to the value)
.KS .KS
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
value = database[key] # May throw a MissingEntry exception value = database[key] # May throw a MissingEntry exception
value = database[key]? # Return nil if the value doesn't exist value = database[key]? # Returns nil if the value doesn't exist
database[key] = value database[key] = value
database.delete key database.delete key
.SOURCE .SOURCE
Side note for the Side note for the
.I [] .I []
function: in case the value isn't in the database, the function throws an exception named function: in case the value isn't in the database, the function throws an exception named
.I DODB::MissingEntry . .CLASS DODB::MissingEntry .
To avoid this exception and get a To avoid this exception and get a
.I nil .I nil
value instead, use the value instead, use the
@ -616,7 +633,7 @@ function.
. .
.SS Indexes creation .SS Indexes creation
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
# Uncached, cached and RAM-only basic indexes. # Uncached, cached and RAM-only basic indexes.
cars_by_name = cars.new_uncached_index "name", &.name cars_by_name = cars.new_uncached_index "name", &.name
cars_by_name = cars.new_index "name", &.name cars_by_name = cars.new_index "name", &.name
@ -636,40 +653,49 @@ cars_by_keywords = cars.new_RAM_tags "keywords", &.keywords
. .
. .
.SS Database retrieval, update and deletion with an index .SS Database retrieval, update and deletion with an index
.
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
# Get a value from an index. # Get a value from a 1-1 index.
car = cars_by_name.get "Corvet" # May throw a MissingEntry exception car = cars_by_name.get "Corvet" # May throw a MissingEntry exception
car = cars_by_name.get? "Corvet" # Return nil if the value doesn't exist car = cars_by_name.get? "Corvet" # Returns nil if the value doesn't exist
.SOURCE
Works the same for partitions and tags, the API is consistent.
.QE
.QP
.SOURCE Ruby ps=10
# Update a value.
car = cars_by_name.update updated_car
# In case the indexed attribute changed.
car = cars_by_name.update "Corvet", updated_car
.SOURCE .SOURCE
.QE .QE
.
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
# In case the value may not exist. # Get a value from a partition (1-n relations) or a tag (n-n relations) index.
car = cars_by_name.update_or_create updated_car red_cars = cars_by_color.get "red" # empty array if no such cars exist
# In case the indexed attribute has changed. fast_cars = cars_by_keywords.get "fast" # empty array if no such cars exist
car = cars_by_name.update_or_create "Corvet", updated_car
# Several tags can be selected at the same time, to narrow the search.
cars_both_fast_and_expensive = cars_by_keywords.get ["fast", "expensive"]
.SOURCE .SOURCE
.QE .QE
.
The basic 1-1
.I "index object"
can update a value by selecting an unique entry in the database.
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
cars_by_name.delete "Corvet" car = cars_by_name.update updated_car # If the `name` hasn't changed.
car = cars_by_name.update "Corvet", updated_car # If the `name` has changed.
# Delete all red cars. car = cars_by_name.update_or_create updated_car # Updates or creates the value.
cars_by_color.delete "red" car = cars_by_name.update_or_create "Corvet", updated_car # Same.
.SOURCE
.QE
For deletion, database entries can be selected based on any index.
Partitions and tags can take a block of code to narrow the selection.
.QP
.SOURCE Ruby ps=9 vs=10
cars_by_name.delete "Corvet" # Deletes the car named "Corvet".
cars_by_color.delete "red" # Deletes all red cars.
# Delete all cars that are both blue and slow. # Deletes cars that are both slow and expensive.
cars_by_keywords.delete ["slow", "expensive"]
# Deletes all cars that are both blue and slow.
cars_by_color.delete "blue", do |car| cars_by_color.delete "blue", do |car|
car.keywords.includes? "slow" car.keywords.includes? "slow"
end end
@ -686,7 +712,7 @@ end
The Tag index enables to search for a value based on multiple keys. The Tag index enables to search for a value based on multiple keys.
For example, searching for all cars that are both fast and elegant can be written this way: For example, searching for all cars that are both fast and elegant can be written this way:
.QP .QP
.SOURCE Ruby ps=10 .SOURCE Ruby ps=9 vs=10
fast_elegant_cars = cars_by_keywords.get ["fast", "elegant"] fast_elegant_cars = cars_by_keywords.get ["fast", "elegant"]
.SOURCE .SOURCE
Used with a list of keys, the Used with a list of keys, the
@ -801,7 +827,7 @@ The library is written in Crystal and so is the benchmark (\f[CW]spec/benchmark-
Nonetheless, despite a few technicalities, the objective of this document is to provide an insight on the approach used in DODB more than this particular implementation. Nonetheless, despite a few technicalities, the objective of this document is to provide an insight on the approach used in DODB more than this particular implementation.
The manipulated data type can be found in \f[CW]spec/db-cars.cr\f[]. The manipulated data type can be found in \f[CW]spec/db-cars.cr\f[].
.SOURCE Ruby ps=9 vs=9p .SOURCE Ruby ps=9 vs=9p vs=10
class Car class Car
property name : String # 1-1 relation property name : String # 1-1 relation
property color : String # 1-n relation property color : String # 1-n relation