|
|
|
@ -272,14 +272,21 @@ Of course, browsing the entire database to find a value (or its key) is a waste
|
|
|
|
|
That is when indexes come into play.
|
|
|
|
|
.
|
|
|
|
|
.
|
|
|
|
|
.SS Indexes
|
|
|
|
|
Entries can be
|
|
|
|
|
.I indexed
|
|
|
|
|
based on their attributes.
|
|
|
|
|
There are currently three main ways to search for a value by its attributes: basic indexes, partitions and tags.
|
|
|
|
|
.SS Triggers
|
|
|
|
|
A simple way to quickly retrieve a piece of data is to create
|
|
|
|
|
.I indexes
|
|
|
|
|
based on its attributes.
|
|
|
|
|
When a value is added to the database, or when it is modified, a
|
|
|
|
|
.I trigger
|
|
|
|
|
can be called to index it.
|
|
|
|
|
There are currently three main triggers in
|
|
|
|
|
.CLASS DODB
|
|
|
|
|
to index values: basic indexes, partitions and tags.
|
|
|
|
|
.
|
|
|
|
|
.SSS Basic indexes (1 to 1 relations)
|
|
|
|
|
Basic indexes represent one-to-one relations, such as an index in SQL.
|
|
|
|
|
Basic indexes
|
|
|
|
|
.CLASS DODB::Trigger::Index ) (
|
|
|
|
|
represent one-to-one relations, such as an index in SQL.
|
|
|
|
|
In the Car database, each car has a dedicated (unique) name.
|
|
|
|
|
This
|
|
|
|
|
.I name
|
|
|
|
@ -296,9 +303,9 @@ cars_by_name = cars.new_index "name", { |car| car.name }
|
|
|
|
|
cars_by_name = cars.new_index "name", &.name
|
|
|
|
|
.SOURCE
|
|
|
|
|
Once the index has been created, every added or modified entry in the database will be indexed.
|
|
|
|
|
Adding an index (basic index, partition or tag) provides an
|
|
|
|
|
Adding a trigger provides an
|
|
|
|
|
.I object
|
|
|
|
|
used to manipulate the database based on this index.
|
|
|
|
|
used to manipulate the database based on the related attribute.
|
|
|
|
|
Let's call it an
|
|
|
|
|
.I "index object" .
|
|
|
|
|
In the code above, the
|
|
|
|
@ -349,7 +356,7 @@ directory.
|
|
|
|
|
.QE
|
|
|
|
|
.
|
|
|
|
|
The basic indexes as shown in this section already give a taste of what is possible to do with DODB.
|
|
|
|
|
The following indexes will cover some other usual cases.
|
|
|
|
|
The following triggers will cover some other usual cases.
|
|
|
|
|
.
|
|
|
|
|
.
|
|
|
|
|
.SSS Partitions (1 to n relations)
|
|
|
|
@ -401,7 +408,7 @@ directory!
|
|
|
|
|
.
|
|
|
|
|
.SSS Tags (n to n relations)
|
|
|
|
|
Tags are basically partitions but the indexed attribute can have multiple values.
|
|
|
|
|
|
|
|
|
|
.
|
|
|
|
|
.QP
|
|
|
|
|
.SOURCE Ruby ps=9 vs=10
|
|
|
|
|
# Create a tag based on the "keywords" attribute of the cars.
|
|
|
|
@ -445,15 +452,15 @@ directory!
|
|
|
|
|
.
|
|
|
|
|
.
|
|
|
|
|
.
|
|
|
|
|
.SSS Side note about indexes
|
|
|
|
|
DODB presents a few possible indexes (basic indexes, partitions and tags) which respond to an obvious need for fast searches.
|
|
|
|
|
.SSS Side note about triggers
|
|
|
|
|
DODB presents a few possible triggers (basic indexes, partitions and tags) which respond to an obvious need for fast searches.
|
|
|
|
|
Though, their implementation via the creation of symlinks is the result of a certain vision about how a database should behave in order to provide a practical way for users to sort the entries.
|
|
|
|
|
The implementation can be completely changed.
|
|
|
|
|
|
|
|
|
|
Also, other kinds of indexes could
|
|
|
|
|
Also, other kinds of triggers could
|
|
|
|
|
.B easily
|
|
|
|
|
be implemented in addition of those presented.
|
|
|
|
|
The new indexes may have completely different objectives than providing a file-system representation of the data.
|
|
|
|
|
The new triggers may have completely different objectives than providing a file-system representation of the data.
|
|
|
|
|
The following sections will precisely cover this aspect.
|
|
|
|
|
.
|
|
|
|
|
.
|
|
|
|
@ -469,10 +476,9 @@ Several hundred times faster, see the experiment section.
|
|
|
|
|
.FOOTNOTE2
|
|
|
|
|
Same thing for cached indexes.
|
|
|
|
|
Indexes can easily be cached, thanks to simple hash tables.
|
|
|
|
|
.
|
|
|
|
|
.
|
|
|
|
|
.SS Cached database
|
|
|
|
|
A cached database has the same API as the other DODB databases.
|
|
|
|
|
|
|
|
|
|
.B "Cached database" .
|
|
|
|
|
A cached database has the same API as the other DODB databases and keeps a copy of the entire database in memory for fast retrieval.
|
|
|
|
|
.QP
|
|
|
|
|
.SOURCE Ruby ps=9 vs=10
|
|
|
|
|
# Create a cached database
|
|
|
|
@ -484,12 +490,12 @@ class are available for
|
|
|
|
|
.CLASS Storage::Cached .
|
|
|
|
|
.QE
|
|
|
|
|
.
|
|
|
|
|
.SS Cached indexes
|
|
|
|
|
.B "Cached indexes" .
|
|
|
|
|
Since indexes do not require nearly as much memory as caching the entire database, they are cached by default.
|
|
|
|
|
.
|
|
|
|
|
.
|
|
|
|
|
.
|
|
|
|
|
.SECTION Common database: caching only recently used data
|
|
|
|
|
.SECTION Common database
|
|
|
|
|
Storing the entire data-set in memory is an effective way to make the requests fast, as does
|
|
|
|
|
the
|
|
|
|
|
.I "cached database"
|
|
|
|
@ -497,11 +503,14 @@ presented in the previous section.
|
|
|
|
|
Not all data-sets are compatible with this approach, for obvious reasons.
|
|
|
|
|
Thus, a tradeoff could be found to enable fast retrieval of data without requiring much memory.
|
|
|
|
|
Caching only a part of the data-set could already enable a massive speed-up even in memory-constrained environments.
|
|
|
|
|
The most effective strategy could differ from an application to another, providing a generic algorithm that should work for all possible constraints is an hazardous endeavor.
|
|
|
|
|
The most effective strategy could differ from an application to another\*[*].
|
|
|
|
|
.FOOTNOTE1
|
|
|
|
|
Providing a generic algorithm that should work for all possible constraints is an hazardous endeavor.
|
|
|
|
|
.FOOTNOTE2
|
|
|
|
|
However, caching only the most recently requested values is a simple policy which may be efficient in many cases.
|
|
|
|
|
This strategy is implemented in
|
|
|
|
|
.I "common database"
|
|
|
|
|
and this section will explain how it works.
|
|
|
|
|
This strategy is implemented in the
|
|
|
|
|
.CLASS DODB::Storage::Common
|
|
|
|
|
database and this section will explain how it works.
|
|
|
|
|
|
|
|
|
|
Common database implements a simple strategy to keep only relevant values in memory:
|
|
|
|
|
caching
|
|
|
|
@ -514,7 +523,8 @@ Any value that is requested or added to the database is considered
|
|
|
|
|
Each time a value is added in the database, its key is put as the first element of a list.
|
|
|
|
|
In this list,
|
|
|
|
|
.B "values are unique" .
|
|
|
|
|
Adding a value that is already present in the list is considered as "using the value",
|
|
|
|
|
Adding a value that is already present in the list is considered as
|
|
|
|
|
.I "using the value" ,
|
|
|
|
|
thus it is moved at the start of the list.
|
|
|
|
|
In case the number of entries exceeds what is allowed,
|
|
|
|
|
the least recently used value (the last list entry) is removed,
|
|
|
|
@ -526,7 +536,9 @@ the duration of adding a value is constant, it doesn't change with the number of
|
|
|
|
|
This efficiency is a memory tradeoff.
|
|
|
|
|
All the entries are added to a
|
|
|
|
|
.B "double-linked list"
|
|
|
|
|
(to keep track of the order of the added keys) and to a
|
|
|
|
|
(to keep track of the order of the added keys)
|
|
|
|
|
.UL and
|
|
|
|
|
to a
|
|
|
|
|
.B hash
|
|
|
|
|
to perform efficient searches of the keys in the list.
|
|
|
|
|
Thus, all the nodes are added twice, once in the list, once in the hash.
|
|
|
|
@ -606,13 +618,17 @@ It is perfectly reasonable to have a cached database with a policy of keeping ju
|
|
|
|
|
But for now, the cached version keeps everything.
|
|
|
|
|
See the "Future work" section.
|
|
|
|
|
.FOOTNOTE2
|
|
|
|
|
.
|
|
|
|
|
.SS Uncached database
|
|
|
|
|
By default, the database (provided by
|
|
|
|
|
.CLASS "DODB::Storage::Basic" )
|
|
|
|
|
isn't cached.
|
|
|
|
|
.
|
|
|
|
|
.SS Uncached indexes
|
|
|
|
|
|
|
|
|
|
.B "Uncached database" .
|
|
|
|
|
The
|
|
|
|
|
.CLASS "DODB::Storage::Uncached"
|
|
|
|
|
database has no data cache at all and can be used in very constrained environments.
|
|
|
|
|
However, the
|
|
|
|
|
.CLASS DODB::Storage::Common
|
|
|
|
|
should (probably) be considered instead, even if the configured number of entries is low.
|
|
|
|
|
A small data cache is still better than no cache.
|
|
|
|
|
|
|
|
|
|
.B "Uncached indexes" .
|
|
|
|
|
Cached indexes do not require a large amount of memory since the only stored data is an integer (the
|
|
|
|
|
.I key
|
|
|
|
|
of the data).
|
|
|
|
@ -696,7 +712,7 @@ function.
|
|
|
|
|
.KE
|
|
|
|
|
.
|
|
|
|
|
.
|
|
|
|
|
.SS Indexes creation
|
|
|
|
|
.SS Triggers creation
|
|
|
|
|
.QP
|
|
|
|
|
.SOURCE Ruby ps=9 vs=10
|
|
|
|
|
# Uncached, cached and RAM-only basic indexes.
|
|
|
|
|