Cached and uncached database and indexes.

This commit is contained in:
Philippe PITTOLI 2024-05-17 03:28:50 +02:00
parent 336b19bbde
commit 789b8163e6

View File

@ -426,22 +426,36 @@ The following sections will precisely cover this aspect.
.
.
.SECTION DODB, slow? Nope. Let's talk about caches
DODB acts like a hash table.
Internally, it literally has one
.I "by default"
to cache data.
This means data is being stored in memory as well as on the file-system, so the retrieval is incredibly fast;
same thing for the indexes.
Sure, having a file-system representation of the data (including the indexes) is convenient for the administrator, but input-output operations on a file-system are slow.
Indexes can easily be cached instead, as simple hash tables.
.SS Cached and uncached database
.TBD
.SS Cached and uncached indexes
.TBD
.SS RAM-only indexes
In case the file-system representation isn't required, indexes can be stored in memory,
.I only .
.TBD
The file-system representation (of data and indexes) is convenient for the administrator, but input-output operations on a file-system are slow.
Storing the data on a storage device is required to protect it from crashes and application restarts.
But data can be kept in memory for faster processing of requests.
The DODB library has an API close to a hash table.
Having a data cache is as simple as keeping a hash table in memory besides providing a file-system storage, the retrieval becomes incredibly fast\*[*].
.FOOTNOTE1
Several hundred times faster, see the experiment section.
.FOOTNOTE2
Same thing for cached indexes.
Indexes can easily be cached, thanks to simple hash tables.
.
.
.SS Cached database
A cached database has the same API as the other DODB databases.
.QP
.SOURCE Ruby ps=10
# Create a cached database
database = DODB::CachedDataBase(Car).new "path/to/db-cars"
.SOURCE
All operations of the
.I DODB::DataBase
class are available for
.I DODB::CachedDataBase .
.QE
.
.SS Cached indexes
Since indexes do not require nearly as much memory as caching the entire database, they are cached by default.
.
.
.SECTION RAM-only database for short-lived data
Databases are built around the objective to actually
.I store
@ -509,9 +523,37 @@ For example, in case the database is larger than the available memory, it won't
Keep in mind that for the moment "cached database" means "all data in memory".
It is perfectly reasonable to have a cached database with a policy of keeping just a certain amount of values in memory, in order to limit the memory required by selecting the relevant values to keep in cache (the most recently used, for example).
But for now, the cached version keeps everything.
See the "Future work" section.
.FOOTNOTE2
.
.SS Uncached database
By default, the database (provided by
.I "DODB::DataBase" )
isn't cached.
.
.SS Uncached indexes
Cached indexes do not require a large amount of memory since the only stored data is an integer (the
.I key
of the data).
For that reason, indexes are cached by default.
But for highly memory-constrained environments, the cache can be removed.
.QP
.SOURCE Ruby ps=10
# Uncached basic indexes.
cars_by_name = cars.new_uncached_index "name", &.name
# Uncached partitions.
cars_by_colors = cars.new_uncached_partition "color", &.color
# Uncached tags.
cars_by_keywords = cars.new_uncached_tags "keywords", &.keywords
.SOURCE
The API of the
.I "uncached index objects"
is exactly the same as the others.
.QE
.
.
.
.SECTION Recap of the DODB API
.TBD
@ -520,6 +562,9 @@ But for now, the cached version keeps everything.
.SS Indexes creation
.SS Database update and deletion with an index
.SSS Tags: specific functions
.
.
.
.SECTION Limits of DODB
DODB provides basic database operations such as storing, searching, modifying and removing data.
Though, SQL databases have a few