Blah.
This commit is contained in:
parent
e903b9df43
commit
9da39972d3
116
graphs/graphs.ms
116
graphs/graphs.ms
@ -411,6 +411,8 @@ directory!
|
||||
.QE
|
||||
.KE
|
||||
.
|
||||
.
|
||||
.
|
||||
.SSS Side note about indexes
|
||||
DODB presents a few possible indexes (basic indexes, partitions and tags) which respond to an obvious need for fast searches.
|
||||
Though, their implementation via the creation of symlinks is the result of a certain vision about how a database should behave in order to provide a practical way for users to sort the entries.
|
||||
@ -419,13 +421,105 @@ The implementation can be completely changed.
|
||||
Also, other kinds of indexes could
|
||||
.B easily
|
||||
be implemented in addition of those presented.
|
||||
.TBD
|
||||
The new indexes may have completely different objectives than providing a file-system representation of the data.
|
||||
The following sections will precisely cover this aspect.
|
||||
.
|
||||
.
|
||||
.SSS Cached indexes and ram-only indexes
|
||||
.SECTION DODB, slow? Nope. Let's talk about caches
|
||||
DODB acts like a hash table.
|
||||
Internally, it literally has one
|
||||
.I "by default"
|
||||
to cache data.
|
||||
This means data is being stored in memory as well as on the file-system, so the retrieval is incredibly fast;
|
||||
same thing for the indexes.
|
||||
Sure, having a file-system representation of the data (including the indexes) is convenient for the administrator, but input-output operations on a file-system are slow.
|
||||
Indexes can easily be cached instead, as simple hash tables.
|
||||
.SS Cached and uncached database
|
||||
.TBD
|
||||
.SECTION DODB library: all the important parts
|
||||
.SS Cached and uncached indexes
|
||||
.TBD
|
||||
.SS RAM-only indexes
|
||||
In case the file-system representation isn't required, indexes can be stored in memory,
|
||||
.I only .
|
||||
.TBD
|
||||
.SECTION RAM-only database for short-lived data
|
||||
Databases are built around the objective to actually
|
||||
.I store
|
||||
data.
|
||||
But sometimes the data has only the same lifetime as the application.
|
||||
Stop the application and the data itself become irrelevant, which happens in several occasions, for instance when the application keeps track of the connected users.
|
||||
This case is not covered by traditional databases; this is out-of-scope, short-lived data only is handled within the application.
|
||||
Yet, since DODB is a library and not a separate application (read: DODB is incredibly faster), this usage of the database can be relevant.
|
||||
Having the same API to handle both long and short-lived data can be useful.
|
||||
Moreover, the previously mentioned indexes (basic indexes, partitions and tags) would also work the same way for these short-lived data.
|
||||
Of course, in this case, the file-system representation may be completely irrelevant.
|
||||
And for all these reasons, the
|
||||
.I RAM-only
|
||||
DODB database and
|
||||
.I RAM-only
|
||||
indexes were created.
|
||||
|
||||
Let's recap the advantages of the RAM-only DODB database.
|
||||
The DODB API is the same for short-lived (read: temporary) and long-lived data.
|
||||
This includes the same indexes too, so a file-system representation of the current state of the application is possible.
|
||||
RAM-only also means incredible performances since DODB only is a
|
||||
.I very
|
||||
small layer over a hash table.
|
||||
.SS RAM-only database
|
||||
Instanciate a RAM-only database is as simple as the other options.
|
||||
Moreover, this database has exactly the same API as the others, thus changing from one to another is painless.
|
||||
.QP
|
||||
.SOURCE Ruby ps=10
|
||||
# RAM-only database creation
|
||||
database = DODB::RAMOnlyDataBase(Car).new "path/to/db-cars"
|
||||
.SOURCE
|
||||
Yes, the path still is required which may be seen as a quirk but the rationale\*[*] is sound.
|
||||
.QE
|
||||
.FOOTNOTE1
|
||||
A path is still required despite the databse being only in memory for two reasons.
|
||||
First, indexes can still be instanciated for the database, and those indexes can provide a file-system representation of the data.
|
||||
Second, I worked enough already, leave me alone.
|
||||
.FOOTNOTE2
|
||||
.SS RAM-only indexes
|
||||
All indexes have their RAM-only counterpart.
|
||||
.QP
|
||||
.SOURCE Ruby ps=10
|
||||
# RAM-only basic indexes.
|
||||
cars_by_name = cars.new_RAM_index "name", &.name
|
||||
|
||||
# RAM-only partitions.
|
||||
cars_by_colors = cars.new_RAM_partition "color", &.color
|
||||
|
||||
# RAM-only tags.
|
||||
cars_by_keywords = cars.new_RAM_tags "keywords", &.keywords
|
||||
.SOURCE
|
||||
The API of the
|
||||
.I "RAM-only index objects"
|
||||
is exactly the same as the others.
|
||||
.QE
|
||||
As for the database API itself, changing from a version of an index to another is painless.
|
||||
This way, one can opt for a cached index and, after some time not using the file-system representation, decide to change for its RAM-only version; a 4-character modification and nothing else.
|
||||
.
|
||||
.
|
||||
.
|
||||
.SECTION DODB and memory constraint
|
||||
In contrast with the previous section, some environments have a memory constraint.
|
||||
For example, in case the database is larger than the available memory, it won't be possible to use a data cache\*[*].
|
||||
.FOOTNOTE1
|
||||
Keep in mind that for the moment "cached database" means "all data in memory".
|
||||
It is perfectly reasonable to have a cached database with a policy of keeping just a certain amount of values in memory, in order to limit the memory required by selecting the relevant values to keep in cache (the most recently used, for example).
|
||||
But for now, the cached version keeps everything.
|
||||
.FOOTNOTE2
|
||||
.SS Uncached database
|
||||
.SS Uncached indexes
|
||||
.
|
||||
.SECTION Recap of the DODB API
|
||||
.TBD
|
||||
.SS Database creation
|
||||
.SS Database update and deletion with the key
|
||||
.SS Indexes creation
|
||||
.SS Database update and deletion with an index
|
||||
.SSS Tags: specific functions
|
||||
.SECTION Limits of DODB
|
||||
DODB provides basic database operations such as storing, searching, modifying and removing data.
|
||||
Though, SQL databases have a few
|
||||
@ -453,6 +547,9 @@ FYI, the service
|
||||
uses DODB and since the database is fast enough, parallelism isn't required despite enabling more than a thousand requests per second.
|
||||
.FOOTNOTE2
|
||||
With a cache, data is retrieved five hundred times quicker than with a SQL database.
|
||||
Thus, parallelism is probably not needed but a locking mechanism is provided anyway, just in case; this may be overly simplistic but
|
||||
.SHINE "good enough"
|
||||
for most applications.
|
||||
|
||||
.I Durability
|
||||
is taken into account.
|
||||
@ -585,6 +682,17 @@ Caching the value enables a massive performance gain, data can be retrieved seve
|
||||
.so graph_query_tag.grap
|
||||
.
|
||||
.SECTION Future work
|
||||
.TBD
|
||||
This section presents all the features I want to see in a future version of the DODB library.
|
||||
.SS Cached database and indexes with selective memory
|
||||
Right now, both cached database and cached indexes will store any cached value indefinitively.
|
||||
Giving the cache the ability to select the values to keep in memory would enable a massive speed-up even in memory-constrained environments.
|
||||
The policy could be as simple as keeping in memory only the most recently requested values.
|
||||
.SS Pagination via the indexes: offset and limit
|
||||
Right now, browsing the entire database by requesting a limited list at a time is possible, thanks to some functions accepting an
|
||||
.I offset
|
||||
and a
|
||||
.I size .
|
||||
However, this is not possible with the indexes, thus when querying for example a partition the API provides the entire list of matching values.
|
||||
This is not acceptable for databases with large partitions and tags: memory will be over-used and requests will be slow.
|
||||
.SECTION Conclusion
|
||||
.TBD
|
||||
|
Loading…
Reference in New Issue
Block a user