Some more explanations.
This commit is contained in:
parent
25bfab34e0
commit
1d03f906e6
@ -10,3 +10,9 @@
|
||||
%T RFC 8259, The JavaScript Object Notation (JSON) Data Interchange Format
|
||||
%D 2017
|
||||
%I Internet Engineering Task Force (IETF)
|
||||
|
||||
%K darkhttpd
|
||||
%A Emil Mikulic
|
||||
%T DarkHTTPd, when you need a webserver in a hurry.
|
||||
%D 2017
|
||||
%I https://unix4lyfe.org/darkhttpd/
|
||||
|
402
paper/paper.ms
402
paper/paper.ms
@ -671,161 +671,6 @@ is exactly the same as the others.
|
||||
.QE
|
||||
.
|
||||
.
|
||||
.
|
||||
.SECTION Recap of the DODB API
|
||||
This section provides a quick shorthand manual for the most important parts of the DODB API.
|
||||
For an exhaustive API documentation, please generate the development documentation for the library.
|
||||
The command
|
||||
.COMMAND "make doc"
|
||||
generates the documentation, then the
|
||||
.COMMAND "make serve-doc"
|
||||
command enables to browse the full documentation with a web browser.
|
||||
.
|
||||
.SS Database creation
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
# Uncached, cached, common and RAM-only database creation.
|
||||
database = DODB::Storage::Uncached(Car).new "path/to/db"
|
||||
database = DODB::Storage::Cached(Car).new "path/to/db"
|
||||
database = DODB::Storage::Common(Car).new "path/to/db", 50000 # nb cache entries
|
||||
database = DODB::Storage::RAMOnly(Car).new "path/to/db"
|
||||
.SOURCE
|
||||
.QE
|
||||
.
|
||||
.SS Browsing the database
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
# List all the values in the database
|
||||
database.each do |value|
|
||||
# ...
|
||||
end
|
||||
.SOURCE
|
||||
.QE
|
||||
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
# List all the values in the database with their key
|
||||
database.each_with_key do |value, key|
|
||||
# ...
|
||||
end
|
||||
.SOURCE
|
||||
.QE
|
||||
.
|
||||
.SS Database search, update and deletion with the key (integer associated to the value)
|
||||
.KS
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
value = database[key] # May throw a MissingEntry exception
|
||||
value = database[key]? # Returns nil if the value doesn't exist
|
||||
database[key] = value
|
||||
database.delete key
|
||||
.SOURCE
|
||||
Side note for the
|
||||
.I []
|
||||
function: in case the value isn't in the database, the function throws an exception named
|
||||
.CLASS DODB::MissingEntry .
|
||||
To avoid this exception and get a
|
||||
.I nil
|
||||
value instead, use the
|
||||
.I []?
|
||||
function.
|
||||
.QE
|
||||
.KE
|
||||
.
|
||||
.
|
||||
.SS Trigger creation
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
# Uncached, cached and RAM-only basic indexes.
|
||||
cars_by_name = cars.new_uncached_index "name", &.name
|
||||
cars_by_name = cars.new_index "name", &.name
|
||||
cars_by_name = cars.new_RAM_index "name", &.name
|
||||
|
||||
# Uncached, cached and RAM-only partitions.
|
||||
cars_by_color = cars.new_uncached_partition "color", &.color
|
||||
cars_by_color = cars.new_partition "color", &.color
|
||||
cars_by_color = cars.new_RAM_partition "color", &.color
|
||||
|
||||
# Uncached, cached and RAM-only tags.
|
||||
cars_by_keywords = cars.new_uncached_tags "keywords", &.keywords
|
||||
cars_by_keywords = cars.new_tags "keywords", &.keywords
|
||||
cars_by_keywords = cars.new_RAM_tags "keywords", &.keywords
|
||||
.SOURCE
|
||||
.QE
|
||||
.
|
||||
.
|
||||
.SS Database retrieval, update and deletion with an index
|
||||
.
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
# Get a value from a 1-1 index.
|
||||
car = cars_by_name.get "Corvet" # May throw a MissingEntry exception
|
||||
car = cars_by_name.get? "Corvet" # Returns nil if the value doesn't exist
|
||||
.SOURCE
|
||||
.QE
|
||||
.
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
# Get a value from a partition (1-n relations) or a tag (n-n relations) index.
|
||||
red_cars = cars_by_color.get "red" # empty array if no such cars exist
|
||||
fast_cars = cars_by_keywords.get "fast" # empty array if no such cars exist
|
||||
|
||||
# Several tags can be selected at the same time, to narrow the search.
|
||||
cars_both_fast_and_expensive = cars_by_keywords.get ["fast", "expensive"]
|
||||
.SOURCE
|
||||
.QE
|
||||
.
|
||||
The basic 1-1
|
||||
.I "index object"
|
||||
can update a value by selecting an unique entry in the database.
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
car = cars_by_name.update updated_car # If the `name` hasn't changed.
|
||||
car = cars_by_name.update "Corvet", updated_car # If the `name` has changed.
|
||||
|
||||
car = cars_by_name.update_or_create updated_car # Updates or creates the value.
|
||||
car = cars_by_name.update_or_create "Corvet", updated_car # Same.
|
||||
.SOURCE
|
||||
.QE
|
||||
For deletion, database entries can be selected based on any index.
|
||||
Partitions and tags can take a block of code to narrow the selection.
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
cars_by_name.delete "Corvet" # Deletes the car named "Corvet".
|
||||
cars_by_color.delete "red" # Deletes all red cars.
|
||||
|
||||
# Deletes cars that are both slow and expensive.
|
||||
cars_by_keywords.delete ["slow", "expensive"]
|
||||
|
||||
# Deletes all cars that are both blue and slow.
|
||||
cars_by_color.delete "blue", do |car|
|
||||
car.keywords.includes? "slow"
|
||||
end
|
||||
|
||||
# Same.
|
||||
cars_by_keywords.delete "slow", do |car|
|
||||
car.color == "blue"
|
||||
end
|
||||
.SOURCE
|
||||
.QE
|
||||
.
|
||||
.
|
||||
.SSS Tags: search on multiple keys
|
||||
The Tag index enables to search for a value based on multiple keys.
|
||||
For example, searching for all cars that are both fast and elegant can be written this way:
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
fast_elegant_cars = cars_by_keywords.get ["fast", "elegant"]
|
||||
.SOURCE
|
||||
Used with a list of keys, the
|
||||
.FUNCTION_CALL get
|
||||
function returns an empty list in case the search failed.
|
||||
.br
|
||||
The implementation was designed to be simple (7 lines of code), not efficient.
|
||||
However, with data and index caches, the search is expected to meet about everyone's requirements, speed-wise, given that the tags are small enough (a few thousand entries).
|
||||
.QE
|
||||
.
|
||||
.
|
||||
.SECTION Limits of DODB
|
||||
DODB provides basic database operations such as storing, searching, modifying and removing data.
|
||||
Though, SQL databases have a few
|
||||
@ -980,6 +825,8 @@ That is why alternative encodings, such as CBOR,
|
||||
CBOR
|
||||
.]
|
||||
should be considered for large databases.
|
||||
.
|
||||
.
|
||||
.SS Partitions (1 to n relations)
|
||||
The previous example shown the retrieval of a single value from the database.
|
||||
The following will show what happens when thousands of entries are retrieved.
|
||||
@ -1017,6 +864,7 @@ and
|
||||
databases, and the more data there is to retrieve, the worst it gets.
|
||||
However, retrieving thousands and thousands of entries in a single request may not be a typical usage of databases, anyway.
|
||||
.
|
||||
.
|
||||
.SS Tags (n to n relations)
|
||||
A tag index enables to match a list of entries based on an attribute with potentially multiple values (such as an array).
|
||||
In the experiment, a database of cars is created along with a tag index on a list of
|
||||
@ -1071,40 +919,58 @@ This database is to be considered to achieve maximum speed for data-sets fitting
|
||||
.B "Common database"
|
||||
enables to lower the memory requirements as much as desired.
|
||||
The eviction policy implies some operations which leads to poorer performances, however still acceptable.
|
||||
.
|
||||
.ps -2
|
||||
.TS
|
||||
allbox tab(:);
|
||||
c | lw(3.6i) | cew(1.4i).
|
||||
DODB instance:Comment and database usage:T{
|
||||
compared to RAM-only
|
||||
T}
|
||||
RAM only:T{
|
||||
Worst memory footprint, best performance.
|
||||
T}:-
|
||||
Cached db and index:T{
|
||||
Performance for retrieving a value is the same as RAM only while
|
||||
enabling the admin to manually search for data on-disk.
|
||||
T}:about the same perfs
|
||||
Common db, cached index:T{
|
||||
Performance is still excellent while requiring a
|
||||
.UL configurable
|
||||
amount of RAM.
|
||||
Should be used by default.
|
||||
T}:T{
|
||||
67% slower (about 200 ns) which still is great
|
||||
T}
|
||||
Uncached db, cached index:Very slow. Common database should be considered instead.:170 to 180x slower
|
||||
Uncached db and index:T{
|
||||
Best memory footprint, worst performance.
|
||||
T}:200 to 210x slower
|
||||
.TE
|
||||
.ps \n[PS]
|
||||
|
||||
.B "Uncached database"
|
||||
is mostly in this experiment as a control sample, to see what could be the worst possible performances of DODB.
|
||||
|
||||
Cached indexes should be considered for most applications, or even their RAM-only version in case the file-system representation isn't necessary.
|
||||
.
|
||||
.\" .ps -2
|
||||
.\" .TS
|
||||
.\" allbox tab(:);
|
||||
.\" c | lw(3.6i) | cew(1.4i).
|
||||
.\" DODB instance:Comment and database usage:T{
|
||||
.\" compared to RAM-only
|
||||
.\" T}
|
||||
.\" RAM only:T{
|
||||
.\" Worst memory footprint, best performance.
|
||||
.\" T}:-
|
||||
.\" Cached db and index:T{
|
||||
.\" Performance for retrieving a value is the same as RAM only while
|
||||
.\" enabling the admin to manually search for data on-disk.
|
||||
.\" T}:about the same perfs
|
||||
.\" Common db, cached index:T{
|
||||
.\" Performance is still excellent while requiring a
|
||||
.\" .UL configurable
|
||||
.\" amount of RAM.
|
||||
.\" Should be used by default.
|
||||
.\" T}:T{
|
||||
.\" 67% slower (about 200 ns) which still is great
|
||||
.\" T}
|
||||
.\" Uncached db, cached index:Very slow. Common database should be considered instead.:170 to 180x slower
|
||||
.\" Uncached db and index:T{
|
||||
.\" Best memory footprint, worst performance.
|
||||
.\" T}:200 to 210x slower
|
||||
.\" .TE
|
||||
.\" .ps \n[PS]
|
||||
.
|
||||
.SS Conclusion on performance
|
||||
As expected, retrieving a single value is fast and the size of the database doesn't matter much.
|
||||
Each deserialization and, more importantly, each disk access is a pain point.
|
||||
Caching the value enables a massive performance gain, data can be retrieved several hundred times quicker.
|
||||
The more entries requested, the slower it gets; but more importantly, the poorer performances it gets
|
||||
.UL "per entry" .
|
||||
|
||||
The eviction policy also implies poorer performances since it requires operations to select the data to cache.
|
||||
However, the implementation is as simple as it gets, and some approaches could be considered to make it faster.
|
||||
Notably, specific data-sets or database uses could lead to adapt the eviction policy.
|
||||
Same thing for the entire caching mechanism.
|
||||
The current implementation offers a simple and generic way to store data based on typical database uses.
|
||||
|
||||
As a side note, let's keep in mind that requesting several thousand entries in DODB, with the common database for instance, is as slow as getting
|
||||
.B "a single entry"
|
||||
with SQL (varies from 0.1 to 2 ms on my machine for a single value without a search, just the first available entry).
|
||||
This should help put things into perspective.
|
||||
.
|
||||
.SECTION Future work
|
||||
This section presents all the features I want to see in a future version of the DODB library.
|
||||
@ -1157,6 +1023,9 @@ Since this implementation of DODB is related to the Crystal language (which isn'
|
||||
.
|
||||
.
|
||||
.SECTION Conclusion
|
||||
The
|
||||
.I common
|
||||
database should be an acceptable choice for most applications.
|
||||
.TBD
|
||||
|
||||
.APPENDIX FIFO vs Efficient FIFO
|
||||
@ -1203,3 +1072,168 @@ When the cache size is not sufficient, the requests are hundred times slower, wh
|
||||
This figure shows the request durations to retrieve data based on a tag containing up to 5k entries.
|
||||
.QE
|
||||
As for partitions, the response time depends on the number of entries to retrieve and the duration increases linearly with the number of elements.
|
||||
.
|
||||
.
|
||||
.APPENDIX Recap of the DODB API
|
||||
This section provides a quick shorthand manual for the most important parts of the DODB API.
|
||||
For an exhaustive API documentation, please generate the development documentation for the library.
|
||||
The command
|
||||
.COMMAND "make doc"
|
||||
generates the documentation, then the
|
||||
.COMMAND "make serve-doc"
|
||||
command enables to browse the full documentation with a web browser\*[*].
|
||||
.FOOTNOTE1
|
||||
The
|
||||
.COMMAND "make serve-doc"
|
||||
requires darkhttpd
|
||||
.[
|
||||
darkhttpd
|
||||
.]
|
||||
but this can be adapted to any other web server.
|
||||
.FOOTNOTE2
|
||||
.
|
||||
.SS Database creation
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
# Uncached, cached, common and RAM-only database creation.
|
||||
database = DODB::Storage::Uncached(Car).new "path/to/db"
|
||||
database = DODB::Storage::Cached(Car).new "path/to/db"
|
||||
database = DODB::Storage::Common(Car).new "path/to/db", 50000 # nb cache entries
|
||||
database = DODB::Storage::RAMOnly(Car).new "path/to/db"
|
||||
.SOURCE
|
||||
.QE
|
||||
.
|
||||
.SS Browsing the database
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
# List all the values in the database
|
||||
database.each do |value|
|
||||
# ...
|
||||
end
|
||||
.SOURCE
|
||||
.QE
|
||||
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
# List all the values in the database with their key
|
||||
database.each_with_key do |value, key|
|
||||
# ...
|
||||
end
|
||||
.SOURCE
|
||||
.QE
|
||||
.
|
||||
.SS Database search, update and deletion with the key (integer associated to the value)
|
||||
.KS
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
value = database[key] # May throw a MissingEntry exception
|
||||
value = database[key]? # Returns nil if the value doesn't exist
|
||||
database[key] = value
|
||||
database.delete key
|
||||
.SOURCE
|
||||
Side note for the
|
||||
.I []
|
||||
function: in case the value isn't in the database, the function throws an exception named
|
||||
.CLASS DODB::MissingEntry .
|
||||
To avoid this exception and get a
|
||||
.I nil
|
||||
value instead, use the
|
||||
.I []?
|
||||
function.
|
||||
.QE
|
||||
.KE
|
||||
.
|
||||
.
|
||||
.SS Trigger creation
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
# Uncached, cached and RAM-only basic indexes.
|
||||
cars_by_name = cars.new_uncached_index "name", &.name
|
||||
cars_by_name = cars.new_index "name", &.name
|
||||
cars_by_name = cars.new_RAM_index "name", &.name
|
||||
|
||||
# Uncached, cached and RAM-only partitions.
|
||||
cars_by_color = cars.new_uncached_partition "color", &.color
|
||||
cars_by_color = cars.new_partition "color", &.color
|
||||
cars_by_color = cars.new_RAM_partition "color", &.color
|
||||
|
||||
# Uncached, cached and RAM-only tags.
|
||||
cars_by_keywords = cars.new_uncached_tags "keywords", &.keywords
|
||||
cars_by_keywords = cars.new_tags "keywords", &.keywords
|
||||
cars_by_keywords = cars.new_RAM_tags "keywords", &.keywords
|
||||
.SOURCE
|
||||
.QE
|
||||
.
|
||||
.
|
||||
.SS Database retrieval, update and deletion with an index
|
||||
.
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
# Get a value from a 1-1 index.
|
||||
car = cars_by_name.get "Corvet" # May throw a MissingEntry exception
|
||||
car = cars_by_name.get? "Corvet" # Returns nil if the value doesn't exist
|
||||
.SOURCE
|
||||
.QE
|
||||
.
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
# Get a value from a partition (1-n relations) or a tag (n-n relations) index.
|
||||
red_cars = cars_by_color.get "red" # empty array if no such cars exist
|
||||
fast_cars = cars_by_keywords.get "fast" # empty array if no such cars exist
|
||||
|
||||
# Several tags can be selected at the same time, to narrow the search.
|
||||
cars_both_fast_and_expensive = cars_by_keywords.get ["fast", "expensive"]
|
||||
.SOURCE
|
||||
.QE
|
||||
.
|
||||
The basic 1-1
|
||||
.I "index object"
|
||||
can update a value by selecting an unique entry in the database.
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
car = cars_by_name.update updated_car # If the `name` hasn't changed.
|
||||
car = cars_by_name.update "Corvet", updated_car # If the `name` has changed.
|
||||
|
||||
car = cars_by_name.update_or_create updated_car # Updates or creates the value.
|
||||
car = cars_by_name.update_or_create "Corvet", updated_car # Same.
|
||||
.SOURCE
|
||||
.QE
|
||||
For deletion, database entries can be selected based on any index.
|
||||
Partitions and tags can take a block of code to narrow the selection.
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
cars_by_name.delete "Corvet" # Deletes the car named "Corvet".
|
||||
cars_by_color.delete "red" # Deletes all red cars.
|
||||
|
||||
# Deletes cars that are both slow and expensive.
|
||||
cars_by_keywords.delete ["slow", "expensive"]
|
||||
|
||||
# Deletes all cars that are both blue and slow.
|
||||
cars_by_color.delete "blue", do |car|
|
||||
car.keywords.includes? "slow"
|
||||
end
|
||||
|
||||
# Same.
|
||||
cars_by_keywords.delete "slow", do |car|
|
||||
car.color == "blue"
|
||||
end
|
||||
.SOURCE
|
||||
.QE
|
||||
.
|
||||
.
|
||||
.SSS Tags: search on multiple keys
|
||||
The Tag index enables to search for a value based on multiple keys.
|
||||
For example, searching for all cars that are both fast and elegant can be written this way:
|
||||
.QP
|
||||
.SOURCE Ruby ps=9 vs=10
|
||||
fast_elegant_cars = cars_by_keywords.get ["fast", "elegant"]
|
||||
.SOURCE
|
||||
Used with a list of keys, the
|
||||
.FUNCTION_CALL get
|
||||
function returns an empty list in case the search failed.
|
||||
.br
|
||||
The implementation was designed to be simple (7 lines of code), not efficient.
|
||||
However, with data and index caches, the search is expected to meet about everyone's requirements, speed-wise, given that the tags are small enough (a few thousand entries).
|
||||
.QE
|
||||
.
|
||||
.
|
||||
|
Loading…
Reference in New Issue
Block a user