Talk a bit more about the results (index).
This commit is contained in:
parent
948f995ef4
commit
751baef391
@ -0,0 +1,6 @@
|
||||
%K CBOR
|
||||
%A C. Bormann
|
||||
%A P. Hoffman
|
||||
%T RFC 8949, Concise Binary Object Representation (CBOR)
|
||||
%D 2020
|
||||
%I Internet Engineering Task Force (IETF)
|
@ -44,7 +44,7 @@ define legend {
|
||||
cy = cy - hdiff
|
||||
legend_line(cy,lstartx,lendx,tstartx,black,"Cached db and index")
|
||||
cy = cy - hdiff
|
||||
legend_line(cy,lstartx,lendx,tstartx,pink,"FIFO db and cached index")
|
||||
legend_line(cy,lstartx,lendx,tstartx,pink,"Common db, cached index")
|
||||
cy = cy - hdiff
|
||||
legend_line(cy,lstartx,lendx,tstartx,blue,"Uncached db, cached index")
|
||||
cy = cy - hdiff
|
||||
|
@ -943,14 +943,47 @@ The experiment starts with a database containing 1,000 cars and goes up to 250,0
|
||||
.so graphs/query_index.grap
|
||||
.ps \n[PS]
|
||||
.QP
|
||||
This figure shows the request durations to retrieve data based on a basic index with a database containing up to 250k entries.
|
||||
This figure shows the request durations to retrieve data based on a basic index with a database containing up to 250k entries, both with linear and logarithmic scales.
|
||||
.QE
|
||||
|
||||
Since there is only one value to retrieve, the request is quick and time is almost constant.
|
||||
When the value and the index are kept in memory (see \f[CW]RAM only\f[] and \f[CW]Cached db\f[]), the retrieval is almost instantaneous (about 50 to 120 ns).
|
||||
In case the value is on the disk, deserialization takes about 15 µs (see \f[CW]Uncached db, cached index\f[]).
|
||||
When the value and the index are kept in memory (see \f[CW]RAM only\f[], \f[CW]Cached db\f[] and \f[CW]Common db\f[]), the retrieval is almost instantaneous\*[*].
|
||||
.FOOTNOTE1
|
||||
About 110 to 120 ns for RAM-only and cached database.
|
||||
This is slightly more (about 200 ns) for Common database since there is a few more steps due to the inner structure to maintain.
|
||||
.FOOTNOTE2
|
||||
In case the value is on the disk, deserialization takes about 15 µs (see \f[CW]Uncached db\f[]).
|
||||
The request is a little longer when the index isn't cached (see \f[CW]Uncached db and index\f[]); in this case DODB walks the file-system to find the right symlink to follow, thus slowing the process even more, by up to 20%.
|
||||
|
||||
The logarithmic scale version of this figure shows that RAM-only and Cached databases have exactly the same performance.
|
||||
The Common database is somewhat slower than these two due to the caching policy: when a value is asked, the Common database puts its key at the start of a list to represent a
|
||||
.I recent
|
||||
use of this data (respectively, the last values in this list are the least recently used entries).
|
||||
Thus, Common database takes 80 ns for its caching policy, which makes this database about 67% slower than the previous ones to retrieve a value.
|
||||
Uncached databases are far away from these results, as shown by the logarithmically scaled figure.
|
||||
The data cache improves the duration of the requests, this makes them at least a hundred times faster.
|
||||
|
||||
The results depend on the data size; the bigger the data, the slower the serialization (and deserialization).
|
||||
That is why alternative encodings, such as CBOR,
|
||||
.[
|
||||
CBOR
|
||||
.]
|
||||
should be considered for large databases.
|
||||
|
||||
.SS Partitions (1 to n relations)
|
||||
.LP
|
||||
.ps -2
|
||||
.so graphs/query_partition.grap
|
||||
.ps \n[PS]
|
||||
|
||||
.SS Tags (n to n relations)
|
||||
.LP
|
||||
.ps -2
|
||||
.so graphs/query_tag.grap
|
||||
.ps \n[PS]
|
||||
.
|
||||
|
||||
.SS Summary
|
||||
.ps -2
|
||||
.TS
|
||||
allbox tab(:);
|
||||
@ -965,31 +998,25 @@ Cached db and index:T{
|
||||
Performance for retrieving a value is the same as RAM only while
|
||||
enabling the admin to manually search for data on-disk.
|
||||
T}:about the same perfs
|
||||
Uncached db, cached index::300 to 400x slower
|
||||
Common db, cached index:T{
|
||||
Performance is still excellent while requiring a
|
||||
.UL configurable
|
||||
amount of RAM.
|
||||
Should be used by default.
|
||||
T}:T{
|
||||
67% slower (about 200 ns) which still is great
|
||||
T}
|
||||
Uncached db, cached index:Very slow. Common database should be considered instead.:170 to 180x slower
|
||||
Uncached db and index:T{
|
||||
Best memory footprint, worst performance.
|
||||
T}:400 to 500x slower
|
||||
T}:200 to 210x slower
|
||||
.TE
|
||||
.ps \n[PS]
|
||||
|
||||
.B Conclusion :
|
||||
as expected, retrieving a single value is fast and the size of the database doesn't matter much.
|
||||
.SS Conclusion on performance
|
||||
As expected, retrieving a single value is fast and the size of the database doesn't matter much.
|
||||
Each deserialization and, more importantly, each disk access is a pain point.
|
||||
Caching the value enables a massive performance gain, data can be retrieved several hundred times quicker.
|
||||
.SS Partitions (1 to n relations)
|
||||
.LP
|
||||
|
||||
.ps -2
|
||||
.so graphs/query_partition.grap
|
||||
.ps \n[PS]
|
||||
|
||||
.SS Tags (n to n relations)
|
||||
.LP
|
||||
.ps -2
|
||||
.so graphs/query_tag.grap
|
||||
.ps \n[PS]
|
||||
.
|
||||
.
|
||||
.
|
||||
.SECTION Future work
|
||||
This section presents all the features I want to see in a future version of the DODB library.
|
||||
|
Loading…
Reference in New Issue
Block a user