280 lines
8.1 KiB
Plaintext
280 lines
8.1 KiB
Plaintext
.so macros.roff
|
|
.TITLE Brief performance analysis of Document Oriented DataBase (DODB)
|
|
.AUTHOR Philippe P.
|
|
.ABSTRACT1
|
|
DODB is a database-as-library, enabling a very simple way to store applications' data: storing serialized
|
|
.I documents
|
|
(basically any data type) in plain files.
|
|
To speed-up searches, attributes of these documents can be used as indexes which leads to create a few symbolic links
|
|
.I symlinks ) (
|
|
on the disk.
|
|
.br
|
|
See the \f[CW]README\f[] for a longer explanation.
|
|
|
|
This document briefly presents an experiment to understand the performances we can get with this approach.
|
|
.br
|
|
.UL Status :
|
|
WIP
|
|
.ABSTRACT2
|
|
.SECTION Experimental scenario
|
|
.LP
|
|
The following experiment shows the performance of DODB based on quering durations.
|
|
Data can be searched via
|
|
.I indexes ,
|
|
as for SQL databases.
|
|
Three possible indexes exist in DODB:
|
|
(a) basic indexes, representing 1 to 1 relations, the document's attribute is related to a value and each value of this attribute is unique,
|
|
(b) partitions, representing 1 to n relations, the attribute has a value and this value can be shared by other documents,
|
|
(c) tags, representing n to n relations, enabling the attribute to have multiple values whose are shared by other documents.
|
|
|
|
The scenario is simple: adding values to a database with indexes (basic, partitions and tags) then query 100 times a value based on the different indexes.
|
|
Loop and repeat.
|
|
|
|
Four instances of DODB are tested:
|
|
.BULLET \fIuncached database\f[] shows the achievable performance with a strong memory constraint (nothing can be kept in-memory) ;
|
|
.BULLET \fIuncached data but cached index\f[] shows the improvement you can expect by having a cache on indexes ;
|
|
.BULLET \fIcached database\f[] shows the most basic use of DODB\*[*] ;
|
|
.BULLET \fIRAM only\f[], the database doesn't have a representation on disk (no data is written on it).
|
|
The \fIRAM only\f[] instance shows a possible way to use DODB: to keep a consistent API to store data, including in-memory data with a lifetime related to the application's.
|
|
.ENDBULLET
|
|
.FOOTNOTE1
|
|
Having a cached database will probably be the most widespread use of DODB.
|
|
When memory isn't scarce, there is no point not using it to achieve better performance.
|
|
.FOOTNOTE2
|
|
|
|
The computer on which this test is performed\*[*] is a AMD PRO A10-8770E R7 (4 cores), 2.8 GHz.When mentioned, the
|
|
.I disk
|
|
is actually a
|
|
.I "temporary file-system (tmpfs)"
|
|
to enable maximum efficiency.
|
|
.FOOTNOTE1
|
|
A very simple $50 PC, buyed online.
|
|
Nothing fancy.
|
|
.FOOTNOTE2
|
|
|
|
The library is written in Crystal and so is the benchmark (\f[CW]spec/benchmark-cars.cr\f[]).
|
|
Nonetheless, despite a few technicalities, the objective of this document is to provide an insight on the approach used in DODB more than this particular implementation.
|
|
|
|
The manipulated data type can be found in \f[CW]spec/db-cars.cr\f[].
|
|
.SOURCE Ruby ps=9 vs=9p
|
|
class Car
|
|
property name : String # 1-1 relation
|
|
property color : String # 1-n relation
|
|
property keywords : Array(String) # n-n relation
|
|
end
|
|
.SOURCE
|
|
.
|
|
.SECTION Basic indexes (1 to 1 relations)
|
|
.LP
|
|
An index enables to match a single value based on a small string.
|
|
Since there is only one value to retrieve, the request is quick and time is almost constant.
|
|
When the value and the index are kept in memory (see \f[CW]RAM only\f[] and \f[CW]Cached db\f[]), the retrieval is almost instantaneous (about 50 to 120 ns).
|
|
In case the value is on the disk, deserialization takes about 15 µs (see \f[CW]Uncached db, cached index\f[]).
|
|
The request is a little longer when the index isn't cached, in this case DODB walks the file-system to find the right symlink to follow, thus slowing the process even more, by up to 20%.
|
|
.G1
|
|
copy "legend.grap"
|
|
frame invis ht 3 wid 4 left solid bot solid
|
|
coord y 0,50
|
|
ticks left out from 0 to 50 by 10
|
|
ticks bot out at 50000 "50,000", 100000 "100,000", 150000 "150,000", 200000 "200,000", 250000 "250,000"
|
|
|
|
label left "Request duration with" unaligned "an index (us)" "(Median)" left 0.8
|
|
label bot "Number of cars in the database" down 0.1
|
|
|
|
obram = obuncache = obcache = obsemi = 0 # old bullets
|
|
cbram = cbuncache = cbcache = cbsemi = 0 # current bullets
|
|
|
|
legendxleft = 100000
|
|
legendxright = 250000
|
|
legendyup = 15
|
|
legendydown = 2
|
|
|
|
boite(legendxleft,legendxright,legendyup,legendydown)
|
|
legend(legendxleft,legendxright,legendyup,legendydown)
|
|
|
|
copy "../data/index.d" thru X
|
|
cx = $1*5
|
|
|
|
y_scale = 1000
|
|
|
|
# ram cached semi uncached
|
|
line from cx,$2/y_scale to cx,$4/y_scale
|
|
line from cx,$5/y_scale to cx,$7/y_scale
|
|
line from cx,$8/y_scale to cx,$10/y_scale
|
|
line from cx,$11/y_scale to cx,$13/y_scale
|
|
|
|
#ty = $3
|
|
|
|
cx = $1*5
|
|
|
|
cbram = $3/y_scale
|
|
cbcache = $6/y_scale
|
|
cbsemi = $9/y_scale
|
|
cbuncache = $12/y_scale
|
|
|
|
if (obram > 0) then {line from cx,cbram to ox,obram}
|
|
if (obcache > 0) then {line from cx,cbcache to ox,obcache}
|
|
.gcolor blue
|
|
if (obsemi > 0) then {line from cx,cbsemi to ox,obsemi}
|
|
.gcolor
|
|
.gcolor green
|
|
if (obuncache > 0) then {line from cx,cbuncache to ox,obuncache}
|
|
.gcolor
|
|
|
|
obram = cbram
|
|
obcache = cbcache
|
|
obsemi = cbsemi
|
|
obuncache = cbuncache
|
|
ox = cx
|
|
|
|
# ram cached semi uncached
|
|
.gcolor red
|
|
bullet at cx,cbram
|
|
.gcolor
|
|
bullet at cx,cbcache
|
|
.gcolor blue
|
|
bullet at cx,cbsemi
|
|
.gcolor
|
|
.gcolor green
|
|
bullet at cx,cbuncache
|
|
.gcolor
|
|
X
|
|
.G2
|
|
.bp
|
|
.SECTION Partitions (1 to n relations)
|
|
.LP
|
|
.G1
|
|
copy "legend.grap"
|
|
frame invis ht 3 wid 4 left solid bot solid
|
|
coord x 0,5000*2 y 0,350
|
|
ticks left out from 0 to 350 by 50
|
|
|
|
label left "Request duration" unaligned "for a partition (ms)" "(Median)" left 0.8
|
|
label bot "Number of cars matching the partition" down 0.1
|
|
|
|
obram = obuncache = obcache = obsemi = 0
|
|
cbram = cbuncache = cbcache = cbsemi = 0
|
|
|
|
legendxleft = 1000
|
|
legendxright = 6500
|
|
legendyup = 330
|
|
legendydown = 230
|
|
|
|
boite(legendxleft,legendxright,legendyup,legendydown)
|
|
legend(legendxleft,legendxright,legendyup,legendydown)
|
|
|
|
copy "../data/partitions.d" thru X
|
|
cx = $1*2
|
|
|
|
y_scale = 1000000
|
|
|
|
# ram cached semi uncached
|
|
line from cx,$2/y_scale to cx,$4/y_scale
|
|
line from cx,$5/y_scale to cx,$7/y_scale
|
|
line from cx,$8/y_scale to cx,$10/y_scale
|
|
line from cx,$11/y_scale to cx,$13/y_scale
|
|
|
|
#ty = $3
|
|
|
|
cbram = $3/y_scale
|
|
cbcache = $6/y_scale
|
|
cbsemi = $9/y_scale
|
|
cbuncache = $12/y_scale
|
|
|
|
if (obram > 0) then {line from cx,cbram to ox,obram}
|
|
if (obcache > 0) then {line from cx,cbcache to ox,obcache}
|
|
.gcolor blue
|
|
if (obsemi > 0) then {line from cx,cbsemi to ox,obsemi}
|
|
.gcolor
|
|
.gcolor green
|
|
if (obuncache > 0) then {line from cx,cbuncache to ox,obuncache}
|
|
.gcolor
|
|
|
|
obram = cbram
|
|
obcache = cbcache
|
|
obsemi = cbsemi
|
|
obuncache = cbuncache
|
|
ox = cx
|
|
|
|
# ram cached semi uncached
|
|
.gcolor red
|
|
bullet at cx,cbram
|
|
.gcolor
|
|
bullet at cx,cbcache
|
|
.gcolor blue
|
|
bullet at cx,cbsemi
|
|
.gcolor
|
|
.gcolor green
|
|
bullet at cx,cbuncache
|
|
.gcolor
|
|
X
|
|
.G2
|
|
.bp
|
|
.SECTION Tags (n to n relations)
|
|
.LP
|
|
.G1
|
|
copy "legend.grap"
|
|
frame invis ht 3 wid 4 left solid bot solid
|
|
coord x 0,5000 y 0,170
|
|
ticks left out from 0 to 170 by 20
|
|
label left "Request duration" unaligned "for a tag (ms)" "(Median)" left 0.8
|
|
label bot "Number of cars matching the tag" down 0.1
|
|
|
|
obram = obuncache = obcache = obsemi = 0
|
|
cbram = cbuncache = cbcache = cbsemi = 0
|
|
|
|
legendxleft = 200
|
|
legendxright = 3000
|
|
legendyup = 170
|
|
legendydown = 120
|
|
|
|
boite(legendxleft,legendxright,legendyup,legendydown)
|
|
legend(legendxleft,legendxright,legendyup,legendydown)
|
|
|
|
copy "../data/tags.d" thru X
|
|
cx = $1
|
|
|
|
y_scale = 1000000
|
|
|
|
# ram cached semi uncached
|
|
line from cx,$2/y_scale to cx,$4/y_scale
|
|
line from cx,$5/y_scale to cx,$7/y_scale
|
|
line from cx,$8/y_scale to cx,$10/y_scale
|
|
line from cx,$11/y_scale to cx,$13/y_scale
|
|
|
|
#ty = $3
|
|
|
|
cbram = $3/y_scale
|
|
cbcache = $6/y_scale
|
|
cbsemi = $9/y_scale
|
|
cbuncache = $12/y_scale
|
|
|
|
if (obram > 0) then {line from cx,cbram to ox,obram}
|
|
if (obcache > 0) then {line from cx,cbcache to ox,obcache}
|
|
.gcolor blue
|
|
if (obsemi > 0) then {line from cx,cbsemi to ox,obsemi}
|
|
.gcolor
|
|
.gcolor green
|
|
if (obuncache > 0) then {line from cx,cbuncache to ox,obuncache}
|
|
.gcolor
|
|
|
|
obram = cbram
|
|
obcache = cbcache
|
|
obsemi = cbsemi
|
|
obuncache = cbuncache
|
|
ox = cx
|
|
|
|
# ram cached semi uncached
|
|
.gcolor red
|
|
bullet at cx,cbram
|
|
.gcolor
|
|
bullet at cx,cbcache
|
|
.gcolor blue
|
|
bullet at cx,cbsemi
|
|
.gcolor
|
|
.gcolor green
|
|
bullet at cx,cbuncache
|
|
.gcolor
|
|
X
|
|
.G2
|