Triggers, API documentation, benchmarks, DODB::Storage::Common

This commit is contained in:
Philippe PITTOLI 2024-06-01 02:15:11 +02:00
parent 401578c77d
commit 27d4c00cbe
40 changed files with 4442 additions and 2195 deletions

2
.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
docs/
bin/

View File

@ -1,12 +1,18 @@
all: build
OPTS ?= --progress
OPTS ?= --progress --no-debug
Q ?= @
SHOULD_UPDATE = ./bin/should-update
DBDIR=/tmp/tests-on-dodb
RESULTS_DIR=results
benchmark-cars:
$(Q)crystal build spec/benchmark-cars.cr $(OPTS)
$(Q)crystal build spec/benchmark-cars.cr $(OPTS) --release
benchmark-cars-run: benchmark-cars
./benchmark-cars search # by default, test search durations
./bin/stats.sh $(RESULTS_DIR)
./bin/extract-data-benchmark-cars.sh $(RESULTS_DIR)
build: benchmark-cars
@ -15,3 +21,13 @@ wipe-db:
release:
make build OPTS="--release --progress"
doc:
crystal docs src/dodb.cr
HTTPD_ACCESS_LOGS ?= /tmp/access-dodb-docs.log
HTTPD_ADDR ?= 127.0.0.1
HTTPD_PORT ?= 9000
DIR ?= docs
serve-doc:
darkhttpd $(DIR) --addr $(HTTPD_ADDR) --port $(HTTPD_PORT) --log $(HTTPD_ACCESS_LOGS)

View File

@ -11,11 +11,12 @@ The objective is to get rid of DBMS when storing simple files directly on the fi
A brief summary:
- no SQL
- objects are serialized (currently in JSON)
- indexes (simple symlinks on the FS) can be created to improve significantly searches in the db
- data is indexed to improve significantly searches in the db
- db is fully integrated in the language (basically a simple array with a few more functions)
Also, data can be `cached`.
The entire base will be kept in memory (if you can), enabling incredible speeds.
- symlinks on the FS can be generated to enable data searches **outside the application, with UNIX tools**
- configurable data cache size
- RAM-only databases for short-lived data
- triggers can be easily implemented to extend indexes beyond you wildest expectations
## Limitations
@ -41,15 +42,8 @@ Since DODB doesn't use SQL and doesn't even try to handle stuff like atomicity o
Reading data from disk takes about a few dozen microseconds, and not much more when searching an indexed data.
**On my more-than-decade-old, slow-as-fuck machine**, the simplest possible SQL request to Postgres takes about 100 to 900 microseconds.
With DODB, to reach on-disk data: 13 microseconds.
To search then retrieve indexed data: almost the same thing, 16 microseconds on average, since it's just a path to a symlink we have to build.
With the `cached` version of DODB, there is not even deserialization happening, so 7 nanoseconds.
Indexes (indexes, partitions and tags) are also cached **by default**.
The speed up is great compared to the uncached version since you won't walk the file-system.
Searching an index takes about 35 nanoseconds when cached.
To avoid the memory cost of cached indexes, you can explicitely ask for uncached ones.
With DODB, to reach on-disk data: 15 microseconds; and just a few dozen **nanoseconds** for cached data.
Even when searching a specific value with an index.
**NOTE:** of course SQL and DODB cannot be fairly compared based on performance since they don't have the same properties.
But still, this is the kind of speed you can get with the tool.
@ -65,12 +59,11 @@ dependencies:
git: https://git.baguette.netlib.re/Baguette/dodb.cr
```
# Basic usage
```crystal
# Database creation
db = DODB::DataBase(Thing).new "path/to/storage/directory"
# Database creation, with a data cache of 100k entries.
db = DODB::Storage::Common(Thing).new "path/to/storage/directory", 100_000
# Adding an element to the db
db << Thing.new
@ -88,7 +81,7 @@ end
The DB creation is simply creating a few directories on the file-system.
```crystal
db = DODB::DataBase(Thing).new "path/to/storage/directory"
db = DODB::Storage::Common(Thing).new "path/to/storage/directory", 100_000
```
## Adding a new object
@ -101,8 +94,8 @@ db << Thing.new
To speed-up searches in the DB, we can sort them, based on their attributes for example.
There are 3 sorting methods:
- index, 1-1 relations, an attribute value is bound to a single object (an identifier)
- partition, 1-n relations, an attribute value may be related to several objects (the color of a car, for instance)
- basic indexes, 1-1 relations, an attribute value is bound to a single object (an identifier)
- partitions, 1-n relations, an attribute value may be related to several objects (the color of a car, for instance)
- tags, n-n relations, each object may have several tags, each tag may be related to several objects
Let's take an example.
@ -123,7 +116,7 @@ end
We want to store `cars` in a database and index them on their `id` attribute:
```Crystal
cars = DODB::DataBase(Car).new "path/to/storage/directory"
cars = DODB::Storage::Common(Car).new "path/to/storage/directory", 100_000
# We give a name to the index, then the code to extract the id from a Car instance
cars_by_id = cars.new_index "id", &.id
@ -214,8 +207,7 @@ car = cars_by_id.get "86a07924-ab3a-4f46-a975-e9803acba22d"
# we modify it
car.color = "Blue"
# update
# simple case: no change in the index
# update, simple case: no change in the index
cars_by_id.update car
# otherwise
car.id = "something-else-than-before"
@ -250,6 +242,7 @@ end
# Remove a value based on a tag.
cars_by_keyword.delete "shiny"
cars_by_keyword.delete ["slow", "expensive"] # Remove cars that are both slow and expensive.
cars_by_keyword.delete "elegant", do |car|
car.name == "GTI"
end
@ -282,7 +275,7 @@ end
# Database creation #
#####################
cars = DODB::DataBase(Car).new "./bin/storage"
cars = DODB::Storage::Common(Car).new "./db-storage", 100_000
##########################
@ -334,6 +327,8 @@ pp! cars_by_color.get "red"
# based on a tag (print all fast cars)
pp! cars_by_keyword.get "fast"
# based on several tags (print all cars that are both slow and expensive)
pp! cars_by_keyword.get ["slow", "expensive"]
############
# Updating #
@ -355,7 +350,7 @@ cars_by_name.update_or_create car.name, car
# We all know it, elegant cars are also expensive.
cars_by_keyword.get("elegant").each do |car|
car.keywords << "expensive"
cars_by_name.update car.name, car
cars_by_name.update car
end
###############
@ -372,6 +367,6 @@ cars_by_color.delete "blue", &.name.==("GTI")
# based on a keyword
cars_by_keyword.delete "solid"
# based on a keyword (but not only)
cars_by_keyword.delete "fast", &.name.==("Corvet")
# based on a few keywords (but not only)
cars_by_keyword.delete ["slow", "expensive"], &.name.==("Corvet")
```

17
TODO.md
View File

@ -1,8 +1,13 @@
# API
Cached indexes (index, partition, tags) should be used by default.
Uncached indexes should be an option, through a new function `add_uncached_index` or something.
# Performance
Search with some kind of "pagination" system: ask entries with a limit on the number of elements and an offset.
- search functions of *index objects* with a "pagination" system: ask entries with a limit on the number of elements and an offset.
# Memory and file-system management
- When a value is removed, the related partitions (and tags) may be empty, leaving both an empty array
in memory and a directory on the file-system. Should they be removed?
# Documentation
- Finish the PDF to explain *why DODB*.
- Change *index* by *key* in `DODB::Storage` and inherited classes.

View File

@ -1,5 +1,5 @@
name: dodb
version: 0.3.0
version: 0.5.0
authors:
- Luka Vandervelden <lukc@upyum.com>
@ -8,4 +8,4 @@ authors:
description: |
Simple, embeddable Document-Oriented DataBase in Crystal.
license: MIT
license: ISC

View File

@ -1,181 +1,221 @@
require "benchmark"
require "./benchmark-utilities.cr"
require "./utilities.cr"
require "./db-cars.cr"
require "../src/dodb.cr"
require "./test-data.cr"
# List of environment variables and default values:
# ENV["CARNAME"] rescue "Corvet-#{(db_size/2).to_i}"
# ENV["CARCOLOR"] rescue "red"
# ENV["CARKEYWORD"] rescue "spacious"
# ENV["DBSIZE"] rescue 50_000
# ENV["DBSIZE_START"] rescue 1_000
# ENV["DBSIZE_INCREMENT"] rescue 1_000
# ENV["REPORT_DIR"] rescue "results"
# ENV["NBRUN"] rescue 100
# ENV["MAXINDEXES"] rescue 5_000
# ENV["FIFO_SIZE"] rescue 10_000
class DODBCachedCars < DODB::CachedDataBase(Car)
property storage_dir : String
def initialize(storage_ext = "", remove_previous_data = true)
@storage_dir = "test-storage-cars-cached#{storage_ext}"
class Context
class_property report_dir = "results"
class_property max_indexes = 5_000
class_property nb_run = 100
class_property from = 1_000
class_property to = 50_000
class_property incr = 1_000
class_property fifo_size : UInt32 = 10_000
end
if remove_previous_data
::FileUtils.rm_rf storage_dir
# To simplify the creation of graphs, it's better to have fake data for
# partitions and tags that won't be actually covered.
# 0 means the absence of data.
def fake_report(name)
durations = Array(Int32).new Context.nb_run, 0
File.open("#{Context.report_dir}/#{name}.raw", "w") do |file|
durations.each do |d|
file.puts d
end
super storage_dir
end
def rm_storage_dir
::FileUtils.rm_rf @storage_dir
end
puts "#{name}: no report"
end
class DODBUnCachedCars < DODB::DataBase(Car)
property storage_dir : String
def initialize(storage_ext = "", remove_previous_data = true)
@storage_dir = "test-storage-cars-uncached#{storage_ext}"
if remove_previous_data
::FileUtils.rm_rf storage_dir
def report(storage, name, &block)
durations = run_n_times Context.nb_run, &block
File.open("#{Context.report_dir}/#{name}.raw", "w") do |file|
durations.each do |d|
file.puts d
end
super storage_dir
end
avr = durations.reduce { |a, b| a + b } / Context.nb_run
puts "#{name}: #{avr}"
avr
end
def rm_storage_dir
::FileUtils.rm_rf @storage_dir
def verbose_add_cars(storage, nbcars, name, max_indexes)
long_operation "add #{nbcars} values to #{name}" do
add_cars storage, nbcars, max_indexes: max_indexes
end
end
class DODBSemiCachedCars < DODB::DataBase(Car)
property storage_dir : String
def initialize(storage_ext = "", remove_previous_data = true)
@storage_dir = "test-storage-cars-semi#{storage_ext}"
# Add first entries, then loop: speed tests, add entries.
def prepare_env(storage, name, s_index, s_partition, s_tags, &)
verbose_add_cars storage, Context.from, name, max_indexes: Context.max_indexes
if remove_previous_data
::FileUtils.rm_rf storage_dir
current = Context.from
to = Context.to
incr = Context.incr
while current < to
yield storage, current, name, s_index, s_partition, s_tags
break if current + incr >= to
verbose_add_cars storage, incr, name, max_indexes: Context.max_indexes
current += incr
end
long_operation "removing #{name} data" { storage.rm_storage_dir }
end
def search_benchmark(storage : DODB::Storage(Car),
current_db_size : Int32,
name : String,
search_name : DODB::Trigger::Index(Car),
search_color : DODB::Trigger::Partition(Car),
search_keywords : DODB::Trigger::Tags(Car))
name_to_search = ENV["CARNAME"] rescue "Corvet-#{(current_db_size/2).to_i}"
color_to_search = ENV["CARCOLOR"] rescue "red"
keyword_to_search = ENV["CARKEYWORD"] rescue "spacious"
puts "NEW BATCH: db-size #{current_db_size}, name: '#{name_to_search}', color: '#{color_to_search}', tag: '#{keyword_to_search}'"
report(storage, "#{name}_#{current_db_size}_index") do
corvet = search_name.get name_to_search
end
if current_db_size <= Context.max_indexes
report(storage, "#{name}_#{current_db_size}_partitions") do
corvet = search_color.get? color_to_search
end
super storage_dir
end
def rm_storage_dir
::FileUtils.rm_rf @storage_dir
report(storage, "#{name}_#{current_db_size}_tags") do
corvet = search_keywords.get? keyword_to_search
end
else
fake_report("#{name}_#{current_db_size}_partitions")
fake_report("#{name}_#{current_db_size}_tags")
end
end
def init_indexes(storage : DODB::Storage)
n = storage.new_index "name", &.name
c = storage.new_partition "color", &.color
k = storage.new_tags "keyword", &.keywords
return n, c, k
def bench_searches()
cars_ram = SPECDB::RAMOnly(Car).new
cars_cached = SPECDB::Cached(Car).new
cars_fifo = SPECDB::Common(Car).new "-#{Context.fifo_size}", Context.fifo_size
cars_semi = SPECDB::Uncached(Car).new "-semi"
cars_uncached = SPECDB::Uncached(Car).new
ram_Sby_name, ram_Sby_color, ram_Sby_keywords = ram_indexes cars_ram
cached_Sby_name, cached_Sby_color, cached_Sby_keywords = cached_indexes cars_cached
fifo_Sby_name, fifo_Sby_color, fifo_Sby_keywords = cached_indexes cars_fifo
semi_Sby_name, semi_Sby_color, semi_Sby_keywords = cached_indexes cars_semi
uncached_Sby_name, uncached_Sby_color, uncached_Sby_keywords = uncached_indexes cars_uncached
fn = ->search_benchmark(DODB::Storage(Car), Int32, String, DODB::Trigger::Index(Car), DODB::Trigger::Partition(Car), DODB::Trigger::Tags(Car))
prepare_env cars_ram, "ram", ram_Sby_name, ram_Sby_color, ram_Sby_keywords, &fn
prepare_env cars_cached, "cached", cached_Sby_name, cached_Sby_color, cached_Sby_keywords, &fn
prepare_env cars_fifo, "fifo", fifo_Sby_name, fifo_Sby_color, fifo_Sby_keywords, &fn
prepare_env cars_semi, "semi", semi_Sby_name, semi_Sby_color, semi_Sby_keywords, &fn
prepare_env cars_uncached, "uncached", uncached_Sby_name, uncached_Sby_color, uncached_Sby_keywords, &fn
end
def init_uncached_indexes(storage : DODB::Storage)
n = storage.new_uncached_index "name", &.name
c = storage.new_uncached_partition "color", &.color
k = storage.new_uncached_tags "keyword", &.keywords
return n, c, k
end
def add_cars(storage : DODB::Storage, nb_iterations : Int32)
def perform_add(storage : DODB::Storage(Car))
corvet0 = Car.new "Corvet", "red", [ "shiny", "impressive", "fast", "elegant" ]
i = 0
car1 = Car.new "Corvet", "red", [ "shiny", "impressive", "fast", "elegant" ]
car2 = Car.new "Bullet-GT", "blue", [ "shiny", "fast", "expensive" ]
car3 = Car.new "Deudeuche", "beige", [ "curvy", "sublime" ]
car4 = Car.new "Ford-5", "red", [ "unknown" ]
car5 = Car.new "C-MAX", "gray", [ "spacious", "affordable" ]
while i < nb_iterations
car1.name = "Corvet-#{i}"
car2.name = "Bullet-GT-#{i}"
car3.name = "Deudeuche-#{i}"
car4.name = "Ford-5-#{i}"
car5.name = "C-MAX-#{i}"
storage << car1
storage << car2
storage << car3
storage << car4
storage << car5
perform_benchmark_average Context.nb_run, do
corvet = corvet0.clone
corvet.name = "Corvet-#{i}"
storage.unsafe_add corvet
i += 1
STDOUT.write "\radding value #{i}".to_slice
end
puts ""
end
cars_cached = DODBCachedCars.new
cars_uncached = DODBUnCachedCars.new
cars_semi = DODBSemiCachedCars.new
cached_searchby_name, cached_searchby_color, cached_searchby_keywords = init_indexes cars_cached
uncached_searchby_name, uncached_searchby_color, uncached_searchby_keywords = init_uncached_indexes cars_uncached
semi_searchby_name, semi_searchby_color, semi_searchby_keywords = init_indexes cars_semi
add_cars cars_cached, 1_000
add_cars cars_uncached, 1_000
add_cars cars_semi, 1_000
# Searching for data with an index.
Benchmark.ips do |x|
x.report("(cars db) searching a data with an index (with a cache)") do
corvet = cached_searchby_name.get "Corvet-500"
end
x.report("(cars db) searching a data with an index (semi: cache is only on index)") do
corvet = semi_searchby_name.get "Corvet-500"
end
x.report("(cars db) searching a data with an index (without a cache)") do
corvet = uncached_searchby_name.get "Corvet-500"
end
end
# Searching for data with a partition.
Benchmark.ips do |x|
x.report("(cars db) searching a data with a partition (with a cache)") do
red_cars = cached_searchby_color.get "red"
end
def bench_add()
cars_ram = SPECDB::RAMOnly(Car).new
cars_cached = SPECDB::Cached(Car).new
cars_fifo = SPECDB::Common(Car).new "-#{Context.fifo_size}", Context.fifo_size
cars_semi = SPECDB::Uncached(Car).new "-semi"
cars_uncached = SPECDB::Uncached(Car).new
x.report("(cars db) searching a data with a partition (semi: cache is only on partition)") do
red_cars = semi_searchby_color.get "red"
end
ram_indexes cars_ram
cached_indexes cars_cached
cached_indexes cars_fifo
cached_indexes cars_semi
uncached_indexes cars_uncached
x.report("(cars db) searching a data with a partition (without a cache)") do
red_cars = uncached_searchby_color.get "red"
end
avr = perform_add(cars_ram)
puts "(ram db and indexes) add a value (average on #{Context.nb_run} tries): #{avr}"
avr = perform_add(cars_cached)
puts "(cached db and indexes) add a value (average on #{Context.nb_run} tries): #{avr}"
avr = perform_add(cars_fifo)
puts "(fifo db and cached indexes) add a value (average on #{Context.nb_run} tries): #{avr}"
avr = perform_add(cars_semi)
puts "(uncached db but cached indexes) add a value (average on #{Context.nb_run} tries): #{avr}"
avr = perform_add(cars_uncached)
puts "(uncached db and indexes) add a value (average on #{Context.nb_run} tries): #{avr}"
cars_ram.rm_storage_dir
cars_cached.rm_storage_dir
cars_semi.rm_storage_dir
cars_uncached.rm_storage_dir
end
# Searching for data with a tag.
Benchmark.ips do |x|
x.report("(cars db) searching a data with a tag (with a cache)") do
red_cars = cached_searchby_keywords.get "spacious"
end
def bench_50_shades_of_fifo()
cars_fifo1 = SPECDB::Common(Car).new "-1k", 1_000
cars_fifo5 = SPECDB::Common(Car).new "-5k", 5_000
cars_fifo10 = SPECDB::Common(Car).new "-10k", 10_000
cars_fifo20 = SPECDB::Common(Car).new "-20k", 20_000
x.report("(cars db) searching a data with a tag (semi: cache is only on tags)") do
red_cars = semi_searchby_keywords.get "spacious"
end
fifo_Sby_name1, fifo_Sby_color1, fifo_Sby_keywords1 = cached_indexes cars_fifo1
fifo_Sby_name5, fifo_Sby_color5, fifo_Sby_keywords5 = cached_indexes cars_fifo5
fifo_Sby_name10, fifo_Sby_color10, fifo_Sby_keywords10 = cached_indexes cars_fifo10
fifo_Sby_name20, fifo_Sby_color20, fifo_Sby_keywords20 = cached_indexes cars_fifo20
x.report("(cars db) searching a data with a tag (without a cache)") do
red_cars = uncached_searchby_keywords.get "spacious"
end
fn = ->search_benchmark(DODB::Storage(Car), Int32, String, DODB::Trigger::Index(Car), DODB::Trigger::Partition(Car), DODB::Trigger::Tags(Car))
prepare_env cars_fifo1, "fifo1", fifo_Sby_name1, fifo_Sby_color1, fifo_Sby_keywords1, &fn
prepare_env cars_fifo5, "fifo5", fifo_Sby_name5, fifo_Sby_color5, fifo_Sby_keywords5, &fn
prepare_env cars_fifo10, "fifo10", fifo_Sby_name10, fifo_Sby_color10, fifo_Sby_keywords10, &fn
prepare_env cars_fifo20, "fifo20", fifo_Sby_name20, fifo_Sby_color20, fifo_Sby_keywords20, &fn
end
cars_cached.rm_storage_dir
cars_uncached.rm_storage_dir
ENV["REPORT_DIR"]?.try { |report_dir| Context.report_dir = report_dir }
Dir.mkdir_p Context.report_dir
cars_cached = DODBCachedCars.new
cars_uncached = DODBUnCachedCars.new
ENV["MAXINDEXES"]?.try { |it| Context.max_indexes = it.to_i }
ENV["NBRUN"]?.try { |it| Context.nb_run = it.to_i }
ENV["DBSIZE"]?.try { |it| Context.to = it.to_i }
ENV["DBSIZE_START"]?.try { |it| Context.from = it.to_i }
ENV["DBSIZE_INCREMENT"]?.try { |it| Context.incr = it.to_i }
ENV["FIFO_SIZE"]?.try { |it| Context.fifo_size = it.to_u32 }
#init_indexes cars_cached
#init_indexes cars_uncached
cached_searchby_name, cached_searchby_color, cached_searchby_keywords = init_indexes cars_cached
uncached_searchby_name, uncached_searchby_color, uncached_searchby_keywords = init_uncached_indexes cars_uncached
puts "REPORT_DIR: #{Context.report_dir}"
puts "MAXINDEXES: #{Context.max_indexes}"
puts "NBRUN: #{Context.nb_run}"
puts "DBSIZE: #{Context.to}"
puts "DBSIZE_START: #{Context.from}"
puts "DBSIZE_INCREMENT: #{Context.incr}"
puts "FIFO_SIZE: #{Context.fifo_size}"
add_cars cars_cached, 1_000
add_cars cars_uncached, 1_000
nb_run = 1000
perform_benchmark_average_verbose "(cached) search db with an index", nb_run, do
cached_searchby_name.get "Corvet-500"
if ARGV.size == 0
puts "Usage: benchmark-cars (fifo|search|add)"
exit 0
end
perform_benchmark_average_verbose "(uncached) search db with an index", nb_run, do
uncached_searchby_name.get "Corvet-500"
case ARGV[0]
when /fifo/
bench_50_shades_of_fifo
when /search/
bench_searches
when /add/
bench_add
else
puts "Usage: benchmark-cars (fifo|search|add)"
end
cars_cached.rm_storage_dir
cars_uncached.rm_storage_dir
cars_semi.rm_storage_dir

70
spec/benchmark-fifo.cr Normal file
View File

@ -0,0 +1,70 @@
require "benchmark"
require "./utilities.cr"
require "../src/fifo.cr"
def add(fifo : FIFO(Int32) | EfficientFIFO(Int32), nb : UInt32)
i = 0
while i < nb
fifo << i
i += 1
end
end
def report_add(fifo : FIFO(Int32) | EfficientFIFO(Int32), nb : UInt32, fname : String)
File.open("#{Context.report_dir}/#{fname}.raw", "w") do |file|
i = 0
while i < nb
elapsed_time = perform_something { fifo << i }
i += 1
file.puts "#{i} #{elapsed_time.total_nanoseconds}"
end
end
end
class Context
class_property nb_values : UInt32 = 100_000
class_property fifo_size : UInt32 = 10_000
class_property report_dir = "results"
end
if nb_values = ENV["NBVAL"]?
Context.nb_values = nb_values.to_u32
end
if fifo_size = ENV["FIFOSIZE"]?
Context.fifo_size = fifo_size.to_u32
end
if ARGV.size > 0
puts "Usage: benchmark-fifo"
puts ""
puts "envvar: REPORT_DIR=<directory> where to put the results"
puts "envvar: REPORT_EACH_ADD=<any> to report the duration of each addition of a value in the structure"
puts "envvar: NBVAL=<nb> (default: 100_000) nb of values to add to the structure"
puts "envvar: FIFOSIZE=<nb> (default: 10_000) max number of values in the structure"
exit 0
end
ENV["REPORT_DIR"]?.try { |report_dir| Context.report_dir = report_dir }
Dir.mkdir_p Context.report_dir
if ENV["REPORT_EACH_ADD"]?
FIFO(Int32).new(Context.fifo_size).tap do |fifo|
report_add fifo, Context.nb_values, "fifo_#{Context.fifo_size}_#{Context.nb_values}"
end
EfficientFIFO(Int32).new(Context.fifo_size).tap do |fifo|
report_add fifo, Context.nb_values, "efficientfifo_#{Context.fifo_size}_#{Context.nb_values}"
end
else
Benchmark.ips do |x|
x.report("adding #{Context.nb_values} values, FIFO limited to #{Context.fifo_size}") do
fifo = FIFO(Int32).new Context.fifo_size
add fifo, Context.nb_values
end
x.report("adding #{Context.nb_values} values, EfficientFIFO limited to #{Context.fifo_size}") do
fifo = EfficientFIFO(Int32).new Context.fifo_size
add fifo, Context.nb_values
end
end
end

View File

@ -1,9 +1,7 @@
require "benchmark"
require "./db-ships.cr"
require "../src/dodb.cr"
require "./test-data.cr"
class DODBCached < DODB::CachedDataBase(Ship)
class DODBCached < DODB::Storage::Cached(Ship)
def initialize(storage_ext = "", remove_previous_data = true)
storage_dir = "test-storage#{storage_ext}"
@ -15,7 +13,7 @@ class DODBCached < DODB::CachedDataBase(Ship)
end
end
class DODBUnCached < DODB::DataBase(Ship)
class DODBUnCached < DODB::Storage::Uncached(Ship)
def initialize(storage_ext = "", remove_previous_data = true)
storage_dir = "test-storage#{storage_ext}"

View File

@ -1,32 +0,0 @@
def perform_something(&block)
start = Time.monotonic
yield
Time.monotonic - start
end
def perform_benchmark_average(ntimes : Int32, &block)
i = 1
sum = Time::Span.zero
while i <= ntimes
elapsed_time = perform_something &block
sum += elapsed_time
i += 1
end
sum / ntimes
end
def perform_benchmark_average_verbose(title : String, ntimes : Int32, &block)
i = 1
sum = Time::Span.zero
puts "Execute '#{title}' × #{ntimes}"
while i <= ntimes
elapsed_time = perform_something &block
sum += elapsed_time
STDOUT.write "\relapsed_time: #{elapsed_time}, average: #{sum/i}".to_slice
i += 1
end
puts ""
puts "Average: #{sum/ntimes}"
end

View File

@ -1,402 +0,0 @@
require "spec"
require "file_utils"
require "../src/dodb.cr"
require "./test-data.cr"
class DODB::SpecDataBase < DODB::CachedDataBase(Ship)
def initialize(storage_ext = "", remove_previous_data = true)
storage_dir = "test-storage#{storage_ext}"
if remove_previous_data
::FileUtils.rm_rf storage_dir
end
super storage_dir
end
end
describe "DODB::DataBase::Cached" do
describe "basics" do
it "store and get data" do
db = DODB::SpecDataBase.new
Ship.all_ships.each do |ship|
db << ship
end
db.to_a.sort.should eq(Ship.all_ships.sort)
end
it "rewrite already stored data" do
db = DODB::SpecDataBase.new
ship = Ship.all_ships[0]
key = db << ship
db[key] = Ship.new "broken"
db[key] = ship
db[key].should eq(ship)
end
it "properly remove data" do
db = DODB::SpecDataBase.new
Ship.all_ships.each do |ship|
db << ship
end
Ship.all_ships.each do |ship|
db.pop
end
Ship.all_ships.each_with_index do |ship, i|
# FIXME: Should it raise a particular exception?
expect_raises DODB::MissingEntry do
db[i]
end
db[i]?.should be_nil
end
end
it "preserves data on reopening" do
db1 = DODB::SpecDataBase.new
db1 << Ship.kisaragi
db1.to_a.size.should eq(1)
db2 = DODB::SpecDataBase.new remove_previous_data: false
db2 << Ship.mutsuki
# Only difference with DODB::DataBase: for now, concurrent DB cannot coexists.
db2.to_a.size.should eq(2)
end
it "iterates in normal and reversed order" do
db = DODB::SpecDataBase.new
Ship.all_ships.each do |ship|
db << ship
end
# The two #each test iteration.
db.each_with_index do |item, index|
item.should eq Ship.all_ships[index]
end
db.each_with_index(reversed: true) do |item, index|
item.should eq Ship.all_ships[index]
end
# Actual reversal is tested here.
db.to_a(reversed: true).should eq db.to_a.reverse
end
it "respects the provided offsets if any" do
db = DODB::SpecDataBase.new
Ship.all_ships.each do |ship|
db << ship
end
db.to_a(start_offset: 0, end_offset: 0)[0]?.should eq Ship.mutsuki
db.to_a(start_offset: 1, end_offset: 1)[0]?.should eq Ship.kisaragi
db.to_a(start_offset: 2, end_offset: 2)[0]?.should eq Ship.yayoi
db.to_a(start_offset: 0, end_offset: 2).should eq [
Ship.mutsuki, Ship.kisaragi, Ship.yayoi
]
end
end
describe "indices" do
it "do basic indexing" do
db = DODB::SpecDataBase.new
db_ships_by_name = db.new_index "name", &.name
Ship.all_ships.each do |ship|
db << ship
end
Ship.all_ships.each_with_index do |ship|
db_ships_by_name.get?(ship.name).should eq(ship)
end
end
it "raise on index overload" do
db = DODB::SpecDataBase.new
db_ships_by_name = db.new_index "name", &.name
db << Ship.kisaragi
# Should not be allowed to store an entry whose “name” field
# already exists.
expect_raises(DODB::IndexOverload) do
db << Ship.kisaragi
end
end
it "properly deindex" do
db = DODB::SpecDataBase.new
db_ships_by_name = db.new_index "name", &.name
Ship.all_ships.each do |ship|
db << ship
end
Ship.all_ships.each_with_index do |ship, i|
db.delete i
end
Ship.all_ships.each do |ship|
db_ships_by_name.get?(ship.name).should be_nil
end
end
it "properly reindex" do
db = DODB::SpecDataBase.new
db_ships_by_name = db.new_index "name", &.name
key = db << Ship.kisaragi
# We give the old id to the new ship, to get it replaced in
# the database.
some_new_ship = Ship.all_ships[2].clone
db[key] = some_new_ship
db[key].should eq(some_new_ship)
db_ships_by_name.get?(some_new_ship.name).should eq(some_new_ship)
end
it "properly updates" do
db = DODB::SpecDataBase.new
db_ships_by_name = db.new_index "name", &.name
Ship.all_ships.each do |ship|
db << ship
end
new_kisaragi = Ship.kisaragi.clone.tap do |s|
s.name = "Kisaragi Kai" # Dont think about it too much.
end
# Were changing an indexed value on purpose.
db_ships_by_name.update "Kisaragi", new_kisaragi
db_ships_by_name.get?("Kisaragi").should be_nil
db_ships_by_name.get?(new_kisaragi.name).should eq new_kisaragi
end
end
describe "partitions" do
it "do basic partitioning" do
db = DODB::SpecDataBase.new
db_ships_by_class = db.new_partition "class", &.klass
Ship.all_ships.each do |ship|
db << ship
end
Ship.all_ships.each do |ship|
db_ships_by_class.get(ship.klass).should contain(ship)
end
# We extract the possible classes to do test on them.
ship_classes = Ship.all_ships.map(&.klass).uniq
ship_classes.each do |klass|
partition = db_ships_by_class.get klass
# A partition on “class” should contain entries that all
# share the same value of “class”.
partition.map(&.klass.==(klass)).reduce { |a, b|
a && b
}.should be_true
end
db_ships_by_class.get("does-not-exist").should eq [] of Ship
end
it "removes select elements from partitions" do
db = DODB::SpecDataBase.new
db_ships_by_class = db.new_partition "class", &.klass
Ship.all_ships.each do |ship|
db << ship
end
db_ships_by_class.delete "Mutsuki", &.name.==("Kisaragi")
Ship.all_ships.map(&.klass).uniq.each do |klass|
partition = db_ships_by_class.get klass
partition.any?(&.name.==("Kisaragi")).should be_false
end
end
end
describe "tags" do
it "do basic tagging" do
db = DODB::SpecDataBase.new
db_ships_by_tags = db.new_tags "tags", &.tags
Ship.all_ships.each do |ship|
db << ship
end
db_ships_by_tags.get("flagship").should eq([Ship.flagship])
# All returned entries should have the requested tag.
db_ships_by_tags.get("name ship")
.map(&.tags.includes?("name ship"))
.reduce { |a, e| a && e }
.should be_true
# There shouldnt be one in our data about WWII Japanese warships…
db_ships_by_tags.get("starship").should eq([] of Ship)
end
it "properly removes tags" do
db = DODB::SpecDataBase.new
db_ships_by_tags = db.new_tags "tags", &.tags
Ship.all_ships.each do |ship|
db << ship
end
# Removing the “flagship” tag, brace for impact.
flagship, index = db_ships_by_tags.get_with_indices("flagship")[0]
flagship.tags = [] of String
db[index] = flagship
# ship, index = db_ships_by_tags.update(tag: "flagship") do |ship, index|
# ship.tags = [] of String
# db[index] = ship
# end
db_ships_by_tags.get("flagship").should eq([] of Ship)
end
it "gets items that have multiple tags" do
db = DODB::SpecDataBase.new
db_ships_by_tags = db.new_tags "tags", &.tags
Ship.all_ships.each do |ship|
db << ship
end
results = db_ships_by_tags.get(["flagship", "name ship"])
results.should eq([Ship.yamato])
results = db_ships_by_tags.get(["name ship", "flagship"])
results.should eq([Ship.yamato])
results = db_ships_by_tags.get(["flagship"])
results.should eq([Ship.yamato])
end
end
describe "atomic operations" do
it "safe_get and safe_get?" do
db = DODB::SpecDataBase.new
db_ships_by_name = db.new_index "name", &.name
Ship.all_ships.each do |ship|
db << ship
end
Ship.all_ships.each do |ship|
db_ships_by_name.safe_get ship.name do |results|
results.should eq(ship)
end
db_ships_by_name.safe_get? ship.name do |results|
results.should eq(ship)
end
end
end
end
describe "tools" do
it "rebuilds indexes" do
db = DODB::SpecDataBase.new
db_ships_by_name = db.new_index "name", &.name
db_ships_by_class = db.new_partition "class", &.klass
db_ships_by_tags = db.new_tags "tags", &.tags
Ship.all_ships.each do |ship|
db << ship
end
db.reindex_everything!
Ship.all_ships.each do |ship|
db_ships_by_name.get?(ship.name).should eq(ship)
db_ships_by_class.get(ship.klass).should contain(ship)
end
end
it "migrates properly" do
::FileUtils.rm_rf "test-storage-migration-origin"
old_db = DODB::DataBase(PrimitiveShip).new "test-storage-migration-origin"
old_ships_by_name = old_db.new_index "name", &.name
old_ships_by_class = old_db.new_partition "class", &.class_name
PrimitiveShip.all_ships.each do |ship|
old_db << ship
end
# At this point, the “old” DB is filled. Now we need to convert
# to the new DB.
new_db = DODB::SpecDataBase.new "-migration-target"
new_ships_by_name = new_db.new_index "name", &.name
new_ships_by_class = new_db.new_partition "class", &.klass
new_ships_by_tags = new_db.new_tags "tags", &.tags
old_db.each_with_index do |ship, index|
new_ship = Ship.new ship.name,
klass: ship.class_name,
id: ship.id,
tags: Array(String).new.tap { |tags|
tags << "name ship" if ship.name == ship.class_name
}
new_db[index] = new_ship
end
# At this point, the conversion is done, so… were making a few
# arbitrary tests on the new data.
old_db.each_with_index do |old_ship, old_index|
ship = new_db[old_index]
ship.id.should eq(old_ship.id)
ship.klass.should eq(old_ship.class_name)
ship.tags.any?(&.==("name ship")).should be_true if ship.name == ship.klass
end
end
end
end

104
spec/db-cars.cr Normal file
View File

@ -0,0 +1,104 @@
# This file contains all the necessary code to perform tests based on the following Car database.
require "json"
require "../src/dodb.cr"
require "./spec-database.cr"
class Car
include JSON::Serializable
property name : String # unique to each instance (1-1 relations)
property color : String | DODB::NoIndex # a simple attribute (1-n relations)
property keywords : Array(String) | DODB::NoIndex # tags about a car, example: "shiny" (n-n relations)
def_clone
def initialize(@name, @color, @keywords)
end
class_getter cars = [
Car.new("Corvet", "red", [ "shiny", "impressive", "fast", "elegant" ]),
Car.new("SUV", "red", [ "solid", "impressive" ]),
Car.new("Mustang", "red", [ "shiny", "impressive", "elegant" ]),
Car.new("Bullet-GT", "red", [ "shiny", "impressive", "fast", "elegant" ]),
Car.new("GTI", "blue", [ "average" ]),
Car.new("Deudeuch", "violet", [ "dirty", "slow", "only French will understand" ])
]
# Equality is true if every property is identical.
def ==(other : Car)
@name == other.name && @color == other.color && @keywords == other.keywords
end
# Equality is true if every property is identical.
def <=>(other : Car)
@name <=> other.name
end
end
def ram_indexes(storage : DODB::Storage)
n = storage.new_RAM_index "name", &.name
c = storage.new_RAM_partition "color", &.color
k = storage.new_RAM_tags "keyword", &.keywords
return n, c, k
end
def cached_indexes(storage : DODB::Storage)
n = storage.new_index "name", &.name
c = storage.new_partition "color", &.color
k = storage.new_tags "keyword", &.keywords
return n, c, k
end
def uncached_indexes(storage : DODB::Storage)
n = storage.new_uncached_index "name", &.name
c = storage.new_uncached_partition "color", &.color
k = storage.new_uncached_tags "keyword", &.keywords
return n, c, k
end
# `max_indexes` limits the number of indexes (partitions and tags).
# Once the last index (db last_key/5) is above this value, the following
# cars won't be tagged nor partitionned.
def add_cars(storage : DODB::Storage, nb_iterations : Int32, max_indexes = 5000)
last_key = ((storage.last_key + 1) / 5).to_i
i = 0
car1 = Car.new "Corvet", "red", [ "shiny", "impressive", "fast", "elegant" ]
car2 = Car.new "Bullet-GT", "blue", [ "shiny", "fast", "expensive" ]
car3 = Car.new "Deudeuche", "beige", [ "curvy", "sublime" ]
car4 = Car.new "Ford-5", "red", [ "unknown" ]
car5 = Car.new "C-MAX", "gray", [ "spacious", "affordable" ]
while i < nb_iterations
car1.name = "Corvet-#{last_key}"
car2.name = "Bullet-GT-#{last_key}"
car3.name = "Deudeuche-#{last_key}"
car4.name = "Ford-5-#{last_key}"
car5.name = "C-MAX-#{last_key}"
last_key += 1
if last_key > max_indexes
car1.color = DODB.no_index
car2.color = DODB.no_index
car3.color = DODB.no_index
car4.color = DODB.no_index
car5.color = DODB.no_index
car1.keywords = DODB.no_index
car2.keywords = DODB.no_index
car3.keywords = DODB.no_index
car4.keywords = DODB.no_index
car5.keywords = DODB.no_index
end
storage.unsafe_add car1.clone
storage.unsafe_add car2.clone
storage.unsafe_add car3.clone
storage.unsafe_add car4.clone
storage.unsafe_add car5.clone
i += 1
#STDOUT.write "\radding value #{i}".to_slice
end
#puts ""
end

View File

@ -1,6 +1,9 @@
require "uuid"
require "json"
require "../src/dodb.cr"
require "./spec-database.cr"
# FIXME: Split the test data in separate files. We dont care about those here.
class Ship
@ -85,24 +88,3 @@ class PrimitiveShip
@@asakaze
]
end
class Car
include JSON::Serializable
property name : String # unique to each instance (1-1 relations)
property color : String # a simple attribute (1-n relations)
property keywords : Array(String) # tags about a car, example: "shiny" (n-n relations)
def_clone
def initialize(@name, @color, @keywords)
end
class_getter cars = [
Car.new("Corvet", "red", [ "shiny", "impressive", "fast", "elegant" ]),
Car.new("SUV", "red", [ "solid", "impressive" ]),
Car.new("Mustang", "red", [ "shiny", "impressive", "elegant" ]),
Car.new("Bullet-GT", "red", [ "shiny", "impressive", "fast", "elegant" ]),
Car.new("GTI", "blue", [ "average" ]),
Car.new("Deudeuch", "violet", [ "dirty", "slow", "only French will understand" ])
]
end

51
spec/spec-database.cr Normal file
View File

@ -0,0 +1,51 @@
class SPECDB::Uncached(V) < DODB::Storage::Uncached(V)
property storage_dir : String
def initialize(storage_ext = "", remove_previous_data = true)
@storage_dir = "specdb-storage-uncached#{storage_ext}"
::FileUtils.rm_rf storage_dir if remove_previous_data
super storage_dir
end
def rm_storage_dir
::FileUtils.rm_rf @storage_dir
end
end
class SPECDB::Cached(V) < DODB::Storage::Cached(V)
property storage_dir : String
def initialize(storage_ext = "", remove_previous_data = true)
@storage_dir = "specdb-storage-cached#{storage_ext}"
::FileUtils.rm_rf storage_dir if remove_previous_data
super storage_dir
end
def rm_storage_dir
::FileUtils.rm_rf @storage_dir
end
end
class SPECDB::Common(V) < DODB::Storage::Common(V)
property storage_dir : String
def initialize(storage_ext = "", @max_entries : UInt32 = 5_000, remove_previous_data = true)
@storage_dir = "specdb-storage-common-#{@max_entries}#{storage_ext}"
::FileUtils.rm_rf storage_dir if remove_previous_data
super storage_dir, max_entries
end
def rm_storage_dir
::FileUtils.rm_rf @storage_dir
end
end
class SPECDB::RAMOnly(V) < DODB::Storage::RAMOnly(V)
property storage_dir : String
def initialize(storage_ext = "", remove_previous_data = true)
@storage_dir = "specdb-storage-ram#{storage_ext}"
::FileUtils.rm_rf storage_dir if remove_previous_data
super storage_dir
end
def rm_storage_dir
::FileUtils.rm_rf @storage_dir
end
end

103
spec/test-cars.cr Normal file
View File

@ -0,0 +1,103 @@
require "spec"
require "./db-cars.cr"
corvet0 = Car.new "Corvet-0", "red", [ "shiny", "impressive", "fast", "elegant" ]
describe "uncached, cached and ram indexes" do
it "RAM DB - add items, add indexes, search, reindex, search" do
cars_ram0 = SPECDB::RAMOnly(Car).new "-0"
cars_ram1 = SPECDB::RAMOnly(Car).new "-1"
cars_ram2 = SPECDB::RAMOnly(Car).new "-2"
add_cars cars_ram0, 1
add_cars cars_ram1, 1
add_cars cars_ram2, 1
uncached_searchby_name, uncached_searchby_color, uncached_searchby_keywords = uncached_indexes cars_ram0
cached_searchby_name, cached_searchby_color, cached_searchby_keywords = cached_indexes cars_ram1
ram_searchby_name, ram_searchby_color, ram_searchby_keywords = ram_indexes cars_ram2
uncached_searchby_name.get?("Corvet-0").should be_nil
cached_searchby_name.get?("Corvet-0").should be_nil
ram_searchby_name.get?("Corvet-0").should be_nil
cars_ram0.reindex_everything!
cars_ram1.reindex_everything!
cars_ram2.reindex_everything!
# Get the value even if not written on the disk since the index was written on the disk.
# The value is retrieved by the database, the index only reads its key in the database.
uncached_searchby_name.get?("Corvet-0").should eq corvet0
# Both cached and RAM indexes can retrieve the value since they store the key.
cached_searchby_name.get?("Corvet-0").should eq corvet0
ram_searchby_name.get?("Corvet-0").should eq corvet0
cars_ram0.rm_storage_dir
cars_ram1.rm_storage_dir
cars_ram2.rm_storage_dir
end
end
describe "tracking inconsistencies between implementations" do
it "index - partitions - tags" do
cars_ram0 = SPECDB::RAMOnly(Car).new "-0"
cars_ram1 = SPECDB::RAMOnly(Car).new "-1"
cars_ram2 = SPECDB::RAMOnly(Car).new "-2"
cars_fifo = SPECDB::Common(Car).new "-2", 5
uncached_searchby_name, uncached_searchby_color, uncached_searchby_keywords = uncached_indexes cars_ram0
cached_searchby_name, cached_searchby_color, cached_searchby_keywords = cached_indexes cars_ram1
ram_searchby_name, ram_searchby_color, ram_searchby_keywords = ram_indexes cars_ram2
fifo_cached_searchby_name, fifo_cached_searchby_color, fifo_cached_searchby_keywords = cached_indexes cars_fifo
add_cars cars_ram0, 1
add_cars cars_ram1, 1
add_cars cars_ram2, 1
add_cars cars_fifo, 1
# Searches should be consistent between all implementations of basic indexes, partitions and tags.
# Basic index.
uncached_corvet_car = uncached_searchby_name.get? "Corvet-0"
cached_corvet_car = cached_searchby_name.get? "Corvet-0"
ram_corvet_car = ram_searchby_name.get? "Corvet-0"
fifo_cached_corvet_car = fifo_cached_searchby_name.get? "Corvet-0"
uncached_corvet_car.should eq cached_corvet_car
uncached_corvet_car.should eq ram_corvet_car
uncached_corvet_car.should eq fifo_cached_corvet_car
uncached_corvet_car.should eq corvet0
# Partitions.
red_cars = [ Car.new("Corvet-0", "red", [ "shiny", "impressive", "fast", "elegant" ]),
Car.new("Ford-5-0", "red", [ "unknown" ])
]
uncached_red_cars = uncached_searchby_color.get? "red"
cached_red_cars = cached_searchby_color.get? "red"
ram_red_cars = ram_searchby_color.get? "red"
fifo_cached_red_cars = fifo_cached_searchby_color.get? "red"
uncached_red_cars.sort.should eq cached_red_cars.sort
uncached_red_cars.sort.should eq ram_red_cars.sort
uncached_red_cars.sort.should eq fifo_cached_red_cars.sort
uncached_red_cars.sort.should eq red_cars.sort
# Tags.
fast_cars = [ Car.new("Corvet-0", "red", [ "shiny", "impressive", "fast", "elegant" ]),
Car.new("Bullet-GT-0", "blue", [ "shiny", "fast", "expensive" ])
]
uncached_fast_cars = uncached_searchby_keywords.get? "fast"
cached_fast_cars = cached_searchby_keywords.get? "fast"
ram_fast_cars = ram_searchby_keywords.get? "fast"
fifo_cached_fast_cars = fifo_cached_searchby_keywords.get? "fast"
uncached_fast_cars.sort.should eq cached_fast_cars.sort
uncached_fast_cars.sort.should eq ram_fast_cars.sort
uncached_fast_cars.sort.should eq fifo_cached_fast_cars.sort
uncached_fast_cars.sort.should eq fast_cars.sort
cars_ram0.rm_storage_dir
cars_ram1.rm_storage_dir
cars_ram2.rm_storage_dir
cars_fifo.rm_storage_dir
end
end

33
spec/test-common.cr Normal file
View File

@ -0,0 +1,33 @@
require "spec"
require "./db-cars.cr"
describe "SPECDB::Common" do
it "basics, 3 values" do
car0 = Car.new "Corvet-0", "red", [] of String
car1 = Car.new "Corvet-1", "red", [] of String
car2 = Car.new "Corvet-2", "red", [] of String
car3 = Car.new "Corvet-3", "red", [] of String
db = SPECDB::Common(Car).new "", 3
db.data.keys.sort.should eq([] of Int32)
db << car0
db.data.keys.sort.should eq([0] of Int32)
db << car1
db.data.keys.sort.should eq([0, 1] of Int32)