diff --git a/paper/paper.ms b/paper/paper.ms index 878d001..9caa88f 100644 --- a/paper/paper.ms +++ b/paper/paper.ms @@ -1063,17 +1063,14 @@ These new triggers could record user-defined procedures to perform database veri .BULLET .B Isolation is partially taken into account with a locking mechanism preventing race conditions when modifying a value. -This may be seen as simplistic but -.SHINE "good enough" -for most applications. +This may be seen as simplistic but good enough for most applications. .BULLET .B Durability is taken into account. Data is written on disk each time it changes. -Again, this is basic but -.SHINE "good enough" -for most applications. +Data checksums are delegated to the filesystem or external tools. +Again, this is basic but good enough for most applications. A future improvement could be to write a checksum for every file to detect corrupt data, but this overlaps with some filesystems which already provide this feature. .ENDBULLET @@ -1149,6 +1146,8 @@ Traditional databases can be managed through command lines or a dedicated shell, DODB cannot, for the very same reason it came into existence: enabling this kind of tooling implies an enormous amount of code and complexity, obfuscating core database operations that should be both understandable and customizable. .KE .ENDBULLET + +In conclusion, the "missing" features are either irrelevant in the context of DODB or simple enough to implement and customize to one's needs. . .SS "The state of file systems, their limitations and useful features for DODB instances" A @@ -1184,7 +1183,7 @@ called the files are on the disk, their size, the last time they were modified or accessed and other metadata. A filesystem is split into a list of .I blocks -of a certain size\*[*] (4 kilobytes by default). +of a certain size\*[*] (4 kilobytes by default on ext4). .FOOTNOTE1 Working only with blocks (from 0 to x) is called .dq "Logical block addressing" . @@ -1243,7 +1242,7 @@ So, worst case scenario, data rate is .FRAC 1 4000 (huge waste) meaning that a 1GB of data would require an entire 4TB hard drive\*[*] (without even taking the inodes' size into account). .FOOTNOTE1 -Ext4 can integrate up to 60 bytes of data into an inode. +To slightly mitigate this, ext4 can integrate up to 60 bytes of data into an inode. .FOOTNOTE2 .KS @@ -1286,7 +1285,7 @@ Filesystems can also be distributed with some replication in order to provide a .KS .B "UnionFS" . -UnionFS (and its variants) is a filesystem enabling several filesystems to be mounted on the same mount-point and to show superposed contents, enabling a read-only base image to be used together with persistent data for a specific instance. +UnionFS (and its variants) is a filesystem enabling several filesystems to be mounted on the same mount-point and to show overlapping contents, enabling a read-only base image to be used together with persistent data for a specific instance. This way, a .dq "live-cd image" for an operating system can become persistent by storing modifications on an usb stick. @@ -1294,8 +1293,8 @@ for an operating system can become persistent by storing modifications on an usb UnionFS is a copy-on-write snapshotting filesystem on top of other filesystems. Docker uses it to save space. -Docker provides different ready-to-run software as small virtual machines. -To preserve storage space, a base OS image is shared amongst all instances and each instance only stores its own specific files (binaries, configuration and dependencies) written in a separate storage volume. +Since the different software that Docker provides are ready-to-run virtual machines, a base OS image is shared amongst all instances so each instance only stores its own specific files (binaries, configuration and dependencies) written in a separate storage volume. +Thus, despite each software distribution requiring an entire operating system environment, the storage volume is kept reasonable. .KS .B "Archivemount" . @@ -1314,8 +1313,8 @@ creates a block file based on a chunk of RAM that needs to be formated then moun .B ramfs mounts directly a RAM-based filesystem, without the need to format a fake partition. Finally, -.B tmpfs -is the more flexible one, it is used as ramfs but can be resized and only uses a necessary amount of RAM at a given point (memory is free'd once a file is removed). +.B tmpfs , +the more flexible option, is used as ramfs but can be resized and only uses a necessary amount of RAM at a given point since memory is free'd once a file is removed. .FOOTNOTE2 .KS @@ -1327,6 +1326,12 @@ As a side effect, searching for a file in this context can be done by computing Well well well… doesn't that sound like the DODB tag triggers? As if databases and filesystems were intertwined somehow… .FOOTNOTE2 + +.KS +.B "And many more" ! +Other specific filesystems may not be widespread like the ones mentioned above but they exist and are as exotic as the constraints in which they evolve. +.KE +. .KS .SSS "Quick comparison between DBMSs and filesystems" The following table shows the proximity between famous database systems and ordinary filesystems, both sharing a lot of features despite very different approaches. @@ -1345,19 +1350,20 @@ allbox tab(:); c | c | c cw(\n[col1]u) | lw(\n[col2]u) | lw(\n[col3]u). Feature : DBMS : Filesystems -CRUD operations : SQL :files & directories +CRUD operations : \*[OK] SQL :\*[OK] files & directories Atomicity : \*[OK] :T{ -locking mechanism based on files +\*[OK] locking mechanism based on files T} -Consistency : \*[OK] : \*[NOK] besides very specific filesystems +Consistency : \*[OK] :\*[OK] in specific filesystems (the kernel-related ones for example) Isolation : \*[OK] :T{ +\*[OK] .dq "new file then mv" technique\*[*] T} -Durability : \*[OK] :limited (checksums) +Durability : \*[OK] :\*[OK] checksums Access Time : 0.1 to 2ms :a few µs (cache) to a few ms (first access with a hard disk) High avail. : \*[OK] :T{ -RAID & variants plus many distributed or cluster filesystems +\*[OK] RAID & variants plus many distributed or cluster filesystems T} Transactions : \*[OK] :T{ \*[OK] in a few filesystems (BTRFS, ZFS) @@ -1366,13 +1372,14 @@ Replication : \*[OK] :T{ \*[OK] in many filesystems (BTRFS, ZFS, ClusterFS, etc.) T} Performance : \*[OK] :T{ -B-trees and variants (used in all modern FS: BTRFS, ext4, Raiserfs4, NTFS, HAMMER…) are used to search data on the storage device but also to get an entry in a huge directory. +\*[OK] B-trees and variants (used in all modern FS: BTRFS, ext4, Raiserfs4, NTFS, HAMMER…) are used to search data on the storage device but also to get an entry in a huge directory T} Space waste :T{ -almost none +.ps -2 +\*[OK] almost none .ps T}:T{ -depends on many factors, but generally important on small data +\*[NOK] generally important on small data (there is a room for improvement), that's why just to mimic relational databases doesn't work well with current filesystem inner workings, but document-oriented databases (having a whole set of related data in a single file) make sense T} .TE .FOOTNOTE1 @@ -1383,12 +1390,12 @@ However, considering a data management library, this method to ensure data integ This table shows an overview of some (mostly shared) DBMSs and filesystems features. Real deployments may involve a whole range of tools, including a mix of both of these solutions. -For example, key-value databases can be used as DBMSs' cache to massively speed data retrieval up. +For example, key-value databases often are used as DBMSs' cache to massively speed data retrieval up. The main difference between DBMSs and filesystems is the .I consistency property. -Filesystems are almost exclusively built to store undefined streams of data with a very wide range of different shapes (plain text, multimedia, documents, etc.) and sizes (from empty to multiple terabytes and more), thus no consistency verification can be reasonably implemented. +Filesystems are almost exclusively built to store undefined streams of data with a very wide range of different shapes (plain text, multimedia, documents, etc.) and sizes (from empty to multiple terabytes and more), thus no consistency verification can be reasonably implemented outside very specific contexts (such as kernel-related filesystems). . . .KS