Show HN: Bitemporal, Binary JSON Database System and Event Store https://ift.tt/Kr2uvaM
Show HN: Bitemporal, Binary JSON Database System and Event Store I had already posted the project a couple of years ago, and it gained some interest, but a lot of stuff has been done since then, especially regarding performance, a completely new JSON store, a REST API, various internals refactored, an improved JSONiq based query engine allowing updates, implementing set-oriented join optimizations, a now already dated web UI, a new Kotlin based CLI, a Python and TypeScript client to ease the use of Sirix... First prototypes from a precursor stem already from 2005. So, what is it all about? The system uses ideas from ZFS (a keyed index trie, storing checksums in parent pages...) and Git (a persistent index structure that shares unchanged pages between revisions) but appends new tree roots on each commit [1][2]. It is a JSON DBS. The system stores fine granular JSON nodes. Thus, there's almost no limit to the structure and size of an object. Objects can be arbitrarily nested, and updates are cheap. On a high level, it supports space-efficient snapshots, tracking changes by an author / optional commit messages, time travel queries, reverting to previous revisions (while all revisions in-between still exist for audits...), or retrieving the changes of whole (sub)trees. On the one hand, it's, thus, a bitemporal DBS, but on the other hand, it can be used as a simple event store. It stores the state after an event or a change occurs and tracks the changes. Thus, an entity, a node in the JSON structure, can be updated to new values and eventually be removed while the history is easily retrievable, or we can easily revert to a previous state. The system assigns a unique ID to each new node, which never changes and is never reused (even after the deletion of the node). Thus, the system stores the state after the change/event and the event itself (the change event). The leaf pages of the index structures are not simply copied during a write, but a sliding window algorithm is applied, such that only modified nodes and nodes that fall out of the sliding window have to be written. A predefined window length is configurable. The system avoids write-peaks, which would occur due to full snapshots and having to read a long chain of incremental changes in between. Thus, it's best suited for fast flash drives with fast random reads and sequential writes. Data is never overwritten thus, audit trails are given for free. Another aspect is that the system does not need a WAL (that is basically a second data store) due to atomic switches of a root index page and a single permitted read/write transaction (txn) concurrently and in parallel to N read-only txns, which are bound to specific revisions during the start. Reads do not involve any locks.[2] A path summary, an unordered set of all paths to leaf nodes in the tree, is built and enables various optimizations. Furthermore, a rolling hash is optionally built, whereas all ancestor node hashes are adapted during inserts. A dated Jupyter notebook with some examples can be found in [3], and overall documentation in [4]. The query engine[5] Brackit is retargetable (a couple of interfaces and rewrite rules have to be implemented for DB systems) and especially finds implicit joins and applies known algorithms from the relational DB systems world to optimize joins and aggregate functions due to set-oriented processing of the operators.[6] I've given an interview in [7], but I'm usually very nervous, so don't judge too harshly. Give it a try, and happy coding! Kind regards Johannes [1] https://sirix.io | https://ift.tt/tQTvHjJ [2] https://ift.tt/kbGmaoJ [3] https://ift.tt/SJdZEL1 [4] https://sirix.io/docs/ [5] http://brackit.io [6] https://ift.tt/BnS9xUt [7] https://youtu.be/Ee-5ruydgqo?si=Ift73d49w84RJWb2 November 14, 2023 at 12:51AM
Komentar
Posting Komentar