Overview
What is RoStore​
RoStore is a java-native simple key-value off-heap data storage, that uses memory mapping to store and access the data.
The key and value can be any arbitrary binary data, e.g. string or image, that the client decides how to use.
In the managed-environment like java's JVM, all the objects are managed by the virtual machine, which makes programming easiler, yet involves an overhead in handling the massive amounts of data. That's why all the data in RoStore is preserved off-heap and only instantiated in form of Java objects if needed and explicitly requested.
RoStore provides a variety of integration patterns, that offers different level of developer's engagement with the technology, starting with a simple values by keys request, to more fine granular data management.
For example, ro-store REST-service only allows utf-8-based strings as a key to make the usage more clean, where as if application would integrate the service as a java module, it will be possible to work with keys of any nature.
The access to the data is designed for high-frequency data access. The data is split into many independent shards or fragments that allows to modify and read the data by concurrent processes and minimize the blocking effects of parallel writes and reads.
Memory Mapping​
Ro-Sore is created to use the benefits of memory-mapping and all its internal structure is build around this approach to access the data on the OS that supports virtual memory.
Memory-mapping is a hardware-supported technology, implemented in the modern OS, that allows effectively mapping of the physical hard drive blocks directly to the RAM memory pages, supported by CPU.
Instead of randomly and sequentially reading and writing the data the rostore storage uses the CPU hardware to map the data to the memory. Read wikipedia articles for more information.
Why is this so good?​
Modern CPUs are made with a hardware support of virtual memory and memory-mapping, which will be used anyway by the modern program during read-modify-write cycle.
If the application is not involved in the management of this memory-mapping functionality, it could be very inefficient, as CPU does not know a couple of things about the intentions of the program.
Consider, classical operations for data modification, they involve the following operations:
- reserve memory;
- read the data to the reserved memory from the original file location;
- modify the data in the reserved memory;
- write the data from the reserved memory block to the original file location;
- free the reserved memory.
The step "reserve data" from above, will eventually create a mapping of the newly reserved virtual memory page to the swap file (virtual memory and CPU supports all this!), the read operation will transfer the data from the hard drive to this swap-file bound virtual memory page. The swap-file location will be mapped to this page until the data will be freed. When afterwards the write operation happens, it will transfer the data from the respective memory page / swap file to the new write location.
RoStore basically eliminates the intermediate data relocation to some swap-bound page, allowing to map the data block from the stoage itself directly to the virtual memory. So, OS and CPU will not move all these data elements, and work in-place with them.
RoStore, by organizing the data in the pages and by mapping them to the memory makes it all much more native.
Additionally, RoStore is a java-native module, which makes it very attractive for java developers, and generally makes it portable. As it relies on the support of the CPU.