Add a smarter Journalndex implementation

Description

atomix-storage has a SparseJournal implementation of JournalIndex. This implementation retains every 1000th entry (by default) in the index.
While this works for small writes reasonably well, it makes no point if entry sizes are relatively large compared to maxSegmentSize.

This is a rather required trade-off with TreeMap, as TreeMap.Entry costs 32-64 bytes. Plus we need 2x16-32 bytes for an Integer and a Long (as they will not be efficiently interned). That makes we are using 64-128 bytes to what amounts to 12 bytes of data.

There is a bit more efficiency possible, as we have a JournalIndex per segment, each segment is limited to 2G, each entry has to have a 8 byte header and 1 byte value: therefore there cannot be more than 238 609 294 entries in a file. So instead of storing 8 bytes of index, we can store the first index and use relative addressing. That requires 28 bits, i.e. can be stored in 4 bytes with potentially 4 bits left for other things.

At maximum occupancy, SparseJournalIndex ends out eating ~15-30 megabytes on the index.
At minimum occupancy it is empty.

There are three operations on an index:

append
lookup floor entry for an index
truncate to an index

These are inherently linear operations and therefore let's use linear array to store entries.

This helps with per-entry overhead down by a factor of 8-16x. That means though, worst case would cost us 1.9GiB, or at least ~950MiB, if we did not introduce sparseness.

Given the our low cost of entry, we can provide many more of them. If we expend 2MiB memory we can store 262144 entries, which ends up being sparseness factor of 910, translating to read ~8KiB of data to locate the exact entry.

Linked issues

is duplicated by

CONTROLLER-2111

Add a smarter Journalndex implementation

split to

CONTROLLER-2112

Persistent SegmentIndex

Activity

Show:

Robert Varga March 22, 2024 at 6:20 PM

So "JournalIndex" really is "SegmentIndex" and it records locations of entries as they are appended.
That means we can have "zero-heap" index when we use StorageLevel.MAPPED: in that case we have the entire file mapped in memory, so we can use it as normal memory, which begs the question: can we persist the index as well?

tracks that. The JournalIndex implementation can assume it is cheap to traverse non-cached entries.

Details

Assignee

Unassigned

Reporter

Robert Varga

Labels

Components

Fix versions

12.0.0

Priority

Medium

Created March 21, 2024 at 4:34 AM

Updated March 10, 2025 at 12:21 PM

Configure