Show HN: OnPair – String compression with fast random access (Rust, C++) https://ift.tt/1kuYJ2T
Show HN: OnPair – String compression with fast random access (Rust, C++) I’ve been working on a compression algorithm for fast random access to individual strings in large collections. The problem came up when working with large in-memory database columns (emails, URLs, product titles, etc.), where low-latency point queries are essential. With short strings, LZ77-based compressors don’t perform well. Block compression helps, but block size forces a trade-off between ratio and access speed. Some existing options: - BPE: good ratios, but slow and memory-heavy - FSST (discussed here: https://ift.tt/67MaOUk ): very fast, but weaker compression This solution provides an interesting balance (more details in the paper): - Compression ratio: similar to BPE - Compression speed: 100–200 MiB/s - Decompression speed: 6–7 GiB/s I’d love to hear your thoughts — whether it’s workloads you think this could help with, ideas for API improvements, or just general discussion. Always happy to chat here on HN or by email. --- Resources: - Paper: https://ift.tt/btiZufw - Rust: https://ift.tt/YjvNLgH - C++: https://ift.tt/hJEpqld https://ift.tt/YjvNLgH August 19, 2025 at 10:20PM
Komentar
Posting Komentar