Show HN: hist: An overengineered solution to `sort|uniq -c` with 25x throughput https://ift.tt/EgtoN8p

Show HN: hist: An overengineered solution to `sort|uniq -c` with 25x throughput Was sitting around in meetings yesterday and remembered an old shell script I had to count the number of unique lines in a file. Gave it a shot in rust and with a little bit of (over-engineering)™ I managed to get 25x throughput over the naive approach using coreutils as well as improve over some existing tools. Some notes on the improvements: 1. using csv (serde) for writing leads to some big gains 2. arena allocation of incoming keys + storing references in the hashmap instead of storing owned values heavily reduced the number of allocations and improves cache efficiency (I'm guessing, I did not measure). There are some regex functionalities and some table filtering built in as well. happy hacking https://ift.tt/NZ7rGfQ October 23, 2025 at 11:26PM

Komentar

Postingan populer dari blog ini

Show HN: Guish – A GUI for constructing and executing Unix pipelines https://ift.tt/HrXz5ub

Twin Peaks for All: Survey Results

Taken with Transportation Podcast: For the Love of Muni