Show HN: Speakrs Full PyAnnotate pipeline in Rust/ONNX 20-37x times faster macOS https://ift.tt/TVthOw4

Show HN: Speakrs Full PyAnnotate pipeline in Rust/ONNX 20-37x times faster macOS Speakrs implements the full pyannote community-1 style diarization pipeline in Rust: segmentation, powerset decode, overlap-add aggregation, binarization, embedding, PLDA, and VBx clustering. There is no Python runtime in the library path. Inference runs on ONNX Runtime or native CoreML, and the rest of the pipeline stays in Rust. It is 20x-30x faster on macOS, but only 2-3x faster on linux/cuda (depending on CPU). Few reasons its faster: 1. Speakrs is using coreml versions of the models. I exported the models specifically to run on coreml. PyAnnote just runs the same the same PyTorch versions through MPS (Metal) on macOS. 2. PyAnnote is not a single model, its a few different models put together in a pipeline, the readme has some info on the full pipeline. 3. Speakrs optimizes the pipeline so different parts can run on CPU, Neural Engine and GPU. Speakrs has a batch mode, where you can run on multiple files at once, doing this also lets you keep CPU/GPU/ANE all fully utilized. This is why on linux/cuda its not that much faster, PyAnnotate is already optimized to run on cuda, the speed improvements we get on cuda is by running some stuff on cpu while the other stuff runs on the GPU. The speedup on linux will depend on how powerful the CPU is. There is also a fast mode, that sacrifices some speed for accuracy, that can be up to 50x faster, and for some types of audio doesn't sacrifice that much accuracy. The benchmarks have more info on this. https://ift.tt/2FHj7qR May 27, 2026 at 12:05AM

Komentar

Postingan populer dari blog ini

Show HN: Guish – A GUI for constructing and executing Unix pipelines https://ift.tt/HrXz5ub

Launch HN: PillarPlus (YC W20) – Automatically create construction blueprints https://ift.tt/2yet5m3