Show HN: ONNX optimized SigLIP and related foundation models https://ift.tt/lTGJ6PC
Show HN: ONNX optimized SigLIP and related foundation models Hey there Hacker News! Long time lurker but first time posting so nice to meet you all! I have some time on my hands so I got down a rabbit hole and created a project to keep track of, and unify a bunch of optimized foundation models under one easy to use Python package. So far I have the following up and running:- SigLIP as a FP16 ONNX representation for super fast zero-shot classification image classification - quantized model support is on it’s way Automatic pre and post processing switching - choosing CLIP as a model type falls back to cosine similarity with softmax, where SigLIP falls back to its full graph with a Scipy based sigmoid output activation. Manual mode with exposed image and text encoders for each model - SigLIP also has it’s pre-pooled hidden output available for analysis etc An ONNX Segment Anything representation - the plan is to have each CLIP/SigLIP model’s salient map feed directly as a multi-point prompt to SAM and its variants. More on this soon!The plan is to have the same usage wrap a TensorRT backend along the road, and tie that into Chromadb for super fast search but for now check out the example Gradio app. It’s still a little clunky but the results are pretty impressive! I’d imagine this would be great for lightweight RAG too! https://ift.tt/0vhSTZO June 21, 2024 at 11:58PM
Komentar
Posting Komentar