Show HN: Ragas – Open-source library for evaluating RAG pipelines https://ift.tt/On4WcNq

Show HN: Ragas – Open-source library for evaluating RAG pipelines Ragas is an open-source library for evaluating and testing RAG and other LLM applications. Github: https://ift.tt/hdqQlg7 , docs: https://docs.ragas.io/ . Ragas provides you with different sets of metrics and methods like synthetic test data generation to help you evaluate your RAG applications. Ragas started off by scratching our own itch for evaluating our RAG chatbots last year. Problems Ragas can solve - How do you choose the best components for your RAG, such as the retriever, reranker, and LLM? - How do you formulate a test dataset without spending tons of money and time? We believe there needs to be an open-source standard for evaluating and testing LLM applications, and our vision is to build it for the community. We are tackling this challenge by evolving the ideas from the traditional ML lifecycle for LLM applications. ML Testing Evolved for LLM Applications We built Ragas on the principles of metrics-driven development and aim to develop and innovate techniques inspired by state-of-the-art research to solve the problems in evaluating and testing LLM applications. We don't believe that the problem of evaluating and testing applications can be solved by building a fancy tracing tool; rather, we want to solve the problem from a layer under the stack. For this, we are introducing methods like automated synthetic test data curation, metrics, and feedback utilisation, which are inspired by lessons learned from deploying stochastic models in our careers as ML engineers. While currently focused on RAG pipelines, our goal is to extend Ragas for testing a wide array of compound systems, including those based on RAGs, agentic workflows, and various transformations. Try out Ragas here https://ift.tt/Uf07oED... in Google Colab. Read our docs - https://docs.ragas.io/ to know more We would love to hear feedback from the HN community :) https://ift.tt/1Pul3fC March 21, 2024 at 10:48PM

Komentar

Postingan populer dari blog ini

Show HN: Interactive exercises for GNU grep, sed and awk https://ift.tt/OxeFwah

Show HN: My Book Bulletproof TLS and PKI (Second Edition) Is Out https://ift.tt/5PZ9mxF