Show HN: [OSS] Reduce LLM Hallucination by Streamlining Experimentation *Problem* Let's assume you are building a simple Q&A Chatbot backed by RAG. To get the best possible answer for a given question, you need to find the right combination of prompt and LLM model. To find the right prompt, you need to identify the correct prompt template (e.g., chain-of-thought, few-shot, react, etc.) with the appropriate RAG context. To get the right RAG context, your Vector DB needs to have the correct data and indexing. In this simple, non-conversational single-agent application, to obtain the best answer for your application, you need to find the right combinations of Model X Prompt Template X Context X RAG Content. There are likely thousands of permutations to try, and it's challenging to determine which combination would yield the best result. Therefore, to find the best possible combination, you need a process that can help you quickly run experiments to evaluate these different com