Show HN: MyGPT a toy LLM which can be trained on Project Gutenberg and dad jokes https://ift.tt/Ez0NwyA

Show HN: MyGPT a toy LLM which can be trained on Project Gutenberg and dad jokes My puny version of ChatGPT. This was based on the excellent LLM lecture series by Andrej Karpathy: https://www.youtube.com/watch?v=kCc8FmEb1nY The main points of differentiation are that my version is token-based (tiktoken) with code to load up multiple text files as a trining set. Plus, it has a minimal server which is a drop-in replacement for the OpenAI REST API. So you can train the default tiny 15M parameter model, and use that in your projects instead of ChatGPT. I trained it on 20Mb of Project Gutenberg encyclopaedias, then fine-tuned it on 120 dad jokes, to get a Q: A: prompt format. This model + training set is so small that the results are basically a joke; it's for entertainment purposes only. The code is also very rough, and the server only has the minimum functionality filled in. I embodied this model in my talking LLM-driven hexapod robot, and it could give very silly answers to spoken questions. https://ift.tt/ayQ0Wvw September 26, 2023 at 09:51PM

Komentar

Postingan populer dari blog ini

Show HN: Interactive exercises for GNU grep, sed and awk https://ift.tt/OxeFwah

Show HN: My Book Bulletproof TLS and PKI (Second Edition) Is Out https://ift.tt/5PZ9mxF