Show HN: I indexed 8,643 BSides talks across 227 chapters and 6 continents https://ift.tt/LrDgfE3
Show HN: I indexed 8,643 BSides talks across 227 chapters and 6 continents Hi HN, I'm Roland, and for the past few weeks, I've been building AllBSides — a directory of every BSides conference talk uploaded to YouTube. As of today, 8,643 talks from 5,927 speakers across 227 chapters in 68 countries. Combined runtime is 280 days. The transcripts come to about 60 million words. The archive came together in stages: 1. Manually map every BSides chapter's YouTube channel 2. Pull every video and transcript from Supabase 3. Run each transcript through Haiku for tag extraction (tools, topics, difficulty, team, talk style, research method, and much more) 4. Run results through Sonnet for categorization and dedup 5. Final pass goes through Opus for verification 6. Do a manual verification - at one time, the pipeline showed over 16k AI suggestions for manual verification. Today, most are resolved. Total LLM cost so far: about €200. The whole pipeline is rebuildable from scratch. Each t...