(Voiceover) Building on evaluation quicksand

Read the full post here: https://www.interconnects.ai/p/building-on-evaluation-quicksandChapters00:00 Building on evaluation quicksand01:26 The causes of closed evaluation silos06:35 The challenge facing open evaluation tools10:47 Frontiers in evaluation11:32 New types of synthetic data contamination13:57 Building harder evaluationsFiguresFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/manual/openai-predictions.webp Get full access to Interconnects at www.interconnects.ai/subscribe

Om Podcasten

Audio essays about the latest developments in AI and interviews with leading scientists in the field. Breaking the hype, understanding what's under the hood, and telling stories. www.interconnects.ai