“o3” by Zach Stein-Perlman

See livestream, site, OpenAI thread, Nat McAleese thread. OpenAI announced (but isn't yet releasing) o3 and o3-mini (skipping o2 because of telecom company O2's trademark). "We plan to deploy these models early next year" (source). "o3 is powered by further scaling up RL beyond o1" (source); I don't know whether it's a new base model. o3 gets 25% on FrontierMath, smashing the previous SoTA. (These are really hard math problems.) Wow. (The dark blue bar, about 7%, is presumably one-attempt; unfortunately OpenAI didn't say what the light blue bar is, but I think it doesn't really matter and the 25% is for real.[1]) o3 also is easily SoTA on SWE-bench Verified and Codeforces. It's also easily SoTA on ARC-AGI, after doing RL on the public ARC-AGI problems + when spending $4,000 per task on inference (!).[2] OpenAI has a "new alignment strategy"; looks like Constitutional AI (and just about [...] The original text contained 4 footnotes which were omitted from this narration. The original text contained 6 images which were described by AI. --- First published: December 20th, 2024 Source: https://forum.effectivealtruism.org/posts/aNdg7ctFP9zFcowNd/o3 --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Om Podcasten

Audio narrations from the Effective Altruism Forum, including curated posts, posts with 30 karma, and other great writing. If you'd like fewer episodes, subscribe to the "EA Forum (Curated & Popular)" podcast instead.