#207 - GPT 4.1, Gemini 2.5 Flash, Ironwood, Claude Max

Our 207th episode with a summary and discussion of last week's big AI news!Recorded on 04/14/2025 Hosted by Andrey Kurenkov and Jeremie Harris. Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/. Join our Discord here! https://discord.gg/nTyezGSKwP In this episode: OpenAI introduces GPT-4.1 with optimized coding and instruction-following capabilities, featuring variants like GPT-4.1 Mini and Nano, and a million-token context window. Concerns arise as OpenAI reduces resources for safety testing, sparking internal and external criticisms. XAI's newly launched API for Grok 3 showcases significant capabilities comparable to other leading models. Meta faces allegations of aiding China in AI development for business advantages, with potential compliances and public scrutiny looming. Timestamps + Links: Tools & Apps (00:03:13) OpenAI’s new GPT-4.1 AI models focus on coding (00:08:12) ChatGPT will now remember your old conversations (00:11:16) Google’s newest Gemini AI model focuses on efficiency (00:14:27) Elon Musk’s AI company, xAI, launches an API for Grok 3 (00:18:35) Canva is now in the coding and spreadsheet business (00:20:31) Meta’s vanilla Maverick AI model ranks below rivals on a popular chat benchmark Applications & Business (00:25:46) Ironwood: The first Google TPU for the age of inference (00:34:15) Anthropic rolls out a $200-per-month Claude subscription (00:37:17) OpenAI co-founder Ilya Sutskever’s Safe Superintelligence reportedly valued at $32B (00:40:20) Mira Murati’s AI startup gains prominent ex-OpenAI advisers (00:42:52) Hugging Face buys a humanoid robotics startup (00:44:58) Stargate developer Crusoe could spend $3.5 billion on a Texas data center. Most of it will be tax-free. Projects & Open Source (00:48:14) OpenAI Open Sources BrowseComp: A New Benchmark for Measuring the Ability for AI Agents to Browse the Web Research & Advancements (00:56:09) Sample, Don't Search: Rethinking Test-Time Alignment for Language Models (01:03:32) Concise Reasoning via Reinforcement Learning (01:09:37) Going beyond open data – increasing transparency and trust in language models with OLMoTrace (01:15:34) Independent evaluations of Grok-3 and Grok-3 mini on our suite of benchmarks Policy & Safety (01:17:58) OpenAI countersues Elon Musk, calls for enjoinment from ‘further unlawful and unfair action’ (01:24:33) OpenAI slashes AI model safety testing time (01:27:55) Ex-OpenAI staffers file amicus brief opposing the company’s for-profit transition (01:32:25) Access to future AI models in OpenAI’s API may require a verified ID (01:34:53) Meta whistleblower claims tech giant built $18 billion business by aiding China in AI race and undermining U.S. national security

Om Podcasten

Weekly summaries and discussion about the most interesting developments in AI, deep learning, robotics, and more!