Prof. Jakob Foerster - ImageNet Moment for Reinforcement Learning?

Prof. Jakob Foerster, a leading AI researcher at Oxford University and Meta, and Chris Lu, a researcher at OpenAI -- they explain how AI is moving beyond just mimicking human behaviour to creating truly intelligent agents that can learn and solve problems on their own. Foerster champions open-source AI for responsible, decentralised development. He addresses AI scaling, goal misalignment (Goodhart's Law), and the need for holistic alignment, offering a quick look at the future of AI and how to guide it.SPONSOR MESSAGES:***CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!https://centml.ai/pricing/Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/***TRANSCRIPT/REFS:https://www.dropbox.com/scl/fi/yqjszhntfr00bhjh6t565/JAKOB.pdf?rlkey=scvny4bnwj8th42fjv8zsfu2y&dl=0 Prof. Jakob Foersterhttps://x.com/j_foersthttps://www.jakobfoerster.com/University of Oxford Profile: https://eng.ox.ac.uk/people/jakob-foerster/Chris Lu:https://chrislu.page/TOC1. GPU Acceleration and Training Infrastructure [00:00:00] 1.1 ARC Challenge Criticism and FLAIR Lab Overview [00:01:25] 1.2 GPU Acceleration and Hardware Lottery in RL [00:05:50] 1.3 Data Wall Challenges and Simulation-Based Solutions [00:08:40] 1.4 JAX Implementation and Technical Acceleration2. Learning Frameworks and Policy Optimization [00:14:18] 2.1 Evolution of RL Algorithms and Mirror Learning Framework [00:15:25] 2.2 Meta-Learning and Policy Optimization Algorithms [00:21:47] 2.3 Language Models and Benchmark Challenges [00:28:15] 2.4 Creativity and Meta-Learning in AI Systems3. Multi-Agent Systems and Decentralization [00:31:24] 3.1 Multi-Agent Systems and Emergent Intelligence [00:38:35] 3.2 Swarm Intelligence vs Monolithic AGI Systems [00:42:44] 3.3 Democratic Control and Decentralization of AI Development [00:46:14] 3.4 Open Source AI and Alignment Challenges [00:49:31] 3.5 Collaborative Models for AI DevelopmentREFS[[00:00:05] ARC Benchmark, Chollethttps://github.com/fchollet/ARC-AGI[00:03:05] DRL Doesn't Work, Irpanhttps://www.alexirpan.com/2018/02/14/rl-hard.html[00:05:55] AI Training Data, Data Provenance Initiativehttps://www.nytimes.com/2024/07/19/technology/ai-data-restrictions.html[00:06:10] JaxMARL, Foerster et al.https://arxiv.org/html/2311.10090v5[00:08:50] M-FOS, Lu et al.https://arxiv.org/abs/2205.01447[00:09:45] JAX Library, Google Researchhttps://github.com/jax-ml/jax[00:12:10] Kinetix, Mike and Michaelhttps://arxiv.org/abs/2410.23208[00:12:45] Genie 2, DeepMindhttps://deepmind.google/discover/blog/genie-2-a-large-scale-foundation-world-model/[00:14:42] Mirror Learning, Grudzien, Kuba et al.https://arxiv.org/abs/2208.01682[00:16:30] Discovered Policy Optimisation, Lu et al.https://arxiv.org/abs/2210.05639[00:24:10] Goodhart's Law, Goodharthttps://en.wikipedia.org/wiki/Goodhart%27s_law[00:25:15] LLM ARChitect, Franzen et al.https://github.com/da-fr/arc-prize-2024/blob/main/the_architects.pdf[00:28:55] AlphaGo, Silver et al.https://arxiv.org/pdf/1712.01815.pdf[00:30:10] Meta-learning, Lu, Towers, Foersterhttps://direct.mit.edu/isal/proceedings-pdf/isal2023/35/67/2354943/isal_a_00674.pdf[00:31:30] Emergence of Pragmatics, Yuan et al.https://arxiv.org/abs/2001.07752[00:34:30] AI Safety, Amodei et al.https://arxiv.org/abs/1606.06565[00:35:45] Intentional Stance, Dennetthttps://plato.stanford.edu/entries/ethics-ai/[00:39:25] Multi-Agent RL, Zhou et al.https://arxiv.org/pdf/2305.10091[00:41:00] Open Source Generative AI, Foerster et al.https://arxiv.org/abs/2405.08597<trunc, see PDF/YT>

Om Podcasten

Welcome! We engage in fascinating discussions with pre-eminent figures in the AI field. Our flagship show covers current affairs in AI, cognitive science, neuroscience and philosophy of mind with in-depth analysis. Our approach is unrivalled in terms of scope and rigour – we believe in intellectual diversity in AI, and we touch on all of the main ideas in the field with the hype surgically removed. MLST is run by Tim Scarfe, Ph.D (https://www.linkedin.com/in/ecsquizor/) and features regular appearances from MIT Doctor of Philosophy Keith Duggar (https://www.linkedin.com/in/dr-keith-duggar/).