#54 - OpenAI on publication norms, malicious uses of AI, and general-purpose learning algorithms

OpenAI’s Dactyl is an AI system that can manipulate objects with a human-like robot hand. OpenAI Five is an AI system that can defeat humans at the video game Dota 2. The strange thing is they were both developed using the same general-purpose reinforcement learning algorithm. How is this possible and what does it show? In today's interview Jack Clark, Policy Director at OpenAI, explains that from a computational perspective using a hand and playing Dota 2 are remarkably similar problems. A robot hand needs to hold an object, move its fingers, and rotate it to the desired position. In Dota 2 you control a team of several different people, moving them around a map to attack an enemy. Your hand has 20 or 30 different joints to move. The number of main actions in Dota 2 is 10 to 20, as you move your characters around a map. When you’re rotating an objecting in your hand, you sense its friction, but you don’t directly perceive the entire shape of the object. In Dota 2, you're unable to see the entire map and perceive what's there by moving around – metaphorically 'touching' the space. Read our new in-depth article on becoming an AI policy specialist: The case for building expertise to work on US AI policy, and how to do it Links to learn more, summary and full transcript This is true of many apparently distinct problems in life. Compressing different sensory inputs down to a fundamental computational problem which we know how to solve only requires the right general-purpose software. The creation of such increasingly 'broad-spectrum' learning algorithms like has been a key story of the last few years, and this development like have unpredictable consequences, heightening the huge challenges that already exist in AI policy. Today’s interview is a mega-AI-policy-quad episode; Jack is joined by his colleagues Amanda Askell and Miles Brundage, on the day they released their fascinating and controversial large general language model GPT-2. We discuss: • What are the most significant changes in the AI policy world over the last year or two? • What capabilities are likely to develop over the next five, 10, 15, 20 years? • How much should we focus on the next couple of years, versus the next couple of decades? • How should we approach possible malicious uses of AI? • What are some of the potential ways OpenAI could make things worse, and how can they be avoided? • Publication norms for AI research • Where do we stand in terms of arms races between countries or different AI labs? • The case for creating newsletters • Should the AI community have a closer relationship to the military? • Working at OpenAI vs. working in the US government • How valuable is Twitter in the AI policy world? Rob is then joined by two of his colleagues – Niel Bowerman & Michelle Hutchinson – to quickly discuss: • The reaction to OpenAI's release of GPT-2 • Jack’s critique of our US AI policy article • How valuable are roles in government? • Where do you start if you want to write content for a specific audience? Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type 80,000 Hours into your podcasting app. Or read the transcript below. The 80,000 Hours Podcast is produced by Keiran Harris.

Om Podcasten