MLCommons’ David Kanter, NVIDIA’s Daniel Galvez on Publicly Accessible Datasets - Ep. 167

In deep learning and machine learning, having a large enough dataset is key to training a system and getting it to produce results. So what does a ML researcher do when there just isn’t enough publicly accessible data? Enter the MLCommons Association, a global engineering consortium with the aim of making ML better for everyone. MLCommons recently announced the general availability of the People’s Speech Dataset, a 30,000 hour English-language conversational speech dataset, and the Multilingual Spoken Words Corpus, an audio speech dataset with over 340,000 keywords in 50 languages, to help advance ML research. On this episode of NVIDIA’s AI Podcast, host Noah Kravitz spoke with David Kanter, founder and executive director of MLCommons, and NVIDIA senior AI developer technology engineer Daniel Galvez, about the democratization of access to speech technology and how ML Commons is helping advance the research and development of machine learning for everyone. https://blogs.nvidia.com/blog/2022/04/13/mlcommons/

Om Podcasten

One person, one interview, one story. Join us as we explore the impact of AI on our world, one amazing person at a time -- from the wildlife biologist tracking endangered rhinos across the savannah here on Earth to astrophysicists analyzing 10 billion-year-old starlight in distant galaxies to the Walmart data scientist grappling with the hundreds of millions of parameters lurking in the retailer’s supply chain. Every two weeks, we’ll bring you another tale, another 25-minute interview, as we build a real-time oral history of AI that’s already garnered nearly 3.4 million listens and been acclaimed as one of the best AI and machine learning podcasts. Listen in and get inspired. https://blogs.nvidia.com/ai-podcast/