659: Open-Source Tools for Natural Language Processing

NLP practitioners: this episode is for you. From the awareness of linguistic elements and annotation to getting the necessary people in the room, Vincent Warmerdam presents to Jon Krohn a recipe for a successful project and the open-source NLP tools to get there. This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick (https://linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: • How Vincent came to work with De Speld [08:57] • Vincent’s role at Explosion [18:59] • How users can apply spaCy [21:46] • Prodigy: Annotate training data more efficiently with scripts [26:28] • How to manage “skill anxiety” with Calmcode [32:32] • How Vincent fixed bad labels [42:47] • The value of understanding linguistics for NLP [54:42] • How to constrain artificial stupidity [1:02:38] Additional materials: www.superdatascience.com/659

Om Podcasten

The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact. Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy. We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.