Weather forecasting with AI, Kaggle tips and tricks, dealing with missing data, deep learning with Jesper Dramsch, The Data Scientist Show #040

Jesper Dramsch is a scientist for machine learning at the European Centre for Medium-Range Weather forecasts. They have a phd in applied Machine Learning to Geoscience from Technical University of Denmark. They are a Kaggle Kernals Expert and TPU star, ranking at top 81/100k worldwide. We talked about weather forecasting, things they learned from Kaggle, how to deal with missing data and ourliers, deep learning, Keras vs Pytorch, XGBoost, their struggles as a phd student, working in the EU vs US. Follow @DalianaLiu for more updates on data science and this show. (00:01:27) how he got into in ML  (00:09:10) how he handled missing data  (00:28:34) Transformers are eating the world  (00:49:36) Hoover Loss is a fantastic metric to deal with extreme values  (00:54:48) his experience with Kaggle competition  (01:02:59) Kaggle tricks that helped his models perform better  (01:08:18) PyTorch vs Keras  (01:30:30) working in different countries and cultures  Resources shared by Jesper: The newsletter with missing data: https://buttondown.email/jesper/archive/towels-have-quite-a-dry-sense-of-humor/ The paper by Gael about missing data: https://academic.oup.com/gigascience/article/doi/10.1093/gigascience/giac013/6568998 The Huber Loss: https://en.wikipedia.org/wiki/Huber_loss Skill Scores: https://en.wikipedia.org/wiki/Forecast_skill Brier Skill in Weather: https://www.dwd.de/EN/ourservices/seasonals_forecasts/forecast_reliability.html CRPS Continuous Ranked Probability Score https://datascience.stackexchange.com/questions/63919/what-is-continuous-ranked-probability-score-crps ConvNext, Convnets for the 2020s: https://arxiv.org/abs/2201.03545 Transformers for ensemble forecasts: https://arxiv.org/abs/2106.13924 Books I recommend: https://www.amazon.com/shop/jesperdramsch/list/2DYS5KVR5TX0E Blog posts I wrote about these books: https://dramsch.net/tags/books/ Short I made about Test-Time Augmentation https://www.youtube.com/shorts/w4sAh9lKyls Their links: https://dramsch.net/links Their open PhD thesis: https://dramsch.net/phd Newsletter: https://dramsch.net/newsletter Twitter: https://dramsch.net/twitter Youtube: https://dramsch.net/youtube Linkedin: https://dramsch.net/linkedin Kaggle: https://dramsch.net/

Om Podcasten

A deep dive into data scientists' day-to-day work, tools and models they use, how they tackle problems, and their career journeys. This podcast helps you grow a successful career in data science. Listening to an episode is like having lunch with an experienced mentor. Guests are data science practitioners from various industries, AI researchers, economists, and CTOs of AI companies. Host: Daliana Liu, an ex-Amazon senior data scientist with 180k followers on Linkedin. Join 20k subscribers at www.dalianaliu.com to learn more about data science, career, and this show. Twitter @DalianaLiu.