#93: K-means clustering: machine learning algorithm to easily split observations into multiple buckets

K-means clustering is an algorithm for partitioning data into multiple, non-overlapping buckets. For example, if you have a bunch of points in two-dimensional space, this algorithm can easily find concentrated clusters of points. To be honest, that’s quite a simple task for humans. Just plot all the points on a piece of paper and find areas with higher density. For example, most of the points are located on the top-left of the plane, some at the bottom and a few at the centre-right. However, this is not that straightforward once you can no longer rely on graphical representation. For instance, when your data points live 3-, 4- or 100-dimensional space. Turns out, this is not that uncommon. Let me clarify. Read more: https://nurkiewicz.com/93 Get the new episode straight to your mailbox: https://nurkiewicz.com/newsletter

Om Podcasten

Podcast for developers, testers, SREs... and their managers. I explain complex and convoluted technologies in a clear way, avoiding buzzwords and hype. Never longer than 4 minutes and 16 seconds. Because software development does not require hours of lectures, dev advocates' slide decks and hand waving. For those of you, who want to combat FOMO, while brushing your teeth. 256 seconds is plenty of time. If I can't explain something within this time frame, it's either too complex, or I don't understand it myself. By Tomasz Nurkiewicz. Java Champion, CTO, trainer, O'Reilly author, blogger