Howdy, machine learning students! Today we’re going to introduce the concept of unsupervised machine learning algorithms.
Quick Recap: Supervised Learning
Before we jump in, let’s quickly recap our last article introducing supervised machine learning algorithms. This will give us the appropriate context for unsupervised learning.
In supervised machine learning problems, we supply pre-labeled data to the algorithm. By supplying data that’s already correctly labeled, we ask the algorithm to further predict (regression) or label (classification) new data.
Unsupervised Machine Learning = Unlabeled Data
The most immediate and prominent difference for unsupervised learning is the data. Above, we gave the algorithm “a boost” by supplying the intended “right” answers in the data. Below, in an unsupervised machine learning problem, there are no right answers…yet.
We’ve supplied the algorithm with data in the problem, but it’s provided without labels or “answers”. We are mandating that the algorithm discover structure and infer patterns/labels on its own. We could also compare the above example to a clustering problem.
So in unsupervised learning, we supply a large amount of unlabeled data, without explicitly identified form or structure. We ask the algorithm to come up with ideas of structure and segmentation on its own.
Some additional applications of unsupervised learning could include:
- Market segmentation of massive transaction data
- Large scale social networking data
- Astronomical data analysis
- Large scale market data
- Mass audio/voice analysis
- Large scale gene clustering
That was a bit of a quick one! The challenge with some these technical subject matter areas is sometimes we have limited room to run before going off into the technical weeds. This is one of those areas. Next, we’ll be covering some key concepts in the areas of machine learning model representation, cost function and parameter learning. Don’t worry too much about those yet, we’ll take it step by step. 🙂