Unsupervised Learning

Unsupervised learning is the machine learning task of determining a function from unlabeled data. Specifically, because the data is unlabeled, there is no error or reward to let the algorithm know if it is close or far away from the proper solution. Unsupervised learning is very important when using machine learning on problems where the answer is not known.

Example of an unsupervised clustering algorithm^[1]

Given a set of data points (as above) an unsupervised learning algorithm is able to cluster the points into three different groups: red, blue, and yellow. Because the data is unlabeled, we do not know whether the clustering into three groups is the actual "correct" clustering of the data.

Uses

Unsupervised learning is used in many contexts, a few of which are detailed below.

Clustering - Clustering is a popular unsupervised learning method used to group similar data together (in clusters). K-means clustering is a popular way of clustering data. As shown in the above example, since the data is not labeled, the clusters cannot be compared to a "correct" clustering of the data.
Anomaly detection - Anomaly detection, otherwise known as outlier detection, is the identification of data that does not conform to the rest of the dataset. This task does not require labeled data, as long as the majority of data points in the dataset are "normal" and the algorithm looks for data points that are the least similar to the rest of the data.

Example of outliers^[2]

In this example, we see two clusters of data (G1 and G2), along with outliers O1 and O2. Anomaly detection, a form of unsupervised learning, can determine that O1 and O2 are outliers even when the data is unlabeled. One method of doing so is a variant of k-nearest neighbors, where a data point is marked as an outlier or not by looking at its k nearest neighbors and the distance between the data points and these neighbors.

Neural networks

Many artificial neural networks use unsupervised learning, where an algorithm must learn to reach a certain goal on unlabeled data. The fundamental theory behind unsupervised neural networks is Hebbian theory, which describes the adaptation of neurons during learning. It details the basics of synaptic plasticity, which is the strengthening and weakening of interactions between neurons over time.

Simply put, when neurons in the brain (and likewise, artificial neurons in neural networks) are activated at the same time, the relationship between them strengthen, and when they are not activated at the same time, the relationship between them weaken. This plays a role in unsupervised learning, where trends among data must be determined without feedback (error or reward). By strengthening weights between neurons in neural networks, machine learning algorithms can extract useful information from the given unlabeled data.

Latent Variables

A statistical approach of unsupervised learning is the method of moments, a way of estimating the parameters of a probability distribution. The algorithm uses the moments of unknown parameters, which are the expected values of the powers of the parameters, to determine the parameters of the distribution.

Particularly, the method of moments is used to learn the parameters of latent variable models. These are statistical models that contain variables that are not observed. An example of a latent variable model is the machine learning task of determining a topic (latent variable) based on the words (observed variable) of a document. For example, a document with "dog", "bone" and "chew" is related to the topic of "dogs", a document with "cat", "scratch", and "meow" is related to the topic of "cats", etc. In such a task, the method of moments (an unsupervised learning process) is very useful in extracting the topics of the documents.

Furthermore, the expectation-maximization (EM) algorithm is another approach to finding latent variables using unsupervised learning. The algorithm uses an expectation of the estimated parameters, as well as maximizing this expectation, to determine the latent variables and is further described in its detailed wiki. Overall, method of moments and method of moments are important uses of unsupervised learning in machine learning tasks.

References

hellisp, . Cluster-2.gif. Retrieved June 1, 2016, from https://en.wikipedia.org/wiki/File:Cluster-2.svg
Osrecki, . Two-dimensional Outliers Example.png. Retrieved June 1, 2016, from https://commons.wikimedia.org/wiki/File:Two-dimensional_Outliers_Example.png

Contents