The following problems appeared as a project in the edX course ColumbiaX: CSMM.102x Machine Learning.

In this assignment the following clustering algorithms will be implemented:

Hard clustering with K-means
Soft clustering with
a. Weighted K-means
b. Gaussian mixture models with Expectation Maximization.

Some datasets with n data points {x_1,…,x_n} will be used for testing the algorithms, where each x_i ∈ R^d.

Hard Clustering

Each point is assigned to a one and only one cluster (hard assignment).
With K-means we try to find K centroids {μ1,…,μK} and the corresponding assignments of each data point {c1,…,cn} where each ci∈{1,…,K} and c_i indicates which of the K clusters the observation x_i belongs to. The objective function that we seek to minimize can be written as

im1

Soft Clustering

(1) Each point is assigned to all the clusters with different weights or probabilities (soft assignment).

(2) With Weighed K-means we try to compute the weights ϕ_i(k) for each data point i to the cluster k as minimizing the following objective:

(3) With GMM-EM we can do soft clustering too. The EM algorithm can be used to learn the parameters of a Gaussian mixture model. For this model, we assume a generative process for the data as follows:
im2

In other words, the i_th observation is first assigned to one of K clusters according to the probabilities in vector π, and the value of observation x_i is then generated from one of K multivariate Gaussian distributions, using the mean and covariance indexed by c_i. The EM algorithm seeks to maximize

over all parameters π,μ1,…,μK,Σ1,…,ΣK using the cluster assignments c1,…,cn as the hidden data.

The following figures show the algorithms that are going to be implemented for clustering.

gmm

The following animations show the output of the clustering algorithms (and how they converge with different iterations) on a few datasets (with k=3 clusters), the weighted k-means is run with the stiffness-parameter beta=10.

Dataset1

kmeans1 wkmeans1