Clustering data is the act of partitioning observations into groups, or clusters, such that each data point in the subset shares similar characteristics to its corresponding members. Cluster analysis is commonly used in fields that utilize data mining, pattern recognition and machine learning. While MATLAB has several clustering tools included in its arsenal, we’ll take a look at the function kmeans in this tutorial. Following classification of n observations into k clusters, we can use binary classification to investigate the sensitivity and specificity of our clustering.
Binary classification is the act of discriminating an item into one of two groups based on specified measures or variables. While previously we have discussed methods for determining values of logic gates using neural networks (Part 1 and Part 2), we will begin a series on clustering algorithms that can be performed in Matlab, including the use of k-means clustering and Gaussian Mixture Models. Prior to doing so, we will discuss how classification is evaluated, primarily through the discussion of sensitivity, specificity and the way to calculate these values through Matlab.