Cluster Analysis and Unsupervised Machine Learning in Python
Data science techniques for pattern recognition, data mining, k-means clustering, and hierarchical clustering, and KDE.
What you’ll learn
-
Understand the regular K-Means algorithm
-
Understand and enumerate the disadvantages of K-Means Clustering
-
Understand the soft or fuzzy K-Means Clustering algorithm
-
Implement Soft K-Means Clustering in Code
-
Understand Hierarchical Clustering
-
Explain algorithmically how Hierarchical Agglomerative Clustering works
-
Apply Scipy’s Hierarchical Clustering library to data
-
Understand how to read a dendrogram
-
Understand the different distance metrics used in clustering
-
Understand the difference between single linkage, complete linkage, Ward linkage, and UPGMA
-
Understand the Gaussian mixture model and how to use it for density estimation
-
Write a GMM in Python code
-
Explain when GMM is equivalent to K-Means Clustering
-
Explain the expectation-maximization algorithm
-
Understand how GMM overcomes some disadvantages of K-Means
-
Understand the Singular Covariance problem and how to fix it
Requirements
-
Know how to code in Python and Numpy
-
Install Numpy and Scipy
-
Matrix arithmetic, probability
Who this course is for:
- Students and professionals interested in machine learning and data science
- People who want an introduction to unsupervised machine learning and cluster analysis
- People who want to know how to write their own clustering code
- Professionals interested in data mining big data sets to look for patterns automatically