#MLNotes: Clustering Process and checks before applying clustering on the available data.
Jan 9, 2021
This may be over simplified version but this article is more of a review note for a beginner.
The process and checks are simple and is outlined below:
Data Checks
- K-Mean clustering will only work on numeric data and not for categorical data
- Outliers should be removed.
- Data should be scaled appropriately.
Check Clustering Tendency on the available Data
- You may want to measure Clustering Tendency by using Hopkins Statistics Test
Apply Clustering Algorithm
- You may want to use K-Mean algorithm.