#MLNotes: Clustering Process and checks before applying clustering on the available data.

--

This may be over simplified version but this article is more of a review note for a beginner.

The process and checks are simple and is outlined below:

Data Checks

  • K-Mean clustering will only work on numeric data and not for categorical data
  • Outliers should be removed.
  • Data should be scaled appropriately.

Check Clustering Tendency on the available Data

  • You may want to measure Clustering Tendency by using Hopkins Statistics Test

Apply Clustering Algorithm

  • You may want to use K-Mean algorithm.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Vaibhav Pandey
Vaibhav Pandey

Written by Vaibhav Pandey

https://vaibhavpandey.co.uk, 9x Azure Certs Masters Degree in AI 2023, PG Diploma in AI 2022, Desertation in Cancer Prediction, Builds with AI

No responses yet