#MLNotes: Types of Data in a Machine Learning problem

Vaibhav Pandey
2 min readNov 29, 2020

--

Probably this topic might have appeared a million times on the internet. Here is quick summary of types of data in a machine learning problem with real world examples so that we all can relate them closely. Every data scientist should have a good understanding of the data types as these are utilsed to find out type of algorithm to be selected for the machine learning problem under consideration.

Types of data —

At high level data is classified as Qualitative data and Quantitative(numeric) data.

Quantitative Data

1. Numerical

Numeric data belongs to Quantitative category and has two sub types as follows:

1.A - Discrete

This type of data can be counted, have very specific values, cannot have values in fractions or decimal, and they should be held in an integer datatype in python, C#.

  • Number of rooms, fan, bulbs, bathrooms,
  • Retail — count of SKU’s(Store Keeping Units)
  • Banking — number of fraud, payments, accounts, payees,

1. B - Continuous

This type of data can have fraction, decimal values and can be represented in a range. Float data type should be used

  • Salary, money, age, water in a container, etc.

Qualitative Data

Categorical

2. Nominal

This type of data does not have any particular order in data like high, medium or low which can be used to order or arrange them in any particular manner. Hence is also referred as Unordered Categorical data. Following are key examples

  • Continents — Asia, Africa, Europe
  • Countries — India, China, United States, United Kingdom, etc.
  • Gender — Male or Female
  • Furnishing status for a house: Furnished, Semi Furnished, Unfurnished.
  • Color of the ball, bat, toothbrush — Blue, Green, Red, Black, etc.
  • Color of car — Red, Green, Blue, White, etc.

3. Ordinal

These types have an order which can be used to order or arrange items in the collection. This data type is a mix of numeric and categorical data type and is also referred as Ordered Categorical data.

  • Income level — low, medium and high
  • Review ratings on the products on E-Commerce websites like Amazon, Flipkart.
  • Star ratings on electric devices

Further references for readers

Please visit following links:

Khan Academy: https://www.youtube.com/watch?v=dOr0NKyD31Q

--

--

Vaibhav Pandey

https://vaibhavpandey.co.uk, 9x Azure Certified, work for a Tech major, never dull, sharpening my skills and loves sharing learnings in the simplest form.