Machine Learning Archives - Machine Learning Plus

KL Divergence – What is it and mathematical details explained

Leave a Comment / Machine Learning / By Selva Prabhakaran

At its core, KL (Kullback-Leibler) Divergence is a statistical measure that quantifies the dissimilarity between two probability distributions. Think of it like a mathematical ruler that tells us the “distance” or difference between two probability distributions. Remember, in data science, we’re often working with probabilities – the chances of events happening. So, if we have …

KL Divergence – What is it and mathematical details explained Read More »

Probe Method – How to select features for ML models

Leave a Comment / Machine Learning / By Selva Prabhakaran

The Probe method is a highly intuitive approach to feature selection. If a feature in the dataset contains only random numbers, it is not going to be a useful feature. Any feature that has lower feature importance than a random feature is suspicious. In this one, we will see: What is the Probe Method for …

Probe Method – How to select features for ML models Read More »

Cook’s Distance for Detecting Influential Observations

Leave a Comment / Machine Learning / By Selva Prabhakaran

Cook’s distance is a measure computed to measure the influence exerted by each observation on the trained model. It is measured by building a regression model and therefore is impacted only by the X variables included in the model. What is Cooks Distance? Cook’s distance measures the influence exerted by each data point (row / …

Cook’s Distance for Detecting Influential Observations Read More »

How to detect outliers with z-score

Leave a Comment / Machine Learning / By Selva Prabhakaran

Z score, also called as standard score, is used to scale the features in a dataset for machine learning model training. It can also be used to detect outliers. In this one, we will first see how to compute Z-scores and then use it to detect outliers. How is Z-score used in machine learning? Now, …

How to detect outliers with z-score Read More »

How to detect outliers using Z score?

Leave a Comment / Machine Learning / By Selva Prabhakaran

Z score is one of the most important concepts in statistics. It is also called standard score. Typically it is used to scale the features for machine learning. But can also be used to detect outliers. Also Read: How to detect outliers with IQR and Box Plots How is Z-score used in machine learning? Now, …

How to detect outliers using Z score? Read More »

How to detect outliers using IQR and Boxplots?

Leave a Comment / Machine Learning / By Selva Prabhakaran

Let’s understand what are outliers, how to identify them using IQR and Boxplots and how to treat them if appropriate. 1. What are outliers? In statistics, outliers are those specific data points that differ significantly from other data points in the dataset. There can be various reasons behind the outliers. It can be because of …

How to detect outliers using IQR and Boxplots? Read More »

MICE imputation – How to predict missing values using machine learning in Python

Leave a Comment / Machine Learning / By Selva Prabhakaran

MICE Imputation, short for ‘Multiple Imputation by Chained Equation’ is an advanced missing data imputation technique that uses multiple iterations of Machine Learning model training to predict the missing values using known values from other features in the data as predictors. What is MICE Imputation? You can impute missing values by predicting them using other …

MICE imputation – How to predict missing values using machine learning in Python Read More »

Spline Interpolation – How to find the polynomial curve to interpolate missing values

Leave a Comment / Machine Learning / By Selva Prabhakaran

Spline interpolation is a special type of interpolation where a piecewise lower order polynomial called spline is fitted to the datapoints. That is, instead of fitting one higher order polynomial (as in polynomial interpolation), multiple lower order polynomials are fitted on smaller segments. This can be implemented in Python. You can do non-linear spline interpolation …

Spline Interpolation – How to find the polynomial curve to interpolate missing values Read More »

Interpolation in Python – How to interpolate missing data, formula and approaches

Leave a Comment / Machine Learning / By Selva Prabhakaran

Interpolation can be used to impute missing data. Let’s see the formula and how to implement in Python. But, you need to be careful with this technique and try to really understand whether or not this is a valid choice for your data. Often, interpolation is applicable when the data is in a sequence or …

Interpolation in Python – How to interpolate missing data, formula and approaches Read More »

Missing Data Imputation Approaches | How to handle missing values in Python

Leave a Comment / Machine Learning / By Selva Prabhakaran

Machine Learning works on the idea of garbage in – garbage out. If you put in useless junk data to the machine learning algorithm, the results will also be, well, ‘junk’. The quality and consistency of results depend on the data provided. Missing values in data degrade the quality. Why clean the data before training …

Missing Data Imputation Approaches | How to handle missing values in Python Read More »

Exploratory Data Analysis (EDA) – How to do EDA for Machine Learning Problems using Python

Leave a Comment / Machine Learning / By Selva Prabhakaran

Exploratory Data Analysis, simply referred to as EDA, is the step where you understand the data in detail. You understand each variable individually by calculating frequency counts, visualizing the distributions, etc. Also the relationships between the various combinations of the predictor and response variables by creating scatterplots, correlations, etc. EDA is typically part of every …

Exploratory Data Analysis (EDA) – How to do EDA for Machine Learning Problems using Python Read More »

ML Modeling – Problem statement and Data description

Leave a Comment / Machine Learning / By Selva Prabhakaran

ML modeling is the step where machine learning is used to find patterns in data and use that learned knowledge to predict an outcome. The type of ML modeling we are going to solve in this problem is called ‘Churn Modeling’. Let’s first understand the Churn modeling problem statement and then go over the data …

ML Modeling – Problem statement and Data description Read More »

AdaBoost – An Introduction to AdaBoost

Leave a Comment / Machine Learning / By Selva Prabhakaran

Adaboost is one of the earliest implementations of the boosting algorithm. It forms the base of other boosting algorithms, like gradient boosting and XGBoost. This tutorial will take you through the math behind implementing this algorithm and also a practical example of using the scikit-learn Adaboost API. Contents: What is boosting? What is Adaboost? Algorithm …

AdaBoost – An Introduction to AdaBoost Read More »

How to formulate machine learning problem

Leave a Comment / Machine Learning / By Selva Prabhakaran

Let’s understand how to define and formulate the machine learning problem (for predictive modeling) from a business problem. This structured approach should help you apply the process to most other types of predictive modeling problems at work. Introduction Often in ML teams, you will hear from the business/company departments about the problems and issues they …

How to formulate machine learning problem Read More »

Build your first ML project

Leave a Comment / Machine Learning / By Selva Prabhakaran

Let’s build your first machine learning project with Python from scratch. “But I am a complete beginner, I am not ready yet!..” – Your mind voice. If you have been looking to get started in ML, but can’t really figure out how and where to start, then this one is for you. Just read on.. …

Build your first ML project Read More »

Setup Python environment for ML

Leave a Comment / Machine Learning / By Selva Prabhakaran

Python is the most popular programming language used for AI and machine learning. Let’s see how to setup python environment for ML using anaconda. How to install Python? Simply visit Python.org, go to downloads section, download latest version that shows there and install it like you do for any other software. To do machine learning …

Setup Python environment for ML Read More »

Train Test Split – How to split data into train and test for validating machine learning models?

Leave a Comment / Machine Learning / By Selva Prabhakaran

The train-test split technique is a way of evaluating the performance of machine learning models. Whenever you build machine learning models, you will be training the model on a specific dataset (X and y). Once trained, you want to ensure the trained model is capable of performing well on the unseen test data as well. …

Train Test Split – How to split data into train and test for validating machine learning models? Read More »

Task Checklist for Almost Any Machine Learning Project

Leave a Comment / Machine Learning / By Selva Prabhakaran

A cheat sheet of tasks and things to take care of for every end-to-end ML projects. In this, I write down a check list of items and tasks to check whenever you start with a new Data Science / ML project. Once you start off with the project there will be so many things going …

Task Checklist for Almost Any Machine Learning Project Read More »

Data Science Roadmap – How to become a Data Scientist? (6 month self study plan)

Leave a Comment / Machine Learning / By Selva Prabhakaran

Today, I discuss the Data Science Roadmap, the missing guide to self study machine learning. I’ll discuss what exactly you need to know and do in order to self study Data science / ML / AI / Stats. I will provide you with some of the best resources for each topic, why you need to …

Data Science Roadmap – How to become a Data Scientist? (6 month self study plan) Read More »

Why learn the math behind Machine Learning and AI?

Leave a Comment / Machine Learning / By Selva Prabhakaran

Why learn the math behind machine learning algorithms when you can readily implement it using the python libraries like scikit-learn, h2o, statsmodels etc? This is a fair question especially coming from beginners when it is easy to implement ML with few lines of code and get the results fast. Now, you must understand that learning …

Why learn the math behind Machine Learning and AI? Read More »

Machine Learning

KL Divergence – What is it and mathematical details explained

Probe Method – How to select features for ML models

Cook’s Distance for Detecting Influential Observations

How to detect outliers with z-score

How to detect outliers using Z score?

How to detect outliers using IQR and Boxplots?

MICE imputation – How to predict missing values using machine learning in Python

Spline Interpolation – How to find the polynomial curve to interpolate missing values

Interpolation in Python – How to interpolate missing data, formula and approaches

Missing Data Imputation Approaches | How to handle missing values in Python

Exploratory Data Analysis (EDA) – How to do EDA for Machine Learning Problems using Python

ML Modeling – Problem statement and Data description

AdaBoost – An Introduction to AdaBoost

How to formulate machine learning problem

Build your first ML project

Setup Python environment for ML

Train Test Split – How to split data into train and test for validating machine learning models?

Task Checklist for Almost Any Machine Learning Project

Data Science Roadmap – How to become a Data Scientist? (6 month self study plan)

Why learn the math behind Machine Learning and AI?

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos: