Python Archives - Machine Learning Plus

How to convert Python code to Cython (and speed up 100x)?

Leave a Comment / Python / By Selva Prabhakaran

Using Cython, you can speed up existing Python code by an order of 100x or more. This is possible because Cython converts some of the Python code to C by doing some basic code changes. Even without any code change a speed up of 2x is commonly observed, like in this post example. Because, everything …

How to convert Python code to Cython (and speed up 100x)? Read More »

How to convert Python to Cython inside Jupyter Notebooks?

Leave a Comment / Python / By Selva Prabhakaran

Let’s see how to cythonize Python code inside Jupyter notebooks step by step. In this post we will see how to: Define and time a Python Function to benchmark How to run Python using Cython in Jupyter Notebook Let’s cythonize the function But, let’s first answer a basic question: What is the difference between CPython …

How to convert Python to Cython inside Jupyter Notebooks? Read More »

KL Divergence – What is it and mathematical details explained

Leave a Comment / Machine Learning / By Selva Prabhakaran

At its core, KL (Kullback-Leibler) Divergence is a statistical measure that quantifies the dissimilarity between two probability distributions. Think of it like a mathematical ruler that tells us the “distance” or difference between two probability distributions. Remember, in data science, we’re often working with probabilities – the chances of events happening. So, if we have …

KL Divergence – What is it and mathematical details explained Read More »

Probe Method – How to select features for ML models

Leave a Comment / Machine Learning / By Selva Prabhakaran

The Probe method is a highly intuitive approach to feature selection. If a feature in the dataset contains only random numbers, it is not going to be a useful feature. Any feature that has lower feature importance than a random feature is suspicious. In this one, we will see: What is the Probe Method for …

Probe Method – How to select features for ML models Read More »

Cook’s Distance for Detecting Influential Observations

Leave a Comment / Machine Learning / By Selva Prabhakaran

Cook’s distance is a measure computed to measure the influence exerted by each observation on the trained model. It is measured by building a regression model and therefore is impacted only by the X variables included in the model. What is Cooks Distance? Cook’s distance measures the influence exerted by each data point (row / …

Cook’s Distance for Detecting Influential Observations Read More »

How to detect outliers with z-score

Leave a Comment / Machine Learning / By Selva Prabhakaran

Z score, also called as standard score, is used to scale the features in a dataset for machine learning model training. It can also be used to detect outliers. In this one, we will first see how to compute Z-scores and then use it to detect outliers. How is Z-score used in machine learning? Now, …

How to detect outliers with z-score Read More »

How to detect outliers using Z score?

Leave a Comment / Machine Learning / By Selva Prabhakaran

Z score is one of the most important concepts in statistics. It is also called standard score. Typically it is used to scale the features for machine learning. But can also be used to detect outliers. Also Read: How to detect outliers with IQR and Box Plots How is Z-score used in machine learning? Now, …

How to detect outliers using Z score? Read More »

How to detect outliers using IQR and Boxplots?

Leave a Comment / Machine Learning / By Selva Prabhakaran

Let’s understand what are outliers, how to identify them using IQR and Boxplots and how to treat them if appropriate. 1. What are outliers? In statistics, outliers are those specific data points that differ significantly from other data points in the dataset. There can be various reasons behind the outliers. It can be because of …

How to detect outliers using IQR and Boxplots? Read More »

Select columns in PySpark dataframe – A Comprehensive Guide to Selecting Columns in different ways in PySpark dataframe

Introduction to PySpark – Unleashing the Power of Big Data using PySpark

Install opencv python – A Comprehensive Guide to Installing “OpenCV-Python”

install pip mac – How to install pip in MacOS?: A Comprehensive Guide

Leave a Comment / Python / By Selva Prabhakaran

Pip is a widely used package manager for Python, allowing you to install and manage Python packages easily. In this blog post, we’ll explore various methods to install Pip on MacOS. I’ll provide clear, reproducible code examples for each method, making it easy for you to get started with Pip on your MacOS system. Using …

install pip mac – How to install pip in MacOS?: A Comprehensive Guide Read More »

Scrapy vs. Beautiful Soup: Which is better for web scraping?

Leave a Comment / Python / By Selva Prabhakaran

Web scraping is the technique of extracting data from a specific website or web page. This has wide applications in: Research and publication purposes Competitor and market studies Creating data for machine learning models The extracted data can be stored in any format be it a csv, txt, json, API etc so that it can …

Scrapy vs. Beautiful Soup: Which is better for web scraping? Read More »

add Python to PATH – How to add Python to the PATH environment variable in Windows?

Leave a Comment / Python / By Selva Prabhakaran

1. What is the purpose of adding Python to the PATH environment variable? Adding Python to the PATH environment variable in Windows allows you to run Python commands from any directory within the command prompt. Here are the steps to add Python to the PATH variable: 2. What is the PATH environment variable in Windows? …

add Python to PATH – How to add Python to the PATH environment variable in Windows? Read More »

AdaBoost – An Introduction to AdaBoost

Leave a Comment / Machine Learning / By Selva Prabhakaran

Adaboost is one of the earliest implementations of the boosting algorithm. It forms the base of other boosting algorithms, like gradient boosting and XGBoost. This tutorial will take you through the math behind implementing this algorithm and also a practical example of using the scikit-learn Adaboost API. Contents: What is boosting? What is Adaboost? Algorithm …

AdaBoost – An Introduction to AdaBoost Read More »

Numpy.random.randint() in python

Leave a Comment / Python / By Selva Prabhakaran

numpy.random.randint function is used to get random integers from low to high values. The low value is included while the high value is excluded in the calculations. The output values are taken from the discrete uniform distribution of the range values. random.randint(low, high=None, size=None, dtype=int) Purpose: The numpy random randint function used for creating a …

Numpy.random.randint() in python Read More »

How to use numpy.random.uniform() in python.

Leave a Comment / Python / By Selva Prabhakaran

The np.random.uniform() function is used to create an array with random samples from a uniform probability distribution of given low and high values. random.uniform(low=0.0, high=1.0, size=None) Purpose: The numpy random uniform function used for creating a numpy array with random float values from low to high interval. Parameteres: Low: float or array-like of floats,optional: Lowest …

How to use numpy.random.uniform() in python. Read More »

Numpy.sort() in python

Leave a Comment / Python / By Selva Prabhakaran

The np.sort() function is used to sort the array along a specified axis. Numpy.sort (a, axis=- 1, kind=None, order=None) Purpose: This function is used for sorting the array. Parameters: arr:a:array_like array to be sorted. axis: None or int,optional Axis on which we perform the arithmetic mean if specified. otherwise, the arr will be flattened. kind: …

Numpy.sort() in python Read More »

numpy.median() – How to compute median in Python

Leave a Comment / Python / By Selva Prabhakaran

numpy.median function is used to calculate the median of an array along a specific axis or multiple axes. Median is defined as the middle value separating the higher half from the lower half of a data sample in other words median is a value in the middle when you sort the values. In this post, …

numpy.median() – How to compute median in Python Read More »

Setup Python environment for ML

Leave a Comment / Machine Learning / By Selva Prabhakaran

Python is the most popular programming language used for AI and machine learning. Let’s see how to setup python environment for ML using anaconda. How to install Python? Simply visit Python.org, go to downloads section, download latest version that shows there and install it like you do for any other software. To do machine learning …

Setup Python environment for ML Read More »

Python

How to convert Python code to Cython (and speed up 100x)?

How to convert Python to Cython inside Jupyter Notebooks?

KL Divergence – What is it and mathematical details explained

Probe Method – How to select features for ML models

Cook’s Distance for Detecting Influential Observations

How to detect outliers with z-score

How to detect outliers using Z score?

How to detect outliers using IQR and Boxplots?

Select columns in PySpark dataframe – A Comprehensive Guide to Selecting Columns in different ways in PySpark dataframe

Introduction to PySpark – Unleashing the Power of Big Data using PySpark

Install opencv python – A Comprehensive Guide to Installing “OpenCV-Python”

install pip mac – How to install pip in MacOS?: A Comprehensive Guide

Scrapy vs. Beautiful Soup: Which is better for web scraping?

add Python to PATH – How to add Python to the PATH environment variable in Windows?

AdaBoost – An Introduction to AdaBoost

Numpy.random.randint() in python

How to use numpy.random.uniform() in python.

Numpy.sort() in python

numpy.median() – How to compute median in Python

Setup Python environment for ML

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos: