Menu
Scaler Ads

Python

How to convert Python to Cython inside Jupyter Notebooks?

Let’s see how to cythonize Python code inside Jupyter notebooks step by step. In this post we will see how to: Define and time a Python Function to benchmark How to run Python using Cython in Jupyter Notebook Let’s cythonize the function But, let’s first answer a basic question: What is the difference between CPython …

How to convert Python to Cython inside Jupyter Notebooks? Read More »

KL Divergence

KL Divergence – What is it and mathematical details explained

At its core, KL (Kullback-Leibler) Divergence is a statistical measure that quantifies the dissimilarity between two probability distributions. Think of it like a mathematical ruler that tells us the “distance” or difference between two probability distributions. Remember, in data science, we’re often working with probabilities – the chances of events happening. So, if we have …

KL Divergence – What is it and mathematical details explained Read More »

Cook’s Distance for Detecting Influential Observations

Cook’s distance is a measure computed to measure the influence exerted by each observation on the trained model. It is measured by building a regression model and therefore is impacted only by the X variables included in the model. What is Cooks Distance? Cook’s distance measures the influence exerted by each data point (row / …

Cook’s Distance for Detecting Influential Observations Read More »

How to detect outliers using IQR and Boxplots?

Let’s understand what are outliers, how to identify them using IQR and Boxplots and how to treat them if appropriate. 1. What are outliers? In statistics, outliers are those specific data points that differ significantly from other data points in the dataset. There can be various reasons behind the outliers. It can be because of …

How to detect outliers using IQR and Boxplots? Read More »

Select columns in PySpark dataframe

Select columns in PySpark dataframe – A Comprehensive Guide to Selecting Columns in different ways in PySpark dataframe

Apache PySpark is a powerful big data processing framework, which allows you to process large volumes of data using the Python programming language. PySpark’s DataFrame API is a powerful tool for data manipulation and analysis. One of the most common tasks when working with DataFrames is selecting specific columns. In this blog post, we will …

Select columns in PySpark dataframe – A Comprehensive Guide to Selecting Columns in different ways in PySpark dataframe Read More »

Introduction to PySpark

Introduction to PySpark – Unleashing the Power of Big Data using PySpark

Introduction As we continue to generate massive volumes of data every day, the importance of scalable data processing and analysis tools cannot be overstated. One such powerful tool is Apache Spark, an open-source, distributed computing system that has become synonymous with big data processing. In this blog post, we will introduce you to PySpark, the …

Introduction to PySpark – Unleashing the Power of Big Data using PySpark Read More »

Install opencv python

Install opencv python – A Comprehensive Guide to Installing “OpenCV-Python”

OpenCV (Open Source Computer Vision Library) is an open-source computer vision and machine learning software library. OpenCV-Python is a Python wrapper for the original OpenCV C++ library. Let’s see how it install OpenCV in python. Introduction OpenCV enables users to perform image and video processing tasks with ease. In this blog post, we will provide …

Install opencv python – A Comprehensive Guide to Installing “OpenCV-Python” Read More »

Install pip mac

install pip mac – How to install pip in MacOS?: A Comprehensive Guide

Pip is a widely used package manager for Python, allowing you to install and manage Python packages easily. In this blog post, we’ll explore various methods to install Pip on MacOS. I’ll provide clear, reproducible code examples for each method, making it easy for you to get started with Pip on your MacOS system. Using …

install pip mac – How to install pip in MacOS?: A Comprehensive Guide Read More »

Scrapy vs. Beautiful Soup: Which is better for web scraping?

Web scraping is the technique of extracting data from a specific website or web page. This has wide applications in: Research and publication purposes Competitor and market studies Creating data for machine learning models The extracted data can be stored in any format be it a csv, txt, json, API etc so that it can …

Scrapy vs. Beautiful Soup: Which is better for web scraping? Read More »

add Python to PATH – How to add Python to the PATH environment variable in Windows?

1. What is the purpose of adding Python to the PATH environment variable? Adding Python to the PATH environment variable in Windows allows you to run Python commands from any directory within the command prompt. Here are the steps to add Python to the PATH variable: 2. What is the PATH environment variable in Windows? …

add Python to PATH – How to add Python to the PATH environment variable in Windows? Read More »

An Introduction to AdaBoost

AdaBoost – An Introduction to AdaBoost

Adaboost is one of the earliest implementations of the boosting algorithm. It forms the base of other boosting algorithms, like gradient boosting and XGBoost. This tutorial will take you through the math behind implementing this algorithm and also a practical example of using the scikit-learn Adaboost API. Contents: What is boosting? What is Adaboost? Algorithm …

AdaBoost – An Introduction to AdaBoost Read More »

Numpy.random.randint() in python

Numpy.random.randint() in python

numpy.random.randint function is used to get random integers from low to high values. The low value is included while the high value is excluded in the calculations. The output values are taken from the discrete uniform distribution of the range values. random.randint(low, high=None, size=None, dtype=int) Purpose: The numpy random randint function used for creating a …

Numpy.random.randint() in python Read More »

np.random.uniform

How to use numpy.random.uniform() in python.

The np.random.uniform() function is used to create an array with random samples from a uniform probability distribution of given low and high values. random.uniform(low=0.0, high=1.0, size=None) Purpose: The numpy random uniform function used for creating a numpy array with random float values from low to high interval. Parameteres: Low: float or array-like of floats,optional: Lowest …

How to use numpy.random.uniform() in python. Read More »

Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science