Menu

Numpy Tutorial – Your first numpy guide to build python coding foundations

This is part 1 of the numpy tutorial covering all the core aspects of performing data manipulation and analysis with numpy's ndarrays. Numpy is the most basic and a powerful package for data manipulation and scientific computing in python.

Written by Selva Prabhakaran | 10 min read

NumPy is the backbone of data science in Python. This tutorial covers arrays, indexing, reshaping, and random numbers — all the basics you need to work with data. By the end, you’ll know how to create, inspect, and work with NumPy arrays like a pro.

Numpy Tutorial Part 1: Introduction to Arrays. Photo by Bryce Canyon.

This post has interactive code — click ‘Run’ or press Ctrl+Enter on any code block to execute it directly in your browser. The first run may take a few seconds to initialize.

Also Read:

Numpy Tutorial – Gentle Introduction [Part 1] [This Article]
Numpy – Vital Functions for Data Analysis [Part 2]

Contents

  1. Introduction to numpy
  2. How to create a numpy array?
  3. How to inspect the size and shape of a numpy array?
  4. How to extract specific items from an array?
    4.1 How to reverse the rows and the whole array?
    4.2 How to represent missing values and infinite?
    4.3 How to compute mean, min, max on the ndarray?
  5. How to create a new array from an existing array?
  6. Reshaping and Flattening Multidimensional arrays
    6.1 What is the difference between flatten() and ravel()?
  7. How to create sequences, repetitions, and random numbers?
    7.1 How to create repeating sequences?
    7.2 How to generate random numbers?
    7.3 How to get the unique items and the counts?

1. Introduction to Numpy

NumPy is the most important Python package for working with numbers and data. If you plan to do data analysis or machine learning, you need to learn it well.

Why? Because pandas is built on top of NumPy. Scikit-learn relies on it heavily too. These are the main tools you’ll use for data work and ML projects.

So what does NumPy actually give you?

At its core, NumPy gives you the ndarray — short for n-dimensional array. You can store many items of the same data type in an ndarray. The tools built around this array object make math and data tasks fast and easy.

You might think: “I can already store numbers in a Python list. I can do math with list comprehensions and for-loops. Why do I need NumPy?”

Great question. NumPy arrays have big advantages over lists. Let me show you by first creating a NumPy array.

2. How to create a numpy array?

There are many ways to create a NumPy array. The most common way is to pass a Python list to np.array.

Let’s start with a simple 1D array.

# Create a 1d array from a list
import numpy as np

list1 = [0, 1, 2, 3, 4]
arr1d = np.array(list1)

# Print the array and its type
print(type(arr1d))
print(arr1d)

The key difference between an array and a list? Arrays handle vectorized operations. A Python list does not.

This means when you apply an operation, it runs on every item in the array — not on the array object itself.

For example, try adding 2 to every item in a list:

list1 + 2  # TypeError — you can't do this with a list

That fails. But with a NumPy array, it just works:

import numpy as np

arr1d = np.array([0, 1, 2, 3, 4])

# Add 2 to each element
print(arr1d + 2)

One thing to note: once you create a NumPy array, you can’t grow its size. You’d have to create a new array. Lists don’t have this limit — you can append freely.

But NumPy has many more strengths. Let’s keep going.

You can also pass a list of lists to create a 2D array (like a matrix):

import numpy as np

# Create a 2d array from a list of lists
list2 = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
arr2d = np.array(list2)
print(arr2d)

You can set the data type with the dtype argument. Common NumPy dtypes include: 'float', 'int', 'bool', 'str', and 'object'.

For tighter memory control, use 'float32', 'float64', 'int8', 'int16', or 'int32'.

import numpy as np

list2 = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]

# Create a float 2d array
arr2d_f = np.array(list2, dtype='float')
print(arr2d_f)

See those decimal points? That tells you it’s a float. You can convert to a different type with astype:

import numpy as np

list2 = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
arr2d_f = np.array(list2, dtype='float')

# Convert to int
print(arr2d_f.astype('int'))

# Chain conversions: float -> int -> str
print(arr2d_f.astype('int').astype('str'))

Here’s an important rule: every item in a NumPy array must share the same data type. Lists don’t have this rule.

If you need to mix types (like numbers and strings), set dtype='object':

import numpy as np

# Boolean array: any nonzero value becomes True
arr2d_b = np.array([1, 0, 10], dtype='bool')
print("Boolean array:", arr2d_b)

# Object array: can hold mixed types
arr1d_obj = np.array([1, 'a'], dtype='object')
print("Object array:", arr1d_obj)

You can always convert an array back to a Python list with tolist():

import numpy as np

arr1d_obj = np.array([1, 'a'], dtype='object')
print(arr1d_obj.tolist())

To sum up, the main differences between NumPy arrays and Python lists:

  1. Arrays support vectorized operations. Lists don’t.
  2. Arrays have a fixed size after creation. You must create a new array to change it.
  3. Every array has exactly one dtype. All items must match that type.
  4. NumPy arrays use much less memory than equivalent Python lists.

3. How to inspect the size and shape of a numpy array?

Every array has traits that tell you about its shape and layout. Let me walk you through the key ones.

Take arr2d — we built it from a list of lists, so it has 2 axes (rows and columns, like a matrix). A list of list of lists would give 3 axes, like a cube.

When someone hands you a NumPy array, what do you check first? Here are the five things I always look at:

  • Number of dimensions (ndim)
  • Items in each dimension (shape)
  • Data type (dtype)
  • Total item count (size)
  • A few sample values (through indexing)
import numpy as np

# Create a 2d array with 3 rows and 4 columns
list2 = [[1, 2, 3, 4], [3, 4, 5, 6], [5, 6, 7, 8]]
arr2 = np.array(list2, dtype='float')
print("Array:\n", arr2)

print('Shape:', arr2.shape)
print('Datatype:', arr2.dtype)
print('Size:', arr2.size)
print('Num Dimensions:', arr2.ndim)

4. How to extract specific items from an array?

You can pull out parts of an array using indexing, starting at 0 — just like Python lists.

But here’s the difference: NumPy arrays accept one index per dimension, separated by commas. Lists can’t do this.

import numpy as np

list2 = [[1, 2, 3, 4], [3, 4, 5, 6], [5, 6, 7, 8]]
arr2 = np.array(list2, dtype='float')

# Extract the first 2 rows and first 2 columns
print(arr2[:2, :2])

Note: if you try list2[:2, :2] on a regular list, you’ll get an error. Multi-dimensional indexing is a NumPy feature.

NumPy also supports boolean indexing. You create a True/False mask, and only the True positions are kept:

import numpy as np

arr2 = np.array([[1, 2, 3, 4], [3, 4, 5, 6], [5, 6, 7, 8]], dtype='float')

# Create a boolean mask: which values are greater than 4?
b = arr2 > 4
print("Boolean mask:\n", b)

# Filter: keep only values where mask is True
print("Filtered values:", arr2[b])

4.1 How to reverse the rows and the whole array?

Reversing works the same way as with lists — use the ::-1 slice. For a full reversal of a 2D array, apply it to both axes:

import numpy as np

arr2 = np.array([[1, 2, 3, 4], [3, 4, 5, 6], [5, 6, 7, 8]], dtype='float')

# Reverse only the rows
print("Rows reversed:\n", arr2[::-1, ])

# Reverse both rows and columns
print("Fully reversed:\n", arr2[::-1, ::-1])

4.2 How to represent missing values and infinite?

Use np.nan for missing values and np.inf for infinity. Let’s insert some into our array and then replace them:

import numpy as np

arr2 = np.array([[1, 2, 3, 4], [3, 4, 5, 6], [5, 6, 7, 8]], dtype='float')

# Insert nan and inf
arr2[1, 1] = np.nan
arr2[1, 2] = np.inf
print("With nan and inf:\n", arr2)

# Replace nan and inf with -1
# Important: don't use arr2 == np.nan (it won't work!)
missing_bool = np.isnan(arr2) | np.isinf(arr2)
arr2[missing_bool] = -1
print("After replacement:\n", arr2)

4.3 How to compute mean, min, max on the ndarray?

Every ndarray has built-in methods for basic stats. Here’s how to use them on the whole array, and also by row or column.

import numpy as np

arr2 = np.array([[1, 2, 3, 4], [3, -1, -1, 6], [5, 6, 7, 8]], dtype='float')

# Overall stats
print("Mean:", arr2.mean())
print("Max:", arr2.max())
print("Min:", arr2.min())

Need the min for each row or column? Use np.amin with the axis parameter. Axis 0 goes down columns. Axis 1 goes across rows.

import numpy as np

arr2 = np.array([[1, 2, 3, 4], [3, -1, -1, 6], [5, 6, 7, 8]], dtype='float')

print("Column-wise minimum:", np.amin(arr2, axis=0))
print("Row-wise minimum:", np.amin(arr2, axis=1))

What if you want a custom function applied row-wise? That’s where np.apply_along_axis comes in — we’ll cover that in Part 2.

Here’s another useful one — the cumulative sum:

import numpy as np

arr2 = np.array([[1, 2, 3, 4], [3, -1, -1, 6], [5, 6, 7, 8]], dtype='float')
print("Cumulative sum:", np.cumsum(arr2))

5. How to create a new array from an existing array?

This is a common trap. When you slice an array and save it to a new name, the new name is just a view of the source. It points to the same spot in memory.

Change the view, and the original changes too. Watch:

import numpy as np

arr2 = np.array([[1, 2, 3, 4], [3, -1, -1, 6], [5, 6, 7, 8]], dtype='float')

# This is a VIEW, not a copy
arr2a = arr2[:2, :2]
arr2a[0, 0] = 100  # This changes arr2 as well!
print("Original after view change:\n", arr2)

To avoid this, use .copy(). This creates a completely separate array:

import numpy as np

arr2 = np.array([[100, 2, 3, 4], [3, -1, -1, 6], [5, 6, 7, 8]], dtype='float')

# This is a COPY — changes won't affect arr2
arr2b = arr2[:2, :2].copy()
arr2b[0, 0] = 999
print("Original after copy change:\n", arr2)
print("Copy:\n", arr2b)

6. Reshaping and Flattening Multidimensional arrays

Reshaping moves items into a new layout while keeping all the data intact.

Flattening turns any multi-axis array into a flat 1D array. Let me show both.

import numpy as np

arr2 = np.array([[100, 2, 3, 4], [3, -1, -1, 6], [5, 6, 7, 8]], dtype='float')

# Reshape from 3x4 to 4x3
print("Reshaped (4x3):\n", arr2.reshape(4, 3))

6.1 What is the difference between flatten() and ravel()?

Both turn a multi-axis array into 1D. But there’s a key difference.

flatten() returns a copy. Changes to the flattened array won’t affect the original.

ravel() returns a view. Changes to the raveled array will affect the source. But it uses less memory since it skips the copy step.

import numpy as np

arr2 = np.array([[100, 2, 3, 4], [3, -1, -1, 6], [5, 6, 7, 8]], dtype='float')

# flatten() returns a copy
b1 = arr2.flatten()
b1[0] = 999
print("After changing flatten result:")
print("Original unchanged:\n", arr2)

# ravel() returns a view
b2 = arr2.ravel()
b2[0] = 101
print("\nAfter changing ravel result:")
print("Original changed:\n", arr2)

7. How to create sequences, repetitions and random numbers using numpy?

np.arange creates custom number sequences. It works like Python’s range, but returns an ndarray:

import numpy as np

print("Default (0 to 4):", np.arange(5))
print("0 to 9:", np.arange(0, 10))
print("0 to 9, step 2:", np.arange(0, 10, 2))
print("10 to 1, decreasing:", np.arange(10, 0, -1))

With np.arange, you set the start, stop, and step. But what if you want exactly N numbers between two values?

That’s what np.linspace is for. You tell it how many numbers you want, and it figures out the spacing:

import numpy as np

# 10 evenly spaced numbers between 1 and 50
print(np.linspace(start=1, stop=50, num=10, dtype=int))

Note: forcing dtype=int causes rounding, so the spacing isn’t perfectly even.

There’s also np.logspace, which spaces values on a log scale. By default it uses base 10 — so start=1 means 101 and stop=50 means 1050:

import numpy as np
np.set_printoptions(precision=2)

# 10 values from 10^1 to 10^50
print(np.logspace(start=1, stop=50, num=10, base=10))

Need arrays filled with zeros or ones? Use np.zeros and np.ones:

import numpy as np

print("Zeros:\n", np.zeros([2, 2]))
print("Ones:\n", np.ones([2, 2]))

7.1 How to create repeating sequences?

Two handy functions here: np.tile repeats the whole array, while np.repeat repeats each element.

import numpy as np

a = [1, 2, 3]

# Tile: repeat the whole array twice
print('Tile:  ', np.tile(a, 2))

# Repeat: repeat each element twice
print('Repeat:', np.repeat(a, 2))

7.2 How to generate random numbers?

NumPy’s random module creates random numbers of any shape. Here are the most common functions:

import numpy as np

# Uniform random numbers between [0, 1)
print("Uniform 2x2:\n", np.random.rand(2, 2))

# Normal distribution (mean=0, variance=1)
print("Normal 2x2:\n", np.random.randn(2, 2))

# Random integers between [0, 10)
print("Random ints 2x2:\n", np.random.randint(0, 10, size=[2, 2]))

# Single random number
print("Single random:", np.random.random())

# Random choices from a list
print("Random vowels:", np.random.choice(['a', 'e', 'i', 'o', 'u'], size=10))

Every time you run these functions, you get different numbers. To get the same random numbers each time, set a seed:

import numpy as np

# Method 1: Using RandomState
rn = np.random.RandomState(100)
print("With RandomState:\n", rn.rand(2, 2))

# Method 2: Using seed
np.random.seed(100)
print("With seed:\n", np.random.rand(2, 2))

Both give the same results. The seed can be any number — just use the same seed each time to get the same output.

7.3 How to get the unique items and the counts?

Use np.unique to find distinct values. Set return_counts=True to see how many times each value appears:

import numpy as np

np.random.seed(100)
arr_rand = np.random.randint(0, 10, size=10)
print("Random array:", arr_rand)

# Get unique values and their counts
uniqs, counts = np.unique(arr_rand, return_counts=True)
print("Unique items:", uniqs)
print("Counts:      ", counts)

8.0 Conclusion

That wraps up Part 1 of the NumPy series. You now know how to create arrays, check their shape, slice and index them, reshape and flatten, and make sequences and random numbers.

Next up: advanced NumPy for data analysis, where we’ll cover the key functions you need for real data work.

Recommended Course: If you liked this article, you will enjoy the exhaustive NumPy Course. It covers NumPy from first principles, the recommended way to learn NumPy for a strong foundation for programming for ML / AI: Numpy for Data Science.

Free Course
Master Core Python — Your First Step into AI/ML

Build a strong Python foundation with hands-on exercises designed for aspiring Data Scientists and AI/ML Engineers.

Start Free Course
Trusted by 50,000+ learners
Related Course
Master Python — Hands-On
Join 5,000+ students at edu.machinelearningplus.com
Explore Course
Free Callback - Limited Slots
Not Sure Which Course to Start With?
Talk to our AI Counsellors and Practitioners. We'll help you clear all your questions for your background and goals, bridging the gap between your current skills and a career in AI.
10-digit mobile number
📞
Thank You!
We'll Call You Soon!
Our learning advisor will reach out within 24 hours.
(Check your inbox too — we've sent a confirmation)
⚡ Before you go

Python.
SQL. NumPy.
All free.

Get the exact 10-course programming foundation that Data Science professionals use.

🐍
Core Python — from first line to expert level
📈
NumPy & Pandas — the #1 libraries every DS job needs
🗃️
SQL Levels I–III — basics to Window Functions
📄
Real industry data — Jupyter notebooks included
R A M S K
57,000+ students
★★★★★ Rated 4.9/5
⚡ Before you go
Python. SQL.
All Free.
R A M S K
57,000+ students  ★★★★★ 4.9/5
Get Free Access Now
10 courses. Real projects. Zero cost. No credit card.
New learners enrolling right now
🔒 100% free ☕ No spam, ever ✓ Instant access
🚀
You're in!
Check your inbox for your access link.
(Check Promotions or Spam if you don't see it)
Or start your first course right now:
Start Free Course →
Scroll to Top
Scroll to Top
Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science