How to Use Numpy Where Function?

The numpy.where() function is used to filter data based on the conditions provided. These conditions can vary from being as simple as value comparisons to nested bit-wise conditions.
You can also use this function to perform conditional replacements in the input data array.

In this article, you will learn about the possible use cases of the numpy “where” function.

numpy.where

__numpy.where(condition[, x, y])__

__Purpose:__ Return elements picked from x or y, depending on the condition

__Parameters:__

– **condition:** _array_like, bool_ Where True, yield x, otherwise yield y
– **x, y:** _array_like_ Values from which to choose. x, y

**Returns**

_ndarray_

An array with elements from x where condition is True, and elements from y elsewhere.

# Import Packages
import numpy as np
import warnings

warnings.filterwarnings("ignore")

1. Numpy.where function

The numpy.where() function takes in the condition as one of the required arguments and returns the indices array for elements which satisfy the given condition. Condition is nothing but an expression involving usage of operators with the input array. Let’s take an example to understand this.

Step 1: Create a numpy array

num_array = np.array([1,2,3,4,5,6,7,8,3,10])

Step 2: Use np.where() function

# Get the values less than or equal to 6

np.where(num_array <= 6)

(array([0, 1, 2, 3, 4, 5, 8]),)

On applying the condition for getting the values less than or equal to 6, a tuple containing the indices for the values that are less than or equal to 6, is returned and not the actual values.

2. Numpy.where function for multi-dimensional data

The np.where() function is not limited to 1-D arrays. It can be applied on multi-dimensional arrays as well, where the np.where() function is broadcasted in every dimension. In this case, multiple arrays are returned depending upon the dimension of the input array. Let’s understand it with some commonly used dimensions.

A. Applying np.where on 2-D numpy array

When the np.where() function is applied to a 2-D numpy array, a tuple containing two arrays is returned. These arrays correspond to respective indices of an element in both dimensions. An element position is determined by taking one value each from arrays in order. See the example below.

Step 1: Create a 2-D numpy array

a_2d_array = np.array([[0, -1],
                       [1, 3], 
                      ])

Step 2: Use np.where() function to get the position of element -1

np.where(a_2d_array == -1)

(array([0]), array([1]))

Here, a tuple containing two arrays is returned, and hence the position of -1 comes out to be (0, -1)

B. Applying np.where on 3-D numpy array

When the np.where() function is applied to a 3-D numpy array, a tuple containing three arrays is returned. These arrays correspond to respective indices of an element in all three dimensions. An element position is determined by taking one value from each array in order. See the example below.

Step 1: Create a 3-D numpy array

a_3d_array = np.array([[[1, 3], [1, 4]],
                       [[5, -1], [3, 1]]
                      ])

Step 2: Use np.where() function to get all positions of element 1

np.where(a_3d_array == 1)

(array([0, 0, 1]), array([0, 1, 1]), array([0, 0, 1]))

Here, (0, 0, 0), (0, 1, 0), and (1, 1, 1) are the respective positions of the element 1.

3. Getting the actual values using Indices

The np.where()function returns the indices for the values that satisfy a condition. To get the actual values, you need to slice the numpy array at these indices. This slicing returns a single dimension array for all the values found for any dimension array input.

Step 1: Create a 1-D and 3-D numpy array

a_1d_array = np.array([1,2,-1,4,-5])

a_3d_array = np.array([[[1, 3], [1, -3]],
                       [[5, -1], [3, 1]]
                      ])

Step 2: Apply np.where() function to get negative values and store the result

indices_1d = np.where(a_1d_array < 0)

indices_3d = np.where(a_3d_array < 0)

Step 3: Slice the numpy array using the stored indices

print("Negative values from 1-d array", a_1d_array[indices_1d])
print("Negative values from 1-3 array", a_3d_array[indices_3d])

Negative values from 1-d array [-1 -5]
Negative values from 1-3 array [-3 -1]

Thus, slicing the original numpy array at its indices returns the actual values that satisfy the condition set in the np.where() function in a single dimension array.

4. Applying multiple conditions using numpy.where function

The np.where() function can be used for applying multiple conditions to the array elements. These conditions need to be enclosed in circular brackets and can be separated using logical operators. Let’s look at an example.

Step 1: Create a numpy array

num_array = np.array([1,2,3,4,5,6,7,8,9,10])

Step 2: Apply a nested condition in np.where() function

The below conditions restrict the elements to be present in the range of 2 to 5 and 8 is included explicitly.

num_array[np.where((num_array > 2) & (num_array < 5) | (num_array == 8))]

array([3, 4, 8])

5. Conditional replacement using numpy where function

The np.where() function is not restricted to returning indices of the elements. It can also perform replacements. The optional X and Y arguments are required to decide what values would be used in place of the elements of the input array.

In case the condition is True, X values are considered for replacement and if it’s False, Y values are considered. This also means that an array with replaced values is returned and therefore, no indices are involved.

These replacements can be performed in three ways: Array broadcast, Specific value broadcast, and Processing elements for replacement. Let’s discuss them one by one.

Array broadcast

With this method, the replacement values, X and Y arguments, are present in the form of an array. This replacement array should be of the same size as the input array. You can think of it as the merging of two or more arrays based on conditions.

The example below demonstrates the replacement of values that are even as “Even” and odd values as “Odd”

Step 1: Create a numpy and replacement arrays

num_array = np.array([1,5,4,6,13, 2])
replace_even = ['Even', 'Even', 'Even', 'Even', 'Even', 'Even']
replace_odd = ['Odd', 'Odd', 'Odd', 'Odd', 'Odd', 'Odd']

Step 2: Use np.where() function to make replacements

np.where(num_array%2 == 0, replace_even, replace_odd)

array(['Odd', 'Odd', 'Even', 'Even', 'Odd', 'Even'], dtype='<U4')

Specific value broadcast

One of the drawbacks of array broadcasting is that, you needs to keep a track of the array size. If the input array size is modified, the same modification is needed for replacement arrays. In specific value broadcasting, one can directly pass the replacement values using the X and Y arguments.

Let’s revisit the array broadcasting example and this time, apply specific value broadcast.

# using the same number array created in array broadcast

np.where(num_array%2 == 0, 'Even', 'Odd')

array(['Odd', 'Odd', 'Even', 'Even', 'Odd', 'Even'], dtype='<U4')

Processing elements for replacement

This is an extended use case of a specific value broadcast. It may be the case that the elements of the array needed to be re-evaluated and then replaced. This can be achieved by processing these elements at the time of replacement. One can simply add expressions to the X and Y arguments and the calculated values would be replaced.

See the example below where Even numbers are replaced with their squares and Odd numbers with their cubes

Step 1: Create a numpy array

num_array = np.array([1,2,3,4,5,6,7,8,9,10])

Step 2: Use np.where() function

np.where(num_array%2 == 0, num_array**2, num_array**3)

array([  1,   4,  27,  16, 125,  36, 343,  64, 729, 100])

6. Numpy.where function without condition expression

The conditional expressions, when provided in np.where() function, evaluates to boolean values. What if these boolean values are passed explicitly as a condition parameter, in the form of a boolean list? The np.where() function will still return results provided X and Y arguments are arrays and of the same size.

np.where([True, False, False, True, True], [1,2,3,4,5], [6,7,8,9,10])

array([1, 7, 8, 4, 5])

7. Practical Tips

Binarization

It is a process of transforming data features into binary values of 1s and 0s. This is useful in image processing techniques where a grayscale image is converted into black and white images. You can use np.where() function to binarize array data based on a threshold value.

In the example below, a 2-D array has been binarized with a threshold of 0.5. A value greater than 0.5 will be marked as 1 else 0.

Step 1: Create a 2-D numpy array

a_2d_array = np.array([[0.2, 0.8, -0.6],
                       [-0.93, 0.34, 0.45], 
                      ])

Step 2: Use np.where() function to binarize the array at 0.5 threshold

np.where(a_2d_array > 0.5, 1, 0)

array([[0, 1, 0],
       [0, 0, 0]])

8. Test your knowledge

Q1: np.where() function can be used by passing any two arguments among “condition”, “X” or “Y”. True or False?

Answer: False. The np.where() expects condition as one required argument. When X, Y arguments are to be used, a condition is necessary. Therefore, at a time, there can be either 1 or 3 arguments in use but not 2.

Q2: The array size for X and Y arguments is variable and independent of input array size. True or False?

Answer: False. The array size for X and Y arguments should be the same as the size of the input array.

Q3: How multiple conditions can be applied in np.where() function?

Answer: All the conditions need to be enclosed in circular brackets separated by logical operators.

Q4: Answer the following questions using the given dataset.

# Importing Packages
import numpy as np

a_3d_array = np.array([[[4, -3], [0, 3]],
                       [[9, -1], [-2, 1]]
                      ])

Q4.1: Write the logic to obtain indices of all the negative numbers and zeros from the 3-D array

Answer:

np.where(a_3d_array <= 0)

(array([0, 0, 1, 1]), array([0, 1, 0, 1]), array([1, 0, 1, 0]))

Q4.2: Binarize the 3D array at the threshold of 1, i.e, values greater than should be 1 else 0

Answer:

a_3d_array

array([[[ 4, -3],
        [ 0,  3]],

       [[ 9, -1],
        [-2,  1]]])

np.where(a_3d_array > 1, 1, 0)

array([[[1, 0],
        [0, 1]],

       [[1, 0],
        [0, 0]]])

Q4.3: Replace all negative values with the text “neg” and positive values as “pos”. (Assume zero to be positive category)

Answer:

np.where(a_3d_array <= 0, 'neg', 'pos')

array([[['pos', 'neg'],
        ['neg', 'pos']],

       [['pos', 'neg'],
        ['neg', 'pos']]], dtype='<U3')

References

https://numpy.org/doc/stable/reference/generated/numpy.where.html
https://www.geeksforgeeks.org/numpy-where-in-python/
https://www.javatpoint.com/numpy-where
https://note.nkmk.me/en/python-numpy-where/
https://thispointer.com/numpy-where-tutorial-examples-python/
https://www.sharpsightlabs.com/blog/numpy-where/
https://madhavuniversity.edu.in/binarization-process.html

Numpy