Pandas iloc – How to select rows using index in DataFrames?

#pandas iloc #python iloc

Pandas iloc is a method for integer-based indexing, which is used for selecting specific rows and subsetting pandas DataFrames and Series. The command to use this method is pandas.DataFrame.iloc()

The iloc method accepts only integer-value arguments. However, these arguments can be passed in different ways.

In this article, you will understand different methods of subsetting pandas dataframes and series using the iloc method.

Reading the dataset

You can download the data here

# Load and read the dataset as a DataFrame

df = pd.read_csv('D:/PERSONAL/DATASETS/vgsales.csv')
df
Reading Pandas dataset

Subsetting DataFrames with Pandas iloc

Here are some ways in which you can perform subsetting on a dataframe using iloc function.

1. Using a single integer value in Pandas iloc

You can pass a single integer value as the row index to select a single row across all the columns from the dataframe.

Example 1

# Subset a single row of the DataFrame
print(df.iloc[655])
Use pandas iloc to subset dataframe

By specifying both the row and column indices to the iloc function, you can also view a specific data point.

Example 2

# Display the value at row index 203 and column index 5
df.iloc[203, 5]

2. Using a list of integer values

Using a list of integer values allows you to select specific rows and columns from the DataFrame, which may or may not be contiguous.

The first argument passed to this method performs subsetting on the row indices and the second argument passed performs subsetting on the column indices. Both the arguments are a list of row indices and column indices respectively.

Example 1

# subset only rows
df.iloc[[2, 657, 658, 3445]]
Example 1: Use list of integer values to subset

Example 2

# subset both rows and columns
df.iloc[[2, 657, 658, 3445], [2, 3, 4]]
Example 2: Use list of integer values to subset

 

3. Slicing

With the help of slicing objects, you can subset a number of contiguous rows as well as columns from the DataFrame.

The arguments used for subsetting both the row indices and column indices are passed as separate slice objects to the iloc method.

Example 1

# subset the first hundred rows
df.iloc[:100]
Use Pandas iloc for slicing data

Example 2

# Subset the rows indexed from 144 to 256 and the second, third and fourth column of the DataFrame

df.iloc[144:257, 2:5]
Example 2 - For slicing

4. Using Boolean values

You can also use boolean values for subsetting a DataFrame. You can pass them as a list of values or as a slice object for subsetting both rows and columns.

  1. For displaying a particular row or column, pass the boolean value True
  2. For not displaying all the other rows and columns, pass the boolean value False

Example

# Pass a Boolean value(True for the index to be displayed and False for all the indices not to be displayed).

df.iloc[:, [True, False, False, False, False,
            True, False, False, False, False, False]]
Use Boolean values for subsetting data

The colon(:) passed in the row index argument indicates all the rows of the DataFrame are to be displayed.

Therefore, the line df.iloc[:,[True,False,False,False,False,True,False,False,False,False,False]] tells Python to display all the rows but only the first and the sixth columns of the DataFrame.

Get Free Complete Python Course

Facing the same situation like everyone else?

Build your data science career with a globally recognised, industry-approved qualification. Get the mindset, the confidence and the skills that make Data Scientist so valuable.

Logo

Get Free Complete Python Course

Build your data science career with a globally recognised, industry-approved qualification. Get the mindset, the confidence and the skills that make Data Scientist so valuable.


Subsetting pandas series with Pandas iloc

The iloc method can also be used to subset a pandas Series. All the methods which can be applied on a pandas dataframe are also applicable on pandas Series.

If the pandas series is present as a column in a dataframe,

  1. Firstly, the column is subsetted and
  2. then the values of that column can be subsetted using the iloc method.

Since a pandas series is a one-dimensional data structure, it can be subsetted only along the rows. Therefore, it only accepts the row indices argument.

Creating a pandas series

data = pd.Series(['Bitcoin', 'Ethereum', 'Litecoin', 'Cardano', 'Polkadot'])

To learn more about creating pandas Series, click here.

1. Using a single integer value

Here, you can pass a single integer value to view a particular value of the series.

Example

# Pass a single integer value
data.iloc[3]
Single integer value

2. Using a list of integer values

Passing a list of values to the series allows you to view a number of values of the series which may or may not be contiguous.

Example

# Pass a list of integer values
data.iloc[[1, 2, 4]]
Use list of integer values

3. Slicing

By using slicing, you can view a number of contiguous values of a series at once.

Example

# Pass a slice object
data.iloc[:3]
Use slicing with Pandas iloc function

4. Using Boolean values

You can also use boolean values to display the values of a series.

The syntax is similar to the one used for subsetting a dataframe. You can pass either a list or a slice object of boolean values to subset the series.

  1. For displaying a particular index, pass the boolean value True.
  2. For not displaying all the other indices, pass the boolean value False.

Example

data.iloc[[True, False, True, True, False]]
Use Boolean values to display series values

Practical Tips for Pandas iloc

  1. Remember that whenever you are subsetting a dataframe using the list of integer values and the slice objects, the column indices cannot be passed without specifying the row indices as well. However, the subsetting only on row indices is allowed by passing only the row indices without passing the column indices.
  2. Try to avoid subsetting of dataframes or series by using Boolean values as it may not be feasible to pass a True or False boolean value for every row index of the dataframe or series.
  3. Use the value -1 as the index value for subsetting the last row or the last column.
# Display all the values of the last column down
#the rows
df.iloc[:, -1]
Use of value -1 in in the index in Pandas iloc function
# Display all the values of the last row
# across all the columns
df.iloc[-1, :]
Display all values in last row, across all columns

Conclusion

In this article, you have learnt how to subset pandas DataFrames and Series using:

  1. Single integer values

  2. List of integer values

  3. Slicing objects

  4. Boolean values


Test Your Knowledge for Pandas iloc

Q1: In the Boolean method for subsetting the dataframes, you only need to pass the True value at the corresponding position of the index as a list. True or False?

Answer

False. You need to pass both the True value at the position of the index to be displayed and also the value False at all the other indices which should not be displayed.

Q2: You have a DataFrame stored in a variable df. Write the code to subset all the rows and the first four columns of the DataFrame using the slicing method.

Answer

df.iloc[:,:4]

Q3: Write the code to view the second, fourth and fifth row and the seventh and ninth column of the DataFrame stored in the variable df using the iloc method.

Answer

df.iloc[[1,3,4],[6,8]]

Q4: Write the code to view the data point at the row index 234 and the column index 3 of a DataFrame stored in the variable df.

Answer

df.iloc[234,3]

 


 

This article was contributed by Shreyansh and edited by Leena.

Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science