#pandas iloc #python iloc
Pandas iloc is a method for integer-based indexing, which is used for selecting specific rows and subsetting pandas DataFrames and Series. The command to use this method is
The iloc method accepts only integer-value arguments. However, these arguments can be passed in different ways.
In this article, you will understand different methods of subsetting pandas dataframes and series using the
Reading the dataset
You can download the data here
# Load and read the dataset as a DataFrame df = pd.read_csv('D:/PERSONAL/DATASETS/vgsales.csv') df
Subsetting DataFrames with Pandas iloc
Here are some ways in which you can perform subsetting on a dataframe using iloc function.
1. Using a single integer value in Pandas iloc
You can pass a single integer value as the row index to select a single row across all the columns from the dataframe.
# Subset a single row of the DataFrame print(df.iloc)
By specifying both the row and column indices to the iloc function, you can also view a specific data point.
# Display the value at row index 203 and column index 5 df.iloc[203, 5]
2. Using a list of integer values
Using a list of integer values allows you to select specific rows and columns from the DataFrame, which may or may not be contiguous.
The first argument passed to this method performs subsetting on the row indices and the second argument passed performs subsetting on the column indices. Both the arguments are a list of row indices and column indices respectively.
# subset only rows df.iloc[[2, 657, 658, 3445]]
# subset both rows and columns df.iloc[[2, 657, 658, 3445], [2, 3, 4]]
With the help of slicing objects, you can subset a number of contiguous rows as well as columns from the DataFrame.
The arguments used for subsetting both the row indices and column indices are passed as separate slice objects to the
# subset the first hundred rows df.iloc[:100]
# Subset the rows indexed from 144 to 256 and the second, third and fourth column of the DataFrame df.iloc[144:257, 2:5]
4. Using Boolean values
You can also use boolean values for subsetting a DataFrame. You can pass them as a list of values or as a slice object for subsetting both rows and columns.
- For displaying a particular row or column, pass the boolean value
- For not displaying all the other rows and columns, pass the boolean value
# Pass a Boolean value(True for the index to be displayed and False for all the indices not to be displayed). df.iloc[:, [True, False, False, False, False, True, False, False, False, False, False]]
The colon(:) passed in the row index argument indicates all the rows of the DataFrame are to be displayed.
Therefore, the line
df.iloc[:,[True,False,False,False,False,True,False,False,False,False,False]] tells Python to display all the rows but only the first and the sixth columns of the DataFrame.
Want to become awesome in ML?
Hi! I am Selva, and I am excited you are reading this!
You can now go from a complete beginner to a Data Science expert, with my end-to-end free Data Science training.
No shifting between multiple books and courses. Hop on to the most effective way to becoming the expert. (Includes downloadable notebooks, portfolio projects and exercises)
Start free with the first course 'Foundations of Machine Learning' - a well rounded orientation of what the field of ML is all about.
Sold already? Start with the Complete ML Mastery Path
Subsetting pandas series with Pandas iloc
The iloc method can also be used to subset a pandas Series. All the methods which can be applied on a pandas dataframe are also applicable on pandas Series.
If the pandas series is present as a column in a dataframe,
- Firstly, the column is subsetted and
- then the values of that column can be subsetted using the iloc method.
Since a pandas series is a one-dimensional data structure, it can be subsetted only along the rows. Therefore, it only accepts the row indices argument.
Creating a pandas series
data = pd.Series(['Bitcoin', 'Ethereum', 'Litecoin', 'Cardano', 'Polkadot'])
To learn more about creating pandas Series, click here.
1. Using a single integer value
Here, you can pass a single integer value to view a particular value of the series.
# Pass a single integer value data.iloc
2. Using a list of integer values
Passing a list of values to the series allows you to view a number of values of the series which may or may not be contiguous.
# Pass a list of integer values data.iloc[[1, 2, 4]]
By using slicing, you can view a number of contiguous values of a series at once.
4. Using Boolean values
You can also use boolean values to display the values of a series.
The syntax is similar to the one used for subsetting a dataframe. You can pass either a list or a slice object of boolean values to subset the series.
- For displaying a particular index, pass the boolean value
- For not displaying all the other indices, pass the boolean value
data.ilocTrue, False, True, True, False]
Practical Tips for Pandas iloc
- Remember that whenever you are subsetting a dataframe using the list of integer values and the slice objects, the column indices cannot be passed without specifying the row indices as well. However, the subsetting only on row indices is allowed by passing only the row indices without passing the column indices.
- Try to avoid subsetting of dataframes or series by using Boolean values as it may not be feasible to pass a
Falseboolean value for every row index of the dataframe or series.
- Use the value -1 as the index value for subsetting the last row or the last column.
# Display all the values of the last column down #the rows df.iloc[:, -1]
# Display all the values of the last row # across all the columns df.iloc[-1, :]
In this article, you have learnt how to subset pandas DataFrames and Series using:
Single integer values
List of integer values
Test Your Knowledge for Pandas iloc
Q1: In the Boolean method for subsetting the dataframes, you only need to pass the
True value at the corresponding position of the index as a list. True or False?
False. You need to pass both the True value at the position of the index to be displayed and also the value False at all the other indices which should not be displayed.
Q2: You have a DataFrame stored in a variable df. Write the code to subset all the rows and the first four columns of the DataFrame using the slicing method.Answer
Q3: Write the code to view the second, fourth and fifth row and the seventh and ninth column of the DataFrame stored in the variable df using the iloc method.Answer
Q4: Write the code to view the data point at the row index 234 and the column index 3 of a DataFrame stored in the variable df.Answer