#pandas reset_index #reset index
pandas.reset_index
in pandas is used to reset index of the dataframe object to default indexing (0 to number of rows minus 1) or to reset multi level index. By doing so, the original index gets converted to a column.
By the end of this article, you will know the different features of reset_index
function, the parameters which can be customized to get the desired output from the function. This also covers use cases that are closely related to doing reset index in pandas.
pandas.reset_index
Syntax
-
- pandas.reset_index(level=None, drop=False, inplace=False, col_level=0, col_fill= ”)
Purpose
-
- Reset the index, or a level of it. Reset the index of the DataFrame, and use the default one instead. If the DataFrame has a MultiIndex, this method can remove one or more levels
Parameters
-
- level:
-
-
- int, str, tuple or list, (default None) Only remove the provided levels from the index. Removes all the levels by default.
-
-
- drop:
-
-
- bool, (default False) Do not add the old index into dataframe. By default, it adds.
-
-
- inplace:
-
-
- bool, (default False) Do the changes in the current datafame object
-
-
- col_level:
-
-
- int or str, (default 0) If the columns have multiple levels, determines at which level the labels are to be inserted. By default, it is inserted into the first level (0).
-
-
- col_fill:
-
-
- object, (default ”) If the columns have multiple levels, determines how the other levels are named. If None then the index name is repeated.
-
-
- level:
Returns
-
- DataFrame or None, DataFrame with the new index or None if inplace=True
- DataFrame or None, DataFrame with the new index or None if inplace=True
1. How to reset the index?
To reset the index in pandas, you simply need to chain the function .reset_index()
with the dataframe object.
Step 1: Create a simple DataFrame
import pandas as pd
import numpy as np
import random
# A dataframe with an initial index. The marks represented here are out of 50
df = pd.DataFrame({
'Networking': [45, 34, 23, 8, 21],
'Web Engineering': [32, 43, 23, 50, 21],
'Complier Design': [14, 42, 21, 12, 45]
}, index=['Abhishek', 'Saumya', 'Ayushi', 'Saksham', 'Rajveer']
)
df

Step 2: Reset the index
df.reset_index()

On applying the .reset_index()
function, the index gets shifted to the dataframe as a separate column. It is named as index
. The new index of the dataframe is now integers ranging from 0 to the length of the dataframe.
2. What happens if a named index is reset?
For dataframe with named index, then, the name of the index will be made as a column name in the dataframe, instead of the default name index
. A named index means the index has a name assigned to it.
Step 1: Create a DataFrame with Named Index
# Create a Series with name
namedIndex = pd.Series(['Abhishek', 'Saumya', 'Ayushi', 'Saksham', 'Rajveer'], name='initial_index')
# Create the dataframe and pass the named series as index
df = pd.DataFrame({
'Networking': [45, 34, 23, 8, 21],
'Web Engineering': [32, 43, 23, 50, 21],
'Complier Design': [14, 42, 21, 12, 45]
}, index=namedIndex
)
df

Step 2: Reset the Index
Resetting the index in this case returns a dataframe with initial_index
as the column name for the old index:-
df.reset_index()

3. How to persist the change?
Consider a dataframe below, where the index has been reset:
# Create the dataframe
df = pd.DataFrame({
'Networking': [45, 34, 23, 8, 21],
'Web Engineering': [32, 43, 23, 50, 21],
'Complier Design': [14, 42, 21, 12, 45]
}, index=['Abhishek', 'Saumya', 'Ayushi', 'Saksham', 'Rajveer']
)
# reset the index
df.reset_index()

The output above shows that the index of the dataframe has been changed. But if you check the dataframe, it was not applied permanently:
df

If you want your to retain your changes, then you need to pass a parameter called inplace
, and set it’s value to True
, so that your index reset is applied to the dataframe object at the time of running the reset_index
function.
# reset the index with inplace=True
df.reset_index(inplace=True)
df

4. How to drop the old index?
You might be interested in dropping the old index of the dataframe which was added while resetting the index. Though you can do this manually by using .drop()
function, you can save this time by passing drop=True
parameter while resetting the index.
Step 1: Create a DataFrame
df = pd.DataFrame({
'Networking': [45, 34, 23, 8, 21],
'Web Engineering': [32, 43, 23, 50, 21],
'Complier Design': [14, 42, 21, 12, 45]
}, index=['Abhishek', 'Saumya', 'Ayushi', 'Saksham', 'Rajveer']
)
df

Step 2: Reset the index with drop=True
df.reset_index(drop=True)

5. How to convert a column to an index?
You can reset the index of your dataframe without removing the default index by following these steps:
Step 1: Create a DataFrame with initial index
df = pd.DataFrame({
'Name': ['Abhishek', 'Saumya', 'Ayushi', 'Saksham', 'Rajveer'],
'Networking': [45, 34, 23, 8, 21],
'Web Engineering': [32, 43, 23, 50, 21],
'Complier Design': [14, 42, 21, 12, 45],
}, index=['One', 'Two', 'Three', 'Four', 'Five']
)
df

Step 2: Set the column as Index using set_index
# Set 'Name' as the index of the dataframe
df.set_index('Name', inplace=True)
df

6. How to reset multi-level index?
# Create a Multi-Level Index
newIndex = pd.MultiIndex.from_tuples(
[('IT', 'Abhishek'),
('IT', 'Rajveer'),
('CSE', 'Saumya'),
('CSE', 'Saksham'),
('EEE', 'Ayushi')
],
names=['Branch', 'Name'])
# Optionally, you can also create multilevel columns
columns = pd.MultiIndex.from_tuples(
[('subject1', 'Networking'),
('subject2', 'Web Engineering'),
('subject3', 'Complier Design')
])
df = pd.DataFrame([
(45, 32, 14),
(21, 21, 25),
(23, 23, 21),
(8, 50, 12),
(34, 43, 42)
], index=newIndex,
columns=columns)
df

Here you can see that Branch
level maps to multiple rows. This is a multi-level index. Multi-level index
shows the details in greater granularity, and they can be very useful when we are dealing with hierarchical data.
If you apply the .reset_index()
function to such type of dataframe, by default, all the levels will be merged into the dataframe as columns:
# convert multi-level index to columns.
df.reset_index()

Suppose, you want to reset the index at Branch
level. To reset such index, you need to provide the level
parameter to the reset_index
function.
df.reset_index(level='Branch')

Name
column still remains as index. Because we specified Branch
as the level on which we want to reset the index.
7. Reset only one level in multi-level index
Consider our previous dataframe when it was reset at Branch
level:
df.reset_index(level='Branch')

You can see that Branch
column, on being reset, is placed at the top level(0) by default. You can modify this level by specifying col_level
parameter.
It defines the level at which the shifted index column should be placed. Look at an implementation below:
# Changing the level of column to 1
df.reset_index(level='Branch', col_level=1)

8. How to fill void levels?
Continuing the previous example, you can see that as the Branch
column level has been lowered (level 1), a void has been created at the level above it:
df.reset_index(level='Branch', col_level=1)

You can fill this level too using col_fill
parameter that takes in the name for that.
df.reset_index(level='Branch', col_level=1, col_fill='Department')

9. Practical Tips
.reset_index()
function is very useful in cases when you have performed a lot of preprocessing steps with your data such as removing null values rows or filtering data.
These processes may return a different dataframe whose index is not in continuous manner anymore. Let’s try a small example.
# Create a dataframe
df = pd.DataFrame({
'Name': ['Abhishek', 'Saumya', 'Ayushi', 'Ayush', 'Saksham', 'Rajveer'],
'Networking': [45, 34, 23, np.nan, 8, 21],
'Web Engineering': [32, 43, 23, np.nan, 50, 21],
'Complier Design': [14, 42, 21, 14, 12, 45]
})
df['Percentage'] = round((df.sum(axis=1)/150)*100, 2)
df

# Drop null values
df.dropna(axis=0, inplace=True)
# filter rows with percentage > 55
output = df[df.Percentage > 55]
output

As you can see in the table above, the indexing of rows has changed. Initially it was 0,1,2… but now it has changed to 0,1,5.
In such cases, you can use .reset_index()
function to number the rows in the right order.
# Set drop=True if you don't want old index to be added as column
output.reset_index(drop=True)

10. Test your knowledge
Q1: The pandas dataframe index is reset as soon as the .reset_index()
function is applied to it. True or False?
Answer: False. Because, the output dataframe is just a view of the changes. To apply the changes, we use inplace
parameter.
Q2: What is the use of drop
parameter in .reset_index()
function?
Answer: It is used to avoid old index being added to pandas dataframe while resetting the index.
Q3: Which parameter is used change the default level of column while resetting multi-level index?
AnswerAnswer: We use col_level
parameter to define the level of column.
Q4: Answer the following questions using the given dataset.
import pandas as pd
import numpy as np
# Multi-Level Index
newIndex = pd.MultiIndex.from_tuples(
[('IT', 'Abhishek'),
('IT', 'Rajveer'),
('MAE', 'Jitender'),
('CSE', 'Saumya'),
('CSE', 'Saksham'),
('CSE', 'Ayushi')
],
names=['Branch', 'Name'])
# Multilevel columns
columns = pd.MultiIndex.from_tuples(
[('subject1', 'Computer Graphics'),
('subject2', 'Artificial Intelligence'),
('subject3', 'Micro Processors')
])
df = pd.DataFrame([
(21, 21, 25),
(45, 32, 14),
(8, 50, 12),
(23, 23, 21),
(34, 43, 42),
(42, 46, 21)
], index=newIndex,
columns=columns)
df['Percentage'] = round((df.sum(axis=1)/150)*100, 2)
df

Q4.1: Reset the index at branch
level, and assign an upper level Department
for branch
. Save the output as ques1
.
Answer:
# Make a copy of dataframe
ques1 = df.copy()
# Reset the index, define the column level, name to fill in col_fill
ques1.reset_index(level='Branch', col_level=1, col_fill='Department', inplace=True)
ques1

Q4.2: Use the output of Question 1 to add an upper level named Metric
for Percentage
. Make sure that name
still remains the index
Answer:
# Make a copy of dataframe
ques2 = ques1.copy()
# Reset the index so that names are shifted to dataframe as column
ques2.reset_index(inplace=True)
# Set the index as Percentage
ques2.set_index('Percentage', inplace=True)
# Reset the index with column level and col_fill defined
ques2.reset_index(col_level=1, col_fill='Metric', inplace=True)
# Set the index again as Name
ques2.set_index('Name')


Q4.3: Calculate the rank of the students where branch is CSE
and sorted in decreasing order of Percentage
. Print rank and name of student both.
Answer:
# make a copy of dataframe
que3 = df.copy()
# Reset the index
que3.reset_index(inplace=True)
# filter the rows by Branch, and then sort by Percentage in decreasing order
output = que3[que3.Branch == 'CSE'].sort_values(by='Percentage', ascending=False)
# Reset the index
output.reset_index(drop=True, inplace=True)
print([(rank, name) for rank, name in zip(output.index.values + 1, output.Name.values)])

This blog has been contributed by Kaustubh Gupta, under the guidance of ML+ team.