Menu

Pandas reset index – How to reset the index and convert the index to a column?

#pandas reset_index #reset index

pandas.reset_index in pandas is used to reset index of the dataframe object to default indexing (0 to number of rows minus 1) or to reset multi level index. By doing so, the original index gets converted to a column.

By the end of this article, you will know the different features of reset_index function, the parameters which can be customized to get the desired output from the function. This also covers use cases that are closely related to doing reset index in pandas.

pandas.reset_index

 

Syntax

    • pandas.reset_index(level=None, drop=False, inplace=False, col_level=0, col_fill= ”)

Purpose

    • Reset the index, or a level of it. Reset the index of the DataFrame, and use the default one instead. If the DataFrame has a MultiIndex, this method can remove one or more levels

Parameters

    • level:
          • int, str, tuple or list, (default None) Only remove the provided levels from the index. Removes all the levels by default.
    • drop:
          • bool, (default False) Do not add the old index into dataframe. By default, it adds.
    • inplace:
          • bool, (default False) Do the changes in the current datafame object
    • col_level:
          • int or str, (default 0) If the columns have multiple levels, determines at which level the labels are to be inserted. By default, it is inserted into the first level (0).
    • col_fill:
          • object, (default ”) If the columns have multiple levels, determines how the other levels are named. If None then the index name is repeated.

Returns

    • DataFrame or None, DataFrame with the new index or None if inplace=True

       

1. How to reset the index?

To reset the index in pandas, you simply need to chain the function .reset_index() with the dataframe object.

Step 1: Create a simple DataFrame

import pandas as pd
import numpy as np
import random
# A dataframe with an initial index. The marks represented here are out of 50

df = pd.DataFrame({
                    'Networking': [45, 34, 23, 8, 21],
                    'Web Engineering': [32, 43, 23, 50, 21],
                    'Complier Design': [14, 42, 21, 12, 45]
                  }, index=['Abhishek', 'Saumya', 'Ayushi', 'Saksham', 'Rajveer']
                 )
df
pandas reset_index

Step 2: Reset the index

df.reset_index()
pandas reset_index

On applying the .reset_index() function, the index gets shifted to the dataframe as a separate column. It is named as index. The new index of the dataframe is now integers ranging from 0 to the length of the dataframe.

2. What happens if a named index is reset?

For dataframe with named index, then, the name of the index will be made as a column name in the dataframe, instead of the default name index. A named index means the index has a name assigned to it.

Step 1: Create a DataFrame with Named Index

# Create a Series with name
namedIndex = pd.Series(['Abhishek', 'Saumya', 'Ayushi', 'Saksham', 'Rajveer'], name='initial_index')

# Create the dataframe and pass the named series as index
df = pd.DataFrame({
                    'Networking': [45, 34, 23, 8, 21],
                    'Web Engineering': [32, 43, 23, 50, 21],
                    'Complier Design': [14, 42, 21, 12, 45]
                    }, index=namedIndex
                 )
df
pandas reset_index

Step 2: Reset the Index

Resetting the index in this case returns a dataframe with initial_index as the column name for the old index:-

df.reset_index()
pandas reset_index

3. How to persist the change?

Consider a dataframe below, where the index has been reset:

# Create the dataframe
df = pd.DataFrame({
                    'Networking': [45, 34, 23, 8, 21],
                    'Web Engineering': [32, 43, 23, 50, 21],
                    'Complier Design': [14, 42, 21, 12, 45]
                    }, index=['Abhishek', 'Saumya', 'Ayushi', 'Saksham', 'Rajveer']
                    )

# reset the index
df.reset_index()
pandas reset_index

The output above shows that the index of the dataframe has been changed. But if you check the dataframe, it was not applied permanently:

df
pandas reset_index

If you want your to retain your changes, then you need to pass a parameter called inplace, and set it’s value to True, so that your index reset is applied to the dataframe object at the time of running the reset_index function.

# reset the index with inplace=True
df.reset_index(inplace=True)
df
pandas reset_index

4. How to drop the old index?

You might be interested in dropping the old index of the dataframe which was added while resetting the index. Though you can do this manually by using .drop() function, you can save this time by passing drop=True parameter while resetting the index.

Step 1: Create a DataFrame

df = pd.DataFrame({
                    'Networking': [45, 34, 23, 8, 21],
                    'Web Engineering': [32, 43, 23, 50, 21],
                    'Complier Design': [14, 42, 21, 12, 45]
                    }, index=['Abhishek', 'Saumya', 'Ayushi', 'Saksham', 'Rajveer']
                    )
df
pandas reset_index

Step 2: Reset the index with drop=True

df.reset_index(drop=True)
pandas reset_index

5. How to convert a column to an index?

You can reset the index of your dataframe without removing the default index by following these steps:

Step 1: Create a DataFrame with initial index

df = pd.DataFrame({
                    'Name': ['Abhishek', 'Saumya', 'Ayushi', 'Saksham', 'Rajveer'],
                    'Networking': [45, 34, 23, 8, 21],
                    'Web Engineering': [32, 43, 23, 50, 21],
                    'Complier Design': [14, 42, 21, 12, 45],
                  }, index=['One', 'Two', 'Three', 'Four', 'Five']
                 )
df
pandas reset_index

Step 2: Set the column as Index using set_index

# Set 'Name' as the index of the dataframe
df.set_index('Name', inplace=True)
df
pandas reset_index

6. How to reset multi-level index?

# Create a Multi-Level Index
newIndex = pd.MultiIndex.from_tuples(
                                      [('IT', 'Abhishek'),
                                       ('IT', 'Rajveer'),
                                       ('CSE', 'Saumya'),
                                       ('CSE', 'Saksham'),
                                       ('EEE', 'Ayushi')
                                      ],
                                  names=['Branch', 'Name'])


# Optionally, you can also create multilevel columns
columns = pd.MultiIndex.from_tuples(
                                    [('subject1', 'Networking'),
                                     ('subject2', 'Web Engineering'),
                                     ('subject3', 'Complier Design')
                                    ])

df = pd.DataFrame([
                    (45, 32, 14),
                    (21, 21, 25),
                    (23, 23, 21),
                    (8, 50, 12),
                    (34, 43, 42)      
                    ], index=newIndex, 
                    columns=columns)
df
pandas reset_index

Here you can see that Branch level maps to multiple rows. This is a multi-level index. Multi-level index shows the details in greater granularity, and they can be very useful when we are dealing with hierarchical data.

If you apply the .reset_index() function to such type of dataframe, by default, all the levels will be merged into the dataframe as columns:

# convert multi-level index to columns.
df.reset_index()
pandas reset_index

Suppose, you want to reset the index at Branch level. To reset such index, you need to provide the level parameter to the reset_index function.

df.reset_index(level='Branch')
pandas reset_index

Name column still remains as index. Because we specified Branch as the level on which we want to reset the index.

7. Reset only one level in multi-level index

Consider our previous dataframe when it was reset at Branch level:

df.reset_index(level='Branch')

 

pandas reset_index

You can see that Branch column, on being reset, is placed at the top level(0) by default. You can modify this level by specifying col_level parameter.

It defines the level at which the shifted index column should be placed. Look at an implementation below:

# Changing the level of column to 1
df.reset_index(level='Branch', col_level=1)
pandas reset_index

8. How to fill void levels?

Continuing the previous example, you can see that as the Branch column level has been lowered (level 1), a void has been created at the level above it:

df.reset_index(level='Branch', col_level=1)
pandas reset_index

You can fill this level too using col_fill parameter that takes in the name for that.

df.reset_index(level='Branch', col_level=1, col_fill='Department')
pandas reset_index

9. Practical Tips

.reset_index() function is very useful in cases when you have performed a lot of preprocessing steps with your data such as removing null values rows or filtering data.

These processes may return a different dataframe whose index is not in continuous manner anymore. Let’s try a small example.

# Create a dataframe

df = pd.DataFrame({
                    'Name': ['Abhishek', 'Saumya', 'Ayushi', 'Ayush', 'Saksham', 'Rajveer'],
                    'Networking': [45, 34, 23, np.nan, 8, 21],
                    'Web Engineering': [32, 43, 23, np.nan, 50, 21],
                    'Complier Design': [14, 42, 21, 14, 12, 45]
                    })

df['Percentage'] = round((df.sum(axis=1)/150)*100, 2)

df
pandas reset_index
# Drop null values
df.dropna(axis=0, inplace=True)

# filter rows with percentage > 55
output = df[df.Percentage > 55]

output
pandas reset_index

As you can see in the table above, the indexing of rows has changed. Initially it was 0,1,2… but now it has changed to 0,1,5.

In such cases, you can use .reset_index() function to number the rows in the right order.

# Set drop=True if you don't want old index to be added as column
output.reset_index(drop=True)
pandas reset_index

10. Test your knowledge

Q1: The pandas dataframe index is reset as soon as the .reset_index() function is applied to it. True or False?

Answer

Answer: False. Because, the output dataframe is just a view of the changes. To apply the changes, we use inplace parameter.

Q2: What is the use of drop parameter in .reset_index() function?

Answer

Answer: It is used to avoid old index being added to pandas dataframe while resetting the index.

Q3: Which parameter is used change the default level of column while resetting multi-level index?

Answer

Answer: We use col_level parameter to define the level of column.

Q4: Answer the following questions using the given dataset.

import pandas as pd
import numpy as np

# Multi-Level Index
newIndex = pd.MultiIndex.from_tuples(
                                      [('IT', 'Abhishek'),
                                       ('IT', 'Rajveer'),
                                       ('MAE', 'Jitender'),
                                       ('CSE', 'Saumya'),
                                       ('CSE', 'Saksham'),
                                       ('CSE', 'Ayushi')
                                      ],
                                  names=['Branch', 'Name'])


# Multilevel columns
columns = pd.MultiIndex.from_tuples(
                                    [('subject1', 'Computer Graphics'),
                                     ('subject2', 'Artificial Intelligence'),
                                     ('subject3', 'Micro Processors')
                                    ])

df = pd.DataFrame([
                    (21, 21, 25),
                    (45, 32, 14),
                    (8, 50, 12),
                    (23, 23, 21),
                    (34, 43, 42),
                    (42, 46, 21)
                    ], index=newIndex, 
                    columns=columns)

df['Percentage'] = round((df.sum(axis=1)/150)*100, 2)

df

Q4.1: Reset the index at branch level, and assign an upper level Department for branch. Save the output as ques1.

Answer

Answer:

# Make a copy of dataframe
ques1 = df.copy()

# Reset the index, define the column level, name to fill in col_fill
ques1.reset_index(level='Branch', col_level=1, col_fill='Department', inplace=True)

ques1

Q4.2: Use the output of Question 1 to add an upper level named Metric for Percentage. Make sure that name still remains the index

Answer

Answer:

# Make a copy of dataframe
ques2 = ques1.copy()

# Reset the index so that names are shifted to dataframe as column
ques2.reset_index(inplace=True)

# Set the index as Percentage
ques2.set_index('Percentage', inplace=True)

# Reset the index with column level and col_fill defined
ques2.reset_index(col_level=1, col_fill='Metric', inplace=True)

# Set the index again as Name
ques2.set_index('Name')
pandas reset_index

Q4.3: Calculate the rank of the students where branch is CSE and sorted in decreasing order of Percentage. Print rank and name of student both.

Answer

Answer:

# make a copy of dataframe
que3 = df.copy()

# Reset the index
que3.reset_index(inplace=True)

# filter the rows by Branch, and then sort by Percentage in decreasing order
output = que3[que3.Branch == 'CSE'].sort_values(by='Percentage', ascending=False)


# Reset the index
output.reset_index(drop=True, inplace=True)

print([(rank, name) for rank, name in zip(output.index.values + 1, output.Name.values)])

This blog has been contributed by Kaustubh Gupta, under the guidance of ML+ team.

Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science