Menu

Pandas Add Column

Let’s see how to add a new columns to an existing Pandas Dataframe.

Adding columns to a DataFrame is one of the most crucial operations you have to perform while working on a project. It is required for several reasons such as adding new data which is relevant to the problem you are trying to solve or adding new features to improve the performance of the machine learning model.

In this article, you will see a number of methods to add columns of a pandas DataFrame followed by some practical tips.

Creating a DataFrame for demonstration

# Create the data as a dictionary
import pandas as pd
data_df = {'Name': ['Samsung', 'Huawei', 'Apple', 'Oppo', 'Vivo'],
           'Founder': ['Lee Byung-Chul', 'Ren Zhengfei', 'Steve Jobs', 'Tony Chen', 'Shen Wei'],
           'Year Founded in': [1938, 1987, 1976, 2004, 2009]}

# Create the DataFrame
df = pd.DataFrame(data_df)
df
Basic Dataframe

Using a List to add column in pandas

Create the new column as a list of values and directly assign it to the pandas DataFrame

# Create the new column as a list
new_col = ['Lee Kun-hee', 'Xu Zhijun', 'Tim Cook', 'Tony Chen', 'Shen Wei']

# Assign the list to the DataFrame as a column
df['Current Chairperson'] = new_col
df
Using List to add new column in pandas

Using List unpacking to add column in pandas

List unpacking is the process of assigning multiple iterables (lists or tuples) to a list of variables in a single statement.

You can use the list unpacking operation to assign multiple columns at once.

# Create the lists
new_col1 = ['Lee Kun-hee', 'Xu Zhijun', 'Tim Cook', 'Tony Chen', 'Shen Wei']
new_col2 = ['Android', 'HarmonyOS', 'macOS', 'ColorOS', 'FuntouchOS']

# Assign both the lists to the DataFrame using list unpacking
df['Current Chairperson'], df['Operating System Used'] = [new_col1, new_col2]
df
Using List Unpacking to add new column in pandas

 

Using a Dictionary to add column in pandas

You can add the new column to a pandas DataFrame using a dictionary. The keys of the dictionary should be the values of the existing column and the values to those keys will be the values of the new column.
After making the dictionary, pass its values as the new column to the DataFrame.

# Create the dictionary containing the data of the new column
col_dict = {'Samsung': 'Lee kun-hee', 'Huawei': 'Xu Zhijun',
            'Apple': 'Tim Cook', 'Oppo': 'Tony Chen', 'Vivo': 'Shen Wei'}

# Assign the values of the dictionary as the values of the new column
df['Current chairperson'] = col_dict.values()
df
Using Dictionary to add new column in pandas

Using the DataFrame.insert() method

In other methods, the new column is created at the end of the dataframe. With the DataFrame.insert method, you can add a new column between existing columns instead of adding them at the end of the pandas DataFrame.

  • Syntax: pandas.DataFrame.insert(loc, column, value, allow_duplicates=False)
  • Purpose: To add a new column to a pandas DataFrame at a user-specified location.
  • Parameters:
    • loc:Int. It is used to specify the integer-based location for inserting the new column. The integer value must be between zero to one less than the total number of columns.
    • column:String or number or hashable object. We use this to specify the label of the column which will be displayed for that column in the DataFrame.
    • value:Integer or Series or array. We use it to specify the column we want to add.
    • allow_duplicatesBoolean (default: False). It is used to specify that the new column, which is a duplicate of an existing column, should be added or not.
# Create a list which contains the values of the new column
new_col = ['Lee Kun-hee', 'Xu Zhijun', 'Tim Cook', 'Tony Chen', 'Shen Wei']

# Assign the column by specifying the index position, column name and the values of the column
df.insert(loc=2, column='Current Chairperson', value=new_col)
df
Using insert method

Using the DataFrame.assign() method

Let us say you add columns in pandas using the DataFrame.assign method. A new DataFrame will be created having the newly added columns to the original.

Always keep in mind that you cannot pass expressions (Strings, Integers,etc.) as column names using this method.

  • Syntax: pandas.DataFrame.assign( kwargs)
  • Purpose: To return a new DataFrame object having the new columns along with the columns of the original DataFrame.
  • Parameters: kwargs: We use this to specify the columns that are to be added.
  • Returns:** pandas DataFrame
# Create a list which contains the values of the new column
new_col = ['Lee Kun-hee', 'Xu Zhijun', 'Tim Cook', 'Tony Chen', 'Shen Wei']

# Assign the column to the DataFrame
df_2 = df.assign(Chairperson=new_col)
df_2
Using allign method

Using the .loc() indexing method

We can use the row/column index labels in the loc indexing method to access rows and columns.
However, you can also use this method for adding a new column to pandas DataFrames.

The first argument passed to this method is the row labels and the second argument is the column labels.

You can use the colon symbol (:) to indicate that you wish to access all the rows and then pass the name of the new column as the second argument. Then, you can assign a list of the values which will form the values of the new column.

# Create a list which contains the values of the new column
new_col = ['Lee Kun-hee', 'Xu Zhijun', 'Tim Cook', 'Tony Chen', 'Shen Wei']

# Assign the column to the DataFrame
df.loc[:, 'Current Chairperson'] = new_col
df
Using .loc method

Practical Tips

  • If you are creating a duplicate column from an existing column using any method other than the DataFrame.insert() method, make sure that the column name of the duplicate column is different from the original otherwise the duplicate column will not be created. For creating duplicate columns with the same name, use the DataFrame.insert method and set the value of the ‘aloow_duplicate’ parameter to True.
  • While creating a new column using a dictionary, make sure to use the .values() method of the dictionary. If you use this, the values of the dictionary will get passed as the values of the new column. Otherwise, the keys of the dictionary will form the values of the new column.
  • All the methods other than the DataFrame.insert() method will add the columns at the end of the pandas DataFrame.

Test Your Knowledge

Q1: To make a new column using the DataFrame.assign function, pass the column name as a string and then assign the list of values to the function. True or False?

Answer:

Answer: False. We cannot use Keywords to make column names using the DataFrame.assign function.

Q2: What is the object returned when you add new columns using the DataFrame.assign function?

Answer:

Answer: The new columns are We will get a new DataFrame with new columns added to the original DataFrame.

Q3: Identify the error in the code and write the correct code for the following:

df = df.assign("new_col_name") = new_col

Answer:

Answer: df = df.assign(new_col_name) = new_col

Q4: Assign the lists col_1, col_2, col_3 to a DataFrame df as new_col_1, new_col_2, new_col_3 using the list unpacking function.

Answer:

Answer: df["new_col_1"], df["new_col_2"], df["new_col_3"] = [col_1, col_2, col_3]

Q5: Assign the dictionary data_dict to the DataFrame df as new_col.

Answer:

Answer: df['new_col'] = data_dict.values()

The article was contributed by Shreyansh B and Shri Varsheni.

Course Preview

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free Sample Videos:

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning A-Z™: Hands-On Python & R In Data Science