Menu Close

How to Replace Column Values in Pandas DataFrame

How to Replace Column Values in Pandas DataFrame

In this Pandas article, we are going to see multiple ways to replace column values in Pandas DataFrame based on the condition.
This is one of the asked questions during Data analysis, Data engineering, and Data scientist interviews and also it is very useful in real Pandas applications where you can replace any specific value of a column based on the conditions.

Pandas provides multiple ways to replace the column value of Pandas DataFrame, Let’s explore all these methods along with an example.

You can get more about Python Pandas from the Python Pandas tutorial page.

I have prepared a sample CSV dataset as you can see below and this sample data will be used throughout this article.

How to Replace Column Values in Pandas DataFrame

Requirement:- My requirement is to replace the ‘Male‘ value with ‘M‘ and ‘Female‘ with ‘F‘ in the ‘emp_gender‘ column of Pandas DataFrame.

Load CSV Data into Pandas DataFrame

Pandas has a method called read_csv() method that is used to load the CSV data into Pandas DataFrame because without DataFrame we cannot perform any methods in order to replace values inside a column.

Let’s load CSV data into Pandas DataFrame using the read_csv() method.


import pandas as pd
df = pd.read_csv(
                 '../../Datasets/employee_dataset.csv'
                )
How to Replace Column Values in Pandas DataFrame

Now, I have loaded CSV datasets to Pandas DataFrame, and let’s explore all the methods to replace column values in Pandas DataFrame based on the condition.

👉 Read CSV File into Pandas DataFrame

Replace column values in Pandas DataFrame using Assignment Operator

The assignment operator ( = ) can be used to replace values in specific columns of the Pandas DataFrame for example. In the below example, I have replaced ‘Male‘ with ‘M‘ and ‘Female‘ with ‘F‘ using the assignment operator.


df = pd.read_csv(
                 '../../Datasets/employee_dataset.csv'
                )

df['emp_gender'] = df['emp_gender'].replace('Male', 'M').replace('Female', 'F')
print(df)
How to Replace Column Values in Pandas DataFrame

Replace Column Values in Pandas DataFrame using the replace() Method

Python Pandas DataFrame has a method called replace() that is used to replace the value on Pandas DataFrame.You can use the replace() method in different conditions. This is one of the best Python Pandas DataFrame methods to replace values in DataFrame.
It will return a new DataFrame after replacing the value into DataFrame, To replace the existing DataFrame you have to pass inplace=True into the replace() method. Here I am not going to pass inplace=True because I don’t want to change the existing dataframe.

In the following example, I have replaced ‘Male‘ with ‘M‘ and ‘Female‘ with ‘F‘ using the replace() method.


df = pd.read_csv(
                 '../../Datasets/employee_dataset.csv'
                )

df = df.replace({'emp_gender': {'Male': 'M', 'Female': 'F'}})
print(df)
How to Replace Column Values in Pandas DataFrame

Replace Column Values in Pandas DataFrame using loc Property

loc is a Python Pandas DataFrame property that is used to access the group of rows and columns from DataFrame using labels or a boolean array. It is also used to replace column value base don the condition and without condition.

Remember, the loc property will replace the existing DataFrame.Be careful when you are using this loc property.

let’s see a way of using the Python Pandas DataFrame loc property.


import pandas as pd
df = pd.read_csv(
                 '../../Datasets/employee_dataset.csv'
                )

df.loc[df['emp_gender'] == 'Male', ['emp_gender']] = 'M'
df.loc[df['emp_gender'] == 'Female', ['emp_gender']] = 'F'
print(df)
How to Replace Column Values in Pandas DataFrame

Replace Column Values in Pandas DataFrame using np.where() Method

Python Numpy is another popular and open-source library that provides a method called where() that is used to replace the value in a column based on the condition. To use the where() method we need to import the where() method from the Python Numpy module.

In the below example, I have replaced ‘Male‘ with ‘M‘ and ‘Female‘ with ‘F‘ using the Python Numpy where() method.


import pandas as pd
from numpy import where
df = pd.read_csv(
                 '../../Datasets/employee_dataset.csv'
                )

df['emp_gender'] = where(df['emp_gender'] == 'Male', 'M', 'F')
print(df)
How to Replace Column Values in Pandas DataFrame

Replace Column Values in Pandas DataFrame using the mask() Method

mask() method is another Pandas DataFrame method that is used to replace the existing value with a new value in the Pandas DataFrame according to the condition. The mask() method takes condition as the first parameter and value as the second parameter to be replaced if the defined condition evaluates True.

Let’s see an example of using the DataFrame mask() method in order to replace the value in a specific column of Pandas DataFrame.


import pandas as pd
df = pd.read_csv(
                 '../../Datasets/employee_dataset.csv'
                )

df['emp_gender'].mask(df['emp_gender'] == 'Male', 'M', inplace=True)
df['emp_gender'].mask(df['emp_gender'] == 'Female', 'F', inplace=True)
print(df)
How to Replace Column Values in Pandas DataFrame

Conclusion

So throughout this article, we have seen various ways to replace column values in Pandas DataFrame based on the condition now you have multiple ways to tackle this problem during the interviews and your Python Pandas application. You can go with anyone if you have a requirement to replace specific values to the specific column of the pandas DatFrame.

If you found this article helpful, please share and keep visiting for further Pandas tutorials.

Thanks for your valuable time…

How to Read CSV File into Pandas DataFrame
How to Delete Column from Pandas DataFrame

Related Posts