Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 22912

Python for Beginners: Pandas Replace Values in Dataframe or Series

$
0
0

In python, we use pandas dataframes to handle tabular data. This article will discuss different ways to replace values in a pandas dataframe or series. 

This article only discusses how to multiple values in a pandas dataframe or series using the replace() method. If you want to understand the syntax of the replace() method and how to replace a single value in the pandas dataframe, you can read this article on pandas replace value.

Replace Multiple Values in a Series With One Value

To replace multiple values with one value in a pandas series, we will pass the list of values that need to be replaced as the first input argument to the replace() method. Next, we will pass the new value as the second input argument to the replace() method.

After execution, all the values of the input series specified in the input argument to the replace() method will be replaced by the second input argument to the replace() method. You can observe this in the following example.

import pandas as pd
import numpy as np
numbers=[100,90,80,90,70,100,60]
series=pd.Series(numbers)
print("The series is:")
print(series)
newSeries=series.replace([100, 90],"Distinction")
print("The updated series is:")
print(newSeries)

Output:

The series is:
0    100
1     90
2     80
3     90
4     70
5    100
6     60
dtype: int64
The updated series is:
0    Distinction
1    Distinction
2             80
3    Distinction
4             70
5    Distinction
6             60
dtype: object

In the above example, the given series contains marks of a student in different subjects. If we want to specify that the student has scored distinction if they have 90 or 100 marks, we can pass the list [90, 100] as the first input argument and the python literalDistinction” as the second input argument to the replace() method as shown above. You can observe that we get the modified series after execution of the replace() method.

Replace Multiple Values in a Series With Different Values

To replace multiple values in a series with different values, you can use two approaches. 

Replace Multiple Values in a Series Using a List in Pandas

In the first approach to replace multiple values in a pandas series, we will pass a list of values that need to be replaced as the first input argument to the replace() method. Next, we will pass a list of replacement values to the replace() method as the second input argument. Here, the number of elements in both lists should be the same.

After execution of the replace() method, the values in the first list are replaced by the corresponding values in the second list passed as the input argument. You can observe this in the following example.

import pandas as pd
import numpy as np
numbers=[100,90,80,90,70,100,60]
series=pd.Series(numbers)
print("The series is:")
print(series)
newSeries=series.replace([100, 90, 80, 70, 60],["Hundred", "Ninety", "Eighty", "Seventy", "Sixty"])
print("The updated series is:")
print(newSeries)

Output:

The series is:
0    100
1     90
2     80
3     90
4     70
5    100
6     60
dtype: int64
The updated series is:
0    Hundred
1     Ninety
2     Eighty
3     Ninety
4    Seventy
5    Hundred
6      Sixty
dtype: object

Replace Multiple Values in a Series Using a Python Dictionary

Instead of using the lists, you can pass a python dictionary to the replace() method to replace multiple values in a series with different values. For this, we will first create a dictionary that contains the values that have to be replaced as keys and the replacements as the associated value for each key. Then, we will invoke the replace() method on the series and pass the dictionary as the input argument. After execution of the replace() method, we will get the modified series as shown below.

import pandas as pd
import numpy as np
numbers=[100,90,80,90,70,100,60]
series=pd.Series(numbers)
print("The series is:")
print(series)
newSeries=series.replace({100:"Hundred", 90:"Ninety", 80:"Eighty", 70:"Seventy", 60:"Sixty"})
print("The updated series is:")
print(newSeries)

Output:

The series is:
0    100
1     90
2     80
3     90
4     70
5    100
6     60
dtype: int64
The updated series is:
0    Hundred
1     Ninety
2     Eighty
3     Ninety
4    Seventy
5    Hundred
6      Sixty
dtype: object

In the above examples, we have replaced the integer values in the series with their English names.

Replace Multiple Values in a Dataframe With One Value

To replace multiple values in the pandas dataframe with a single value, you can pass a list of values that need to be replaced as the first input argument to the replace() method. Next, you can pass the replacement value as the second input argument to the replace() method. After execution, all the values of the input dataframe specified in the input argument to the replace() method will be replaced by the second input argument to the replace() method.

You can observe this in the following example.

import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
        {"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
        {"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
        {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
        {"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
        {"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
newDf=df.replace([100, 90],"Distinction")
print("The updated dataframe is:")
print(newDf)

Output:

The input dataframe is:
   Roll  Maths  Physics  Chemistry
0     1    100       80         90
1     2     80      100         90
2     3     90       80         70
3     4    100      100         90
4     5     90       90         80
5     6     80       70         70
The updated dataframe is:
   Roll        Maths      Physics    Chemistry
0     1  Distinction           80  Distinction
1     2           80  Distinction  Distinction
2     3  Distinction           80           70
3     4  Distinction  Distinction  Distinction
4     5  Distinction  Distinction           80
5     6           80           70           70

In this example, we have replaced 90 and 100 with the term “Distinction” using the replace() method.

Replace Multiple Values in a Dataframe With Different Values

In a similar manner to a series, you can use two approaches to replace multiple values in a series with different values.

Replace Multiple Values in a Pandas Dataframe Using a List

In the first approach to replace multiple values in a pandas dataframe, we will pass a list of values that need to be replaced as the first input argument to the replace() method. Also, we will pass a list of replacement values to the replace() method as the second input argument. 

After execution of the replace() method, the values in the first list are replaced by the corresponding values in the second list passed as the input argument. You can observe this in the following example.

import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
        {"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
        {"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
        {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
        {"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
        {"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
newDf=df.replace([100, 90, 80, 70],["Hundred", "Ninety", "Eighty", "Seventy"])
print("The updated dataframe is:")
print(newDf)

Output:

The input dataframe is:
   Roll  Maths  Physics  Chemistry
0     1    100       80         90
1     2     80      100         90
2     3     90       80         70
3     4    100      100         90
4     5     90       90         80
5     6     80       70         70
The updated dataframe is:
   Roll    Maths  Physics Chemistry
0     1  Hundred   Eighty    Ninety
1     2   Eighty  Hundred    Ninety
2     3   Ninety   Eighty   Seventy
3     4  Hundred  Hundred    Ninety
4     5   Ninety   Ninety    Eighty
5     6   Eighty  Seventy   Seventy

If the number of elements in both the input lists is not the same, the program will run into a python ValueError exception as shown in the following example.

import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
        {"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
        {"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
        {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
        {"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
        {"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
newDf=df.replace([100, 90, 80, 70],["Ninety", "Eighty", "Seventy"])
print("The updated dataframe is:")
print(newDf)

Output:

ValueError: Replacement lists must match in length. Expecting 4 got 3 

In this example, we have passed fours elements in the list given to the replace() method as the first input argument. However, In the list given as the second input argument, there are only three elements. Hence, the program runs into the ValueError exception.

Hence, you should make sure that the lists contain the same number of elements.

Replace Multiple Values in a Pandas Dataframe Using a Python Dictionary

Instead of using the lists, you can pass a python dictionary to the replace() method to replace multiple values in pandas dataframe with different values.

For this, we will first create a dictionary that contains the values that have to be replaced as keys and the replacements as the associated value for each key. Then, we will invoke the replace() method on the dataframe and pass the dictionary as the input argument. After execution of the replace() method, we will get the modified dataframe as shown below.

import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
        {"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
        {"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
        {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
        {"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
        {"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
newDf=df.replace({100:"Hundred", 90:"Ninety", 80:"Eighty", 70:"Seventy"})
print("The updated dataframe is:")
print(newDf)

Output:

The input dataframe is:
   Roll  Maths  Physics  Chemistry
0     1    100       80         90
1     2     80      100         90
2     3     90       80         70
3     4    100      100         90
4     5     90       90         80
5     6     80       70         70
The updated dataframe is:
   Roll    Maths  Physics Chemistry
0     1  Hundred   Eighty    Ninety
1     2   Eighty  Hundred    Ninety
2     3   Ninety   Eighty   Seventy
3     4  Hundred  Hundred    Ninety
4     5   Ninety   Ninety    Eighty
5     6   Eighty  Seventy   Seventy

Replace Multiple Values in a Column With One Value

A column in a pandas dataframe is actually a series, hence, you can replace values in a column in pandas dataframe as shown in the following examples.

To replace multiple values with a single value, you can pass the list of values that need to be replaced as the first input argument and the replacement value as the second input argument to the replace() method. After execution of the replace() method, you will get the modified column as shown below.

import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
        {"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
        {"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
        {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
        {"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
        {"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
df["Maths"]=df["Maths"].replace([100, 90],"Distinction")
print("The updated dataframe is:")
print(df)

Output:

The input dataframe is:
   Roll  Maths  Physics  Chemistry
0     1    100       80         90
1     2     80      100         90
2     3     90       80         70
3     4    100      100         90
4     5     90       90         80
5     6     80       70         70
The updated dataframe is:
   Roll        Maths  Physics  Chemistry
0     1  Distinction       80         90
1     2           80      100         90
2     3  Distinction       80         70
3     4  Distinction      100         90
4     5  Distinction       90         80
5     6           80       70         70

Replace Multiple Values in a Column With Different  Values

To replace multiple values in a column in the pandas dataframe with different values, we will pass a list of values that need to be replaced as the first input argument to the replace() method. Next, we will pass a list of new values to the replace() method as the second input argument. After execution of the replace() method, the values in the first list are replaced by the corresponding values in the second list passed as the input argument. You can observe this in the following example.

import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
        {"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
        {"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
        {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
        {"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
        {"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
df["Maths"]=df["Maths"].replace([100, 90, 80, 70],["Hundred","Ninety", "Eighty", "Seventy"])
print("The updated dataframe is:")
print(df)

Output:

The input dataframe is:
   Roll  Maths  Physics  Chemistry
0     1    100       80         90
1     2     80      100         90
2     3     90       80         70
3     4    100      100         90
4     5     90       90         80
5     6     80       70         70
The updated dataframe is:
   Roll    Maths  Physics  Chemistry
0     1  Hundred       80         90
1     2   Eighty      100         90
2     3   Ninety       80         70
3     4  Hundred      100         90
4     5   Ninety       90         80
5     6   Eighty       70         70

Replace Multiple Values in a Column Using a Python Dictionary

Instead of using the lists, you can pass a python dictionary to the replace() method to replace multiple values in a column in the pandas dataframe with different values. For this, we will first create a dictionary that contains the values that have to be replaced as keys and the replacements as the associated value for each key. Then, we will invoke the replace() method on the column and pass the dictionary as the input argument. After execution of the replace() method, we will get the modified column as shown below.

import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
        {"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
        {"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
        {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
        {"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
        {"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
df["Maths"]=df["Maths"].replace({100:"Hundred", 90:"Ninety", 80:"Eighty", 70:"Seventy"})
print("The updated dataframe is:")
print(df)

Output:

The input dataframe is:
   Roll  Maths  Physics  Chemistry
0     1    100       80         90
1     2     80      100         90
2     3     90       80         70
3     4    100      100         90
4     5     90       90         80
5     6     80       70         70
The updated dataframe is:
   Roll    Maths  Physics  Chemistry
0     1  Hundred       80         90
1     2   Eighty      100         90
2     3   Ninety       80         70
3     4  Hundred      100         90
4     5   Ninety       90         80
5     6   Eighty       70         70

Replace Values Inplace in a Pandas Dataframe

In all the above examples, the input dataframe or series isn’t modified by the replace() method. It returns a new dataframe or series object. 

To modify the original dataframe while using the replace() method, you can set the inplace parameter to True as shown below.

import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
        {"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
        {"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
        {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
        {"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
        {"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
df.replace({100:"Hundred", 90:"Ninety", 80:"Eighty", 70:"Seventy"},inplace=True)
print("The updated dataframe is:")
print(df)

Output:

The input dataframe is:
   Roll  Maths  Physics  Chemistry
0     1    100       80         90
1     2     80      100         90
2     3     90       80         70
3     4    100      100         90
4     5     90       90         80
5     6     80       70         70
The updated dataframe is:
   Roll    Maths  Physics Chemistry
0     1  Hundred   Eighty    Ninety
1     2   Eighty  Hundred    Ninety
2     3   Ninety   Eighty   Seventy
3     4  Hundred  Hundred    Ninety
4     5   Ninety   Ninety    Eighty
5     6   Eighty  Seventy   Seventy

In a similar manner, you can replace values in a pandas series inplace as shown in the following example.

import pandas as pd
import numpy as np
numbers=[100,90,80,90,70,100,60]
series=pd.Series(numbers)
print("The series is:")
print(series)
series.replace([100, 90],"Distinction",inplace=True)
print("The updated series is:")
print(series)

Output:

The series is:
0    100
1     90
2     80
3     90
4     70
5    100
6     60
dtype: int64
The updated series is:
0    Distinction
1    Distinction
2             80
3    Distinction
4             70
5    Distinction
6             60
dtype: object

Conclusion

In this article, we have discussed different ways to replace multiple values in a pandas dataframe or series. To learn more about python programming, you can read this article on how to sort a pandas dataframe. You might also like this article on how to drop columns from a pandas dataframe.

I hope you enjoyed reading this article. Stay tuned for more informative articles.

Happy Learning!

The post Pandas Replace Values in Dataframe or Series appeared first on PythonForBeginners.com.


Viewing all articles
Browse latest Browse all 22912

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>