Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 22882

Python for Beginners: Sort a Pandas Series in Python

$
0
0

Pandas series is used to handle sequential data in python. In this article, we will discuss different ways to sort a pandas series in Python. 

Sort a series using the sort_values() method

You can sort a pandas series using the sort_values() method. It has the following syntax.

Series.sort_values(*, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last', ignore_index=False, key=None)

Here, 

  • The axis parameter is used to decide if we want to sort a dataframe by a column or row. For series, the axis parameter is unused. It is defined just for the compatibility of the sort_values() method with pandas dataframes.
  • By default, the sort_values() method sorts a series in ascending order. If you want to sort a series in descending order, you can set the ascending parameter to True. 
  • After execution, the sort_values() method returns the sorted series. It doesn’t modify the original series. To sort and modify the original series instead of creating a new series, you can set the inplace parameter to True.
  • The kind parameter is used to determine the sorting algorithm. By default, the “quicksort” algorithm is used. If your data has a specific pattern where another sorting algorithm can be efficient, you can use  ‘mergesort’,‘heapsort’, or‘stable’ sorting algorithm.
  • The na_position parameter is used to determine the position of NaN values in the sorted series. By default, the NaN values are stored at the last of the sorted series. You can set the na_position parameter to “first” to store the NaN values at the top of the sorted series.  
  • When we sort a series, the index of all the values is shuffled when the values are sorted. Due to this, the indices in the sorted series are in no order. If you want to reset the indices after sorting, you can set the ignore_index parameter to True.
  • The key parameter is used to perform operations on the series before sorting. It takes a vectorized function as its input argument. The function provided to the key parameter must take a pandas series as its input argument and return a pandas series. Before sorting, the function is applied to the series. The values in the output of the function are then used to sort the series. 

Sort a Series in Ascending Order in Python

To sort a series in ascending order, you can use the sort_values() method on the series object as shown in the following example.

import pandas as pd
numbers=[12,34,11,25,27,8,13]
series=pd.Series(numbers)
print("The original series is:")
print(series)
sorted_series=series.sort_values()
print("The sorted series is:")
print(sorted_series)

Output:

The original series is:
0    12
1    34
2    11
3    25
4    27
5     8
6    13
dtype: int64
The sorted series is:
5     8
2    11
0    12
6    13
3    25
4    27
1    34
dtype: int64

In the above example, we have first created a pandas series of 7 numbers. Then, we have sorted the series using the sort_values() method.

You can observe that the indices are also shuffled with the values in the series when it is sorted. To reset the index, you can set the ignore_index parameter to True as shown below.

import pandas as pd
numbers=[12,34,11,25,27,8,13]
series=pd.Series(numbers)
print("The original series is:")
print(series)
sorted_series=series.sort_values(ignore_index=True)
print("The sorted series is:")
print(sorted_series)

Output:

The original series is:
0    12
1    34
2    11
3    25
4    27
5     8
6    13
dtype: int64
The sorted series is:
0     8
1    11
2    12
3    13
4    25
5    27
6    34
dtype: int64

In this example, you can observe that the series returned by the sort_values() method has indices starting from 0 till 6 instead of shuffled indices.

Sort a Pandas Series in Descending Order

To sort a pandas series in descending order, you can set the ascending parameter in the sort_values() parameter to False. After execution, the sort_values() method will return a series sorted in descending order. You can observe this in the following example.

import pandas as pd
numbers=[12,34,11,25,27,8,13]
series=pd.Series(numbers)
print("The original series is:")
print(series)
sorted_series=series.sort_values(ascending=False,ignore_index=True)
print("The sorted series is:")
print(sorted_series)

Output:

The original series is:
0    12
1    34
2    11
3    25
4    27
5     8
6    13
dtype: int64
The sorted series is:
0    34
1    27
2    25
3    13
4    12
5    11
6     8
dtype: int64

In the above example, we have set the ascending parameter in the sort_values() method to False. Hence, after execution of the sort_values() method, we get a series that is sorted in descending order.

Sort a Series Having NaN Values in Python

To sort a pandas series with NaN values, you just need to invoke the sort_values() method on the pandas series as shown in the following example.

import pandas as pd
import numpy as np
numbers=[12,np.nan,11,np.nan,27,-8,13]
series=pd.Series(numbers)
print("The original series is:")
print(series)
series.sort_values(inplace=True,ignore_index=True)
print("The sorted series is:")
print(series)

Output:

The original series is:
0    12.0
1     NaN
2    11.0
3     NaN
4    27.0
5    -8.0
6    13.0
dtype: float64
The sorted series is:
0    -8.0
1    11.0
2    12.0
3    13.0
4    27.0
5     NaN
6     NaN
dtype: float64

In this example, you can observe that the series contains NaN values. Hence, the sort_values() method puts the NaN values at the last of a sorted series by default. If you want the NaN values at the start of the sorted series, you can set the na_position parameter to “first” as shown below.

import pandas as pd
import numpy as np
numbers=[12,np.nan,11,np.nan,27,-8,13]
series=pd.Series(numbers)
print("The original series is:")
print(series)
series.sort_values(inplace=True,ignore_index=True,na_position="first")
print("The sorted series is:")
print(series)

Output:

The original series is:
0    12.0
1     NaN
2    11.0
3     NaN
4    27.0
5    -8.0
6    13.0
dtype: float64
The sorted series is:
0     NaN
1     NaN
2    -8.0
3    11.0
4    12.0
5    13.0
6    27.0
dtype: float64

In the above two examples, you can observe that the datatype of the series is set to float64 unlike the prior examples where the data type of the series was set to int64. This is due to the reason that NaN values are considered floating point data type in python. Hence, all the numbers are typecast to most compatible data type.

Sort a Series Inplace in Python

In the above examples, you can observe that the original series isn’t modified and we get a new sorted series. If you want to sort the series inplace, you can set the inplace parameter to True as shown below.

import pandas as pd
numbers=[12,34,11,25,27,8,13]
series=pd.Series(numbers)
print("The original series is:")
print(series)
series.sort_values(inplace=True,ignore_index=True)
print("The sorted series is:")
print(series)

Output:

The original series is:
0    12
1    34
2    11
3    25
4    27
5     8
6    13
dtype: int64
The sorted series is:
0     8
1    11
2    12
3    13
4    25
5    27
6    34
dtype: int64

In this example, we have set the inplace parameter to True in the sort_values() method. Hence, after execution of the sort_values() method, the original series is sorted instead of creating a new pandas series. In this case, the sort_values() method returns None.

Sort a Pandas Series Using a Key

By default, the values in the series are used for sorting. Now, suppose that you want to sort the series based on the magnitude of the values instead of their actual values. For this, you can use the keys parameter.

We will pass the abs() function to the key parameter of the sort_values() method. After this, the values of the series will be sorted by their magnitude. You can observe this in the following example.

import pandas as pd
numbers=[12,-34,11,-25,27,-8,13]
series=pd.Series(numbers)
print("The original series is:")
print(series)
series.sort_values(inplace=True,ignore_index=True,key=abs)
print("The sorted series is:")
print(series)

Output:

The original series is:
0    12
1   -34
2    11
3   -25
4    27
5    -8
6    13
dtype: int64
The sorted series is:
0    -8
1    11
2    12
3    13
4   -25
5    27
6   -34
dtype: int64

In this example, we have a series of positive and negative numbers. Now, to sort the pandas series using the absolute value of the numbers, we have used the key parameter in the sort_values() method. In the key parameter, we have passed the abs() function.

When the sort_values() method is executed, the elements of the series are first passed to the abs() function. The values returned by the abs() function are then used to compare the elements for sorting the series. This is why we get the series in which the elements are sorted by absolute value instead of actual value.

Suggested Reading: If you are into machine learning, you can read this article on regression in machine learning. You might also like this article on clustering mixed data types in Python.

The sort_index() Method in Python

Instead of sorting a series using the values, we can also sort it using the row indices. For this, we can use the sort_index() method. It has the following syntax.

Series.sort_index(*, axis=0, level=None, ascending=True, inplace=False, kind='quicksort', na_position='last', sort_remaining=True, ignore_index=False, key=None)

Here,

  • The axis parameter is unused in a similar manner to the sort_values() method.
  • The level parameter is used to sort the series by a certain index level when there are multilevel indices. To sort the series by multiple index levels in a specific order, you can pass the list of levels to the level parameter in the same order. 
  • By default, the series object is sorted by index values in ascending order. If you want the indices to be in descending order in the output dataframe, you can set the ascending parameter to False. 
  • After execution, the sort_values() method returns the sorted series. To sort and modify the original series by index instead of creating a new series, you can set the inplace parameter to True.
  • The kind parameter is used to determine the sorting algorithm. By default, the “quicksort” algorithm is used. If the index values are in a specific pattern where another sorting algorithm can be efficient, you can use  ‘mergesort’, ‘heapsort’, or ‘stable’ sorting algorithm.
  • The na_position parameter is used to determine the position of NaN indices in the sorted series. By default, the NaN indices are stored at the last of the sorted series. You can set the na_position parameter to“first” to store the NaN indices at the top of the sorted series.  
  • The sort_index() method sorts the indices in a specific order (ascending or descending). After sorting the indices, if you want to reset the index of the series, you set the ignore_index parameter to True. 
  • The key parameter is used to perform operations on the index of the series before sorting. It takes a vectorized function as its input argument. The function provided to the key parameter must take the index as its input argument and return a pandas series. Before sorting, the function is applied to the index. The values in the output of the function are then used to sort the series. 

Sort a Pandas Series by Index in Ascending Order

To sort a pandas series by index in ascending order, you can invoke the sort_index() method on the series object as shown in the following example.

import pandas as pd
import numpy as np
letters=["a","b","c","ab","abc","abcd","bc","d"]
numbers=[3,23,11,14,16,2,45,65]
series=pd.Series(letters)
series.index=numbers
print("The original series is:")
print(series)
sorted_series=series.sort_index()
print("The sorted series is:")
print(sorted_series)

Output:

The original series is:
3        a
23       b
11       c
14      ab
16     abc
2     abcd
45      bc
65       d
dtype: object
The sorted series is:
2     abcd
3        a
11       c
14      ab
16     abc
23       b
45      bc
65       d
dtype: object

In this example, we have series of strings with numbers as index. As we have used the sort_index() method on the pandas series to sort it, the series is sorted by index values. Hence, we get a series where the index values are sorted.

After sorting, if you want to reset the index of the output dataframe, you can set the ignore_index parameter to True in the sort_index() method as shown below.

import pandas as pd
import numpy as np
letters=["a","b","c","ab","abc","abcd","bc","d"]
numbers=[3,23,11,14,16,2,45,65]
series=pd.Series(letters)
series.index=numbers
print("The original series is:")
print(series)
sorted_series=series.sort_index(ignore_index=True)
print("The sorted series is:")
print(sorted_series)

Output:

The original series is:
3        a
23       b
11       c
14      ab
16     abc
2     abcd
45      bc
65       d
dtype: object
The sorted series is:
0    abcd
1       a
2       c
3      ab
4     abc
5       b
6      bc
7       d
dtype: object

In this example, we have set the ignore_index parameter to True in the sort_index() method. Hence, after sorting the series by original index values, the index of the series is reset.

Sort a Series by Index in Descending Order in Python

To sort a pandas series by index in descending order, you can set the ascending parameter in the sort_index() method to False as shown in the following example.

import pandas as pd
import numpy as np
letters=["a","b","c","ab","abc","abcd","bc","d"]
numbers=[3,23,11,14,16,2,45,65]
series=pd.Series(letters)
series.index=numbers
print("The original series is:")
print(series)
sorted_series=series.sort_index(ascending=False)
print("The sorted series is:")
print(sorted_series)

Output:

The original series is:
3        a
23       b
11       c
14      ab
16     abc
2     abcd
45      bc
65       d
dtype: object
The sorted series is:
65       d
45      bc
23       b
16     abc
14      ab
11       c
3        a
2     abcd
dtype: object

In this example, we have set the ascending parameter in the sort_index() method to False. Hence, the series is sorted by index in descending order.

Sort a Pandas Series by Index Having NaN Values

To sort a series by index when there are NaN values in the index, you just need to invoke the sort_index() method on the pandas series as shown in the following example.

import pandas as pd
import numpy as np
letters=["a","b","c","ab","abc","abcd","bc","d"]
numbers=[3,23,np.nan,14,16,np.nan,45,65]
series=pd.Series(letters)
series.index=numbers
print("The original series is:")
print(series)
sorted_series=series.sort_index()
print("The sorted series is:")
print(sorted_series)

Output:

The original series is:
3.0        a
23.0       b
NaN        c
14.0      ab
16.0     abc
NaN     abcd
45.0      bc
65.0       d
dtype: object
The sorted series is:
3.0        a
14.0      ab
16.0     abc
23.0       b
45.0      bc
65.0       d
NaN        c
NaN     abcd
dtype: object

In the above example, the index of the series contains NaN values. By default, the NaN values are stored at the last of the sorted series. If you want the NaN values at the start of the sorted series, you can set the na_position parameter to“first” as shown below.

import pandas as pd
import numpy as np
letters=["a","b","c","ab","abc","abcd","bc","d"]
numbers=[3,23,np.nan,14,16,np.nan,45,65]
series=pd.Series(letters)
series.index=numbers
print("The original series is:")
print(series)
sorted_series=series.sort_index(na_position="first")
print("The sorted series is:")
print(sorted_series)

Output:

The original series is:
3.0        a
23.0       b
NaN        c
14.0      ab
16.0     abc
NaN     abcd
45.0      bc
65.0       d
dtype: object
The sorted series is:
NaN        c
NaN     abcd
3.0        a
14.0      ab
16.0     abc
23.0       b
45.0      bc
65.0       d
dtype: object

In this example, you can observe that we have set the na_position parameter to "first" in the sort_index() method. Hence, the elements having NaN values as their index are kept at the start of the sorted series returned by the sort_index() method.

Interesting read: Advantages of being a programmer.

Sort a Series by Index Inplace in Python

By default, the sort_index() method doesn’t sort the original series. It returns a new series sorted by index. If you want to modify the original series, you can set the inplace parameter to True in the sort_index() method as shown below.

import pandas as pd
import numpy as np
letters=["a","b","c","ab","abc","abcd","bc","d"]
numbers=[3,23,np.nan,14,16,np.nan,45,65]
series=pd.Series(letters)
series.index=numbers
print("The original series is:")
print(series)
series.sort_index(inplace=True)
print("The sorted series is:")
print(series)

Output:

The original series is:
3.0        a
23.0       b
NaN        c
14.0      ab
16.0     abc
NaN     abcd
45.0      bc
65.0       d
dtype: object
The sorted series is:
3.0        a
14.0      ab
16.0     abc
23.0       b
45.0      bc
65.0       d
NaN        c
NaN     abcd
dtype: object

In this example, we have set the inplace parameter to True in the sort_index() method. Hence, the original series is sorted instead of creating a new series.

Sort a Pandas Series by Index Using a Key in Python

By using the key parameter, we can perform operations on the index of the series before sorting the series by index. For example, if you have negative numbers as the index in the series and you want to sort the series using the magnitude of the indices, you can pass the abs() function to the key parameter in the sort_index() method. 

import pandas as pd
import numpy as np
letters=["a","b","c","ab","abc","abcd","bc","d"]
numbers=[3,23,-100,14,16,-3,45,65]
series=pd.Series(letters)
series.index=numbers
print("The original series is:")
print(series)
series.sort_index(inplace=True,key=abs)
print("The sorted series is:")
print(series)

Output:

The original series is:
 3         a
 23        b
-100       c
 14       ab
 16      abc
-3      abcd
 45       bc
 65        d
dtype: object
The sorted series is:
 3         a
-3      abcd
 14       ab
 16      abc
 23        b
 45       bc
 65        d
-100       c
dtype: object

In this example, we have a series having positive and negative numbers as indices. Now, to sort the pandas series using the absolute value of the indices, we have used the key parameter in the sort_index() method. In the key parameter, we have passed the abs() function.

When the sort_index() method is executed, the indices of the series are first passed to the abs() function. The values returned by the abs() function are then used to compare the indices for sorting the series. This is why we get the series in which the indices are sorted by absolute value instead of the actual value.

Conclusion

In this article, we have discussed how to sort a pandas series in Python. For this, we have used the sort_values() and sort_index() method. We have demonstrated different examples using different parameters of these methods.

I hope you enjoyed reading this article. To know more about pandas module, you can read this article on how to sort a pandas dataframe. You might also like this article on how to drop columns from a pandas dataframe.

Happy Learning!

The post Sort a Pandas Series in Python appeared first on PythonForBeginners.com.


Viewing all articles
Browse latest Browse all 22882

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>