Site icon DataFlair

Easy Steps to Sort Pandas DataFrames, Series, Arrays with Examples

Free Pandas course with real-time projects Start Now!!

Sorting refers to the act of arranging the items systematically and the sequence is decided by some or the other criterion. In this Python Sorting tutorial, we are going to learn how to sort Pandas Dataframes, Series and array by rows and columns with examples.

1. Pandas Sorting

Sorting is not something exclusive to Pandas only. It is one of the most common algorithms one uses in coding and is generally linked with structures like an array or in our case, Series and DataFrames.

2. Parameters used in Pandas Sorting

Before starting to sort, let us get to know the parameters involved in sorting:

1. columns: You have to pass an object. You have to pass the column name or names.

2. ascending: You have to pass a Boolean value. The default value is True. This decides whether it gets sorted in the descending or the ascending order.

3. axis: You can pass 0 or 1; or ‘index’ or ‘columns’ for index and columns respectively. The default value is 0. This decides whether you sort by index or columns.

4. inplace: You pass a Boolean value. The default value is false. This does not create a new instance while sorting the DataFrame.

5. kind: ‘heapsport’, ‘mergesport’, ‘quicksort’. This is optional and is to be applied only when you sort a single column or labels.

6. na_position: ‘first’, ‘last’. The default optional is ‘last’. Quite expectedly, ‘first’ puts NaNs at the beginning, while ‘last’ puts NaNs at the end.

Technology is evolving rapidly!
Stay updated with DataFlair on WhatsApp!!

It’s the right time to learn4 Basic Functionalities Used by Data Scientists

3. How to Sort Arrays with Pandas?

Before we start sorting, let’s create a NumPy array-

3.1. Creating a NumPy Array

For creating a NumPy array, you will have to import NumPy.

Type:

>>>import pandas as pd
>>>import numpy as np

NumPy arrays can be constructed directly via the numpy.array constructor.

For creating a 1-dimensional array, type:

>>>dataflair_ar1=np.array([4,5,6,7])

For creating a 2-dimensional array, type:

>>> dataflair_ar2=np.array ([[4,5,6],[7,8,9]])

After initializing both these arrays, when you access them, the output will be:

Output-

array([4,5,6,7]) #1D
array([[4,5,6],
[7,8,9]]) #2D

3.2 How to Sort 1D array in Pandas?

We create an unsorted 1D array:

>>> dataflair_ar3=np.array([5,9,7])
>>> dataflair_ar3.sort()
>>> dataflair_ar3

The output is:

array([5,7,9])

3.3 How to Sort 2D Array in Pandas?

We create a 2D array are as:

>>> dataflair_ar4=np.array([[9,8],[11,0]])

Do you know the File Hierarchy in Pandas?

3.3.1 First we sort it along X axis:

>>>> dataflair_ar4=np.array([[9,8],[11,0]])
>>> dataflair_ar4.sort(axis=0)
>>> dataflair_ar4

The output is:

array([[ 9, 0],
[11,8]])

3.3.2 Now we sort it along Y axis:

We recreate the 2D array a4 as:

>>> dataflair_ar4=np.array([[9,8],[11,0]])
>>> dataflair_ar4.sort(axis=1)
>>> dataflair_ar4

The output is:

array([[ 8, 9],
[0,11]])

4. How to Sort a Series with Pandas?

Before we start Pandas Sorting, let’s create a series-

4.1 Creating a Series in Pandas

Create a series by the following code:

>>> dataflair_se = pd.Series([np.nan, 3, 7, 11, 8])

The output will be:

0   NaN
1    3.0
2    7.0
3    11.0
4     8.0
dtype:   float64

4.2 How to Sort a Series in Pandas?

4.2.1 Sorting a Pandas Series in an ascending order

The following syntax enables us to sort the series in ascending order:

>>> dataflair_se.sort_values(ascending=True)

The output is:

1       3.0
2       7.0
4       8.0
3      11.0
0       NaN
dtype: float64

4.2.2 Sorting a Pandas Series in a descending order

The following syntax enables us to sort the series in ascending order:

>>> dataflair_se.sort_values(ascending=False)

3   11.0
4    8.0
2    7.0
1    3.0
0   NaN

dtype: float64

4.2.3 Sorting values inplace

The following syntax enables us to sort the series inplace:

>>> dataflair_se.sort_values(ascending=False, inplace=True)
>>> dataflair_se

The output is:

3   11.0
4    8.0
2    7.0
1    3.0
0 NaN
dtype: float64

4.2.4 Sorting values while putting Na first

The following syntax enables us to sort the series while putting Na first:

>>> dataflair_se.sort_values(na_position='first')

Your output will be:

0    NaN
1    3.0
2    7.0
4    8.0
3    11.0
dtype: float64

5. How to Sort a DataFrame with Pandas?

5.1 Creating a DataFrame in Pandas

Create a DataFrame using the following code:

>>> dataflair_df1 = pd.DataFrame({'col1' : [5, 2, 5, 2, 2, 1],'col2' : ['C', 'B', 'A', np.nan, 'C', 'D'],'col3': [9, 6, 0, 7, 5, 8]})
>>> dataflair_df1

Output-

.   Col1   Col2   Col3

0       5        C       9

1       2        B      6

2       5        A      0

3       2      NaN   7

4      2        C         5

5       1      D         8

5.2 Sorting the DataFrame

5.2.1 How to Sort Pandas DataFrames in Ascending Order?

We’ll see that pandas sorts DataFrame in the ascending order by default.

When we have to sort by a single column, we type:

>>> dataflair_df1.sort_values(by=['col1'])

The output, as shown on your screen, is:

When we have to sort by multiple columns, we type:

>>> dataflair_df1.sort_values(by=['col1', 'col2'])

The output, as shown on your screen, is:

5.2.2 How to Sort Pandas in Descending Order?

When we have to sort by a single column, we type:

>>> dataflair_df1.sort_values(by='col1', ascending=False)

The output, as shown on your screen, is:

When we have to sort by multiple columns, we type:

>>> dataflair_df1.sort_values(by=['col1', 'col2'], ascending=False)

The output, as shown on your screen, is:

5.3 Sorting while putting Na first

To put the NaN values on the top, we have to specify the code as:

>>> dataflair_df1.sort_values(by='col2', ascending=False, na_position='first')

The output, as shown on your screen, is:

5.4 How to Sort Pandas DataFrames by Index?

Let’s understand with an example-

>>> dataflair_df1 = pd.DataFrame({'col3' : [9, 6, 0, 7, 5, 8],'col1' : ['C', 'B', 'A', np.nan, 'C', 'D'],'col2': [20, 3, 5, 18, 15, 1],})
>>> dataflair_df1

The output will be like-

Input-

>>> dataflair_df1.sort_index(axis = 1,inplace = True)
>>> dataflair_df1

Output:

5.5 How to Sort Pandas DataFrames are by Column name?

Follow this example to know how to sort dataframes by column names-

>>> L = ['col3','col1','col2']
>>> dataflair_df1=dataflair_df1[L]
>>> dataflair_df1

The output will like-

We will sort using:

>>> dataflair_df1 = dataflair_df1.reindex(sorted(dataflair_df1.columns), axis=1)

Output:

5.6 How to Sort Pandas DataFrames by Rows?

Keep in mind that for sorting by rows, all elements have to be of the same type. So we will create a new DataFrame here, and we will have to label the index as well.

>>> dataflair_df1 = pd.DataFrame({'col3' : [4,8,7,9,3,5],'col1' : [14,18,27,29,6,1],'col2': [3,6,9,2,4,8]},index=['A','B','C','D','E','F'])
>>> dataflair_df1

Output

Now we will sort with respect to rows

>>> dataflair_df1 = dataflair_df1.sort_values(by='D', axis=1)
>>> dataflair_df1

Output:

Summary

Sorting in NumPy Array and Pandas Series and DataFrame is quite straightforward. You will have to mention your preferences explicitly if they are not the default options. If you still have any doubts during runtime, feel free to ask them in the comment section below.

Exit mobile version