Easy Steps to Sort Pandas DataFrames, Series, Arrays with Examples

Sorting refers to the act of arranging the items systematically and the sequence is decided by some or the other criterion. In this Python Sorting tutorial, we are going to learn how to sort Pandas Dataframes, Series and array by rows and columns with examples.

1. Pandas Sorting

Sorting is not something exclusive to Pandas only. It is one of the most common algorithms one uses in coding and is generally linked with structures like an array or in our case, Series and DataFrames.

2. Parameters used in Pandas Sorting

Before starting to sort, let us get to know the parameters involved in sorting:

1. columns: You have to pass an object. You have to pass the column name or names.

2. ascending: You have to pass a Boolean value. The default value is True. This decides whether it gets sorted in the descending or the ascending order.

3. axis: You can pass 0 or 1; or ‘index’ or ‘columns’ for index and columns respectively. The default value is 0. This decides whether you sort by index or columns.

4. inplace: You pass a Boolean value. The default value is false. This does not create a new instance while sorting the DataFrame.

5. kind: ‘heapsport’, ‘mergesport’, ‘quicksort’. This is optional and is to be applied only when you sort a single column or labels.

6. na_position: ‘first’, ‘last’. The default optional is ‘last’. Quite expectedly, ‘first’ puts NaNs at the beginning, while ‘last’ puts NaNs at the end.

It’s the right time to learn4 Basic Functionalities Used by Data Scientists

How to sort Dataframes in Pandas

3. How to Sort Arrays with Pandas?

Before we start sorting, let’s create a NumPy array-

3.1. Creating a NumPy Array

For creating a NumPy array, you will have to import NumPy.

Type:

>>>import pandas as pd
>>>import numpy as np

NumPy arrays can be constructed directly via the numpy.array constructor.

For creating a 1-dimensional array, type:

>>>dataflair_ar1=np.array([4,5,6,7])

For creating a 2-dimensional array, type:

>>> dataflair_ar2=np.array ([[4,5,6],[7,8,9]])

After initializing both these arrays, when you access them, the output will be:

Output-

array([4,5,6,7]) #1D
array([[4,5,6],
[7,8,9]]) #2D

Sorting of Arrays with Pandas

3.2 How to Sort 1D array in Pandas?

We create an unsorted 1D array:

>>> dataflair_ar3=np.array([5,9,7])
>>> dataflair_ar3.sort()
>>> dataflair_ar3

The output is:

array([5,7,9])

How to Sort One Dimensional Array in Pandas

3.3 How to Sort 2D Array in Pandas?

We create a 2D array are as:

>>> dataflair_ar4=np.array([[9,8],[11,0]])

Do you know the File Hierarchy in Pandas?

3.3.1 First we sort it along X axis:

>>>> dataflair_ar4=np.array([[9,8],[11,0]])
>>> dataflair_ar4.sort(axis=0)
>>> dataflair_ar4

The output is:

array([[ 9, 0],
[11,8]])

How to sort 2D array on X axis

3.3.2 Now we sort it along Y axis:

We recreate the 2D array a4 as:

>>> dataflair_ar4=np.array([[9,8],[11,0]])
>>> dataflair_ar4.sort(axis=1)
>>> dataflair_ar4

The output is:

array([[ 8, 9],
[0,11]])

Sorting of 2D array in Pandas

4. How to Sort a Series with Pandas?

Before we start Pandas Sorting, let’s create a series-

4.1 Creating a Series in Pandas

Create a series by the following code:

>>> dataflair_se = pd.Series([np.nan, 3, 7, 11, 8])

The output will be:

0   NaN
1    3.0
2    7.0
3    11.0
4     8.0
dtype:   float64

How to Create Pandas Series

4.2 How to Sort a Series in Pandas?

4.2.1 Sorting a Pandas Series in an ascending order

The following syntax enables us to sort the series in ascending order:

>>> dataflair_se.sort_values(ascending=True)

The output is:

1       3.0
2       7.0
4       8.0
3      11.0
0       NaN
dtype: float64
Sorting a series in an ascending order

4.2.2 Sorting a Pandas Series in a descending order

The following syntax enables us to sort the series in ascending order:

>>> dataflair_se.sort_values(ascending=False)

3   11.0
4    8.0
2    7.0
1    3.0
0   NaN

dtype: float64

Sorting of Series in descending order

4.2.3 Sorting values inplace

The following syntax enables us to sort the series inplace:

>>> dataflair_se.sort_values(ascending=False, inplace=True)
>>> dataflair_se

The output is:

3   11.0
4    8.0
2    7.0
1    3.0
0 NaN
dtype: float64

Sorting values inplace

4.2.4 Sorting values while putting Na first

The following syntax enables us to sort the series while putting Na first:

>>> dataflair_se.sort_values(na_position='first')

Your output will be:

0    NaN
1    3.0
2    7.0
4    8.0
3    11.0
dtype: float64
How to sort pandas series by NA first

5. How to Sort a DataFrame with Pandas?

5.1 Creating a DataFrame in Pandas

Create a DataFrame using the following code:

>>> dataflair_df1 = pd.DataFrame({'col1' : [5, 2, 5, 2, 2, 1],'col2' : ['C', 'B', 'A', np.nan, 'C', 'D'],'col3': [9, 6, 0, 7, 5, 8]})
>>> dataflair_df1

Output-

.   Col1   Col2   Col3

0       5        C       9

1       2        B      6

2       5        A      0

3       2      NaN   7

4      2        C         5

5       1      D         8

How to create a Pandas Dataframe

5.2 Sorting the DataFrame

5.2.1 How to Sort Pandas DataFrames in Ascending Order?

We’ll see that pandas sorts DataFrame in the ascending order by default.

When we have to sort by a single column, we type:

>>> dataflair_df1.sort_values(by=['col1'])

The output, as shown on your screen, is:

How to sort dataframes in Pandas

When we have to sort by multiple columns, we type:

>>> dataflair_df1.sort_values(by=['col1', 'col2'])

The output, as shown on your screen, is:

How Pandas Sort by Multiple coloumns

5.2.2 How to Sort Pandas in Descending Order?

When we have to sort by a single column, we type:

>>> dataflair_df1.sort_values(by='col1', ascending=False)

The output, as shown on your screen, is:

 Sort Pandas in Descending Order

When we have to sort by multiple columns, we type:

>>> dataflair_df1.sort_values(by=['col1', 'col2'], ascending=False)

The output, as shown on your screen, is:

Sorting of Pandas Dataframes in descending order

5.3 Sorting while putting Na first

To put the NaN values on the top, we have to specify the code as:

>>> dataflair_df1.sort_values(by='col2', ascending=False, na_position='first')

The output, as shown on your screen, is:

 Sorting while putting Na first

5.4 How to Sort Pandas DataFrames by Index?

Let’s understand with an example-

>>> dataflair_df1 = pd.DataFrame({'col3' : [9, 6, 0, 7, 5, 8],'col1' : ['C', 'B', 'A', np.nan, 'C', 'D'],'col2': [20, 3, 5, 18, 15, 1],})
>>> dataflair_df1

The output will be like-

Pandas Dataframes are sorted by index

Input-

>>> dataflair_df1.sort_index(axis = 1,inplace = True)
>>> dataflair_df1

Output:

How to sort pandas dataframes by indexing

5.5 How to Sort Pandas DataFrames are by Column name?

Follow this example to know how to sort dataframes by column names-

>>> L = ['col3','col1','col2']
>>> dataflair_df1=dataflair_df1[L]
>>> dataflair_df1

The output will like-

How to sort pandas datafranes by columns name

We will sort using:

>>> dataflair_df1 = dataflair_df1.reindex(sorted(dataflair_df1.columns), axis=1)

Output:

Sorting in pandas

5.6 How to Sort Pandas DataFrames by Rows?

Keep in mind that for sorting by rows, all elements have to be of the same type. So we will create a new DataFrame here, and we will have to label the index as well.

>>> dataflair_df1 = pd.DataFrame({'col3' : [4,8,7,9,3,5],'col1' : [14,18,27,29,6,1],'col2': [3,6,9,2,4,8]},index=['A','B','C','D','E','F'])
>>> dataflair_df1

Output

DataFrames are sorted by rows

Now we will sort with respect to rows

>>> dataflair_df1 = dataflair_df1.sort_values(by='D', axis=1)
>>> dataflair_df1

Output:

Dataframes are sorted by rows

Summary

Sorting in NumPy Array and Pandas Series and DataFrame is quite straightforward. You will have to mention your preferences explicitly if they are not the default options. If you still have any doubts during runtime, feel free to ask them in the comment section below.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.