Site icon DataFlair

Pandas Concatenation – Best Tutorial for Concatenating Series & DataFrames

Free Pandas course with real-time projects Start Now!!

Python pandas concatenation is a process of joining of the object along an axis, with set logic applied to other axes, if any (because a series doesn’t have any other axes). These are the main parameters involved in pandas concatenation- object, axis, handling of other axes, and keys.

You have seen many articles on the internet about pandas concatenation. But, we are going to serve you the best one, in which, you will get the knowledge and practice the concatenation on pandas series and dataframes.

Before we start concatenation, we need to import the pandas library:

>>> import pandas as pd

 

1. How to concatenate pandas series?

1.1. How to create a pandas series?

>>> dataflair_a= pd.Series([1,2,3,4])
>>> dataflair_a

Output-

0   1
1   2
2   3
3   4
dtype: int64

Get a complete guide to master in pandas series

Create another Pandas series(b) 

>>> dataflair_b= pd.Series([5,6,7,8])
>>> dataflair_b

Output-

0 5
1 6
2 7
3 8
dtype: int64

1.2. How to concatenate the pandas series?

>>> pd.concat([dataflair_a,dataflair_b])

Output-

0   1
1   2
2   3
3   4
0   5
1   6
2   7
3   8
dtype: int64

Get the easy steps to sort pandas dataframes and series with example

1.3. Clear the existing index and make a new index

>>> pd.concat([dataflair_a,dataflair_b], ignore_index=True)

Output-

0   1
1   2
2   3
3   4
4   5
5   6
6   7
7   8
dtype: int64

1.4. How to add a hierarchical index on pandas series?

Let’s take this example to perform pandas concatenation on keys-

>>> pd.concat([dataflair_a, dataflair_b], keys=['a', 'b',])

Output-

a    0    1
1     2
2    3
3    4
b    0   5
1   6
2   7
3   8
dtype: int64

Don’t forget to check pandas function applications

1.5. Label the index

>>> pd.concat([dataflair_a, dataflair_b], keys=['a', 'b'],names=['Series name', 'Row ID'])

Output-

Series name Row ID
a    0    1
1     2
2    3
3    4
b    0   5
1   6
2   7
3   8
dtype: int64

2. How to concatenate pandas dataframes?

2.1. How to create pandas dataframes?

Print the first pandas dataframe

>>> dataflair_A = pd.DataFrame([['a', 1], ['b', 2]], columns=['letter', 'number']) 
>>> dataflair_B = pd.DataFrame([['c', 3], ['d', 4]], columns=['letter', 'number'])
>>> dataflair_A

Print the second pandas dataframe

>>> dataflair_B

Don’t miss the opportunity to grab the details about pandas dataframes

Output-

2.2 How to concatenate pandas dataframes?

>>> pd.concat([dataflair_A,dataflair_B])

Output-

2.3 Concatenating pandas dataframes having different columns

Concatenate pandas dataframes with different, overlapping columns and return everything

Create the third dataframe in Pandas

>>> dataflair_C = pd.DataFrame([['c', 3, 'duck'], ['d', 4, 'hen']],columns=['letter', 'number', 'bird'])
>>> dataflair_C

Concatenate it with A using:

>>> pd.concat([dataflair_A,dataflair_C])

Explore the 3 unique ways to iterate over dataframes

Output-

Notice NaN where there are no values in dataframe A.

2.4 Concatenating pandas dataframes with overlapping columns and only returning those

>>> pd.concat([dataflair_A,dataflair_C], join="inner")

Output-

2.5 How to combine pandas dataframes horizontally?

>>> pd.concat([dataflair_A,dataflair_B], axis=1)

Output-

2.6 Concatenating pandas dataframes using .append()

.append() makes an entire copy of the data again and again before appending. Therefore reusing it continuously can lower your program’s performance significantly.

>>> dataflair_A = pd.DataFrame([['a', 1], ['b', 2]], columns=['letter', 'number'])
>>> dataflair_A
>>> dataflair_B = pd.DataFrame([['c', 3], ['d', 4]], columns=['letter', 'number'])
>>> dataflair_B
>>> result = dataflair_A.append(dataflair_B)
>>> result

Now, you can customize your data with 5 Pandas Options

Output-

2.7.Concatenating pandas dataframes while ignoring indexes

If you are working with two dataframes which do not have quite meaningful indexes, you can choose to concatenate them, while ignoring their overlapping indexes. For doing this, you will have to use the ignore_index argument.

>>> result = pd.concat([dataflair_A,dataflair_B], ignore_index=True)
>>> result

Output-

Thus, you can see that the previous indexes were ignored and new indexes were created altogether.

2.8. Concatenating pandas dataframes using mixed ndims

If you want to, you can concatenate a mix of dataframe and series. Going by the hierarchy, the series will be converted into a dataframe with the name of the series being the name of the column name.

>>> dataflair_s = pd.Series(['S0', 'S1'], name='S')
>>> result = pd.concat([dataflair_A,dataflair_s], axis=1)
>>> result

Summary

Now, you can concatenate dataframes and series in pandas easily with the help of the pandas.concat() and append() functions. Pandas concatenation makes your work easy. In our next Pandas tutorial, we will discuss how to merge and join objects in pandas?

Hope, this Pandas Concatenation helped you. Give us suggestions and feedback to serve you better.

Exit mobile version