Concat DataFrames in Pandas

You can concatenate dataframes along the row or the column axis. Say you have multiple dataframes having the same fields and you’d like to combine them into one. In this case, you’d want to concatenate the dataframes along the index or the row axis. Or say, if you have some additional fields for your current data that you may want to add, then you’d want to concatenate them along the column axis. In this tutorial, we’ll look at how to concat two or more dataframes in pandas.

The pandas concat() function is used to concatenate multiple dataframes into one. The following is its syntax:

pd.concat(objs, axis=0)

You pass the sequence of dataframes objects (objs) you want to concatenate and tell the axis (0 for rows and 1 for columns) along which the concatenation is to be done and it returns the concatenated dataframe. The default for axis is 0. Note that you can use the concat function to concatenate pandas Series as well.

Let’s look at some of the use-cases of the concat() function through examples –

If you have two or more dataframes having the same fields (or columns) and want to concat them into one:

import pandas as pd

# Create two dataframes with same columns
df1 = pd.DataFrame({'Name': ['Sam', 'Emma'], 'Age': [14, 15]})
df2 = pd.DataFrame({'Name': ['Karen', 'Rahul'], 'Age': [10, 13]})

# Print the dataframes
print("DataFrame df1:\n")
print(df1)
print("\nDataFrame df2:\n")
print(df2)

# Concatenate them along the axis
df_combined = pd.concat([df1, df2])

# Print the concatenated dataframe
print("\nCombined DataFrame:\n", df_combined)

Output:

DataFrame df1:

   Name  Age
0   Sam   14
1  Emma   15

DataFrame df2:

    Name  Age
0  Karen   10
1  Rahul   13

Combined DataFrame:
     Name  Age
0    Sam   14
1   Emma   15
0  Karen   10
1  Rahul   13

In the above example, the dataframes df1 and df2 have the same columns – Name and Age and are concatenated along the index, that is vertically (since the axis parameter is by default 0). Also, note that the combined dataframe retains the index from the individual dataframes, you can reset the index by using the reset_index() function.

df_combined = df_combined.reset_index(drop=True)
print(df_combined)

Output:

    Name  Age
0    Sam   14
1   Emma   15
2  Karen   10
3  Rahul   13

The above is the combined dataframe with its index reset.

If you have two or more dataframes with rows referring to the same records and want to concat them to have all the different fields into one dataframe use the concat() function with axis=1.

import pandas as pd

# Create two dataframes with same columns
df1 = pd.DataFrame({'Name': ['Sam', 'Emma'], 'Age': [14, 15]})
df2 = pd.DataFrame({'Math': ['B', 'A+'], 'Science': ['A', 'B+']})

# Print the dataframes
print("DataFrame df1:\n")
print(df1)
print("\nDataFrame df2:\n")
print(df2)

# Concatenate them along the axis
df_combined = pd.concat([df1, df2], axis=1)

# Print the concatenated dataframe
print("\nCombined DataFrame:\n", df_combined)

Output:

DataFrame df1:

   Name  Age
0   Sam   14
1  Emma   15

DataFrame df2:

  Math Science
0    B       A
1   A+      B+

Combined DataFrame:
    Name  Age Math Science
0   Sam   14    B       A
1  Emma   15   A+      B+

In the above example the dataframes df1 (storing the name and age of students) and df2 (storing their Math and Science grades) are concatenated horizontally (along the column axis).

The concat() function has a number of parameters that can be used for more specific use-cases. For more, please refer to its official documentation.

With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.