You can concatenate dataframes along the row or the column axis. Say you have multiple dataframes having the same fields and you’d like to combine them into one. In this case, you’d want to concatenate the dataframes along the index or the row axis. Or say, if you have some additional fields for your current data that you may want to add, then you’d want to concatenate them along the column axis. In this tutorial, we’ll look at how to concat two or more dataframes in pandas.
The pandas concat()
function
The pandas concat()
function is used to concatenate multiple dataframes into one. The following is its syntax:
pd.concat(objs, axis=0)
You pass the sequence of dataframes objects (objs
) you want to concatenate and tell the axis (0
for rows and 1
for columns) along which the concatenation is to be done and it returns the concatenated dataframe. The default for axis
is 0
. Note that you can use the concat
function to concatenate pandas Series as well.
Examples
Let’s look at some of the use-cases of the concat()
function through examples –
1. Concat DataFrames with same fields (Vertically)
If you have two or more dataframes having the same fields (or columns) and want to concat them into one:
import pandas as pd
# Create two dataframes with same columns
df1 = pd.DataFrame({'Name': ['Sam', 'Emma'], 'Age': [14, 15]})
df2 = pd.DataFrame({'Name': ['Karen', 'Rahul'], 'Age': [10, 13]})
# Print the dataframes
print("DataFrame df1:\n")
print(df1)
print("\nDataFrame df2:\n")
print(df2)
# Concatenate them along the axis
df_combined = pd.concat([df1, df2])
# Print the concatenated dataframe
print("\nCombined DataFrame:\n", df_combined)
Output:
DataFrame df1:
Name Age
0 Sam 14
1 Emma 15
DataFrame df2:
Name Age
0 Karen 10
1 Rahul 13
Combined DataFrame:
Name Age
0 Sam 14
1 Emma 15
0 Karen 10
1 Rahul 13
In the above example, the dataframes df1
and df2
have the same columns – Name
and Age
and are concatenated along the index, that is vertically (since the axis parameter is by default 0
). Also, note that the combined dataframe retains the index from the individual dataframes, you can reset the index by using the reset_index()
function.
df_combined = df_combined.reset_index(drop=True)
print(df_combined)
Output:
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
Name Age
0 Sam 14
1 Emma 15
2 Karen 10
3 Rahul 13
The above is the combined dataframe with its index reset.
2. Concat DataFrames with rows referring to the same records (Horizontally)
If you have two or more dataframes with rows referring to the same records and want to concat them to have all the different fields into one dataframe use the concat()
function with axis=1
.
import pandas as pd
# Create two dataframes with same columns
df1 = pd.DataFrame({'Name': ['Sam', 'Emma'], 'Age': [14, 15]})
df2 = pd.DataFrame({'Math': ['B', 'A+'], 'Science': ['A', 'B+']})
# Print the dataframes
print("DataFrame df1:\n")
print(df1)
print("\nDataFrame df2:\n")
print(df2)
# Concatenate them along the axis
df_combined = pd.concat([df1, df2], axis=1)
# Print the concatenated dataframe
print("\nCombined DataFrame:\n", df_combined)
Output:
DataFrame df1:
Name Age
0 Sam 14
1 Emma 15
DataFrame df2:
Math Science
0 B A
1 A+ B+
Combined DataFrame:
Name Age Math Science
0 Sam 14 B A
1 Emma 15 A+ B+
In the above example the dataframes df1
(storing the name and age of students) and df2
(storing their Math and Science grades) are concatenated horizontally (along the column axis).
The concat()
function has a number of parameters that can be used for more specific use-cases. For more, please refer to its official documentation.
With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5
More on Pandas DataFrames –
- Pandas – Sort a DataFrame
- Change Order of Columns of a Pandas DataFrame
- Pandas DataFrame to a List in Python
- Pandas – Count of Unique Values in Each Column
- Pandas – Replace Values in a DataFrame
- Pandas – Filter DataFrame for multiple conditions
- Pandas – Random Sample of Rows
- Pandas – Random Sample of Columns
- Save Pandas DataFrame to a CSV file
- Pandas – Save DataFrame to an Excel file
- Create a Pandas DataFrame from Dictionary
- Convert Pandas DataFrame to a Dictionary
- Drop Duplicates from a Pandas DataFrame
- Concat DataFrames in Pandas
- Append Rows to a Pandas DataFrame
- Compare Two DataFrames for Equality in Pandas
- Get Column Names as List in Pandas DataFrame
- Select One or More Columns in Pandas
- Pandas – Rename Column Names
- Pandas – Drop one or more Columns from a Dataframe
- Pandas – Iterate over Rows of a Dataframe
- How to Reset Index of a Pandas DataFrame?
- Read CSV files using Pandas – With Examples
- Apply a Function to a Pandas DataFrame
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.