In this tutorial, we will look at the ValueError: columns must be same length as keys
error and understand why this error occurs. We will also look at how we can resolve this error with the help of some examples.
Understanding the ValueError: columns must be same length as keys
error
This error occurs when you try to add columns to a dataframe using a data structure like a list of lists or another dataframe and there is a mismatch in the number of columns you’re trying to add and the respective values in the data structure.
Let’s look at an example.
import pandas as pd # create two dataframes df1 = pd.DataFrame({ "Name": ["Jim", "Angela", "Dwight"] }) df2 = pd.DataFrame({ "Col1": [25, 29, 30], "Col2": ["Sales", "Accounting", "Sales"], })
Here, we created two dataframes, df1
has the “Name” column and the dataframe df2
has the column “Col1” storing the age values and the column “Col2” storing the department names.
Now, let’s say you want to add columns from the dataframe df2
to the dataframe df1
. When doing so if there is a mismatch in the number of columns you’re trying to add and the number of columns in the dataframe from which you’re adding the data, you’ll get the ValueError: columns must be same length as keys
.
# add columns from df2 to df1 df1[["Age"]] = df2 print(df1)
Output:
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[28], line 2 1 # add columns from df2 to df1 ----> 2 df1[["Age"]] = df2 3 print(df1) File ~/miniforge3/envs/dsp/lib/python3.8/site-packages/pandas/core/frame.py:3600, in DataFrame.__setitem__(self, key, value) 3598 self._setitem_frame(key, value) 3599 elif isinstance(key, (Series, np.ndarray, list, Index)): -> 3600 self._setitem_array(key, value) 3601 elif isinstance(value, DataFrame): 3602 self._set_item_frame_value(key, value) File ~/miniforge3/envs/dsp/lib/python3.8/site-packages/pandas/core/frame.py:3639, in DataFrame._setitem_array(self, key, value) 3637 else: 3638 if isinstance(value, DataFrame): -> 3639 check_key_length(self.columns, key, value) 3640 for k1, k2 in zip(key, value.columns): 3641 self[k1] = value[k2] File ~/miniforge3/envs/dsp/lib/python3.8/site-packages/pandas/core/indexers.py:428, in check_key_length(columns, key, value) 426 if columns.is_unique: 427 if len(value.columns) != len(key): --> 428 raise ValueError("Columns must be same length as key") 429 else: 430 # Missing keys in columns are represented as -1 431 if len(columns.get_indexer_non_unique(key)[0]) != len(value.columns): ValueError: Columns must be same length as key
In the above example, we’re trying to add all the columns from df2 (total of two columns) into a single column “Age” in df1 and thus we get the error.
You’ll get a similar error, the other way around as well, that is, trying to extract more columns than there are actually present.
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
# add columns from df2 to df1 df1[["Age", "Department", "Gender"]] = df2 print(df1)
Output:
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[29], line 2 1 # add columns from df2 to df1 ----> 2 df1[["Age", "Department", "Gender"]] = df2 3 print(df1) File ~/miniforge3/envs/dsp/lib/python3.8/site-packages/pandas/core/frame.py:3600, in DataFrame.__setitem__(self, key, value) 3598 self._setitem_frame(key, value) 3599 elif isinstance(key, (Series, np.ndarray, list, Index)): -> 3600 self._setitem_array(key, value) 3601 elif isinstance(value, DataFrame): 3602 self._set_item_frame_value(key, value) File ~/miniforge3/envs/dsp/lib/python3.8/site-packages/pandas/core/frame.py:3639, in DataFrame._setitem_array(self, key, value) 3637 else: 3638 if isinstance(value, DataFrame): -> 3639 check_key_length(self.columns, key, value) 3640 for k1, k2 in zip(key, value.columns): 3641 self[k1] = value[k2] File ~/miniforge3/envs/dsp/lib/python3.8/site-packages/pandas/core/indexers.py:428, in check_key_length(columns, key, value) 426 if columns.is_unique: 427 if len(value.columns) != len(key): --> 428 raise ValueError("Columns must be same length as key") 429 else: 430 # Missing keys in columns are represented as -1 431 if len(columns.get_indexer_non_unique(key)[0]) != len(value.columns): ValueError: Columns must be same length as key
How to fix this error?
To fix this error, make sure there is no mismatch in the number of columns you’re trying to add and the respective values in the data structure from which you are adding that data.
In the above example, we can fix this error by specifically adding two columns to df1
using df2
.
# add columns from df2 to df1 df1[["Age", "Department"]] = df2 print(df1)
Output:
Name Age Department 0 Jim 25 Sales 1 Angela 29 Accounting 2 Dwight 30 Sales
You can see that we do not get any errors here.
Alternatively, if you want to add only specific columns from a dataframe, use only that data. For example, df1[["Age"]] = df2[["Age"]]
Conclusion
The ValueError: columns must be same length as keys
occurs there’s a mismatch in the number of columns you’re trying to add (or use during dataframe creation) and the number of columns in the dataframe from which you’re extracting the data.
You might also be interested in –