In this tutorial, we will learn how to get a pandas dataframe row as a numpy array. A comfort level with Python is recommended but not required.
How to get a dataframe’s row as a numpy array?
To obtain a pandas dataframe’s row as a numpy array, we can fetch the row, then convert it into a numpy array. The syntax is –
#Get the row using it's integral index row = df.iloc[row_index] #Convert the row to numpy array row_as_array = row.to_numpy()
Here,
df
— The pandas dataframe.df.iloc[row_index]
— Pandas dataframe property to access rows and columns by their integer indices.row.to_numpy()
— Pandas built-in method to convert dataframe or series objects to numpy array.
In the above code, we fetch the required row using its index and the iloc
slicing method of the pandas dataframe. The row thus obtained is a Pandas Series object. To convert series objects to a numpy array, the pandas module comes with a built-in to_numpy()
method.
Alternatively, pandas series objects can be converted to numpy arrays using built-in Numpy functions. The syntax will be-
#Get the row using it's integral index row = df.iloc[row_index] #Convert the row to numpy array row_as_array = np.array(row)
Here, the df.iloc
slicing method, as discussed earlier, returns the required row as a pandas series object. This is then converted to a numpy array using the np.array()
function.
Examples
Let’s understand the above syntax with some code examples.
For our examples, let’s create a dataframe containing the marks of some children in a class. To keep it simple, we will have only three subjects and three students.
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
import pandas as pd import numpy as np #Making a dictionary to pass it to dataframe d = {'Mary': [47, 25, 36], 'Alfred': [81, 85, 76], 'Jean': [93, 87, 71]} #DataFrame object is created with index of choice. df = pd.DataFrame(d, index=['Maths', 'English', 'Science']) #Display the dataframe df
Output:
Example 1: Get the first row of a dataframe as a numpy array
To get the first row of a dataframe as a numpy array, we will extract the row using df.iloc[0]
and then convert it into a numpy array using the pandas to_numpy()
function.
#Get the first row first_row = df.iloc[0] #Convert the row to numpy array first_row_as_np_array = first_row.to_numpy() print("First row:", first_row_as_np_array)
Output:
First row: [47 81 93]
As we can see, we have the first row of the dataframe as a numpy array.
To check if the above variable is a numpy array, we can simply print its type.
print("Type of variable 'first_row_as_np_array': ", type(first_row_as_np_array))
Output:
Type of variable 'first_row_as_np_array': <class 'numpy.ndarray'>
As we see in the output, the variable is a numpy array object.
Example 2: Get the last row of a dataframe as a numpy array
To get the last row of the dataframe as a numpy array, we simply replace 0 with -1 in df.iloc[0]
thereby indicating that we want to access the last row of the dataframe.
#Get last row of dataframe last_row = df.iloc[-1] #Get last row of dataframe as numpy array last_row_as_np_array = last_row.to_numpy() print("Last row:", last_row_as_np_array)
Output:
Last row: [36 76 71]
The above code returns the last row of the dataframe, which represents the marks of students in the Science subject.
Example 3: Get any row of dataframe as a numpy array
To obtain any row in the dataframe, we can pass the index of the required row, let’s say i
in df.iloc[i]
#Get middle row of dataframe row_number = 1 middle_row = df.iloc[row_number] #Get middle row of dataframe as numpy array middle_row_as_np_array = middle_row.to_numpy() print("Middle Row:", middle_row_as_np_array)
Output:
Middle Row: [25 85 87]
Thus we see that the middle row is returned as an array, which represents the marks of students in the English language in our dataframe.
Example 4: Using Numpy’s built-in function to convert Pandas DataFrame row to Numpy array
Alternatively, Numpy supports converting pandas dataframe rows (which are pandas series objects) to numpy arrays using the numpy.array() function.
#Get middle row of dataframe row_number = 1 middle_row = df.iloc[row_number] #Get middle row of dataframe as numpy array using np.array() function. middle_row_as_np_array = np.array(middle_row) print("Middle Row:", middle_row_as_np_array)
Output:
Middle Row: [25 85 87]
We see that the above code gives the same output as that seen in previous examples. There is no difference in converting pandas rows to numpy arrays using either of these methods.
Summary
From this tutorial, we looked at how to:
- Use
iloc
to access a dataframe row using its index. - Use the pandas
to_numpy()
method of the dataframe row to convert it to a numpy array. - Use the numpy’s built-in function,
numpy.array()
to convert a pandas row to a numpy array.
You might also be interested in –
- Average for each row in Pandas Dataframe
- Pandas – Delete rows based on column values
- Pandas DataFrame – Get Row Count
- Pandas – Select first n rows of a DataFrame
- Pandas – Random Sample of Rows
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.