The Numpy library in Python comes with a number of useful functions to work with and manipulate data in Numpy arrays. In this tutorial, we will look at how to remove duplicates from a Numpy array with the help of some examples.
How to remove duplicates from a Numpy array?
You can use the Numpy unique()
function to remove duplicates from an array. Pass the array as an argument. The following is the syntax –
# remove duplicates from numpy array np.unique(ar)
It returns a Numpy array with the duplicate elements removed from the passed array.
You can also use the Numpy unique()
function to remove duplicate rows and columns from a 2-D Numpy array (see the examples below)
Remove duplicate values from a one-dimensional Numpy array
Let’s now look at an example of using the above syntax to remove duplicate values from a one-dimensional Numpy array.
First, let’s create a 1-D array.
import numpy as np # create a numpy array ar = np.array([1, 2, 2, 3, 2, 3, 1, 4]) # print the array print(ar)
Output:
[1 2 2 3 2 3 1 4]
Here, we use the np.array()
function to create a Numpy array of some numbers. You can see that there are duplicate values present in the above array.
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
Let’s now remove the duplicate values from the above array using the np.unique()
function.
# remove duplicates from numpy array print(np.unique(ar))
Output:
[1 2 3 4]
The resulting array contains only the unique values from the passed array.
Remove duplicate rows or columns of two-dimensional Numpy array
You can also use the np.unique()
function to remove duplicate rows or columns from a two-dimensional Numpy array.
Remove duplicate rows in Numpy
To remove duplicate rows of a 2-D Numpy array, use the np.unique()
function with axis=0
parameter.
Let’s look at an example. First, we will create a 2-D array.
# create a numpy array ar = np.array([[1, 2, 1], [2, 3, 5], [1, 2, 1]]) # print the array print(ar)
Output:
[[1 2 1] [2 3 5] [1 2 1]]
Here, we created a 2-D Numpy array with three rows and three columns. You can see that this array contains a duplicate row (the first and the last rows are duplicates).
Let’s now remove the duplicate rows using the np.unique()
function.
# remove duplicate rows from numpy array print(np.unique(ar, axis=0))
Output:
[[1 2 1] [2 3 5]]
The resulting Numpy array contains only the unique rows from the original array.
Remove duplicate columns in Numpy
You can similarly remove duplicate columns from a 2-D Numpy array. For this, pass axis=1
to the np.unique()
function.
Let’s look at an example. First, we will create a 2-D array.
# create a numpy array ar = np.array([[1, 2, 2], [4, 1, 1], [3, 1, 1]]) # print the array print(ar)
Output:
[[1 2 2] [4 1 1] [3 1 1]]
Here, we created a 2-D Numpy array with three rows and three columns. You can see that this array contains a duplicate column (the second and the third columns are duplicates).
Let’s now remove the duplicate columns using the np.unique()
function.
# remove duplicate columns from numpy array print(np.unique(ar, axis=1))
Output:
[[1 2] [4 1] [3 1]]
The resulting Numpy array contains only the unique columns from the original array.
Summary – Remove duplicates in Numpy
In this tutorial, we looked at how to remove duplicates from a Numpy array. The following is a short summary of the important points mentioned in this tutorial –
- Use the
np.unique()
function to remove duplicates from a Numpy array. - Pass
axis=0
to thenp.unique()
function to remove duplicate rows from a 2-D Numpy array. - Pass
axis=1
to thenp.unique()
function to remove duplicate columns from a 2-D Numpy array.
You might also be interested in –
- Remove First Element From Numpy Array
- Remove Last Element From Numpy Array
- Get unique values and counts in a numpy array
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.