In this tutorial, we will check if a Numpy array has any duplicates or not with the help of some examples.
A numpy array is said to have duplicates if one (or more) values in the array occur more than once in the array. For example, in the array, [1, 2, 2, 3, 4]
, the value 2
occurs more than once and thus is a duplicate.
Methods to check if a numpy array has duplicates
To check if a numpy array has any duplicates, check if the count of unique values in the array is less than the length of the array. The idea is, if an array has any duplicate values, the unique value count will be less than the original size of the array. You can use the following two methods –
- Get the unique values in the array using
numpy.unique()
function and compare its length with that of the original array. - Get the length of the set resulting from the original array and compare its length with the original array.
The following is the syntax –
# check if numpy array has duplicates # method 1 len(numpy.unique(ar)) < len(ar) # method 2 len(set(ar)) < len(ar)
Let’s now look at some examples of using the above methods.
Example 1 – Using the numpy.unique()
function
The numpy.unique()
function returns a numpy array of the unique values in the passed array. If the size of this array is less than the size of the original array, we can say that the array has some duplicate values.
Let’s look at an example.
import numpy as np # create two arrays ar1 = np.array([1, 2, 2, 3, 4, 5]) ar2 = np.array([1, 2, 3, 4, 5]) # check if array has duplicates print(len(np.unique(ar1)) < len(ar1)) print(len(np.unique(ar2)) < len(ar2))
Output:
Introductory ⭐
- Harvard University Data Science: Learn R Basics for Data Science
- Standford University Data Science: Introduction to Machine Learning
- UC Davis Data Science: Learn SQL Basics for Data Science
- IBM Data Science: Professional Certificate in Data Science
- IBM Data Analysis: Professional Certificate in Data Analytics
- Google Data Analysis: Professional Certificate in Data Analytics
- IBM Data Science: Professional Certificate in Python Data Science
- IBM Data Engineering Fundamentals: Python Basics for Data Science
Intermediate ⭐⭐⭐
- Harvard University Learning Python for Data Science: Introduction to Data Science with Python
- Harvard University Computer Science Courses: Using Python for Research
- IBM Python Data Science: Visualizing Data with Python
- DeepLearning.AI Data Science and Machine Learning: Deep Learning Specialization
Advanced ⭐⭐⭐⭐⭐
- UC San Diego Data Science: Python for Data Science
- UC San Diego Data Science: Probability and Statistics in Data Science using Python
- Google Data Analysis: Professional Certificate in Advanced Data Analytics
- MIT Statistics and Data Science: Machine Learning with Python - from Linear Models to Deep Learning
- MIT Statistics and Data Science: MicroMasters® Program in Statistics and Data Science
🔎 Find Data Science Programs 👨💻 111,889 already enrolled
Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.
True False
Here, we created two numpy arrays – ar1
, with duplicates and ar2
without any duplicate values and checked if they had any duplicates using the numpy.unique()
method. We get True
for ar1
and False
for ar2
which are the correct results.
Example 2 – Using a set
Alternatively, you can convert the numpy array to a set to get only the unique values and then compare the length of the set and that of the original array.
Let’s take the same example as above.
import numpy as np # create two arrays ar1 = np.array([1, 2, 2, 3, 4, 5]) ar2 = np.array([1, 2, 3, 4, 5]) # check if array has duplicates print(len(set(ar1)) < len(ar1)) print(len(set(ar2)) < len(ar2))
Output:
True False
We get the same results as above.
You might also be interested in –
- Numpy – Check If Array is Monotonically Decreasing
- Numpy – Check If Array is Monotonically Increasing
- Numpy – Set All Zeros to NaN
- Numpy – Count Zeros in Array with Examples
- Numpy – Set All Values to Zero in Array
Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.