remove duplicates from numpy array

Numpy – Remove Duplicates From Array

The Numpy library in Python comes with a number of useful functions to work with and manipulate data in Numpy arrays. In this tutorial, we will look at how to remove duplicates from a Numpy array with the help of some examples.

How to remove duplicates from a Numpy array?

You can use the Numpy unique() function to remove duplicates from an array. Pass the array as an argument. The following is the syntax –

# remove duplicates from numpy array
np.unique(ar)

It returns a Numpy array with the duplicate elements removed from the passed array.

You can also use the Numpy unique() function to remove duplicate rows and columns from a 2-D Numpy array (see the examples below)

Remove duplicate values from a one-dimensional Numpy array

remove duplicates from numpy array

Let’s now look at an example of using the above syntax to remove duplicate values from a one-dimensional Numpy array.

First, let’s create a 1-D array.

import numpy as np

# create a numpy array
ar = np.array([1, 2, 2, 3, 2, 3, 1, 4])
# print the array
print(ar)

Output:

[1 2 2 3 2 3 1 4]

Here, we use the np.array() function to create a Numpy array of some numbers. You can see that there are duplicate values present in the above array.

📚 Data Science Programs By Skill Level

Introductory

Intermediate ⭐⭐⭐

Advanced ⭐⭐⭐⭐⭐

🔎 Find Data Science Programs 👨‍💻 111,889 already enrolled

Disclaimer: Data Science Parichay is reader supported. When you purchase a course through a link on this site, we may earn a small commission at no additional cost to you. Earned commissions help support this website and its team of writers.

Let’s now remove the duplicate values from the above array using the np.unique() function.

# remove duplicates from numpy array
print(np.unique(ar))

Output:

[1 2 3 4]

The resulting array contains only the unique values from the passed array.

Remove duplicate rows or columns of two-dimensional Numpy array

You can also use the np.unique() function to remove duplicate rows or columns from a two-dimensional Numpy array.

Remove duplicate rows in Numpy

To remove duplicate rows of a 2-D Numpy array, use the np.unique() function with axis=0 parameter.

Let’s look at an example. First, we will create a 2-D array.

# create a numpy array
ar = np.array([[1, 2, 1], [2, 3, 5], [1, 2, 1]])
# print the array
print(ar)

Output:

[[1 2 1]
 [2 3 5]
 [1 2 1]]

Here, we created a 2-D Numpy array with three rows and three columns. You can see that this array contains a duplicate row (the first and the last rows are duplicates).

Let’s now remove the duplicate rows using the np.unique() function.

# remove duplicate rows from numpy array
print(np.unique(ar, axis=0))

Output:

[[1 2 1]
 [2 3 5]]

The resulting Numpy array contains only the unique rows from the original array.

Remove duplicate columns in Numpy

You can similarly remove duplicate columns from a 2-D Numpy array. For this, pass axis=1 to the np.unique() function.

Let’s look at an example. First, we will create a 2-D array.

# create a numpy array
ar = np.array([[1, 2, 2], [4, 1, 1], [3, 1, 1]])
# print the array
print(ar)

Output:

[[1 2 2]
 [4 1 1]
 [3 1 1]]

Here, we created a 2-D Numpy array with three rows and three columns. You can see that this array contains a duplicate column (the second and the third columns are duplicates).

Let’s now remove the duplicate columns using the np.unique() function.

# remove duplicate columns from numpy array
print(np.unique(ar, axis=1))

Output:

[[1 2]
 [4 1]
 [3 1]]

The resulting Numpy array contains only the unique columns from the original array.

Summary – Remove duplicates in Numpy

In this tutorial, we looked at how to remove duplicates from a Numpy array. The following is a short summary of the important points mentioned in this tutorial –

  • Use the np.unique() function to remove duplicates from a Numpy array.
  • Pass axis=0 to the np.unique() function to remove duplicate rows from a 2-D Numpy array.
  • Pass axis=1 to the np.unique() function to remove duplicate columns from a 2-D Numpy array.

You might also be interested in –


Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.


Authors

  • Piyush Raj

    Piyush is a data professional passionate about using data to understand things better and make informed decisions. He has experience working as a Data Scientist in the consulting domain and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.

  • Gottumukkala Sravan Kumar
Scroll to Top