# Variance of Numpy Array with NaN Values

The Numpy library in Python comes with a number of useful built-in functions for computing common descriptive statistics like mean, median, standard deviation, etc. In this tutorial, we will look at how to get the variance of values in a Numpy array containing one or more NaN values.

## Can you use the `numpy.var()` function on an array with NaN values?

We use the `numpy.var()` function to get the variance of values in a Numpy array. But what happens if the array contains one or more NaN values?

Let’s find out.

```import numpy as np

# create array
ar = np.array([1, 2, np.nan, 3])
# get array variance
print(np.var(ar))```

Output:

`nan`

Here, we created a one-dimensional Numpy array containing some numbers and a NaN value. We then applied the `numpy.var()` function which resulted in `nan`. This happened because the `numpy.var()` function wasn’t able to handle the `nan` value present in the array when computing the variance.

Thus, you cannot use the `numpy.var()` function to calculate the mean of an array with NaN values.

## How to ignore NaN values when calculating the variance of a Numpy array?

You can use the `numpy.nanvar()` function to calculate the variance of a Numpy array containing NaN values. Pass the array as an argument.

The following is the syntax –

```# variance of array with nan values
numpy.nanvar(ar)```

It returns the variance among all the values in the array ignoring all the NaN values.

Let’s look at some examples of using the `numpy.nanvar()` function.

### Example 1 – Variance of one-dimensional array with NaN values

Let’s apply the `numpy.nanvar()` function on the same array used in the example above.

```# create array
ar = np.array([1, 2, np.nan, 3])
# get array variance
print(np.nanvar(ar))```

Output:

`0.6666666666666666`

We get the variance in the above array as approximately 0.67. The `numpy.nanvar()` function ignores the NaN values when computing the variance.

### Example 2 – Variance of multi-dimensional array with NaN values

The `numpy.nanvar()` function is very similar to the `numpy.var()` function in its arguments. For example, use the `axis` parameter to specify the axis along which to compute the variance.

First, let’s create a 2-D Numpy array.

```# create 2-D numpy array
ar = np.array([[1, np.nan, 3],
[np.nan, 5, np.nan]])
# display the array
print(ar)```

Output:

```[[ 1. nan  3.]
[nan  5. nan]]```

Here, we used the `numpy.array()` function to create a Numpy array with two rows and three columns. You can see that there are some NaN values present in the array.

If you use the Numpy `nanvar()` function on an array without specifying the axis, it will return the variance of the values inside the array.

```# variance of array
print(np.nanvar(ar))```

Output:

`2.6666666666666665`

We get the variance of all the values inside the 2-D array.

Use the `numpy.nanvar()` function with `axis=1` to get the variance for each row in the array.

```# variance of each row in array
print(np.nanvar(ar, axis=1))```

Output:

`[1. 0.]`

We get the variance of each row in the above 2-D array. The variance of values in the first row is 1 and the variance of values in the second row is 0.

Use the `numpy.nanvar()` function with `axis=0` to get the variance of each column in the array.

```# variance of each column in array
print(np.nanvar(ar, axis=0))```

Output:

`[0. 0. 0.]`

We get the variance of each column in the above 2-D array. In this example, each column has one NaN value and one non-NaN value (thus we get 0 as the variance as there’s only one unique value in the column).

## Summary – Variance of Numpy array with NaN values

The following is a short summary of the important points mentioned in this tutorial.

1. Using the `numpy.var()` function on an array with NaN values results in NaN.
2. Use the `numpy.nanvar()` function to get the variance of values in an array containing one or more NaN values. It computes the variance by taking into account only the non-NaN values in the array.
3. Similar to the `numpy.var()` function, you can specify the axis along which you want to compute the variance with the `numpy.nanvar()` function.

You might also be interested in –

• 