I have the ocean and atmospheric dataset in netcdf file. Ocean data will contain nan or any other value -999 over land area. For this eample, say it is nan. Sample data will look like this:-
import numpy as np
ocean = np.array([[2, 4, 5], [6, np.nan, 2], [9, 3, np.nan]])
atmos = np.array([[4, 2, 5], [6, 7, 3], [8, 3, 2]])
Now I wanted to apply multiple conditions on ocean and atmos data to make a new array which will have only values from 1 to 8. For example in ocean data, values between 2 and 4 will be assigned as 1 and values between 4 and 6 will be assigned as 2. The same comparison goes to atmos dataset as well.
To simplify the comparison and assignment operation, I made a list of bin values and used np.digitize to make categories.
bin1 = [2, 4, 6]
bin2 = [4, 6, 8]
ocean_cat = np.digitize(ocean, bin1)
atmos_cat = np.digitize(atmos, bin2)
which produces the following result:-
[[1 2 2]
[3 3 1]
[3 1 3]]
[[1 0 1]
[2 2 0]
[3 0 0]]
Now I wanted element-wise maximum between the above two array results. Therefore, I used np.fmax to get the element-wise maximum.
final_cat = np.fmax(ocean_cat, atmos_cat)
print(final_cat)
which produces the below result:-
[[1 2 2]
[3 3 1]
[3 1 3]]
The above result is almost what I need. The only issue I find here is the missing nan value. What I wanted in the final result is:-
[[1 2 2]
[3 nan 1]
[3 1 nan]]
Can someone help me to replace the values with nan from the same index of original ocean array?
A simple option would be to mask the output with
numpy.where:If both arrays can have NaNs:
Or
np.isnan(ocean)&np.isnan(atmos)if you only want a NaN when both inputs are NaN.Output:
Generic approach for any number of input arrays: