I'm very new to Python but I'm trying to produce a 2D Gaussian fit for some data. Specifically, stellar fluxes linked to certain positions in a coordinate system/grid. However not all of the positions in my grid have corresponding flux values. I don't really want to set these values to zero in case it biases my fit, but I can't seem to set them to nan
and still get my Gaussian fit to work. This is the code I'm using (modified slightly from here):
import numpy
import scipy
from numpy import *
from scipy import optimize
def gaussian(height, center_x, center_y, width_x, width_y):
width_x = float(width_x)
width_y = float(width_y)
return lambda x,y: height*exp(-(((center_x-x)/width_x)**2+((center_y-y)/width_y)**2)/2)
def moments(data):
total = nansum(data)
X, Y = indices(data.shape)
center_x = nansum(X*data)/total
center_y = nansum(Y*data)/total
row = data[int(center_x), :]
col = data[:, int(center_y)]
width_x = nansum(sqrt(abs((arange(col.size)-center_y)**2*col))/nansum(col))
width_y = nansum(sqrt(abs((arange(row.size)-center_x)**2*row))/nansum(row))
height = nanmax(data)
return height, center_x, center_y, width_x, width_y
def fitgaussian(data):
params = moments(data)
errorfunction = lambda p: ravel(gaussian(*p)(*indices(data.shape)) - data)
p, success = optimize.leastsq(errorfunction, params)
return p
parameters = fitgaussian(data)
fit = gaussian(*parameters)
My flux values are in a 2D array called data
. The code works if I have 0
instead of nan
values in this array, but otherwise my parameters
always come out as [nan nan nan nan nan]
. If there's a way to fix this, I would really appreciate your insight! The more detailed the explanation, the better. Thanks in advance!
The obvious thing to do is remove the NaNs from
data
. Doing so, however, also requires that the corresponding positions in the 2DX
,Y
location arrays also be removed:Now you can use
optimize.leastsq
(or the newer, simpleroptimize.curve_fit
) to fit the data to the model function:For example, if we generate some random
data
with NaNsso that
looks like
with the white spots showing where there are NaN values, then
yields