Trying to fit a sine function to phased light curve

2.1k views Asked by At
import numpy as np
import matplotlib.pyplot as plt
from lmfit import Model,Parameters


f2= "KELT_N16_lc_006261_V01_west_tfa.dat"    
t2="TIMES"   # file name

NewData2 = np.loadtxt(t2, dtype=float, unpack=True)
NewData = np.loadtxt(f2,dtype=float, unpack=True, usecols=(1,))

flux = NewData   
time= NewData2

new_flux=np.hstack([flux,flux])

# fold
period = 2.0232               # period (must be known already!)

foldTimes = ((time)/ period)  # divide by period to convert to phase
foldTimes = foldTimes % 1   # take fractional part of phase only (i.e. discard whole number part)


new_phase=np.hstack([foldTimes+1,foldTimes])

print len(new_flux)
print len(new_phase)


def Wave(x, new_flux,new_phase):
    wave = new_flux*np.sin(new_phase+x)
    return wave
model = Model(Wave)
print "Independent Vars:", model.independent_vars
print "Parameters:",model.param_names
p = Parameters()
p.add_many(('new_flux',13.42, True, None, None, None) )   
p.add_many(('new_phase',0,True, None, None, None) )   

result=model.fit(new_flux,x=new_phase,params=p,weights= None)


plt.scatter(new_phase,new_flux,marker='o',edgecolors='none',color='blue',s=5.0, label="Period: 2.0232  days")   
plt.ylim([13.42,13.54])
plt.xlim(0,2)
plt.gca().invert_yaxis()
plt.title('HD 240121 Light Curve with BJD Correction')
plt.ylabel('KELT Instrumental Magnitude')
plt.xlabel('Phase')
legend = plt.legend(loc='lower right', shadow=True)
plt.scatter(new_phase,result.best_fit,label="One Oscillation Fit", color='red',s=60.0)
plt.savefig('NewEpoch.png')
print result.fit_report()

I am trying to fit a sine function to phased light curve data for a research project. However, I am unsure as to where I am going wrong, and I believe it lays in my parameters. It appears that the fit has an amplitude that is too high, and a period that is too long. Any help would be appreciated. Thank you!

This is what the graph looks like now (Attempt at fitting a sine function to my dataset):

img

2

There are 2 answers

0
Spektre On

How could we help you with your uncommented code?

  • How do we know what is what and what should it do?
  • What method for fitting are you using?
  • Where is the data and in what form ?

I would start with computing the approximate sin wave parameters. Let assume you got some input data in form of n points with x,y coordinates. And want to fit a sin wave:

y(t) = y0+A*sin(x0+x(t)*f)

Where y0 is the y offset, x0 is phase offset, A is amplitude and f is angular frequency.

I would:

  1. Compute avg y value

    y0 = sum(data[i].y)/n where i={0,1,2,...n-1}
    

    this is the mean value representing possible y offset y0 of your sin wave.

  2. compute avg distance to y0

    d = sum (|data[i].y-y0|)/n where i={0,1,2,...n-1}
    

    If my memory serves well this should be the effective value of amplitude so:

    A = sqrt(2)*d
    
  3. find zero crossings in the dataset

    for this the dataset should be sorted by x so sort it if it is not. Remember index i for: first crossing i0, last crossing i1 and number of crossings found j from this we can estimate frequency and phase offset:

    f=M_PI*double(j-1)/(datax[i1]-datax[i0]);
    x0=-datax[i0]*f;
    

    To determine which half sin wave we aligned to just check the sign of middle point between first two zero crossings

    i1=i0+((i1-i0)/(j-1));
    if (datay[(i0+i1)>>1]<=y0) x0+=M_PI;
    

    Or check for specific zero crossing pattern instead.

    That is all now we have approximate x0,y0,f,A parametters of sinwave.

Here some C++ code I tested with (sorry I do not use Python):

//---------------------------------------------------------------------------
#include <math.h>
// input data
const int n=5000;
double datax[n];
double datay[n];
// fitted sin wave
double A,x0,y0,f;
//---------------------------------------------------------------------------
void data_generate() // genere random noisy sinvawe
    {
    int i;
    double A=150.0,x0=250.0,y0=200.0,f=0.03,r=20.0;
    double x,y;
    Randomize();
    for (i=0;i<n;i++)
        {
        x=800.0*double(i)/double(n);
        y=y0+A*sin(x0+x*f);
        datax[i]=x+r*Random();
        datay[i]=y+r*Random();
        }
    }
//---------------------------------------------------------------------------
void data_fit() // find raw approximate of x0,y0,f,A
    {
    int i,j,e,i0,i1;
    double x,y,q0,q1;
    // y0 = avg(y)
    for (y0=0.0,i=0;i<n;i++) y0+=datay[i]; y0/=double(n);
    // A = avg(|y-y0|)
    for (A=0.0,i=0;i<n;i++) A+=fabs(datay[i]-y0); A/=double(n); A*=sqrt(2.0);
    // bubble sort data by x asc
    for (e=1,j=n;e;j--)
     for (e=0,i=1;i<j;i++)
      if (datax[i-1]>datax[i])
        {
        x=datax[i-1]; datax[i-1]=datax[i]; datax[i]=x;
        y=datay[i-1]; datay[i-1]=datay[i]; datay[i]=y;
        e=1;
        }
    // find zero crossings
    for (i=0,j=0;i<n;)
        {
        // find value below zero
        for (;i<n;i++) if (datay[i]-y0<=-0.75*A) break; e=i;
        // find value above zero
        for (;i<n;i++) if (datay[i]-y0>=+0.75*A) break;
        if (i>=n) break;
        // find point closest to zero
        for (i1=e;e<i;e++)
         if (fabs(datay[i1]-y0)>fabs(datay[e]-y0)) i1=e;
        if (!j) i0=i1; j++;
        }
    f=2.0*M_PI*double(j-1)/(datax[i1]-datax[i0]);
    x0=-datax[i0]*f;
    }
//---------------------------------------------------------------------------

And preview:

view

The dots are generated noisy data and blue curve is fitted sin wave.

On top of all this you can build your fitting to increase precision. Does not matter which method you will use for the search around found parameters. For example I would go for:

0
M Newville On

A couple of comments/suggestions:

First, it is almost certainly better to replace

p = Parameters()
p.add_many(('new_flux',13.42, True, None, None, None) )
p.add_many(('new_phase',0,True, None, None, None) )

with

p = Parameters()
p.add('new_flux', value=13.42, vary=True)
p.add('new_phase', value=0, vary=True)

Second, your model does not include a DC offset, but your data clearly has one. The offset is approximately 13.4 and the amplitude of the sine wave is approximately 0.05. While you're at it, you probably want to include a scale the phase as a well as an offset, so that the model is

offset + amplitude * sin(scale*x + phase_shift)

You don't necessarily have to vary all of those, but making your model more general will allow to see how the phase shift and scale are correlated -- given the noise level in your data, that might be important.

With the more general model, you can try a few sets of parameter values, using model.eval() to evaluate a model with a set of Parameters. Once you have a better model and reasonable starting points, you should get a reasonable fit.