mutiple Gaussian Curves to fit scipy optimize curve_fit ,

59 views Asked by At

Refer, https://github.com/shuyu-wang/DSC_analysis_peak_separation/tree/main/DSC-Automatic%20multimodal%20decomposition

Consider X is a sum of Multiple Gaussin Function, and if we have to curve fit, I saw a solution for Two Gaussin Funtion as ,

def func2(x, a1, a2, m1, m2, s1, s2):
    return a1*np.exp(-((x-m1)/s1)**2) + a2*np.exp(-((x-m2)/s2)**2)

This func2 was used to with

# set parammers
AmpMin = 0.01
AmpMax = 1000
CenMin = min(x)
CenMax = max(x)
WidMin = 0.1
WidMax = 100

popt, pcov = curve_fit(func2, x, y, 
                       bounds=([AmpMin,AmpMin, CenMin,CenMin, WidMin,WidMin],
                               [AmpMax,AmpMax, CenMax,CenMax, WidMax,WidMax]))

would like to have a parameter of int Cluster. Cluster = 2, the above solution works

but generalizing def func2 to def funcn , I would like to

def fun1(x,a,m,s):
    return a*np.exp(-((x-m)/s)**2)

def funcn(x, as[],ms[],ss[]):
    accum = fun1(x,as[0],ms[0],ss[0])
    for i in range(1,len(as)):
        accum = accum + fun1(x,as[i],ms[i],ss[i])
    return accum

Now, how to pass funcn to curve_fit(funcn, x, y, bounds=# ? )

1

There are 1 answers

0
joni On

Your last code snippet is not valid Python due to the function parameters. One way to generalise your approach is the following:

from itertools import batched # Python3.12
from typing import Iterable

import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
 
def gaussian(x: Iterable[float], *params) -> Iterable[float]:
    a, m, s = params
    return a * np.exp(-(((x - m) / s) ** 2))


def funcN(x: Iterable[float], *all_params) -> Iterable[float]:
    result = np.zeros_like(x)
    for params in batched(all_params, 3):
        result += gaussian(x, *params)
    return result

Here, we assume that all unknown function coefficients (parameters) all_params are sorted like this (a0, m0, s0, a1, m1, s1, a2, m2, s2, ...), i.e. the first three parameters belong to the first function, the next three to the second function and so on. Hence, we need to apply the same ordering when defining the lower and upper bounds.

Here's minimal reproducible example for n = 2 and some random dummy data which illustrates this approach:

def main():
    # create some random dummy data
    true_coeffs = [0.05, 2, 1.0, 1.0, 3.0, 3.0]
    xdata = np.arange(-10, 10, 0.01)
    ydata = funcN(xdata, *true_coeffs) + 0.2 * np.random.rand(len(xdata))

    # set parameters
    AmpMin = 0.01
    AmpMax = 1000
    CenMin = min(xdata)
    CenMax = max(xdata)
    WidMin = 0.1
    WidMax = 100

    # coefficient / parameters bound
    lb = [AmpMin, AmpMin, CenMin, CenMin, WidMin, WidMin]
    ub = [AmpMax, AmpMax, CenMax, CenMax, WidMax, WidMax]

    # init guess
    p0 = np.ones(len(ub))

    # let's do a least-squares regression
    popt, pcov = curve_fit(funcN, xdata, ydata, p0=p0, bounds=(lb, ub))

    # plot the function
    plt.plot(xdata, ydata, "r.")
    plt.plot(xdata, funcN(xdata, *popt))
    plt.show()


if __name__ == "__main__":
    main()