I have a dataset from which I have generated graphs. I am able to extract peaks from these graph which are above a threshold using scipy
. I am trying to create a dataframe which contains peak features like peak value, peak width, peak height, slope of the curve that contains the peak, the number of points in the curve that contains the peak etc. I am struggling to find a way to extract the slope and number of points in the curve that contain peaks.
c_dict["L-04"][3][0]
data is present in the paste bin link.
This is the code that I have tried for extracting some of the peak features.
def extract_peak_features(c_dict,households):
peak_list=[]
width_list=[]
half_width_list=[]
smoke_list=[]
house_list=[]
for key,value in c_dict.items():
if not key.startswith("L-01") and not key.startswith("H"):
for k,v in value.items():
if k==3:
if len(v) > 0:
if key in households:
smoking = 1
else:
smoking = 0
peaks, _ = find_peaks(v[0],prominence=50)
half_widths = peak_widths(v[0], peaks, rel_height=0.5)[0]
widths = peak_widths(v[0], peaks, rel_height=1)[0]
if len(peaks) > 0:
peak_list.extend(np.array(v[0])[peaks])
width_list.extend(widths)
half_width_list.extend(half_widths)
smoke_list.extend([smoking] * len(peaks))
house_list.extend([key] * len(peaks))
print(key,len(peaks),len(widths),len(half_widths))
data = {"ID":house_list,"peaks":peak_list,"width":width_list,"half_width":half_width_list,"smoke":smoke_list}
df_peak_stats = pd.DataFrame(data=data)
return df_peak_stats
df_peak_stats = extract_peak_features(c_dict,households)
A code for plotting c_dict["L-04"][3][0]
data using scipy
and matplotlib
.
peaks, _ = find_peaks(c_dict["L-04"][3][0],prominence=50)
results_half = peak_widths(c_dict["L-04"][3][0], peaks, rel_height=0.5)
results_half[0] # widths
results_full = peak_widths(c_dict["L-04"][3][0], peaks, rel_height=1)
plt.plot(c_dict["L-04"][3][0])
plt.plot(peaks, np.array(c_dict["L-04"][3][0])[peaks], "x")
#plt.hlines(*results_half[1:], color="C2")
plt.hlines(*results_full[1:], color="C3")
plt.show()
In summary, I want to know how to extract the slope and number of points in the 4 curves above that contain the peaks.
Because the peaks in your data are localized, I created 4 subplots for each of the four peaks.
in which
test
is the array in your post. The code above basically locates the peaks in the array, in order to find the two associated points you want:based on this post, you can use
kneed
.