Get sparse region of KDE

Question

Get sparse region of KDE

173 views Asked by Ali Pardhan At 06 October 2020 at 07:06

I have an array of 20k real numbers, and I use pd.DataFrame(scores).plot.kde(figsize=(24,8)) to get the below kernel density estimation. How can I purely programmatically select the indexes of the sparse regions, or conversely the dense region?

My current approach is of the form np.where(scores > np.percentile(scores, 99))[0], I am very of such hard cording of 99 as it may not work too well in production. A potential solution which I'm not sure how to approach is selecting the indices where the Density is below 20,000

Original Q&A

There are 1 answers

**JohanC** · Accepted Answer · 2020-10-06T11:31:08+00:00

Which region to consider "sparse" and which "dense" can be very subjective. It also heavily depends on the signification of the data. An idea is to decide upon some cut-off percentiles. The example below uses the lowest 0.1 % and highest 99.9 %.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

df = pd.DataFrame({'score': np.random.randn(2000, 10).cumsum(axis=0).ravel()})
df['score'].quantile([.01, .99])
ax = df.plot.kde(figsize=(24, 8))
ax.axvline(df['score'].quantile(.001), color='crimson', ls=':')
ax.axvline(df['score'].quantile(.999), color='crimson', ls=':')
ax.set_ylim(ymin=0) # avoid the kde "floating in the air"
plt.show()

TechQA.

Get sparse region of KDE

There are 1 answers

Related Questions in PYTHON

Related Questions in DATAFRAME

Related Questions in SCIPY

Related Questions in KERNEL-DENSITY

Related Questions in ANOMALY-DETECTION

Popular Questions

Popular Tags

Trending Questions