sampling from a custom empirical cdf

50 views Asked by At

I have a CDF which I have defined:

cdf = pd.DataFrame.from_dict({'x':[10e6,20e6,50e6,100e6,250e6],'cdf':[0.4,0.6,0.7,0.8,1]})

I want to draw 10,000 samples from this cdf by two methods:

  1. Directly using the cdf without any smoothing
  2. Directly using the cdf but smoothing between points (e.g. using a spline)

I can probably do

  1. by hand by sampling from a uniform distribution but I'm not sure how to do
  2. manually.

Is there any package that can help with sampling from a given CDF?

1

There are 1 answers

0
linpingta On

np.interp and scipy.interpolate.interp1d may help.

An example:

import pandas as pd
import numpy as np
from scipy.interpolate import interp1d

cdf = pd.DataFrame.from_dict(
    {'x':[10e6,20e6,50e6,100e6,250e6],
     'cdf':[0.4,0.6,0.7,0.8,1]
})

samples = np.random.uniform(0, 1, 10000)
samples = np.interp(samples_method1, cdf['cdf'], cdf['x'])
print(samples)

output looks like:

array([1.12425511e+08, 7.90655259e+07, 8.86092443e+07, ...,
       3.05881874e+07, 2.35671711e+08, 1.00000000e+07])

For spline related logic, please check here for it, "kind" parameter may be used for this scenario.