using cdf to calculate number of occurrence from Monte Carlo simulation

58 views Asked by At

I've run a small Monte Carlo simulation of 800 runs using python. One of the output from the simulation is a max_usage excel file. The file provides me with the "requirement," number of times the asset got called within the 800 runs, and then the number of occurrences per the four categories (critical, essential, improving, and enhancing) as shown in the example below.

requirem critical essential improving enhancing
6 23 0 0 0
7 232 0 0 0
8 333 0 0 0
9 161 0 0 0
10 37 0 0 0
11 8 0 0 0
12 5 0 0 0
18 0 9 4 4
19 0 58 49 49
20 0 160 149 149
21 0 216 198 198
22 0 174 192 192
23 0 119 130 130
24 0 37 44 44
25 0 17 24 24
26 0 9 9 9
27 0 1 1 1
-------- -------- --------- --------- ---------

what I am attempting is to run through each of the assets and tables and assess where a specified rate is achieved per category. In other words I want to write a script that calculates the "requirements" needed for each asset at a rate of .875 per the four categories. For this example Critical would be 9. essential, improving, and enhancing would be 23. I am trying to use CDF(cumulative distribution function) to calculate this number as well as produce plots with all four categories assessed and a dotted line running down/along the rate.

found this in another post but this doesn't quite fit the bill.

firstly, it doesn't seem to be cumulative the plot only goes to 300 when in should be 800.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from itertools import accumulate

df = pd.read_excel("asset_1.xlsx")

# START A PLOT
fig,ax = plt.subplots()

for col in df.columns:
    
    # SKIP IF IT HAS ANY INFINITE VALUES
  if not all(np.isfinite(df[col].values)):
    continue
    

    # USE numpy's HISTOGRAM FUNCTION TO COMPUTE BINS
  xh, xb = np.histogram(df[col], bins=20, density=True)

  # COMPUTE THE CUMULATIVE SUM WITH accumulate
  xh = list(accumulate(xh))
  # NORMALIZE THE RESULT
  xh = np.array(xh) / max(xh)
  
  plt.axhline(0.875, linestyle="--")

  # PLOT WITH LABEL
  ax.plot(xb[1:], xh, label=f"$CDF$({col})")
ax.legend()
plt.title("CDFs of Columns")
plt.show()
0

There are 0 answers