I have four sets of random data samples generated by four dependent random variables. I want to apply copula fitting onto it. For this, I need to find Probability integral transform (PIT). One way is to I can estimate the marginal CDF of each of these random variables using the data and then use the Probability integral transform (PIT) to find marginals that are uniformly distributed on the interval [0, 1].

But is there a consolidated command in R programming language to do this? I believe that R does have such a function given by the Probability integral transform (PIT) in the isodistrreg package. But I need help to get its arguments.

pit(predictions, y, randomize = TRUE, seed = NULL)

Why does one need to give predictions and numeric vector y both? Why can't I enter the generated random data, i.e., y, which gives me the Probability integral transform (PIT)?

Am I losing out on something? Where is the gap in my understanding?

PS: In the absence of a consolidated command, I am using this method:

Step 1: Find the CDF from the Data. I am using ecdf function for it.

Step2: Generate uniform random numbers in the range [0,1]

Step 3: Now I have to do the inverse of this CDF for generated uniform random samples, so I use the quantile function.

But I want this quantile function on my generated CDF, i.e., data generated from eCDF and not the raw data, as raw data has missing as well as NaN entries, which gives me an error, Error in quantile.default(matrixdata[, 1], unigf) : missing values and NaN's not allowed if 'na.rm' is FALSE

I can set na.rm=True, but I would like to know about other options.

1

There are 1 answers

0
Dr. Alenezi On

You need to do the three initial steps to use the copula data.

  • First, you can estimate the margins and transform them into uniform ones. We call this a two-stage estimation method. Or, transform the margins using probability integral transformation using the pobs function from the copula or VineCopula packages.

  • Fit the copula model to the copula data (from step 1)

  • Then, you can return to the original data using the inverse method.

If you need further help, please provide your complete R code.