I would like to estimate power of the following problem. I am interested in comparing two groups that both follow Weibull distribution. So, group A has two parameters (shape par = a1,scale par = b1) and two parameters has group B (a2, b2). By simulating random variables from distribution of interest (for example assuming different scale and shape parameters, i.e. a1=1.5*a2, and b1=b2*0.5; or either differences between groups are just in either shape or scale parameters), apply log-likelihood ratio test to test if a1=a2 and b1=b2 (or e.g. a1=a1, when we know that b1=b2), and estimate power of the test.
The questions would be what are log-likelihoods for the full models, and how to code it in R when a)having exact data, and b) for interval-censored data ?
That is, for reduced model (when a1=a2,b1=b2) log-likelihoods for exact and interval-censored data are:
LL.reduced.exact <- function(par,data){sum(log(dweibull(data,shape=par[1],scale=par[2])))};
LL.reduced.interval.censored<-function(par, data.lower, data.upper) {sum(log((1-pweibull(data.lower, par[1], par[2])) – (1-pweibull(data.upper, par[1],par[2]))))}
What is it for the full model, when a1!=a2, b1!=b2, taking into account two different observational schemes, i.e. when 4 parameters have to be estimated (or, in case when interested in looking at diferences in shape parameters, 3 parameters have to be estimated)?
Is it possible to estimate it buy building two log-likelihoods for separate groups and add it together (i.e.LL.full<-LL.group1+LL.group2)?
Regarding log-likelihood for interval-censored data, censoring is non-informative and all observations are interval-censored. Any better ideas how to perform this task will be appreciated.
Please, find the R Code for exact data below to illustrate the problem. Thank you very much in advance.
R Code:
# n (sample size) = 500
# sim (number of simulations) = 1000
# alpha = .05
# Parameters of Weibull distributions:
#group 1: a1=1, b1=20
#group 2: a2=1*1.5 b2=b1
n=500
sim=1000
alpha=.05
a1=1
b1=20
a2=a1*1.5
b2=b1
#OR: a1=1, b1=20, a2=a1*1.5, b2=b1*0.5
# the main question is how to build this log-likelihood model, when a1!=a2, and b1=b2
# (or a1!=a2, and b1!=b2)
LL.full<-?????
LL.reduced <- function(par,data){sum(log(dweibull(data,shape=par[1],scale=par[2])))}
LR.test<-function(red,full,df) {
lrt<-(-2)*(red-full)
pvalue<-1-pchisq(lrt,df)
return(data.frame(lrt,pvalue))
}
rejections<-NULL
for (i in 1:sim) {
RV1<-rweibull (n, a1, b1)
RV2<-rweibull (n, a2, b2)
RV.Total<-c(RV1, RV2)
par.start<-c(1, 15)
mle.full<- ????????????
mle.reduced<-optim(par.start, LL, data=RV.Total, control=list(fnscale=-1))
LL.full<-?????
LL.reduced<-mle.reduced$value
LRT<-LR.test(LL.reduced, LL.full, 1)
rejections1<-ifelse(LRT$pvalue<alpha,1,0)
rejections<-c(rejections, rejections1)
}
table(rejections)
sum(table(rejections)[[2]])/sim # estimated power
Yes, you can sum the log-likelihoods for the two groups (if they were calculated separately). Just like you would sum the log-likelihoods for a vector of observations, where each observation has different generative parameters.
I prefer to think in terms of one large vector (ie, of the shape parameter) which contains values that vary according to the structure of the covariates (ie, group membership). In a linear model context, this vector could equal the linear predictor (once appropriately transformed by link function): the dot product of the design matrix and the vector of regression coefficients.
Here is a (non-functionalized) example:
To write a log-likelihood function to find the MLE of a Weibull model where the shape parameter(s) are some linear function of covariates, you could use the same approach:
Then your power simulation might look like this:
An identity link works fine in the above example, but for more complex models some sort of transformation might be required.
And you don't have to use linear algebra in the log-likelihood function -- obviously, you can construct the vector of shapes in any way you see fit (as long as you explicitly index the appropriate generative parameters in the vector
par
).Interval-censored data
The cumulative distribution function F(T) of the Weibull distribution (
pweibull
in R) gives the probability of failure before time T. So, if an observation is interval censored between times T[0] and T[1], the probability that the object fails between T[0] and T[1] is F(T[1]) - F(T[0]) : the probability that the object failed before T[1] minus the probability that it failed before T[0] (the integral of the PDF between T[0] and T[1]). Because the Weibull CDF is already implemented in R it is trivial to modify the likelihood function above:Or if you don't want to use a model matrix, etc., and just restrict yourself to indexing the shape parameter vector by groups, you could do something like:
Test that both functions give the same solution: