calculate p value for ROC AUC by subgroups

97 views Asked by At

I'm trying to see if there is a statistically significant difference in ROC AUC by subgroups in SAS. I can't use roccontrast because I only have one model. Here is my codes:

 proc logistic data=data plots(only)=roc;
         by Race;
         model htn (event='1') = bmi ;
         ods output roccurve=ROCdata;
         run;

Race has 4 categories: hispanic, black, white and other

I tried roccontrast, didn't work. I would really appreciate any guidance on this!

1

There are 1 answers

0
Robert Long On

roccontrast in SAS cannot be used as it requires multiple models for comparison, as you are now aware. However, you can still perform a comparison of ROC curves by subgroup using a similar approach to the one described in the answer linked to the question:

Run PROC LOGISTIC with BY Statement: You've already done this part. By using the BY statement, you run the logistic regression separately for each subgroup (hispanic, black, white, and other). The ODS OUTPUT roccurve=ROCdata; will capture the necessary ROC statistics for each subgroup.

To visually compare the ROC curves for each subgroup, use the SGPLOT procedure on the ROCdata. This will help you see how the ROC curves differ across the subgroups.

proc sgplot data=ROCdata;
    series x=1-Specificity y=Sensitivity / group=Race;
    xaxis label="1-Specificity";
    yaxis label="Sensitivity";
    title "ROC Curves by Race";
run;

To statistically compare the AUCs between subgroups, you will need to compute a test statistic. This involves a pairwise comparison of AUCs, taking into account their standard errors. You can compute the standard errors of the AUCs from the ROCdata output and then perform a Z-test for each pair of subgroups.

F9or the Data Step, the comparison involves calculating the Z-score for the difference in AUCs between each pair of subgroups. Here's a generalised approach:

data pairwise_comparison;
    set ROCdata;
    /* Assuming ROCdata has variables Race, AUC, and AUC_SE for each race category */
    array races[4] $ hispanic black white other; /* Update with exact variable names */
    array auc[4] hispanic_auc black_auc white_auc other_auc; /* AUCs */
    array auc_se[4] hispanic_auc_se black_auc_se white_auc_se other_auc_se; /* Standard errors */

    do i = 1 to 4;
        do j = i+1 to 4;
            race1 = races[i];
            race2 = races[j];
            auc_diff = auc[i] - auc[j];
            se_diff = sqrt(auc_se[i]**2 + auc_se[j]**2);
            z = auc_diff / se_diff;
            p_value = (1 - probnorm(abs(z))) * 2; /* two-tailed test */
            output;
        end;
    end;
run;

The pairwise_comparison dataset will contain the Z-scores and p-values for each pair of subgroups.