Cross validation and ROC curve using Matlab: how plot mean ROC curve?

Question

Cross validation and ROC curve using Matlab: how plot mean ROC curve?

619 views Asked by Antonio Mendes At 16 October 2020 at 05:52

I am using k-fold cross validation with k = 10. Thus, I have 10 ROC curves. I would like to average between the curves. I can't just average the values on the Y axes (using perfcurve) because the vectors returned are not the same size.

[X1,Y1,T1,AUC1] = perfcurve(t_test(1),resp(1),1);
.
.
.
[X10,Y10,T10,AUC10] = perfcurve(t_test(10),resp(10),1);

How to solve this? How can I plot the average curve of the 10 ROC curves?

Original Q&A

There are 2 answers

**Antonio Mendes** · Answer 1 · 2020-10-17T04:18:33+00:00

I solved it using Matlab's perfcurve. For that, I had to pass as a parameter a list of vectors (size vectors 1xn) for "label" and "scores". Thus, the perfcurve function already understands as a set of resolutions made using k-fold and returns the average ROC curve and its confidence interval, in addition to the AUC and its confidence interval.

[X1,Y1,T1,AUC1] = perfcurve(t_test_list,resp_list,1);

t_test and resp they are lists of size 1xk (k is the number of folds / k-fold) and each element of the lists is a 1xn vector with scores and labels.

resp = nnet(x_test(i));
t_test_act = t_test(i);

resp has 2xn format (n is the number of predicted samples). There are two classes.

t_test_act contains the labels of the current set of tests, it has formed 2xn and is composed of 0 and 1 (each column has a 1 and a 0, indicating the true class of the sample).

resp_list{i} = resp(1,:)  %(scores)
t_test_list{i} = t_test_act(1,:) %(labels)
[X1,Y1,T1,AUC1] = perfcurve(t_test_list,resp_list,1);

**saastn** · Answer 2 · 2020-10-16T22:22:41+00:00

So, you have k curves with different number of points, all bound in [0..1] interval in both dimensions. First, you need to calculate interpolated values for each curve at specified query points. Now you have new curves with fixed number of points and can compute their mean. The interp1 function will do the interpolation part.

%% generating sample data
k = 10;
X = cell(k, 1);
Y = cell(k, 1);
hold on;
for i=1:k
    n = 10+randi(10);
    X{i} = sort([0 1 rand(1, n)]);
    Y{i} = sort([0 1 rand(1, n)].^.5);
end

%% Calculating interpolations
% location of query points
X2 = linspace(0, 1, 50);
n = numel(X2);
% initializing values for different curves at different query points
Y2 = zeros(k, n);
for i=1:k
    % finding interpolated values for i-th curve
    Y2(i, :) = interp1(X{i}, Y{i}, X2);
end
% finding the mean
meanY = mean(Y2, 1);

Notice that different interpolation methods can affect your results. For example, the ROC plot data are kind of stairs data. To find the exact values on such curves, you should use the Previous Neighbor Interpolation method, instead of the Linear Interpolation which is the default method of interp1:

Y2(i, :) = interp1(X{i}, Y{i}, X2); % linear
Y3(i, :) = interp1(X{i}, Y{i}, X2, 'previous');

This is how it affects the final results:

TechQA.

Cross validation and ROC curve using Matlab: how plot mean ROC curve?

There are 2 answers

Related Questions in MATLAB

Related Questions in CROSS-VALIDATION

Related Questions in ROC

Related Questions in AUC

Related Questions in K-FOLD

Popular Questions

Popular Tags

Trending Questions