Getting the parameters of a data ellipse produced by the car package in R

3.2k views Asked by At

I am using the dataEllipse function from the car package in R to get an elliptic confidence region for my data. For example:

datapoints_x = c(1,3,5,7,8,6,5,4,9)
datapoints_y = c(3,6,8,9,5,8,7,4,8)
ellipse = dataEllipse(cbind(datapoints_x, datapoints_y), levels=0.95)

The output are two vectors x and y corresponding to points that define the ellipse:

head(ellipse)
#             x        y
# [1,] 12.79906 10.27685
# [2,] 12.74248 10.84304
# [3,] 12.57358 11.34255
# [4,] 12.29492 11.76781
# [5,] 11.91073 12.11238
# [6,] 11.42684 12.37102

But rather than that I am interested in the length of the ellipsis axes and its center. Is there a way to get this without carrying out the PCA by myself?

1

There are 1 answers

2
josliber On BEST ANSWER

From ?dataEllipse you read that these functions are mostly plotting functions, not functions designed to give you the fitted ellipse. However reading the source code of dataEllipse, it becomes clear that the function used to fit the ellipse is cov.wt from the stats package. This function should be able to give you the center and covariance matrix used to specify the ellipse location and shape:

set.seed(144)
x <- rnorm(1000)
y <- 3*x + rnorm(1000)
(ell.info <- cov.wt(cbind(x, y)))
# $cov
#          x         y
# x 1.022985  3.142274
# y 3.142274 10.705215
# 
# $center
#           x           y 
# -0.09479274 -0.23889445 
# 
# $n.obs
# [1] 1000

The center of the ellipse is now readily available from ell.info$center. The directions of the axes are accessible as the eigenvectors of the covariance matrix (columns of eigen.info$vectors below).

(eigen.info <- eigen(ell.info$cov))
# $values
# [1] 11.63560593  0.09259443
# 
# $vectors
#           [,1]       [,2]
# [1,] 0.2839051 -0.9588524
# [2,] 0.9588524  0.2839051

Finally you need to know the length of the axes (I'll give the length from the center to the ellipse, aka the radius on that axis):

(lengths <- sqrt(eigen.info$values * 2 * qf(.95, 2, length(x)-1)))
# [1] 8.3620448 0.7459512

Now we can get the four endpoints of the axes of the ellipse:

ell.info$center + lengths[1] * eigen.info$vectors[,1]
#        x        y 
# 2.279234 7.779072 
ell.info$center - lengths[1] * eigen.info$vectors[,1]
#         x         y 
# -2.468820 -8.256861 
ell.info$center + lengths[2] * eigen.info$vectors[,2]
#           x           y 
# -0.81004983 -0.02711513 
ell.info$center - lengths[2] * eigen.info$vectors[,2]
#          x          y 
#  0.6204643 -0.4506738 

We can confirm these are accurate from using dataEllipse:

library(car)
dataEllipse(x, y, levels=0.95)

enter image description here