Making 3D plot of data with LDA boundary

88 views Asked by At

I am playing around with linear discriminant analysis and trying to plot the data in plotly in R and then plot the discriminating plane. When I try this I can get the data and the plane, but the plane does not match with the the 3D plotted data. I get the data "floating" above the plane rather than the plane cutting through the data. Using the coefficients the lda function gives us with parameters SB.lda, OPS.lda, and tSho.lda I try to solve for z under the assumption that the equation is SBx + OPSy + tShoz = 0 because the lda function does not give us a constant.

When I reference this post (https://stats.stackexchange.com/questions/166942/why-are-discriminant-analysis-results-in-r-lda-and-spss-different-constant-t) and use the method to generate the constant I get a constant of -13.82201 which when plotted does not match with the data.

What am I missing?

library(plotly)
library(MASS)
train <- data.frame(read.table("https://raw.githubusercontent.com/gunsnfloyd/algorithmdata/main/train.tsv",header=TRUE))
head(train)
model <- lda(WS~ SB + OPS  + tSho ,data = train)

SB.lda <- 0.008056699
OPS.lda <- 15.174308879
tSho.lda <- 0.205465030

groupmean<-(model$prior%*%model$means)
constant<-(groupmean%*%model$scaling)
-constant


fig <- plot_ly(train, x = ~SB, y = ~OPS, z = ~tSho, color = ~WS, colors = c('#BF382A', '#0C4B8E'))
fig <- fig %>% add_markers()
fig <- fig %>% layout(scene = list(xaxis = list(title = 'SB'),
                                   yaxis = list(title = 'OPS'),
                                   zaxis = list(title = 'tSho')))

x.lda <- seq(0, 400, length.out = 100) #Range based on SB values
y.lda <- seq(0.5, 0.9, length.out = 100) #Range based on OPS values
z.lda <- outer(x.lda, y.lda, function(x, y) ( (-SB.lda * x - OPS.lda * y) / tSho.lda + 13.82201))

fig <- fig %>% add_surface(x = x.lda, y = y.lda, z = z.lda,colors = "#0C4B8E")      
fig
1

There are 1 answers

1
Julian Selke On BEST ANSWER

I get a different result from solving the equation:

# general equation: A * x + B * y + C * z + c = 0

# solve for z:
  SB * x + OPS * y + tSho * z - 13.82201 = 0
  SB * x + OPS * y - 13.82201            = - tSho * z
 -SB * x - OPS * y + 13.82201            = tSho * z
(-SB * x - OPS * y + 13.82201) / tSho    = z

# calculate values of plane:
x.lda <- seq(0, 400, length.out = 100) #Range based on SB values
y.lda <- seq(0.5, 0.9, length.out = 100) #Range based on OPS values
z.lda <- outer(x.lda, y.lda, function(x, y) (-SB.lda * x - OPS.lda * y + 13.82201) / tSho.lda )

The resulting plot:

plot