4. Now fit a linear model for each metric and use the confint function to compare the estimates. (Batting)

Question

4. Now fit a linear model for each metric and use the confint function to compare the estimates. (Batting)

1k views Asked by math.where At 24 November 2020 at 20:24

Here's what I've done so far, I'm having difficulty figuring out the regression line.

Before we get started, we want to generate two tables. One for 2002 and another for the average of 1999-2001 seasons. We want to define per plate appearance statistics. Here is how we create the 2017 table. Keeping only players with more than 100 plate appearances. Now compute a similar table but with rates computed over 1999-2001.

library(Lahman)
data("Batting")
avg <- Batting %>% filter(yearID %in% 1999:2001) %>%
  mutate(pa = AB + BB, 
         avg_singles = (H - X2B - X3B - HR) / pa, avg_bb = BB / pa) %>%
  filter(pa >= 100) %>%
  select(playerID, avg_singles, avg_bb)

dat <- Batting %>% filter(yearID == 2002) %>%
  mutate(pa = AB + BB, 
         singles = (H - X2B - X3B - HR) / pa, bb = BB / pa) %>%
  filter(pa >= 100) %>%
  select(playerID, singles, bb)

Compute the correlation between 2002 and the previous seasons for singles and BB.

dat <- inner_join(dat, avg, by = "playerID")
rdat <- dat %>% 
  summarise(singles_r = cor(singles,avg_singles ), bb_r = cor(bb, avg_bb ))
rdat

Note that the correlation is higher for BB. To quickly get an idea of the uncertainty associated with this correlation estimate, we will fit a linear model and compute confidence intervals for the slope coefficient. However, first make scatterplots to confirm that fitting a linear model is appropriate.

library(ggplot2)
dat %>% 
  ggplot(aes(singles,avg_singles))+
  geom_point(alpha = 0.5)

dat %>% 
  ggplot(aes(bb,avg_bb))+
  geom_point(alpha = 0.5)

Now fit a linear model for each metric and use the confint function to compare the estimates.

Original Q&A

There are 2 answers

**abdullah** · Answer 1 · 2021-09-17T14:10:31+00:00

abdullah On 17 September 2021 at 14:10

I would use the lm function to solve this question.
Example:

lm(singles ~ avg_singles , data = dat)

Likewise for the bb as well.

**DnLusho** · Answer 2 · 2022-01-01T15:51:10+00:00

What is the correlation between 2002 singles rates and 1999-2001 average singles rates?

The following code can be used to determine the correlation:

dat <- inner_join(bat_02, bat_99_01)
cor(dat$singles, dat$mean_singles)

# Correct answer:

[1] 0.5509222

What is the correlation between 2002 BB rates and 1999-2001 average BB rates?

The following code can be used to determine the correlation:

cor(dat$bb, dat$mean_bb)

# Correct answer:

[1] 0.7174787

TechQA.

4. Now fit a linear model for each metric and use the confint function to compare the estimates. (Batting)

There are 2 answers

Related Questions in R

Related Questions in LINEAR-REGRESSION

Related Questions in LINEARMODELS

Popular Questions

Popular Tags

Trending Questions