I am running a panel regression estimating the effect of a change in employee satisfaction measured on glassdoor.com (rating 1 to 5) for a given company on the stock price(adjusted by Fama-French). I do have a panel with 50 companies and 43 quarters, as I am interested in the change, both time series are first differenced (i.e. x = (rating in Q2 - rating in Q1) and y = (alpha in Q2 - alpha in Q1)). I now want to standardize my data. My question is do I standardize the whole dataset or should I group it first by Quarter and then standardize it?
general_reviews <- read.csv("reviews_ general.csv") #Importing the data general_data <- pdata.frame(general_reviews, index = c("company","Year_Quarter", "Type")) #Data is a panel with company as the entity and Year_Quarter the time general_data <- general_data %>% mutate(alpha = as.numeric(as.character(alpha))) %>% group_by(company) %>% mutate(A1d = dplyr::lead(alpha, 1) - alpha) #Building the first difference as I am interested in the change. #Do I use: general_data <- general_data %>% mutate(A1s = scale(alpha)) #or: general_data <- general_data %>% group_by(Year_Quarter) %>% mutate(A1s = scale(alpha))