In a survey dataset I have a variable that has measurement error. I want to impute new values to it. I have a training dataset that includes surveys from past years that have common variables, so I want to use the correlations of the covariates in the imputation. Also, I know some things about the distribution of the target variable:
- It is normal
- The true mean
- The true standard deviation
I have been suggested to use a bayesian regression to incorporate these priors as well as the covariates. In most forums, I have seen that this is done with stan/brms models. However, I know little about this kind of models.
Therefore, in short, I am looking for a code in R that is able to replace values with measurement error so that the resulting imputed variable has a normal distribution, the known mean, the known standard deviation and is in accordance with the covariates of the training dataset.
Here I include a replicable dataset on which the code I am asking could be applied:
# Set seed for reproducibility
# set.seed(42)
# Number of observations
n <- 1000
# Simulate 'variable_with_error' with random measurement error
measurement_error <- rnorm(n, mean = 0, sd = 1)
variable_with_error <- rnorm(n, mean = 0, sd = 1) + measurement_error
# Create some predictor variables
predictor1 <- rnorm(n, 0, 1)
predictor2 <- rnorm(n, 0, sd = 1)
predictor3 <- rnorm(n, 0, sd = 1)
# Combine into a data frame
df <- data.frame(
variable_with_error = variable_with_error,
predictor1 = predictor1,
predictor2 = predictor2,
predictor3 = predictor3
)