I would like to know if there is another way to write the function:
gam(VariableResponse ~ s(CovariateName1) + s(CovariateName2) + ... + s(CovariateName100),
family = gaussian(link = identity), data = MyData)
in mgcv package without typing 100 covariates' name as above? Supposing that in MyData I have only VariableResponse in column 1, CovariateName1 in column 2, etc.
Many thank!
Yes, use the brute force approach to generate a formula by pasting together the covariate names with the strings
's('
and')'
and then collapsing the whole things with' + '
. The convert the resultant string to a formula and pass that togam()
. You may need to fix issues with the formula's environment ifgam()
can't find the variable you name as it is going to do some NSE on the formula to identify which terms need smooths estimating and hence need to be replaced by a basis expansion.We'll ignore the last 5 of those columns for the purposes of this example
Make the formula
Now fit the model
You do have to be careful about fitting GAMs this way however; concurvity (the nonlinear counterpart to multicolinearlity in linear models) can cause catastrophically bad estimates of smooth functions.