I want to run a regression analysis in R to explain the variation of my DV g_law_tot which is the growth rate of total budget from t-1 to t. I have yearly data from 1994 to 2023. I have some political, institutional and economic variables as IVs. I am not interested in forecasts but only to understand which are the most relevant explanatory factors of the annual percentage change in the budget. Is it time series the right choice? How do I understand which specific model is the best in my case? I've got lost in too many videos, readings, blogs.
Thank you so much!
Below is my df:
df <- structure(list(year = c(1994, 1995, 1996, 1997, 1998, 1999, 2000,
2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011,
2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022,
2023), end_legislative_term = c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0),
technocratic = c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0), C_enpp = c(7.88,
7.88, 6.07, 6.07, 6.07, 6.07, 6.07, 5.45, 5.45, 5.45, 5.45,
5.45, 5.09, 5.09, 3.08, 3.08, 3.08, 3.08, 3.08, 3.52, 3.52,
3.52, 3.52, 3.52, 4.38, 4.38, 4.38, 4.38, 5.64, 5.64), leading_ch = c(0,
1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 0, 1, 0, 1, 0, 1), steering_center = c(0, 0,
0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1,
1, 1, 1, 0, 0, 0, 0, 0, 0), gpd_growth = c(2.683230835, 1.266540391,
1.830276287, 1.810314719, 1.625659888, 3.78691271, 1.951454556,
0.25403202, 0.13847967, 1.424073911, 0.817498767, 1.790934831,
1.486917388, -0.962084833, -5.280713695, 1.713274462, 0.707045938,
-2.980514474, -1.841095694, -0.004870179, 0.778657835, 1.293374251,
1.667491666, 0.926246385, 0.482993514, -8.974277401, 8.313634284,
3.724824143, 0.691742081, NA), g_law_tot = c(13.1533324313674,
6.60474423446604, -11.8440505854976, 2.8432575110453, 10.6431848644662,
7.29624633255459, 2.81686804707955, -0.0275394074771063,
-0.570750090159, -0.192522133860973, -0.553271817228385,
-2.94082586015983, 4.28422079688282, 5.5247420527077, -4.450067276262,
-0.0110042515499731, -3.47617754038942, -0.98458875847075,
2.91395911842109, 4.04542948833524, 5.56044089389971, -2.05342343455547,
-0.0998891519709444, 2.70233019929693, 2.48770368114364,
2.31876491767493, 15.1682979912092, 6.53052765440996, 4.09519458060663,
-4.94486838456369)), row.names = c(NA, -30L), class = c("tbl_df",
"tbl", "data.frame"))