R "for loop" and/or Apply to transform several variables dynamically

845 views Asked by At

I am trying to translate/replicate into R a shorthand "for loop" technique that I would use in EViews. I'm trying to replicate a "for loop" where I would divide one time series variable by another (vectors) and save it as a new series.

As I use a common naming convention (for example GDP (real), GDPn (nominal) and GDP_P (prices), see EViews example below), I can declare the list of variables once and use changes in the suffix ("n" or "_P") to create dynamic series names and loop through the calculations I need. My input data is national accounts expenditure series.

'EViews shorthand "for next" loop:

%CATS = "GDP CONS INV GOV EX IM"
 for %CATS {%cats}
   series {%cats}_P= {%cats}n / {%cats}
 next

'Which is shorthand replication of below ("series" declares a series of the subsequent name):

series GDP_P    = GDPn / GDP
series CONS_P   = CONSn / CONS
series INV_P    = INVn /  INV
series GOV_P    = GOVn / GOV
series EX_P     = EXn / EX
series IM_P     = IMn / IM

So far I've tried using an R for loop (which I have read is not the preferred way in R) by creating a vector of the series name and used "assign(paste" to do the calculation. An example is below but it does not work. From what I have read about the "for" command, the declared series for "i" can only be a vector of values or a vector of names with no further context:

cats<-c("GDP","CONS","GOV","INV","EX","IM")
for (i in cats){
  assign(paste(i, "_P",sep=""), paste(i, "n",sep="")/i)
}

I've also done a lot of reading into the "apply" function and derivatives, but I can't see how it works the above scenario. Any suggestions for how to do this in R is helpful.

2

There are 2 answers

2
iod On BEST ANSWER

Your function should work like this:

cats<-c("GDP","CONS","GOV","INV","EX","IM")
for (i in cats){
  assign(paste(i, "_P",sep=""), get(paste(i, "n",sep=""))/get(i))
}

The get will use the strings you provide and find the vector of that name.

There's also a non-for-loop way of doing it, using the idea from one of the answers here:

txt<-paste0(cats, "_P <- ", cats, "n/", cats)
eval(parse(text=txt))

txt will include a list of all the lines that you would have had to type to create all your vectors manually, and then eval(parse(text=txt)) takes each of those commands and executes them one by one.

You can of course skip the assigning of the text to txt -- I just wanted it to be clearer what's going on here:

eval(parse(text=paste0(cats, "_P <- ", cats, "n/", cats)))
0
Parfait On

Consider working with lists especially for many similar elements. Doing so you can better manage your global environment and process data more compactly and efficiently. For you this means maintaining 3 lists of vectors instead of 18 separate named vectors (2 original sets and new 3rd set). The use of assign to dynamically create variables on the fly usually indicates the opportunity to use a named list.

Specifically, gather your items in GDPn_list and GDP_list and then use Map (the non-simplified wrapper to mapply) to iterate elementwise between both equal-length lists that calls the division function /. Then name the list with setNames(). Below demonstrates with random data but for you as the OP can use commented out lines to build list.

Original Data

cats <- c("GDP","CONS","GOV","INV","EX","IM")

set.seed(9272018)
GDPn_list <- setNames(replicate(6, runif(50)*120, simplify=FALSE), paste0(cats, "n"))
# GDPn_list <- list(GDPn, CONSn, GOVn, INVn, EXn, IMn)

str(GDPn_list)
# List of 6
#  $ GDPn : num [1:50] 52.4 31.9 10.6 118.4 66 ...
#  $ CONSn: num [1:50] 18.27 22.3 95.13 87.44 9.79 ...
#  $ GOVn : num [1:50] 48.83 69.73 113.61 35.53 1.21 ...
#  $ INVn : num [1:50] 51.9 96.9 28.2 67.2 19 ...
#  $ EXn  : num [1:50] 28.3 94.3 42.3 65.5 83.6 ...
#  $ IMn  : num [1:50] 109.3 26.6 60.2 78.2 55.5 ...

GDP_list <- setNames(replicate(6, runif(50)*100, simplify=FALSE), cats)
# GDPn_list <- list(GDP, CONS, GOV, INV, EX, IM)

str(GDP_list)    
# List of 6
#  $ GDP : num [1:50] 51.1 65.9 41.5 24.5 87.3 ...
#  $ CONS: num [1:50] 47.66 77.32 46.97 48.61 2.98 ...
#  $ GOV : num [1:50] 32.6 70.3 21.5 73.4 97.8 ...
#  $ INV : num [1:50] 80.7 16.8 57.4 80.7 12.1 ...
#  $ EX  : num [1:50] 38.1 78.1 40.6 62.8 61.9 ...
#  $ IM  : num [1:50] 39.8 84.8 11.4 39.7 14.7 ...

New Data

GDPp_list <- setNames(Map(`/`, GDPn_list, GDP_list), paste0(cats, "p"))

str(GDPp_list)    
# List of 6
#  $ GDPp : num [1:50] 1.025 0.484 0.256 4.835 0.756 ...
#  $ CONSp: num [1:50] 0.383 0.288 2.025 1.799 3.286 ...
#  $ GOVp : num [1:50] 1.4969 0.9921 5.2891 0.4844 0.0124 ...
#  $ INVp : num [1:50] 0.644 5.775 0.491 0.832 1.578 ...
#  $ EXp  : num [1:50] 0.744 1.207 1.043 1.043 1.352 ...
#  $ IMp  : num [1:50] 2.747 0.314 5.293 1.971 3.783 ...

And you still can reference your underlying numeric vectors via names or index numbers without losing any functionality or data:

GDPp_list$GDPp
GDPp_list$CONSp
GDPp_list$GOVp
...

GDPp_list[[1]]
GDPp_list[[2]]
GDPp_list[[3]]
...

And if equal-length vectors, build a matrix from your lists! This time using mapply:

GDPp_matrix <- mapply(`/`, GDPn_list, GDP_list)
colnames(GDPp_matrix) <- paste0(cats, "p")

head(GDPp_matrix)

#           GDPp      CONSp       GOVp       INVp       EXp      IMp
# [1,] 1.0252871  0.3832836 1.49687150  0.6436575 0.7441159 2.746551
# [2,] 0.4835700  0.2884577 0.99208666  5.7753575 1.2067694 0.314102
# [3,] 0.2562130  2.0251752 5.28913247  0.4910816 1.0429316 5.292843
# [4,] 4.8345697  1.7987625 0.48436284  0.8322211 1.0431301 1.970523
# [5,] 0.7563794  3.2859395 0.01236608  1.5781949 1.3518592 3.783420
# [6,] 0.1515318 10.9332338 1.10608066 13.7953500 0.7211371 1.918249