I did quite some searching on how to simplify the code for the problem below but was not successful. I assume that with some kind of apply
-magic one could speed things up a little, but so far I still have my difficulties with these kind of functions ....
I have an data.frame data
, structured as follows:
year iso3c gdpppc elec solid liquid heat
2010 USA 1567 1063 1118 835 616
2015 USA 1571 NA NA NA NA
2020 USA 1579 NA NA NA NA
... USA ... NA NA NA NA
2100 USA 3568 NA NA NA NA
2010 ARG 256 145 91 85 37
2015 ARG 261 NA NA NA NA
2020 ARG 270 NA NA NA NA
... ARG ... NA NA NA NA
2100 ARG 632 NA NA NA NA
As you can see, I have a historical starting value for 2010 and a complete scenario for gdppc
up to 2100. I want to let values for elec
, solid
, liquid
and heat
grow according to some elasticity with respect to the development of gdppc
, but separately for each country (coded in iso3c
).
I have the elasticities defined in a separate data.frame parameters
:
item value
elec 0.5
liquid 0.2
solid -0.1
heat 0.1
So far I am using a nested for
loop:
for (e in 1:length(levels(parameters$item)){
for (c in 1:length(levels(data$iso3c)){
tmp <- subset(data, select=c("year", "iso3c", "gdppc", parameters[e, "item"]), subset=("iso3c" == levels(data$iso3c)[c]))
tmp[tmp$year %in% seq(2015, 2100, 5), parameters[e, "item"]] <-
tmp[tmp$year == 2010, parameters[e, "item"]] *
cumprod((1 + (tmp[tmp$year %in% seq(2015, 2100, 5), "gdppc"] /
tmp[tmp$year %in% seq(2010, 2095, 5), "gdppc"] - 1) * parameters[e, "value"]))
data[data$iso3c == levels(data$iso3c)[i] & data$year %in% seq(2015, 2100, 5), parameters[e, "item"]] <- tmp[tmp$year > 2010, parameters[e, "item"]]
}
}
The outer loop loops over the columns and the inner one over the countries. The inner loop runs for every country (I have 180+ countries). First, a subset containing data on one single country and on the variable of interest is selected. Then I let the respective variable grow with a certain elasticity to growth in gdppc
and finally put the subset back into place in data
.
I have already tried to let the outer loop run in parallel using foreach
but was not succesful recombining the results. Since I have to run similar calculations quite often I would be very grateful for any help.
Thanks
Here's one way. Note I renamed your
parameters
data.frame top
The basic idea is to use the
melt(...)
function to reshape your originaldata
into "long" format, where the values in the four columns solid, liquid, elec, and heat are all in one column,value
, and the columnvariable
indicates which metricvalue
refers to. Now, using data tables, you can fill in the values easily. Then, reshape the result back into wide format usingdcast(...)
.