I am working in R
with a regular dataframe (df
) and a shapefile (map2
), the share a common column called CD116FP
. df
has 103552 lines while map2
has 444 .I am loading the shapefile in the following way:
map2 <- read_sf("D:/Data/tl_2019_us_cd116.shp")
My end-goal is to use the function mapview()
to view the map included in map2
with the "intensity" that is described in df
under the column np_scores
. I hence do not want observations of df
that do not appear on map2
.
Here are my thoughts and failures:
If these two objects were regular dataframes, a reasonable candidate would be to use
merge()
to combine both objects, however if you apply that function in this case, the resulting object looses the spatial properties andmapview
does not know how to read it.Another approach that I used was trying this line of code:
map2m<-data.frame(map2, df[match(map2$CD116FP, df$CD116FP),])
But the result has dimensions that are too big (much bigger that 444 lines) and hence mapview
crashes when trying to plot the desired map.
- At last, I went full-on brute force and just constructed a loop to add the column
np
tomap2
:
map2$np=10
for (i in c(1:nrow(map2)))
{
for (j in c(1:nrow(df)))
{
if (identical(map2$CD116FP[i],df$CD116FP[j]))
{map2$np[i]=df$np_score[j]}
else {map2$np[i]=0}
}
}
However, this approach just takes way too much time given the dimensions of my dataframe.
Do you have any suggestions?
I'm a bit puzzled by the structure of your data. Your
df
has over 100,000 rows, so I'm guessing that the sameCD116FP
occurs multiple times indf
, and thenpscore
will presumably vary across these instances. If you want to merge these tomap2
you will need to aggregate them first.Let's try to recreate a similar setup:
I have made
df
have the same number of rows that your data has to show this solution will scale to your problem.Let's aggregate the
npscores
withdplyr
:Now
map2
has the aggregatednpscores
we can plot - for example, in ggplot:Or in mapview:
Created on 2020-09-19 by the reprex package (v0.3.0)