Making a geographic heatmap using ggmap in R for depicting frequency distribution

21 views Asked by At

I would like to create a geographic heatmap to depict the frequency distribution of a genetic lineage.

My data file contains four columns, including location, latitude, longitude, and frequency of the genetic lineage. There are many more rows.

Location Latitude Longitude Frequency
01 43.03879 42.72047 0.1304
02 38.58569 68.76037 0.0500
03 42.87779 74.60669 0.0500

... ...

Here is my R script for producing the heatmap:

library(ggplot2)
library(ggmap)
library(RColorBrewer) 


# retrieving the freq data

freq <- read.csv("freq_data.csv", sep = ',', header = TRUE,
                    strip.white = TRUE)

# defining the map bounds
map_bounds <- c(left = min(freq$Longitude) - 7, 
                right = max(freq$Longitude) + 7, 
                top = max(freq$Latitude) + 7, 
                bottom = min(freq$Latitude) - 7)

# create a base map using Stadia Maps
base_map <- get_stadiamap(map_bounds, zoom = 3, scale = 2, 
                          maptype = "stamen_terrain_background")


# convert the map into a ggmap object
ggmap_map <- ggmap(base_map, extent="device", legend="none")



# add heatmap layer
ht_map <- ggmap_map + geom_density2d(data = freq,
                                     aes(x = Longitude, 
                                         y = Latitude))

ht_map <- ht_map + stat_density2d(data = freq,
                                        aes(x = Longitude, 
                                            y = Latitude, 
                                            color = Frequency,  
                                            fill = after_stat(level), 
                                            alpha = after_stat(level)), 

                                  geom = "polygon")

# define the contour color
ht_map01 <- ht_map + scale_fill_gradientn(colors = rev(brewer.pal(7, "Spectral")))

# add freq info
ht_map02 <- ht_map01 + geom_point(data = freq,
                                    aes(x = Longitude, y = Latitude), 
                              fill="salmon", 
                              shape=21, 
                              size = freq$Frequency*100,
                              alpha=0.8)

This is the heatmap produced by the script.

The resulted geographic heatmap, however, is not what I expected, because it highlighted the geographic area that contains many clumped sites. The ideal heatmap should instead highlight the sites that have higher frequencies (given as large circles in the heatmap).

What changes should be made to the script an ideal geographic heatmap? I would really appreciate your responses!

0

There are 0 answers