I'm trying to plots data onto a map. I would like to generate data for specific points on the map (e.g. transit times to one or more prespecified location) for a specific city.

I found data for New York City here: https://data.cityofnewyork.us/City-Government/Borough-Boundaries/tqmj-j8zm

It looks like they have a shape file available. I'm wondering if there is a way to sample a latitude-longitude grid within the bounds of the shape file for each borough (perhaps using shapely package, etc).

Sorry if this is naive, I'm not very familiar with working with these files--I'm doing this as a fun project to learn about them

1 Answers

lstbl On Best Solutions

I figured out how to do this. Essentially, I just created a full grid of points and then removed those that did not fall within the shape files corresponding to the boroughs. Here is the code:

import geopandas
from geopandas import GeoDataFrame, GeoSeries
import matplotlib.pyplot as plt
from matplotlib.colors import Normalize
import matplotlib.cm as cm
%matplotlib inline
import seaborn as sns
from shapely.geometry import Point, Polygon
import numpy as np
import googlemaps
from datetime import datetime
plt.rcParams["figure.figsize"] = [8,6]

# Get the shape-file for NYC
boros = GeoDataFrame.from_file('./Borough Boundaries/geo_export_b641af01-6163-4293-8b3b-e17ca659ed08.shp')
boros = boros.set_index('boro_code')
boros = boros.sort_index()

# Plot and color by borough
boros.plot(column = 'boro_name')

# Get rid of are that you aren't interested in (too far away)
plt.gca().set_xlim([-74.05, -73.85])
plt.gca().set_ylim([40.65, 40.9])

# make a grid of latitude-longitude values
xmin, xmax, ymin, ymax = -74.05, -73.85, 40.65, 40.9
xx, yy = np.meshgrid(np.linspace(xmin,xmax,100), np.linspace(ymin,ymax,100))
xc = xx.flatten()
yc = yy.flatten()

# Now convert these points to geo-data
pts = GeoSeries([Point(x, y) for x, y in zip(xc, yc)])
in_map =  np.array([pts.within(geom) for geom in boros.geometry]).sum(axis=0)
pts = GeoSeries([val for pos,val in enumerate(pts) if in_map[pos]])

# Plot to make sure it makes sense:

# Now get the lat-long coordinates in a dataframe
coords = []
for n, point in enumerate(pts):
    coords += [','.join(__ for __ in _.strip().split(' ')[::-1]) for _ in str(point).split('(')[1].split(')')[0].split(',')]

which results in the following plots: boroughs


I also got a matrix of lat-long coordinates I used to make a transport-time map for every point in the city to Columbia Medical Campus. Here is that map:


and a zoomed-up version so you can see how the map is made up of the individual points: transit_time_zoom