I've got a dataset of refineries in Texas (GeoJSON here - https://pastebin.com/R0D9fif9 ):
Name,Latitude,Longitude
Marathon Petroleum,29.374722,-94.933611
Marathon Petroleum,29.368733,-94.903253
Valero,29.367617,-94.909515
LyondellBasell,29.71584,-95.234814
Valero,29.722213,-95.255198
Exxon,29.743865,-95.009208
Shell,29.720425,-95.12495
Petrobras,29.722466,-95.208807
I would like to create a printed map out of these points. But they lie too closely together at a given resolution.
Since every refinery should get mentioned in the legend, I can't cluster. So I would like to
Get the centroid - that was easy
import json import csv from shapely.geometry import shape, Point, MultiPoint with open('refineries.csv', 'rU') as infile: reader = csv.DictReader(infile) data = {} for row in reader: for header, value in row.items(): try: data[header].append(value) except KeyError: data[header] = [value] listo = list(zip(data['Longitude'], data['Latitude'])) points1 = MultiPoint(points=listo) points = MultiPoint([(-94.933611, 29.374722), (-94.903253, 29.368733), (-94.909515, 29.367617), (-95.234814, 29.71584), (-95.255198, 29.722213), (-95.009208, 29.743865), (-95.12495, 29.720425), (-95.208807, 29.722466)]) print(points.centroid)
Shift all points away from the centroid until a minimum distance between all is reached
May you please help me here? Thanks in advance!
It depends how exactly do you want to shift the points away from the centroid. One way would be to calculate for each point its great-circle distance and azimuth with respect to the centroid and rescale all distances in order to ensure that the distance between the two closest points is larger than a specified threshold. In the example below, pyproj is used for the calculation of the azimuths and distances.
Alternatively, this might be combined with the approach within which you also relax the azimuths. This would in principle result in smaller scaling factor for the "radial" distances. However, it would also slightly distort the "visual distribution" of the points. Also, the method presented above might be "improved" by ignoring any outlier points in the rescaling, i.e., points which are already sufficiently far from the centroid and which have no nearby neighbors.