Decompress Bing Maps GeoData (borders) with Python

546 views Asked by At

I have been trying to decompress the Bing Maps location/border shape algorithm using Python. My end goal is to have custom regions/borders created from combining multiple counties and cities, and save the location data to our database for faster and more accurate location based analysis.

My strategy is as follows, but I'm a little stuck on #2, since I can't seem to accurately decompress the code:

  1. Retrieve County/City borders from Bing Maps GeoData API - They refer to it as "shape"
  2. Decompress their "shape" data to get the latitude and longitude of the border points
  3. Remove the points that have the same lat/lng as other shapes (The goal is to make one large shape of multiple counties, as opposed to 5-6 separate shapes)
  4. Compress the end result and save in the database

The function I am using seems to work for the example of 'vx1vilihnM6hR7mEl2Q' provided in the Point Compression Algorithm documentation. However, when I insert something a little more complex, like Cook County, the formula seems to be off (tested by inserting several of the points into different polygon mapping/drawing applications that also use Bing Maps). It basically creates a line at the south side of Chicago that vigorously goes East and West into Indiana, without much North-South movement. Without knowing what the actual coordinates of any counties are supposed to be, I'm not sure how to figure out where I'm going wrong.

Any help is greatly appreciated, even if it is a suggestion of a different strategy.

Here is the python code (sorry for the overuse of the decimal format - my poor attempt to ensure the error wasn't a result of inadvertently losing precision):

safeCharacters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_-'

def decodeBingBorder(compressedData): latLng = [] pointsArray = [] point = [] lastLat = Decimal(0) lastLng = Decimal(0)

# Assigns the number of of each character based on the respective index of 'safeCharacters'
# numbers under 32 indicate it is the last number of the combination of the point, and a new point is begun
for char in compressedData:
    num = Decimal(safeCharacters.index(char))
    if num < 32:
        point.append(num)
        pointsArray.append(point)
        point = []
    else:
        num -= Decimal(32)
        point.append(num)

# Loops through each point to determine the lat/lng of each point
for pnt in pointsArray:
    result = Decimal(0)

    # This revereses step 7 of the Point Compression Algorithm https://msdn.microsoft.com/en-us/library/jj158958.aspx
    for num in reversed(pnt):
        if result == 0:
            result = num
        else:
            result = result * Decimal(32) + num

    # This was pretty much taken from the Decompression Algorithm (not in Python format) at https://msdn.microsoft.com/en-us/library/dn306801.aspx
    # Determine which diaganal it's on
    diag = Decimal(int(round((math.sqrt(8 * result + 5) - 1) / 2)))

    # submtract the total number of points from lower diagonals, and get the X and Y from what's left over
    latY = Decimal(result - Decimal(diag * (diag + 1) / 2))
    lngX = Decimal(diag - latY)

    # undo the sign encoding
    if latY % 2 == 1:
        latY = (latY + Decimal(1)) * Decimal(-1)
    if lngX % 2 == 1:
        lngX = (lngX + Decimal(1)) * Decimal(-1)
    latY /= 2
    lngX /= 2

    # undo the delta encoding
    lat = latY + lastLat
    lng = lngX + lastLng
    lastLat = lat
    lastLng = lng

    # position the decimal point
    lat /= Decimal(100000)
    lng /= Decimal(100000)

    # append the point to the latLng list in a string format, as opposed to the decimal format
    latLng.append([str(lat), str(lng)])

return latLng

The compressed algorithm:

1440iqu9vJ957r8pB_825syB6rh_gXh1-ntqB56sk2B2nq07Mwvq5f64r0m0Fni11ooE4kkvxEy4wzMuotr_DvsiqvFozvt-Lw9znxH-r5oxLv9yxCwhh7wKnk4sB8o0Rvv56D8snW5n1jBg50K4kplClkpqBpgl9F4h4X_sjMs85Ls6qQi6qvqBr188mBqk-pqIxxsx5EpsjosI-8hgIoygDigU94l_4C

This is the result:

[['41.46986', '-87.79031'], ['41.47033', '-87.52569'], ['41.469145', '-87.23372'], ['41.469395', '-87.03741'], ['41.41014', '-86.7114'], ['41.397545', '-86.64553'], ['41.3691', '-86.47018'], ['41.359585', '-86.41984'], ['41.353585', '-86.9637'], ['41.355725', '-87.43971'], ['41.35561', '-87.52716'], ['41.3555', '-87.55277'], ['41.354625', '-87.63504'], ['41.355635', '-87.54018'], ['41.360745', '-87.40351'], ['41.362315', '-87.29262'], ['41.36214', '-87.43194'], ['41.360915', '-87.44473'], ['41.35598', '-87.58256'], ['41.3551', '-87.59025'], ['41.35245', '-87.59828'], ['41.34782', '-87.60784'], ['41.34506', '-87.61664'], ['41.34267', '-87.6219'], ['41.34232', '-87.62643'], ['41.33809', '-87.63286'], ['41.33646', '-87.63956'], ['41.32985', '-87.65056'], ['41.33069', '-87.65596'], ['41.32965', '-87.65938'], ['41.33063', '-87.6628'], ['41.32924', '-87.66659'], ['41.32851', '-87.71306'], ['41.327105', '-87.75963'], ['41.329515', '-87.64388'], ['41.32698', '-87.73614'], ['41.32876', '-87.61933'], ['41.328275', '-87.6403'], ['41.328765', '-87.63857'], ['41.32866', '-87.63969'], ['41.32862', '-87.70802']]

4

There are 4 answers

0
Mark Hebert On BEST ANSWER

As mentioned by rbrundritt, storing the data from Big Maps is against the terms of use. However, there are other sources of this same data available, such as http://nationalmap.gov/boundaries.html

In the interest of solving the problem, and to store this and other coordinate data more efficiently, I solved the problem by removing the 'round' function when calculating 'diag'. This should be what replaces it:

diag = int((math.sqrt(8 * result + 5) - 1) / 2)

All of the 'Decimal' crap I added is not necessary, so you can remove it if you wish.

0
rbrundritt On

You can't store the boundary data from the Bing Maps GeoData API or any data derived from it in a database. This is against the terms of use of the platform.

0
Maksym Zaitsev On

You can also do

diag=int(round((sqrt(8 * number + 1)/ 2)-1/2.))

Don't forget to subtract longitude*2 from latitude to get N/E coordinates!

0
ivan On

Maybe it will be usefull, i found bug in code.

invert pair function should be

diag = math.floor((math.sqrt(8 * result + 1) - 1) / 2)

after fixing this, your implementation work correct