format/round numerical legend label in GeoPandas

7.5k views Asked by At

I'm looking for a way to format/round the numerical legend labels in those maps produced by .plot() function in GeoPandas. For example:

gdf.plot(column='pop2010', scheme='QUANTILES', k=4)

This gives me a legend with many decimal places:

enter image description here

I want the legend label to be integers.

2

There are 2 answers

4
Brendan On BEST ANSWER

As I recently encountered the same issue, and a solution does not appear to be readily available on Stack Overflow or other sites, I thought I would post the approach I took in case it is useful.

First, a basic plot using the geopandas world map:

# load world data set    
world_orig = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
world = world_orig[(world_orig['pop_est'] > 0) & (world_orig['name'] != "Antarctica")].copy()
world['gdp_per_cap'] = world['gdp_md_est'] / world['pop_est']

# basic plot
fig = world.plot(column='pop_est', figsize=(12,8), scheme='fisher_jenks', 
                 cmap='YlGnBu', legend=True)
leg = fig.get_legend()
leg._loc = 3
plt.show()

world map v1

The method I used relied on the get_texts() method for the matplotlib.legend.Legend object, then iterating over the items in leg.get_texts(), splitting the text element into the lower and upper bounds, and then creating a new string with formatting applied and setting this with the set_text() method.

# formatted legend
fig = world.plot(column='pop_est', figsize=(12,8), scheme='fisher_jenks', 
                 cmap='YlGnBu', legend=True)
leg = fig.get_legend()
leg._loc = 3

for lbl in leg.get_texts():
    label_text = lbl.get_text()
    lower = label_text.split()[0]
    upper = label_text.split()[2]
    new_text = f'{float(lower):,.0f} - {float(upper):,.0f}'
    lbl.set_text(new_text)

plt.show()

This is very much a 'trial and error' approach, so I wouldn't be surprised if there were a better way. Still, perhaps this will be helpful.

world map v2

3
steven On

Method 1

According to geopandas's changelog, you can pass a fmt in legend_kwds since version 0.8.0 (June 24, 2020) to format the legend labels. For example, if you want no decimal point, you can set fmt='{:.0f}', like how you format numbers with a f-string. Here's an example for a quantiles map:

import matplotlib.pyplot as plt
import numpy as np
import mapclassify
import geopandas as gpd

gdf = gpd.read_file(
    gpd.datasets.get_path('naturalearth_lowres')
)
np.random.seed(0)
gdf = gdf.assign(
    random_col=np.random.normal(100, 10, len(gdf))
)

# plot quantiles map
fig, ax = plt.subplots(figsize=(10, 10))
gdf.plot(
    column='random_col',
    scheme='quantiles', k=5, cmap='Blues',
    legend=True,
    legend_kwds=dict(fmt='{:.0f}', interval=True),
    ax=ax
)

This gives us: enter image description here


Method 2

In fact, GeoPandas uses PySal's mapclassify to calculate and generate map legends. For the quantiles map (k=5) above, we can get the classification via .Quantiles() in mapclassify.

mapclassify.Quantiles(gdf.random_col, k=5)

The function returns an object of mapclassify.classifiers.Quantiles:

Quantiles               

    Interval       Count
------------------------
[ 74.47,  91.51] |    36
( 91.51,  97.93] |    35
( 97.93, 103.83] |    35
(103.83, 109.50] |    35
(109.50, 123.83] |    36

The object has an attribute bins, which returns an numpy array containing the upper bounds in all classes.

array([ 91.51435701,  97.92957441, 103.83406507, 109.49954895,
       123.83144775])

Thus, we can use this function to get all the bounds of the classes since the upper bound in a lower class equals the lower bound in the higher class. The only one missing is the lower bound in the lowest class, which equals the minimum value of the column you are trying to classify in your DataFrame. Here's an example to round all numbers to integers:

# get all upper bounds
upper_bounds = mapclassify.Quantiles(gdf.random_col, k=5).bins
# insert minimal value in front to get all bounds
bounds = np.insert(upper_bounds, 0, gdf.random_col.min())
# format the numerical legend here
intervals = [
    f'{bounds[i]:.0f}-{bounds[i+1]:.0f}' for i in range(len(bounds)-1)
]

# get all the legend labels
legend_labels = ax.get_legend().get_texts()
# replace the legend labels
for interval, legend_label in zip(intervals, legend_labels):
    legend_label.set_text(interval)

We will eventually get: enter image description here

As you can see, since we are doing things in a lower level, we are able to customize how the legend labels look like, such as removing those brackets but using a - in the middle.


Method 3

In addition to GeoPandas' .plot() method, you can also consider .choropleth() function offered by geoplot in which you can easily use different types of scheme and number of classes while passing a legend_labels arg to modify the legend labels. For example,

import geopandas as gpd
import geoplot as gplt

gdf = gpd.read_file(
    gpd.datasets.get_path('naturalearth_lowres')
)

legend_labels = [
    '< 2.4', '2.4 - 6', '6 - 15', '15 - 38', '38 - 140 M'
]
gplt.choropleth(
    gdf, hue='pop_est', cmap='Blues', scheme='quantiles',
    legend=True, legend_labels=legend_labels
)

which gives you

enter image description here