Convert (latitude, longitude, variable) scattered dataset to gridded data

426 views Asked by At

I have a pandas data frame (90720 rows) consisting of longitude, latitude, and variable columns. The data represent points on a 1.3 km resolution grid but are not in any particular order within the data frame. An example of the dataset looks like:

lon lat var
40.601700 -90.078857 0.006614
40.598942 -90.031372 0.048215
40.592426 -89.920563 0.012860
40.591480 -89.904724 0.006642
40.590546 -89.888916 0.005383
43.642635 -89.904724 0.012860
40.590546 -84.545715 0.012860

I would like to convert these lat/lon/var points into a gridded dataset. Most approaches I have tried (df.pivot) require significant memory due to the size of the data frame. The final gridded data should have a shape of (288,315). Ultimately, I want to plot this data with plt.colormesh() to compare it with other datasets. I appreciate any suggestions!

1

There are 1 answers

0
OCa On

This is strongly inpsired by Resampling irregularly spaced data to a regular grid in Python from 12 years ago. I updated the code to work in current Python, and improved readability. It uses plt.pcolormesh. I suppose this is what you meant when requesting plt.colormesh.

With df the dataframe of your suggested input data:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import griddata

# grid sizing
# number of grid points:
nx, ny = 288, 315
# grid window
xmin, xmax = 40, 45
ymin, ymax = -91, -84

# Generate a regular grid to interpolate the data.
X, Y = np.meshgrid(np.linspace(xmin, xmax, nx), 
                   np.linspace(ymin, ymax, ny))

# Interpolate using "cubic" method
Z = griddata(points = (df.lon, df.lat),
             values = df['var'],
             xi = (X, Y),
             method = 'cubic'))

# Plot the results
plt.figure()
plt.pcolormesh(X, Y, Z)
plt.scatter(x=df.lon, y=df.lat, c=df['var'])
plt.colorbar()
plt.axis([xmin, xmax, ymin, ymax])
plt.xlabel('longitude')
plt.ylabel('latitude')
plt.title('Overlay: Scatter and Grid')
plt.show()

With your initial dataset as circles:

Gridded scatter, cubic interpolation