Regrid#

[1]:
import xarray
import numpy
import pandas
import climtas
import xesmf
import dask.array

We have a large Dask dataset that we’d like to regrid to a different resolution, so we can compare it with a different dataset

[2]:
time = pandas.date_range('20010101', '20040101', freq='D', closed='left')

data = dask.array.random.random((len(time),50,100), chunks=(90,25,25))
lat = numpy.linspace(-90, 90, data.shape[1])
lon = numpy.linspace(-180, 180, data.shape[2], endpoint=False)

da = xarray.DataArray(data, coords=[('time', time), ('lat', lat), ('lon', lon)], name='temperature')
da.lat.attrs['standard_name'] = 'latitude'
da.lon.attrs['standard_name'] = 'longitude'

da
[2]:
<xarray.DataArray 'temperature' (time: 1095, lat: 50, lon: 100)>
dask.array<random_sample, shape=(1095, 50, 100), dtype=float64, chunksize=(90, 25, 25), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 2001-01-01 2001-01-02 ... 2003-12-31
  * lat      (lat) float64 -90.0 -86.33 -82.65 -78.98 ... 78.98 82.65 86.33 90.0
  * lon      (lon) float64 -180.0 -176.4 -172.8 -169.2 ... 169.2 172.8 176.4

Here’s a variable on the target grid, that we’d like to regrid da to

[3]:
t_lat = numpy.linspace(-90, 90, 10)
t_lon = numpy.linspace(0, 360, 50, endpoint=False)

t_da = xarray.DataArray(dask.array.zeros((len(t_lat), len(t_lon))), coords=[('lat', t_lat), ('lon', t_lon)])

t_da
[3]:
<xarray.DataArray 'zeros-18791f2cff3f47059e412e1e9c5470f9' (lat: 10, lon: 50)>
dask.array<zeros, shape=(10, 50), dtype=float64, chunksize=(10, 50), chunktype=numpy.ndarray>
Coordinates:
  * lat      (lat) float64 -90.0 -70.0 -50.0 -30.0 -10.0 ... 30.0 50.0 70.0 90.0
  * lon      (lon) float64 0.0 7.2 14.4 21.6 28.8 ... 331.2 338.4 345.6 352.8

The main way to do a regridding in Xarray is to use xesmf, which uses the ESMF library to do the actual regridding. However data that is chunked horizontally produces an error message, which is an issue for very large grids.

[4]:
re = xesmf.Regridder(da, t_da, method='bilinear')

try:
    re(da)
except Exception as e:
    print("Error", e)
/home/swales/miniconda3/envs/dev/lib/python3.8/site-packages/dask/array/core.py:378: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  o = func(*args, **kwargs)
Error dimension lat on 0th function argument to apply_ufunc with dask='parallelized' consists of multiple chunks, but is also a core dimension. To fix, either rechunk into a single dask array chunk along this dimension, i.e., ``.chunk(lat: -1)``, or pass ``allow_rechunk=True`` in ``dask_gufunc_kwargs`` but beware that this may significantly increase memory usage.
/home/swales/miniconda3/envs/dev/lib/python3.8/site-packages/xesmf/frontend.py:466: FutureWarning: ``output_sizes`` should be given in the ``dask_gufunc_kwargs`` parameter. It will be removed as direct parameter in a future version.
  dr_out = xr.apply_ufunc(

Instead you have to remove the horizontal chunking before doing a xesmf regrid.

[5]:
re(da.chunk({'lat': None, 'lon': None}))
[5]:
<xarray.DataArray 'temperature' (time: 1095, lat: 10, lon: 50)>
dask.array<transpose, shape=(1095, 10, 50), dtype=float64, chunksize=(90, 10, 50), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 2001-01-01 2001-01-02 ... 2003-12-31
  * lon      (lon) float64 0.0 7.2 14.4 21.6 28.8 ... 331.2 338.4 345.6 352.8
  * lat      (lat) float64 -90.0 -70.0 -50.0 -30.0 -10.0 ... 30.0 50.0 70.0 90.0
Attributes:
    regrid_method:  bilinear

The climtas.regrid functions support regridding horizontally chunked data. By default the regridding is bilinear, with weights generated by CDO, but you can also supply your own regridding weights or have the library generate them for you from ESMF or CDO.

[6]:
climtas.regrid.regrid(da, t_da)
[6]:
<xarray.DataArray 'temperature' (time: 1095, lat: 10, lon: 50)>
dask.array<reshape, shape=(1095, 10, 50), dtype=float64, chunksize=(90, 10, 50), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 2001-01-01 2001-01-02 ... 2003-12-31
  * lat      (lat) float64 -90.0 -70.0 -50.0 -30.0 -10.0 ... 30.0 50.0 70.0 90.0
  * lon      (lon) float64 0.0 7.2 14.4 21.6 28.8 ... 331.2 338.4 345.6 352.8