{ "cells": [ { "cell_type": "markdown", "id": "cc2904fd-533c-4a66-901f-3e4d10686b3d", "metadata": {}, "source": [ "# Regrid" ] }, { "cell_type": "code", "execution_count": 1, "id": "69554423-9c37-41c0-98e9-88691a829ead", "metadata": {}, "outputs": [], "source": [ "import xarray\n", "import numpy\n", "import pandas\n", "import climtas\n", "import xesmf\n", "import dask.array" ] }, { "cell_type": "markdown", "id": "dfd7e675-ec1c-439e-872c-1252fb571e46", "metadata": {}, "source": [ "We have a large Dask dataset that we'd like to regrid to a different resolution, so we can compare it with a different dataset" ] }, { "cell_type": "code", "execution_count": 2, "id": "763858b4-deca-4f29-981d-bc09af00e8d2", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'temperature' (time: 1095, lat: 50, lon: 100)>\n",
       "dask.array<random_sample, shape=(1095, 50, 100), dtype=float64, chunksize=(90, 25, 25), chunktype=numpy.ndarray>\n",
       "Coordinates:\n",
       "  * time     (time) datetime64[ns] 2001-01-01 2001-01-02 ... 2003-12-31\n",
       "  * lat      (lat) float64 -90.0 -86.33 -82.65 -78.98 ... 78.98 82.65 86.33 90.0\n",
       "  * lon      (lon) float64 -180.0 -176.4 -172.8 -169.2 ... 169.2 172.8 176.4
" ], "text/plain": [ "\n", "dask.array\n", "Coordinates:\n", " * time (time) datetime64[ns] 2001-01-01 2001-01-02 ... 2003-12-31\n", " * lat (lat) float64 -90.0 -86.33 -82.65 -78.98 ... 78.98 82.65 86.33 90.0\n", " * lon (lon) float64 -180.0 -176.4 -172.8 -169.2 ... 169.2 172.8 176.4" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "time = pandas.date_range('20010101', '20040101', freq='D', closed='left')\n", "\n", "data = dask.array.random.random((len(time),50,100), chunks=(90,25,25))\n", "lat = numpy.linspace(-90, 90, data.shape[1])\n", "lon = numpy.linspace(-180, 180, data.shape[2], endpoint=False)\n", "\n", "da = xarray.DataArray(data, coords=[('time', time), ('lat', lat), ('lon', lon)], name='temperature')\n", "da.lat.attrs['standard_name'] = 'latitude'\n", "da.lon.attrs['standard_name'] = 'longitude'\n", "\n", "da" ] }, { "cell_type": "markdown", "id": "d0a294ae-cb5b-482f-983b-d3dc953fd995", "metadata": {}, "source": [ "Here's a variable on the target grid, that we'd like to regrid `da` to" ] }, { "cell_type": "code", "execution_count": 3, "id": "df30fb03-c45b-4bb0-bc81-d5e4010161c8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'zeros-18791f2cff3f47059e412e1e9c5470f9' (lat: 10, lon: 50)>\n",
       "dask.array<zeros, shape=(10, 50), dtype=float64, chunksize=(10, 50), chunktype=numpy.ndarray>\n",
       "Coordinates:\n",
       "  * lat      (lat) float64 -90.0 -70.0 -50.0 -30.0 -10.0 ... 30.0 50.0 70.0 90.0\n",
       "  * lon      (lon) float64 0.0 7.2 14.4 21.6 28.8 ... 331.2 338.4 345.6 352.8
" ], "text/plain": [ "\n", "dask.array\n", "Coordinates:\n", " * lat (lat) float64 -90.0 -70.0 -50.0 -30.0 -10.0 ... 30.0 50.0 70.0 90.0\n", " * lon (lon) float64 0.0 7.2 14.4 21.6 28.8 ... 331.2 338.4 345.6 352.8" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t_lat = numpy.linspace(-90, 90, 10)\n", "t_lon = numpy.linspace(0, 360, 50, endpoint=False)\n", "\n", "t_da = xarray.DataArray(dask.array.zeros((len(t_lat), len(t_lon))), coords=[('lat', t_lat), ('lon', t_lon)])\n", "\n", "t_da" ] }, { "cell_type": "markdown", "id": "3be98c61-5e98-411f-b729-ec7525af3fc1", "metadata": {}, "source": [ "The main way to do a regridding in Xarray is to use [xesmf](https://xesmf.readthedocs.io/en/latest/), which uses the ESMF library to do the actual regridding. However data that is chunked horizontally produces an error message, which is an issue for very large grids." ] }, { "cell_type": "code", "execution_count": 4, "id": "2d0b8190-aa7b-4ddf-b7f9-d21fa525de7e", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/swales/miniconda3/envs/dev/lib/python3.8/site-packages/dask/array/core.py:378: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison\n", " o = func(*args, **kwargs)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Error dimension lat on 0th function argument to apply_ufunc with dask='parallelized' consists of multiple chunks, but is also a core dimension. To fix, either rechunk into a single dask array chunk along this dimension, i.e., ``.chunk(lat: -1)``, or pass ``allow_rechunk=True`` in ``dask_gufunc_kwargs`` but beware that this may significantly increase memory usage.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/home/swales/miniconda3/envs/dev/lib/python3.8/site-packages/xesmf/frontend.py:466: FutureWarning: ``output_sizes`` should be given in the ``dask_gufunc_kwargs`` parameter. It will be removed as direct parameter in a future version.\n", " dr_out = xr.apply_ufunc(\n" ] } ], "source": [ "re = xesmf.Regridder(da, t_da, method='bilinear')\n", "\n", "try:\n", " re(da)\n", "except Exception as e:\n", " print(\"Error\", e)" ] }, { "cell_type": "markdown", "id": "12ffe355-b8ba-400c-9014-a00575791adf", "metadata": {}, "source": [ "Instead you have to remove the horizontal chunking before doing a xesmf regrid." ] }, { "cell_type": "code", "execution_count": 5, "id": "29102c42-e37f-4117-a198-b231b19343c9", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'temperature' (time: 1095, lat: 10, lon: 50)>\n",
       "dask.array<transpose, shape=(1095, 10, 50), dtype=float64, chunksize=(90, 10, 50), chunktype=numpy.ndarray>\n",
       "Coordinates:\n",
       "  * time     (time) datetime64[ns] 2001-01-01 2001-01-02 ... 2003-12-31\n",
       "  * lon      (lon) float64 0.0 7.2 14.4 21.6 28.8 ... 331.2 338.4 345.6 352.8\n",
       "  * lat      (lat) float64 -90.0 -70.0 -50.0 -30.0 -10.0 ... 30.0 50.0 70.0 90.0\n",
       "Attributes:\n",
       "    regrid_method:  bilinear
" ], "text/plain": [ "\n", "dask.array\n", "Coordinates:\n", " * time (time) datetime64[ns] 2001-01-01 2001-01-02 ... 2003-12-31\n", " * lon (lon) float64 0.0 7.2 14.4 21.6 28.8 ... 331.2 338.4 345.6 352.8\n", " * lat (lat) float64 -90.0 -70.0 -50.0 -30.0 -10.0 ... 30.0 50.0 70.0 90.0\n", "Attributes:\n", " regrid_method: bilinear" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "re(da.chunk({'lat': None, 'lon': None}))" ] }, { "cell_type": "markdown", "id": "5ddc9dfb-a47f-46be-8156-1d7d55ea8b04", "metadata": {}, "source": [ "The [climtas.regrid](api/regrid.rst) functions support regridding horizontally chunked data. By default the regridding is bilinear, with weights generated by CDO, but you can also supply your own regridding weights or have the library generate them for you from ESMF or CDO." ] }, { "cell_type": "code", "execution_count": 6, "id": "3cde370d-ee50-453c-adae-6235ba7628c8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'temperature' (time: 1095, lat: 10, lon: 50)>\n",
       "dask.array<reshape, shape=(1095, 10, 50), dtype=float64, chunksize=(90, 10, 50), chunktype=numpy.ndarray>\n",
       "Coordinates:\n",
       "  * time     (time) datetime64[ns] 2001-01-01 2001-01-02 ... 2003-12-31\n",
       "  * lat      (lat) float64 -90.0 -70.0 -50.0 -30.0 -10.0 ... 30.0 50.0 70.0 90.0\n",
       "  * lon      (lon) float64 0.0 7.2 14.4 21.6 28.8 ... 331.2 338.4 345.6 352.8
" ], "text/plain": [ "\n", "dask.array\n", "Coordinates:\n", " * time (time) datetime64[ns] 2001-01-01 2001-01-02 ... 2003-12-31\n", " * lat (lat) float64 -90.0 -70.0 -50.0 -30.0 -10.0 ... 30.0 50.0 70.0 90.0\n", " * lon (lon) float64 0.0 7.2 14.4 21.6 28.8 ... 331.2 338.4 345.6 352.8" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "climtas.regrid.regrid(da, t_da)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.6" } }, "nbformat": 4, "nbformat_minor": 5 }