{ "cells": [ { "cell_type": "markdown", "id": "0702ec6c-f370-4cad-bb32-bb80d4335bdb", "metadata": {}, "source": [ "# Event" ] }, { "cell_type": "code", "execution_count": 1, "id": "4ce4b199-5572-4e33-83b6-593ce09e57a4", "metadata": {}, "outputs": [], "source": [ "import xarray\n", "import numpy\n", "import pandas\n", "import climtas\n", "import xesmf\n", "import dask.array" ] }, { "cell_type": "markdown", "id": "bfc85e52-2dc0-44f7-b271-8e59361fbc6e", "metadata": {}, "source": [ "We have a Dask dataset, and we'd like to identify periods where the value is above some threshold" ] }, { "cell_type": "code", "execution_count": 2, "id": "7c1dcb98-fbaf-403e-a388-523b73fc6821", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'temperature' (time: 1095, lat: 50, lon: 100)>\n",
       "dask.array<random_sample, shape=(1095, 50, 100), dtype=float64, chunksize=(90, 25, 25), chunktype=numpy.ndarray>\n",
       "Coordinates:\n",
       "  * time     (time) datetime64[ns] 2001-01-01 2001-01-02 ... 2003-12-31\n",
       "  * lat      (lat) float64 -90.0 -86.33 -82.65 -78.98 ... 78.98 82.65 86.33 90.0\n",
       "  * lon      (lon) float64 -180.0 -176.4 -172.8 -169.2 ... 169.2 172.8 176.4
" ], "text/plain": [ "\n", "dask.array\n", "Coordinates:\n", " * time (time) datetime64[ns] 2001-01-01 2001-01-02 ... 2003-12-31\n", " * lat (lat) float64 -90.0 -86.33 -82.65 -78.98 ... 78.98 82.65 86.33 90.0\n", " * lon (lon) float64 -180.0 -176.4 -172.8 -169.2 ... 169.2 172.8 176.4" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "time = pandas.date_range('20010101', '20040101', freq='D', closed='left')\n", "\n", "data = dask.array.random.random((len(time),50,100), chunks=(90,25,25))\n", "lat = numpy.linspace(-90, 90, data.shape[1])\n", "lon = numpy.linspace(-180, 180, data.shape[2], endpoint=False)\n", "\n", "da = xarray.DataArray(data, coords=[('time', time), ('lat', lat), ('lon', lon)], name='temperature')\n", "da.lat.attrs['standard_name'] = 'latitude'\n", "da.lon.attrs['standard_name'] = 'longitude'\n", "\n", "da" ] }, { "cell_type": "markdown", "id": "402209fd-7865-4577-b9d3-3ccf5bef1625", "metadata": {}, "source": [ "[climtas.event.find_events](api/event.rst#climtas.event.find_events) will create a Pandas table of events. You give it an array of boolean values, `True` if an event is active, which can e.g. be generated by comparing against a threshold like a mean or percentile." ] }, { "cell_type": "code", "execution_count": 3, "id": "5b9da0f5-50d3-463e-a624-6beda2d93fd0", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
timelatlonevent_duration
02231210
1311211
2392111
3320611
460610
...............
25891074318211
25901079367911
25911081349812
25921084257910
25931083357811
\n", "

2594 rows × 4 columns

\n", "
" ], "text/plain": [ " time lat lon event_duration\n", "0 2 23 12 10\n", "1 3 1 12 11\n", "2 3 9 21 11\n", "3 3 20 6 11\n", "4 6 0 6 10\n", "... ... ... ... ...\n", "2589 1074 31 82 11\n", "2590 1079 36 79 11\n", "2591 1081 34 98 12\n", "2592 1084 25 79 10\n", "2593 1083 35 78 11\n", "\n", "[2594 rows x 4 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "threshold = da.mean('time')\n", "\n", "events = climtas.event.find_events(da > threshold, min_duration = 10)\n", "\n", "events" ] }, { "cell_type": "markdown", "id": "ed0c70f2-1c28-4cfb-bdee-9a68508df45a", "metadata": {}, "source": [ "Since the result is a Pandas table normal Pandas operations will work, here's a histogram of event durations. The values in the event table are the array indices where events start and the number of steps the event is active for." ] }, { "cell_type": "code", "execution_count": 4, "id": "7e560257-05d6-43b5-b21e-d4ff718d22db", "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "events.hist('event_duration', grid=False);" ] }, { "cell_type": "markdown", "id": "a7771f19-191d-4bfa-afa2-e1b2afcd4e34", "metadata": {}, "source": [ "You can convert from the indices to coordinates using [climtas.event.event_coords](api/event.rst#climtas.event.event_coords). Events still active at the end of the dataset are marked with a duration of NaT (not a time)" ] }, { "cell_type": "code", "execution_count": 5, "id": "dfe44704-7145-4d68-9509-8ef613101703", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
timelatlonevent_duration
02001-01-03-5.510204-136.810 days
12001-01-04-86.326531-136.811 days
22001-01-04-56.938776-104.411 days
32001-01-04-16.530612-158.411 days
42001-01-07-90.000000-158.410 days
...............
25892003-12-1123.877551115.211 days
25902003-12-1642.244898104.411 days
25912003-12-1834.897959172.812 days
25922003-12-211.836735104.410 days
25932003-12-2038.571429100.811 days
\n", "

2594 rows × 4 columns

\n", "
" ], "text/plain": [ " time lat lon event_duration\n", "0 2001-01-03 -5.510204 -136.8 10 days\n", "1 2001-01-04 -86.326531 -136.8 11 days\n", "2 2001-01-04 -56.938776 -104.4 11 days\n", "3 2001-01-04 -16.530612 -158.4 11 days\n", "4 2001-01-07 -90.000000 -158.4 10 days\n", "... ... ... ... ...\n", "2589 2003-12-11 23.877551 115.2 11 days\n", "2590 2003-12-16 42.244898 104.4 11 days\n", "2591 2003-12-18 34.897959 172.8 12 days\n", "2592 2003-12-21 1.836735 104.4 10 days\n", "2593 2003-12-20 38.571429 100.8 11 days\n", "\n", "[2594 rows x 4 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "coords = climtas.event.event_coords(da, events)\n", "\n", "coords" ] }, { "cell_type": "markdown", "id": "297b852e-4d75-4505-8fb3-cca70e7f0eb7", "metadata": {}, "source": [ "To get statistics for each event use [climtas.event.map_events](api/event.rst#climtas.event.map_events). This takes a function that is given the event's data and returns a dict of different statistics. It's helpful to use `.values` here to return a number instead of a DataArray with coordinates and attributes." ] }, { "cell_type": "code", "execution_count": 6, "id": "6b91ec41-86d7-4cc3-bab4-22e90afdb14f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
summean
07.7206461917656210.7720646191765621
17.5451689087822780.6859244462529344
28.7764049532848270.7978549957531661
38.8206292835227080.8018753894111552
47.3506208185051830.7350620818505182
.........
25898.7107908842535760.7918900803866887
25906.732660817866260.6120600743514782
25918.8685922149959740.7390493512496645
25927.7902115942778640.7790211594277864
25938.6698673149185530.7881697559016867
\n", "

2594 rows × 2 columns

\n", "
" ], "text/plain": [ " sum mean\n", "0 7.720646191765621 0.7720646191765621\n", "1 7.545168908782278 0.6859244462529344\n", "2 8.776404953284827 0.7978549957531661\n", "3 8.820629283522708 0.8018753894111552\n", "4 7.350620818505183 0.7350620818505182\n", "... ... ...\n", "2589 8.710790884253576 0.7918900803866887\n", "2590 6.73266081786626 0.6120600743514782\n", "2591 8.868592214995974 0.7390493512496645\n", "2592 7.790211594277864 0.7790211594277864\n", "2593 8.669867314918553 0.7881697559016867\n", "\n", "[2594 rows x 2 columns]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stats = climtas.event.map_events(da, events, lambda x: {'sum': x.sum().values, 'mean': x.mean().values})\n", "\n", "stats" ] }, { "cell_type": "markdown", "id": "dfae98e3-96a0-409b-b2c7-0a7762a50eb4", "metadata": {}, "source": [ "Again this is a Pandas dataframe, so you can join the different tables really simply" ] }, { "cell_type": "code", "execution_count": 7, "id": "1d07ec7b-a291-48ae-9ed4-7e3276302993", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
timelatlonevent_durationsummean
02001-01-03-5.510204-136.810 days7.7206461917656210.7720646191765621
12001-01-04-86.326531-136.811 days7.5451689087822780.6859244462529344
22001-01-04-56.938776-104.411 days8.7764049532848270.7978549957531661
32001-01-04-16.530612-158.411 days8.8206292835227080.8018753894111552
42001-01-07-90.000000-158.410 days7.3506208185051830.7350620818505182
.....................
25892003-12-1123.877551115.211 days8.7107908842535760.7918900803866887
25902003-12-1642.244898104.411 days6.732660817866260.6120600743514782
25912003-12-1834.897959172.812 days8.8685922149959740.7390493512496645
25922003-12-211.836735104.410 days7.7902115942778640.7790211594277864
25932003-12-2038.571429100.811 days8.6698673149185530.7881697559016867
\n", "

2594 rows × 6 columns

\n", "
" ], "text/plain": [ " time lat lon event_duration sum \\\n", "0 2001-01-03 -5.510204 -136.8 10 days 7.720646191765621 \n", "1 2001-01-04 -86.326531 -136.8 11 days 7.545168908782278 \n", "2 2001-01-04 -56.938776 -104.4 11 days 8.776404953284827 \n", "3 2001-01-04 -16.530612 -158.4 11 days 8.820629283522708 \n", "4 2001-01-07 -90.000000 -158.4 10 days 7.350620818505183 \n", "... ... ... ... ... ... \n", "2589 2003-12-11 23.877551 115.2 11 days 8.710790884253576 \n", "2590 2003-12-16 42.244898 104.4 11 days 6.73266081786626 \n", "2591 2003-12-18 34.897959 172.8 12 days 8.868592214995974 \n", "2592 2003-12-21 1.836735 104.4 10 days 7.790211594277864 \n", "2593 2003-12-20 38.571429 100.8 11 days 8.669867314918553 \n", "\n", " mean \n", "0 0.7720646191765621 \n", "1 0.6859244462529344 \n", "2 0.7978549957531661 \n", "3 0.8018753894111552 \n", "4 0.7350620818505182 \n", "... ... \n", "2589 0.7918900803866887 \n", "2590 0.6120600743514782 \n", "2591 0.7390493512496645 \n", "2592 0.7790211594277864 \n", "2593 0.7881697559016867 \n", "\n", "[2594 rows x 6 columns]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "coords.join(stats)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.6" } }, "nbformat": 4, "nbformat_minor": 5 }