Interactive plotting with HoloViews
Contents
Interactive plotting with HoloViews#
Context#
We will be using HoloViews, a tool part of the HoloViz ecosystem, with Xarray to visualize the Vegetation Condition Index (VCI) [Kog95], a well-established indicator to estimate droughts from remote sensing data.
Data#
We will use data that have been generated in the previous episode.
If the dataset is not present in the same folder as this Jupyter notebook, it will be downloaded from zenodo using pooch
, a very handy python-based library to download and cache your data files locally (see further info here).
import pooch
cgls_file = pooch.retrieve(
url="https://zenodo.org/record/6969999/files/C_GLS_NDVI_20220101_20220701_Lombardia_S3_2_masked.nc",
known_hash="md5:be3f16913ebbdb4e7af227f971007b22",
path=f".",
)
Setup#
This episode uses the following main Python packages:
pooch [USR+20]
rioxarray [SBR+22]
matplotlib [Hun07]
cartopy [MetOffice15]
hvplot [RSB+20]
geopandas [JdBF+20]
Please install these packages if not already available in your Python environment.
Packages#
In this episode, Python packages are imported when we start to use them. However, for best software practices, we recommend you to install and import all the necessary libraries at the top of your Jupyter notebook.
import xarray as xr
Open local dataset#
cgls_ds = xr.open_dataset(cgls_file)
Tip
If you get an error with the previous command, check the previous episode where the input file some_hash-C_GLS_NDVI_20220101_20220701_Lombardia_S3_2_masked.netcdf is downloaded locally and it is in the same directory as your Jupyter Notebook.
cgls_ds
<xarray.Dataset> Dimensions: (time: 20, lon: 984, lat: 612) Coordinates: * time (time) datetime64[ns] 2022-01-01 2022-01-11 ... 2022-07-11 * lon (lon) float64 8.502 8.505 8.508 8.511 ... 11.42 11.42 11.42 11.43 * lat (lat) float64 46.5 46.5 46.49 46.49 ... 44.69 44.69 44.68 44.68 Data variables: NDVI (time, lat, lon) float64 ...
Clipping data according to a polygon#
One of the basic concepts in GIS is to clip data using a vector geometry. Xarray is not directly capable of dealing with vectors but thanks to Rioxarray that can be easily achieved. Rioxarray extends Xarray with most of the features that Rasterio (GDAL) brings.
Read a shapefile with the Area Of Interest (AOI)#
import geopandas as gpd
We define the area of interest using the Global Administrative Unit Layers GAUL G2015_2014 provided by FAO-UN (see Documentation).
GeoPandas
, a python-based library extending the capabilities of Pandas
to deal with geometry and spatial operations, will help to manage geodata.
The official data distribution from FAO is through the WFS service (see below how to retrieve data):
GAUL = gpd.read_file('https://data.apps.fao.org/map/gsrv/gsrv1/gaul/wfs?'
'service=WFS&version=2.0.0&'
'Request=GetFeature&'
'TypeNames=gaul:g2015_2014_2&'
'srsName=EPSG%3A4326&'
'maxFeatures=2500&'
'outputFormat=json')
Unfortunately it seems that the service is pretty slow. As an alternative to this approach the JRC MARS unit is distributing the original dataset that was in shapefile format. To accelerate the fetch we highly recommend to follow this approach.
For the training course, we also created a tiny file containing information about Italy only.
try:
GAUL = gpd.read_file('Italy.geojson')
except:
GAUL = gpd.read_file('zip+https://mars.jrc.ec.europa.eu/asap/files/gaul1_asap.zip')
Data are organized in a tabular structure. For each element an index, data (made of columns) and a geometry are defined.
Geometries are defined through shapely geometry objects with three different basic classes:
Points and Multi-Points
Lines and Multi-Lines
Polygons and Multi-Polygons
GAUL.head(5)
asap1_id | name1 | name1_shr | name0 | asap0_id | name0_shr | km2_tot | km2_crop | km2_range | an_crop | an_range | water_lim | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1172 | Friuli-venezia Giulia | Friuli | Italy | 172 | Italy | 7722 | 2995 | 1111 | 1 | 1 | 0 | POLYGON ((12.79517 46.64600, 12.80945 46.63482... |
1 | 1202 | Calabria | Calabria | Italy | 172 | Italy | 15203 | 7152 | 2127 | 1 | 1 | 1 | POLYGON ((16.42261 40.14416, 16.43624 40.12897... |
2 | 1204 | Campania | Campania | Italy | 172 | Italy | 13618 | 7376 | 2378 | 1 | 1 | 1 | POLYGON ((14.16099 41.48058, 14.16653 41.48039... |
3 | 1207 | Abruzzi | Abruzzi | Italy | 172 | Italy | 10797 | 4699 | 3300 | 1 | 1 | 1 | POLYGON ((13.91523 42.89441, 13.92206 42.85505... |
4 | 1211 | Emilia-romagna | Emilia-romagna | Italy | 172 | Italy | 22204 | 14958 | 771 | 1 | 1 | 1 | POLYGON ((12.49151 43.91780, 12.48442 43.90761... |
In the cell below, we subset the polygon geometry in which the name1
field equals to Lombardia
.
AOI_name = 'Lombardia'
AOI = GAUL[GAUL.name1 == AOI_name]
AOI_poly = AOI.geometry
AOI_poly
14 POLYGON ((10.23973 46.62177, 10.25084 46.61110...
Name: geometry, dtype: geometry
In a second step we set a EPSG:4326 Geodetic coordinate reference system to the polygon geometry. To achieve this we need to rely on rioxarray that extends xarray with the rasterio capabilities. The rio accessor is activated through importing rioxarray as has been done at the top.
cgls_ds.rio.write_crs(4326, inplace=True)
<xarray.Dataset> Dimensions: (time: 20, lon: 984, lat: 612) Coordinates: * time (time) datetime64[ns] 2022-01-01 2022-01-11 ... 2022-07-11 * lon (lon) float64 8.502 8.505 8.508 8.511 ... 11.42 11.42 11.43 * lat (lat) float64 46.5 46.5 46.49 46.49 ... 44.69 44.69 44.68 44.68 spatial_ref int64 0 Data variables: NDVI (time, lat, lon) float64 ...
Once this has been done we can clip the data with the polygon that has been obtained through geopandas at the beginning of the notebook.
NDVI_AOI = cgls_ds.NDVI.rio.clip(AOI_poly, crs=4326)
Visualize with matplotlib#
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
fig = plt.figure(1, figsize=[20, 10])
# We're using cartopy and are plotting in PlateCarree projection
# (see documentation on cartopy)
ax = plt.subplot(1, 1, 1, projection=ccrs.PlateCarree())
#ax.set_extent([15.5, 27.5, 36, 41], crs=ccrs.PlateCarree()) # lon1 lon2 lat1 lat2
ax.coastlines(resolution='10m')
ax.gridlines(draw_labels=True)
NDVI_AOI.sel(time='2022-06-01').plot(ax=ax, transform=ccrs.PlateCarree(), cmap="RdYlGn")
# One way to customize your title
plt.title("S3 NDVI over Lombardia", fontsize=18)
Text(0.5, 1.0, 'S3 NDVI over Lombardia')
/usr/share/miniconda/envs/foss4g/lib/python3.10/site-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_coastline.zip
warnings.warn(f'Downloading: {url}', DownloadWarning)
Visualization with HoloViews#
import holoviews as hv
import hvplot.xarray
import hvplot.pandas
Plotting data through the HoloViews back-end thanks to the hvplot that acts as high-level plotting API.
NDVI_AOI.isel(time=0).hvplot(cmap="RdYlGn", width=1000, height=1000)
Having a look to data distribution can reveal a lot about the data.
NDVI_AOI[0].hvplot.hist(cmap="RdYlGn",bins=25, width=800, height=700)
Multi-plots using groupby#
To be able to visualize interactively all the different available times, we can use groupby
time.
NDVI_AOI.hvplot(groupby ='time', cmap="RdYlGn", width=800, height=700)
We can add a histogram to the visualization.
NDVI_AOI.hvplot(groupby='time', cmap='RdYlGn', width=800, height=700 ).hist()