Using xarray to read EBAS data#

See more at http://ebas.nilu.no/ and https://ebas.nilu.no/thredds/

requirements pip install threddsclient

The EBAS database collects observational data on atmospheric chemical composition and physical properties from a variety of national and international research projects and monitoring programs, such as ACTRIS, AMAP, EMEP, GAW and HELCOM, as well as for the Norwegian monitoring programs funded by the Norwegian Environment Agency, the Ministry of Climate and Environment and NILU – Norwegian Institute for Air Research.

See all files available: https://thredds.nilu.no/thredds/catalog/ebas/catalog.html

# You need to first install threddsclient:
# remember to do in terminal pip install threddsclient
import threddsclient
import xarray as xr
# Find url addresses for files on EBAS

all_opendap_urls = threddsclient.opendap_urls(
'https://thredds.nilu.no/thredds/catalog/ebas/catalog.xml')
#Example 1 nephelometer scattering coefficient data
# get all data urls for one station, e.g., Zeppelin NO0042G
opendap_urls = [x for x in all_opendap_urls if 'NO0042G' in x]
# get all scattering data urls
opendap_urls = [x for x in opendap_urls if 'nephelometer' in x]

opendap_urls
['https://thredds.nilu.no/thredds/dodsC/ebas/NO0042G.20100101000000.20150216111241.nephelometer..pm10.4y.1h.SE02L_TSI_3563_ZEP_dry.SE02L_scat_coef.lev2.nc',
 'https://thredds.nilu.no/thredds/dodsC/ebas/NO0042G.20080708135939.20181213000000.nephelometer..aerosol_humidified.3mo.6h.CH02L_TSI_3563_ZEP_ref+TSI_3563_ZEP_wet.CH02L_hygro_tandem_neph_CorrData.lev2.nc']
# read multiple files
# the problem with this example is, 
#    that the files are from different instruments
#    thus, they need to be treated and digested one by one.

dsmf = xr.open_mfdataset(opendap_urls)
dsmf
<xarray.Dataset>
Dimensions:                                                         (
                                                                     time: 35451,
                                                                     metadata_time: 5,
                                                                     Wavelength: 3,
                                                                     RH_base: 1,
                                                                     RH_humidified: 1,
                                                                     ...
                                                                     aerosol_light_scattering_coefficient_prec1587_qc_flags: 2,
                                                                     aerosol_light_scattering_coefficient_perc8413_qc_flags: 2,
                                                                     pressure_qc_flags: 2,
                                                                     aerosol_light_backscattering_coefficient_prec1587_qc_flags: 2,
                                                                     aerosol_light_backscattering_coefficient_amean_qc_flags: 2,
                                                                     aerosol_light_backscattering_coefficient_perc8413_qc_flags: 2)
Coordinates:
  * time                                                            (time) datetime64[ns] ...
  * metadata_time                                                   (metadata_time) datetime64[ns] ...
  * Wavelength                                                      (Wavelength) float64 ...
  * RH_base                                                         (RH_base) float64 ...
  * RH_humidified                                                   (RH_humidified) float64 ...
  * RH_base_max                                                     (RH_base_max) float64 ...
  * RH                                                              (RH) float64 ...
  * Location                                                        (Location) |S64 ...
Dimensions without coordinates: tbnds,
                                aerosol_light_scattering_enhancement_factor_qc_flags,
                                aerosol_light_backscattering_coefficient_qc_flags,
                                aerosol_light_backscattering_enhancement_factor_qc_flags,
                                aerosol_light_scattering_coefficient_qc_flags,
                                temperature_qc_flags,
                                ...
                                aerosol_light_scattering_coefficient_prec1587_qc_flags,
                                aerosol_light_scattering_coefficient_perc8413_qc_flags,
                                pressure_qc_flags,
                                aerosol_light_backscattering_coefficient_prec1587_qc_flags,
                                aerosol_light_backscattering_coefficient_amean_qc_flags,
                                aerosol_light_backscattering_coefficient_perc8413_qc_flags
Data variables: (12/41)
    time_bnds                                                       (time, tbnds) datetime64[ns] dask.array<chunksize=(35451, 2), meta=np.ndarray>
    metadata_time_bnds                                              (metadata_time, tbnds) datetime64[ns] dask.array<chunksize=(5, 2), meta=np.ndarray>
    aerosol_light_scattering_enhancement_factor_qc                  (Wavelength, RH_base, RH_humidified, aerosol_light_scattering_enhancement_factor_qc_flags, time) float64 dask.array<chunksize=(3, 1, 1, 1, 35451), meta=np.ndarray>
    aerosol_light_backscattering_enhancement_factor_ebasmetadata    (Wavelength, RH_base_max, RH_humidified, metadata_time) object dask.array<chunksize=(3, 1, 1, 5), meta=np.ndarray>
    aerosol_light_scattering_enhancement_factor_ebasmetadata        (Wavelength, RH_base, RH_humidified, metadata_time) object dask.array<chunksize=(3, 1, 1, 5), meta=np.ndarray>
    aerosol_light_backscattering_coefficient_qc                     (Wavelength, RH, aerosol_light_backscattering_coefficient_qc_flags, time) float64 dask.array<chunksize=(3, 13, 1, 35451), meta=np.ndarray>
    ...                                                              ...
    aerosol_light_scattering_coefficient_prec1587                   (Wavelength, time) float64 dask.array<chunksize=(3, 35451), meta=np.ndarray>
    pressure                                                        (Location, time) float64 dask.array<chunksize=(1, 35451), meta=np.ndarray>
    aerosol_light_backscattering_coefficient_amean                  (Wavelength, time) float64 dask.array<chunksize=(3, 35451), meta=np.ndarray>
    aerosol_light_scattering_coefficient_perc8413                   (Wavelength, time) float64 dask.array<chunksize=(3, 35451), meta=np.ndarray>
    relative_humidity                                               (Location, time) float64 dask.array<chunksize=(1, 35451), meta=np.ndarray>
    aerosol_light_backscattering_coefficient_prec1587               (Wavelength, time) float64 dask.array<chunksize=(3, 35451), meta=np.ndarray>
Attributes: (12/47)
    Conventions:                   CF-1.7, ACDD-1.3
    featureType:                   timeSeries
    title:                         Ground based in situ observations of nephe...
    keywords:                      NO0042G, Zeppelin mountain (Ny-Ã…lesund), a...
    id:                            NO0042G.20080708135939.20181213000000.neph...
    naming_authority:              EBAS
    ...                            ...
    geospatial_lat_units:          degrees_north
    geospatial_lon_units:          degrees_east
    comment:                       {\n    "Data definition": "EBAS_1.1", \n  ...
    standard_name_vocabulary:      CF-1.7, ACDD-1.3
    history:                       None
    creator_url:                   ebas.nilu.no
# Example 2 aerosol sulphate data 
# also here - 
#    the files are differently shaped for different periods
#    best to read single files, analyse, and then concatenate for longer time series
#    note that catalogue file names give no full description of what is in a file

# get all data urls for one station, e.g., Zeppelin NO0042G
opendap_urls = [x for x in all_opendap_urls if 'NO0042G' in x]

# get all data urls which could contain sulphate data, exclude some
opendap_urls = [x for x in opendap_urls if 'filter_3pack' in x]
opendap_urls = [x for x in opendap_urls if not 'sum_ammonia' in x]
opendap_urls = [x for x in opendap_urls if not 'sulphur_dioxide' in x]

opendap_urls
['https://thredds.nilu.no/thredds/dodsC/ebas/NO0042G.20180101070000.20220405123416.filter_3pack...4y.1d.NO01L_f3p_d_0042.NO01L_IC.lev2.nc',
 'https://thredds.nilu.no/thredds/dodsC/ebas/NO0042G.20120101070000.20210421112338.filter_3pack...6y.1d.NO01L_f3p_d_0042.NO01L_IC.lev2.nc',
 'https://thredds.nilu.no/thredds/dodsC/ebas/NO0042G.20110101070000.20210421112338.filter_3pack.sulphate_corrected.aerosol.1y.1d.NO01L_f3p_d_0042.NO01L_IC.lev2.nc',
 'https://thredds.nilu.no/thredds/dodsC/ebas/NO0042G.20110101070000.20210420142507.filter_3pack...1y.1d.NO01L_f3p_d_0042.NO01L_IC.lev2.nc',
 'https://thredds.nilu.no/thredds/dodsC/ebas/NO0042G.20010101070000.20210420142507.filter_3pack...10y.1d.NO01L_f3p_d_0042.NO01L_IC.lev2.nc',
 'https://thredds.nilu.no/thredds/dodsC/ebas/NO0042G.19930101070000.20210421112338.filter_3pack..aerosol.18y.1d.NO01L_f3p_d_0042.NO01L_IC.lev2.nc',
 'https://thredds.nilu.no/thredds/dodsC/ebas/NO0042G.19930101070000.20210420142507.filter_3pack...8y.1d.NO01L_f3p_d_0042.NO01L_IC.lev2.nc',
 'https://thredds.nilu.no/thredds/dodsC/ebas/NO0042G.19891101070000.20210420142507.filter_3pack.sulphate_total.aerosol.3y.1d.NO01L_f3p_d_0042.NO01L_IC.lev2.nc']
file_to_read = opendap_urls[5]
file_to_read
'https://thredds.nilu.no/thredds/dodsC/ebas/NO0042G.19930101070000.20210421112338.filter_3pack..aerosol.18y.1d.NO01L_f3p_d_0042.NO01L_IC.lev2.nc'
ds_single_file = xr.open_dataset(file_to_read)
ds_single_file
<xarray.Dataset>
Dimensions:                                      (time: 6574, tbnds: 2,
                                                  metadata_time: 18,
                                                  nitrate_ug_per_m3_qc_flags: 2,
                                                  chloride_qc_flags: 1,
                                                  sulphate_corrected_ug_per_m3_qc_flags: 1,
                                                  sulphate_total_ug_S_per_m3_qc_flags: 1,
                                                  ...
                                                  nitrate_ug_N_per_m3_qc_flags: 2,
                                                  sulphate_corrected_ug_S_per_m3_qc_flags: 1,
                                                  ammonium_ug_per_m3_qc_flags: 2,
                                                  potassium_qc_flags: 1,
                                                  calcium_qc_flags: 1,
                                                  magnesium_qc_flags: 1)
Coordinates:
  * time                                         (time) datetime64[ns] 1993-0...
  * metadata_time                                (metadata_time) datetime64[ns] ...
Dimensions without coordinates: tbnds, nitrate_ug_per_m3_qc_flags,
                                chloride_qc_flags,
                                sulphate_corrected_ug_per_m3_qc_flags,
                                sulphate_total_ug_S_per_m3_qc_flags,
                                sodium_qc_flags, ammonium_ug_N_per_m3_qc_flags,
                                sulphate_total_ug_per_m3_qc_flags,
                                nitrate_ug_N_per_m3_qc_flags,
                                sulphate_corrected_ug_S_per_m3_qc_flags,
                                ammonium_ug_per_m3_qc_flags,
                                potassium_qc_flags, calcium_qc_flags,
                                magnesium_qc_flags
Data variables: (12/41)
    time_bnds                                    (time, tbnds) datetime64[ns] ...
    metadata_time_bnds                           (metadata_time, tbnds) datetime64[ns] ...
    nitrate_ug_per_m3_qc                         (nitrate_ug_per_m3_qc_flags, time) float64 ...
    chloride_qc                                  (chloride_qc_flags, time) float64 ...
    ammonium_ug_per_m3                           (time) float64 0.0 0.0 ... 0.4
    sulphate_corrected_ug_S_per_m3               (time) float64 0.216 ... 0.25
    ...                                           ...
    ammonium_ug_per_m3_qc                        (ammonium_ug_per_m3_qc_flags, time) float64 ...
    calcium                                      (time) float64 0.05 ... 0.03
    magnesium                                    (time) float64 0.09 ... 0.02
    potassium_qc                                 (potassium_qc_flags, time) float64 ...
    calcium_qc                                   (calcium_qc_flags, time) float64 ...
    magnesium_qc                                 (magnesium_qc_flags, time) float64 ...
Attributes: (12/92)
    Conventions:                       CF-1.8, ACDD-1.3
    featureType:                       timeSeries
    title:                             Ground based in situ observations of f...
    keywords:                          NO0042G, mass_concentration_of_chlorid...
    id:                                NO0042G.19930101070000.20210421112338....
    naming_authority:                  EBAS
    ...                                ...
    geospatial_lat_units:              degrees_north
    geospatial_lon_units:              degrees_east
    comment:                           {\n    "Data definition": "EBAS_1.1", ...
    standard_name_vocabulary:          CF-1.7, ACDD-1.3
    history:                           None
    creator_url:                       ebas.nilu.no