Save zarr files to EOSC (CESNET) using s3fs#

import s3fs
import xarray as xr
import zarr

Get a sample file#

ds = xr.tutorial.open_dataset("").rename({"air": "Tair"})

Save your results to Remote object storage#

  • If not done, create your credentials by follwoing this link

  • Verify your credentials in $HOME/.aws/credentials It should look like

It is important to save your results in 'your' bucket. [The credential created here ](../ is a common space for pangeo-eosc cloud users. You need to not to 'over write' data on other users

Define your s3 storage parameters#

s3_prefix =  "s3://"+path
access_key = !aws configure get aws_access_key_id
access_key = access_key[0]
secret_key = !aws configure get aws_secret_access_key
secret_key = secret_key[0]
client_kwargs={'endpoint_url': ''}

set your zarr store (with S3Map)#

zarr_file_name= 'S3Map_zarr'
s3 = s3fs.S3FileSystem(client_kwargs=client_kwargs,key=access_key, secret=secret_key)
uri = f"{s3_prefix}/{zarr_file_name}"
store_s3= s3fs.S3Map(root=uri,s3=s3,check=False)

Write your file down to zarr#

%time ds.to_zarr(store=store_s3, mode='w', consolidated=True)
/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/core/ SerializationWarning: saving variable None with floating point data as an integer dtype without any _FillValue to use for NaNs
  return to_zarr(  # type: ignore
CPU times: user 714 ms, sys: 70.3 ms, total: 784 ms
Wall time: 50 s
<xarray.backends.zarr.ZarrStore at 0x7ff66f37ce40>

Verify that your ‘zarr’ file is at your storage#

target = s3fs.S3FileSystem(anon=False,client_kwargs=client_kwargs), detail=True, refresh=True)

Open your zarr storage#

Dimensions:  (time: 2920, lat: 25, lon: 53)
  * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
  * lon      (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
  * time     (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
    Tair     (time, lat, lon) float32 dask.array<chunksize=(730, 13, 27), meta=np.ndarray>
    Conventions:  COARDS
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    title:        4x daily NMC reanalysis (1948)

Delete the zarr file you created#
target.rm(path+'/S3Map_zarr', recursive=True)'escience/')
one users can delete another user's file.... So please be carefull especially when using recursive=True