Save files to EOSC (CESNET)
Contents
Save files to EOSC (CESNET)#
This example show you how to save your zarr file to object storage https://object-store.cloud.muni.cz
import s3fs
import xarray as xr
import zarr
Get a sample file#
ds = xr.tutorial.open_dataset("air_temperature.nc").rename({"air": "Tair"})
Save your results to Remote object storage#
If not done, create your credentials by follwoing this link
Verify your credentials in
$HOME/.aws/credentials
It should look like
[default]
aws_access_key_id=xxxxx
aws_secret_access_key=yyyy
aws_endpoint_url=https://object-store.cloud.muni.cz
It is important to save your results in 'your' bucket. [The credential created here ](../EOSC_to_bucket.md) is a common space for pangeo-eosc cloud users. You need to not to 'over write' data on other users
Define your s3 storage parameters#
your_name='tinaok'
path='tmp/'+your_name
s3_prefix = "s3://"+path
print(s3_prefix)
access_key = !aws configure get aws_access_key_id
access_key = access_key[0]
secret_key = !aws configure get aws_secret_access_key
secret_key = secret_key[0]
client_kwargs={'endpoint_url': 'https://object-store.cloud.muni.cz'}
s3://tmp/tinaok
Define your s3 storage and define your zarr store#
zarr_file_name= "FSStore_zarr"
uri = f"{s3_prefix}/{zarr_file_name}"
store = zarr.storage.FSStore(uri, client_kwargs=client_kwargs,key=access_key, secret=secret_key)
Write your file down to zarr with FSStore#
%time ds.to_zarr(store=store, mode='w', consolidated=True)
/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/core/dataset.py:2060: SerializationWarning: saving variable None with floating point data as an integer dtype without any _FillValue to use for NaNs
return to_zarr( # type: ignore
CPU times: user 870 ms, sys: 169 ms, total: 1.04 s
Wall time: 51.3 s
<xarray.backends.zarr.ZarrStore at 0x7ffb4b8619e0>
Verify that your ‘zarr’ file is at your storage#
target = s3fs.S3FileSystem(anon=False,client_kwargs=client_kwargs)
target.ls(path)#, detail=True, refresh=True)
['tmp/tinaok/FSStore_zarr']
Open your zarr storage#
xr.open_zarr(store)
<xarray.Dataset> Dimensions: (time: 2920, lat: 25, lon: 53) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 Data variables: Tair (time, lat, lon) float32 dask.array<chunksize=(730, 13, 27), meta=np.ndarray> Attributes: Conventions: COARDS description: Data is from NMC initialized reanalysis\n(4x/day). These a... platform: Model references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly... title: 4x daily NMC reanalysis (1948)
Delete the zarr file you created#
target.ls(path)
['tmp/tinaok/FSStore_zarr']
target.rm(path+'/FSStore_zarr', recursive=True)
target.ls('tmp/')
['tmp/fepit',
'tmp/fepitzarr-demo',
'tmp/guillaumeeb',
'tmp/xarray-demo-dask-s3']
target.rm('tmp/fepit/xesmf_regridding.ipynb')
one users can delete another user's file.... So please be carefull especially when using recursive=True