sup3r.preprocessing.loaders.xr.LoaderX#

class LoaderX(file_paths, features='all', res_kwargs=None, chunks='auto', feature_aliases=None, BaseLoader=None)[source]#

Bases: BaseLoader

Base xarray loader. Can load any file type supported by xarray. Primarily used to “load” netcdf or zarr files. The .data attribute provides access to the data in the files. This object provides a __getitem__ method that can be used by Sampler objects to build batches or by other objects to derive / extract specific features / regions / time_periods.

Parameters:
  • file_paths (str | pathlib.Path | list) – Location(s) of files to load

  • features (list | str) – Features to return in loaded dataset. If ‘all’ then all available features will be returned.

  • res_kwargs (dict) – Additional keyword arguments passed through to the BaseLoader. BaseLoader is usually xr.open_mfdataset for NETCDF files and MultiFileResourceX for H5 files.

  • chunks (dict | str | None) – Dictionary of chunk sizes to pass through to dask.array.from_array() or xr.Dataset().chunk(). Will be converted to a tuple when used in from_array(). These are the methods for H5 and NETCDF data, respectively. This argument can be “auto” in additional to a dictionary. If this is None then the data will not be chunked and instead loaded directly into memory.

  • feature_aliases (dict) – Optional dictionary of feature aliases to use when loading data. This is useful for renaming features to expected sup3r names. For example, {‘sp’: ‘pressure_0m’, ‘u10’: u_10m’}.

  • BaseLoader (Callable) – Optional base loader update. The default for H5 files is MultiFileResourceX and for NETCDF or ZARR is xarray.open_mfdataset

Methods

BASE_LOADER(file_paths, **kwargs)

Lowest level interface to data.

post_init_log([args_dict])

Log additional arguments after initialization.

wrap(data)

Return a Sup3rDataset object or tuple of such.

Attributes

timer

data

Return underlying data.

file_paths

Get file paths for input data

shape

Get shape of underlying data.

classmethod BASE_LOADER(file_paths, **kwargs)[source]#

Lowest level interface to data.

property data#

Return underlying data.

Returns:

Sup3rDataset

See also

wrap()

property file_paths#

Get file paths for input data

post_init_log(args_dict=None)#

Log additional arguments after initialization.

property shape#

Get shape of underlying data.

wrap(data)#

Return a Sup3rDataset object or tuple of such. This is a tuple when the .data attribute belongs to a Collection object like BatchHandler. Otherwise this is Sup3rDataset object, which is either a wrapped 3-tuple, 2-tuple, or 1-tuple (e.g. len(data) == 3, len(data) == 2 or len(data) == 1). This is a 3-tuple when .data belongs to a container object like DualSamplerWithObs, a 2-tuple when .data belongs to a dual container object like DualSampler, and a 1-tuple otherwise.