sup3r.preprocessing.rasterizers.dual.DualRasterizer#
- class DualRasterizer(data: Sup3rDataset | tuple[Dataset, Dataset] | dict[str, Dataset], regrid_workers=1, regrid_lr=True, run_qa=True, s_enhance=1, t_enhance=1, lr_cache_kwargs=None, hr_cache_kwargs=None)[source]#
Bases:
ContainerObject containing xr.Dataset instances for low and high-res data. (Usually ERA5 and WTK, respectively). This essentially just regrids the low-res data to the coarsened high-res grid. This is useful for caching prepping data which then can go directly to a
DualSamplerDualBatchQueue.Note
When first extracting the low_res data make sure to extract a region that completely overlaps the high_res region. It is easiest to load the full low_res domain and let
DualRasterizerselect the appropriate region through regridding.Initialize data container lr and hr
Datainstances. Typically lr = ERA5 data and hr = WTK data.- Parameters:
data (Sup3rDataset | tuple[xr.Dataset, xr.Dataset] |) – dict[str, xr.Dataset] A tuple of xr.Dataset instances. The first must be low-res and the second must be high-res data
regrid_workers (int | None) – Number of workers to use for regridding routine.
regrid_lr (bool) – Flag to regrid the low-res data to the high-res grid. This will take care of any minor inconsistencies in different projections. Disable this if the grids are known to be the same.
run_qa (bool) – Flag to run qa on the regridded low-res data. This will check for NaNs and fill them if there are not too many.
s_enhance (int) – Spatial enhancement factor
t_enhance (int) – Temporal enhancement factor
lr_cache_kwargs (dict) – Cache kwargs for the call to lr_data.cache_data(cache_kwargs). Must include ‘cache_pattern’ key if not None, and can also include dictionary of chunk tuples with feature keys
hr_cache_kwargs (dict) – Cache kwargs for the call to hr_data.cache_data(cache_kwargs). Must include ‘cache_pattern’ key if not None, and can also include dictionary of chunk tuples with feature keys
Methods
Check for NaNs after regridding and do NN fill if needed.
derive(feature[, strict])Resolve feature name to a feature in the underlying data.
Get regridder object
post_init_log([args_dict])Log additional arguments after initialization.
Set the high resolution data attribute and check if hr_data.shape is divisible by s_enhance.
Regrid low_res data for all requested noncached features.
wrap(data)Return a
Sup3rDatasetobject or tuple of such.Attributes
- property data#
Return underlying data.
- Returns:
See also
- update_hr_data()[source]#
Set the high resolution data attribute and check if hr_data.shape is divisible by s_enhance. If not, take the largest shape that can be.
- update_lr_data()[source]#
Regrid low_res data for all requested noncached features. Load cached features if available and overwrite=False
- derive(feature, strict=True)#
Resolve feature name to a feature in the underlying data. This is used for handling feature aliases and for deriving new features from existing ones.
- post_init_log(args_dict=None)#
Log additional arguments after initialization.
- property shape#
Get shape of underlying data.
- wrap(data)#
Return a
Sup3rDatasetobject or tuple of such. This is a tuple when the.dataattribute belongs to aCollectionobject likeBatchHandler. Otherwise this isSup3rDatasetobject, which is either a wrapped 3-tuple, 2-tuple, or 1-tuple (e.g.len(data) == 3,len(data) == 2orlen(data) == 1). This is a 3-tuple when.databelongs to a container object likeDualSamplerWithObs, a 2-tuple when.databelongs to a dual container object likeDualSampler, and a 1-tuple otherwise.