reVeal normalize#
Execute the normalize step from a config file.
Convert specified attribute values of input grid to a scale of 0 to 1 using the specified method(s). Outputs a new GeoPackage containing the input grid with added attributes for normalized attributes.
The general structure for calling this CLI command is given below
(add --help to print help info to the terminal).
Usage
reVeal normalize [OPTIONS]
Options
- -c, --config_file <config_file>#
Required Path to the
normalizeconfiguration file. Below is a sample template config{ "execution_control": { "option": "local", "allocation": "[REQUIRED IF ON HPC]", "walltime": "[REQUIRED IF ON HPC]", "qos": "normal", "memory": null, "queue": null, "feature": null, "conda_env": null, "module": null, "sh_script": null, "keep_sh": false, "num_test_nodes": null }, "log_directory": "./logs", "log_level": "INFO", "grid": "[REQUIRED]", "attributes": null, "normalize_method": null, "invert": false }
execution_control: option: local allocation: '[REQUIRED IF ON HPC]' walltime: '[REQUIRED IF ON HPC]' qos: normal memory: null queue: null feature: null conda_env: null module: null sh_script: null keep_sh: false num_test_nodes: null log_directory: ./logs log_level: INFO grid: '[REQUIRED]' attributes: null normalize_method: null invert: false
log_directory = "./logs" log_level = "INFO" grid = "[REQUIRED]" invert = false [execution_control] option = "local" allocation = "[REQUIRED IF ON HPC]" walltime = "[REQUIRED IF ON HPC]" qos = "normal" keep_sh = false
Parameters#
- execution_controldict
Dictionary containing execution control arguments. Allowed arguments are:
- option:
({‘local’, ‘kestrel’, ‘eagle’, ‘awspc’, ‘slurm’, ‘peregrine’}) Hardware run option. Determines the type of job scheduler to use as well as the base AU cost. The “slurm” option is a catchall for HPC systems that use the SLURM scheduler and should only be used if desired hardware is not listed above. If “local”, no other HPC-specific keys in are required in execution_control (they are ignored if provided).
- allocation:
(str) HPC project (allocation) handle.
- walltime:
(int) Node walltime request in hours.
- qos:
(str, optional) Quality-of-service specifier. For Kestrel users: This should be one of {‘standby’, ‘normal’, ‘high’}. Note that ‘high’ priority doubles the AU cost. By default,
"normal".- memory:
(int, optional) Node memory max limit (in GB). By default,
None, which uses the scheduler’s default memory limit. For Kestrel users: If you would like to use the full node memory, leave this argument unspecified (or set toNone) if you are running on standard nodes. However, if you would like to use the bigmem nodes, you must specify the full upper limit of memory you would like for your job, otherwise you will be limited to the standard node memory size (250GB).- queue:
(str, optional; PBS ONLY) HPC queue to submit job to. Examples include: ‘debug’, ‘short’, ‘batch’, ‘batch-h’, ‘long’, etc. By default,
None, which uses “test_queue”.- feature:
(str, optional) Additional flags for SLURM job (e.g. “-p debug”). By default,
None, which does not specify any additional flags.- conda_env:
(str, optional) Name of conda environment to activate. By default,
None, which does not load any environments.- module:
(str, optional) Module to load. By default,
None, which does not load any modules.- sh_script:
(str, optional) Extra shell script to run before command call. By default,
None, which does not run any scripts.- keep_sh:
(bool, optional) Option to keep the HPC submission script on disk. Only has effect if executing on HPC. By default,
False, which purges the submission scripts after each job is submitted.- num_test_nodes:
(str, optional) Number of nodes to submit before terminating the submission process. This can be used to test a new submission configuration without submitting all nodes (i.e. only running a handful to ensure the inputs are specified correctly and the outputs look reasonable). By default,
None, which submits all node jobs.
Only the option key is required for local execution. For execution on the HPC, the allocation and walltime keys are also required. All other options are populated with default values, as seen above.
- log_directorystr
Path to directory where logs should be written. Path can be relative and does not have to exist on disk (it will be created if missing). By default,
"./logs".- log_level{“DEBUG”, “INFO”, “WARNING”, “ERROR”}
String representation of desired logger verbosity. Suitable options are
DEBUG(most verbose),INFO(moderately verbose),WARNING(only log warnings and errors), andERROR(only log errors). By default,"INFO".- gridstr
Path to vector dataset for which attribute normalization will be performed. Must be an existing vector dataset in a format that can be opened by pyogrio. Does not strictly need to be a grid, or even a polygon dataset, but must be a vector dataset.
- attributesdict, optional
Attributes to be normalized. Must be a dictionary keyed by the name of the output column for each normalized attribute. Each value must be another dictionary with the following keys:
attribute: String indicating the name of the attribute to normalize.normalize_method: Method to use for normalizing the attribute to a scale from 0 to 1. Refer toreVeal.config.normalize.NormalizeMethodEnum.invert: Boolean option. If specified asTrue, normalized values will be inverted such that low values will be closer to 1, and higher values closer to 0. Default is False, under which values are normalized with low values closer to 0 and high values closer to 1. Ifattributesis not specified,normalize_methodmust be provided.
- normalize_methodstr, optional
Optional default method to be used for normalization. If specified, this method will be applied to all numeric attributes in the input grid that are not specified separately in the input
attributes. Each output column will be named based on the corresponding input column plus a suffix “_score”. Ifnormalize_methodis not specified,attributesmust be provided. Refer toreVeal.config.normalize.NormalizeMethodEnum.- invertbool, optional
If specified as
Trueandnormalize_methodis provided, all attributes not specified separately inattributeswill be normalized with values inverted (i.e., low values will be closer to 1, and higher values closer to 0). Default isFalse, under which values are normalized with low values closer to 0 and high values closer to 1. Note that this parameter will have no effect ifnormalize_methodis not specified.
Note that you may remove any keys with a
nullvalue if you do not intend to update them yourself.