geopfa.extrapolationΒΆ

exptrapolation.py - Gaussian Process Regression (GPR) for extrapolating values on a 2D grid over multiple Z slices.

This module provides functions for: - I/O and Data Handling: reading data into GeoDataFrames and saving/loading GPy models. - Pre-processing: preparing data for modeling (extracting slices, standardizing). - Modeling: building Gaussian Process models (with composite kernels) and making predictions. - Validation/Evaluation: assessing model performance and diagnostic tests on residuals. - Visualization: plotting residuals, uncertainty, and comparison of predictions vs true values.

Functions

assess_gp_model_fit(model, Y_true, Y_pred, ...)

Compute regression performance metrics and model diagnostics for a trained Gaussian Process model.

backfill_gdf(gdf, value_col, *[, z_value, ...])

Perform Gaussian Process-based extrapolation (or interpolation) to fill missing values in a GeoDataFrame at a specified Z slice.

backfill_gdf_at_height(gdf, *, value_col, ...)

Backfill missing values in a GeoDataFrame at a specific Z slice using a precomputed 2D grid of fill-in predictions.

bootstrap_assess_residuals_stats(Y_true, ...)

Perform bootstrap-based residual diagnostic tests for Gaussian Process residuals.

build_and_fit_gp(X_train_stdized, ...[, ...])

Build and train a Sparse Gaussian Process regression model using a globally stabilized multi-kernel combination (RBF + Matern32 + optional Bias/White).

build_combined_kernel(X_stdized, Y_stdized)

Construct a composite kernel consisting of optional components: RBF, Matern 3/2, long-scale RBF, Bias, and White noise kernels.

build_matern32_kernel_global(X_stdized, ...)

Construct a Matern 3/2 kernel with ARD lengthscales constrained by the global radius of the dataset in standardized space.

build_rbf_kernel_global(X_stdized, Y_stdized)

Construct an RBF kernel whose ARD lengthscale bounds are derived from the global radius of the standardized training coordinates.

check_param_limits_hit_from_constraints(...)

Identify kernel parameters whose optimized values lie at or extremely near their constrained lower or upper bounds.

compute_global_radius(X_std)

Compute a global spatial radius for standardized 2D coordinates.

compute_lengthscale_bounds_from_global_radius(R)

Compute dataset-agnostic kernel lengthscale bounds based on the global spatial radius of the standardized coordinate domain.

drop_z_from_geometry(gdf[, geom_col])

Remove Z coordinates from 3D Point geometries in a GeoDataFrame, converting them into standard 2D Points.

estimate_variance(Y)

Estimate the unbiased sample variance of an array.

get_gaussian_noise_bounds(lik_params)

Extract Gaussian noise variance bounds from a likelihood-parameter dictionary.

get_predictions(model, X[, kvals_df, ...])

Generate GP predictions, optionally converting back to original Y-units and optionally reshaping predictions into a 2D grid defined by x/y coordinates.

grid_from_tidy(df)

Convert tidy (x, y, value) data into a rectangular grid and aligned meshgrids.

load_2d_data(gdf, *, value_col[, z_value, ...])

Prepare all arrays needed for 2D Gaussian process interpolation.

load_gpy_model(filepath)

Load a serialized GPy model from disk using joblib.

make_prediction_comparison_plot_2d(X_grid, ...)

Plot side-by-side heatmaps comparing ground truth and predicted values on a shared color scale.

plot_array_with_coords(x, y, Z[, title, ...])

Plot a 2D array Z using provided X-Y coordinates and a colorbar.

plot_residuals(Y_true, Y_pred)

Plot residual diagnostics: a histogram of residuals and a scatter plot of residuals versus predicted values.

prepare_slice(gdf, *, value_col[, z_value, ...])

Prepare a tidy slice of (x, y, value) for an optional z-level filter.

recommend_likelihood_params(Y[, Y_var])

Recommend initialization parameters and a prior for the Gaussian likelihood (observation-noise variance) used in GPy models.

report_model_fit(model_assessment[, ...])

Report (or warn about) model performance diagnostics.

save_gpy_model(model, filepath)

Save a GPy model to disk using joblib.

standardize_xy(X_train, X_full)

Standardize coordinate arrays using the mean and standard deviation computed from the training coordinates.

update_gdf_with_predictions(gdf, Y_grid, ...)

Update a GeoDataFrame with prediction values taken from a 2D grid.