geopfa.extrapolation.backfill_gdf

backfill_gdf(gdf, value_col, *, z_value=None, z_tol=1e-06, test_size=0.2, seed=42, verbose=True)[source]

Perform Gaussian Process-based extrapolation (or interpolation) to fill missing values in a GeoDataFrame at a specified Z slice.

This high-level routine orchestrates a complete 2D GP modeling pipeline:

  1. Slice preparation via load_2d_data()

  2. Sparse GP training via build_and_fit_gp()

  3. Model evaluation (regression metrics + optional bootstrap diagnostics)

  4. Prediction on missing grid cells

  5. Reconstruction of a full 2D backfilled prediction array

  6. GeoDataFrame merge using either: - backfill_gdf_at_height() (if z_value provided), or - update_gdf_with_predictions() (if 2D / no Z slice)

Parameters:
  • gdf (geopandas.GeoDataFrame) – Input GeoDataFrame containing 3D point geometries and the target column. Must include columns 'x', 'y', 'z' if Z slicing is used.

  • value_col (str) – Name of the value column to fill.

  • z_value (float or None, optional) – Z slice value to target. If None, the method treats the data as 2D.

  • z_tol (float, optional) – Allowed tolerance for selecting points with z ≈ z_value.

  • test_size (float, optional) – Fraction of known points reserved for model validation.

  • seed (int, optional) – Random seed for train/validation splitting.

  • verbose (bool, optional) – Whether to print progress, diagnostics, and plots.

Returns:

geopandas.GeoDataFrame – A copy of the input GeoDataFrame with missing values at the selected slice filled using GP predictions.

Notes

  • All behavior from the original implementation is preserved.

  • No modifications are made to the original gdf; a filled copy is returned.

  • Plotting and diagnostics occur only if verbose=True.