geopfa.extrapolation.build_and_fit_gp

build_and_fit_gp(X_train_stdized, Y_train_stdized, optimize_restarts=0, verbose=False, save_path=None, n_inducing=300, lower_frac=0.02, upper_frac=0.2)[source]

Build and train a Sparse Gaussian Process regression model using a globally stabilized multi-kernel combination (RBF + Matern32 + optional Bias/White).

This method improves extrapolation stability by:

  • Constraining ARD lengthscales based on global radius,

  • Distributing inducing points across the domain via KMeans,

  • Anchoring the prior with a constant-mean mapping,

  • Adding bias and white kernels to reduce drift,

  • Using a Gamma prior on the Gaussian likelihood noise variance.

Parameters:
  • X_train_stdized (numpy.ndarray) – Standardized training coordinates, shape (N, D).

  • Y_train_stdized (numpy.ndarray) – Standardized training targets, shape (N, 1).

  • optimize_restarts (int, optional) – Number of multi-start optimization attempts. Default is 0.

  • verbose (bool, optional) – Whether to print optimization diagnostics.

  • save_path (str or None, optional) – If provided, the model is saved to this path via joblib.

  • n_inducing (int, optional) – Target number of inducing points for SparseGPRegression.

  • lower_frac (float, optional) – Fraction of global radius used for lower lengthscale bounds.

  • upper_frac (float, optional) – Fraction of global radius used for upper lengthscale bounds.

Returns:

  • model (GPy.models.SparseGPRegression) – Trained sparse Gaussian Process model.

  • constraint_info (dict) – Dictionary containing kernel and likelihood constraint diagnostics.