geopfa.extrapolation.bootstrap_assess_residuals_stats

bootstrap_assess_residuals_stats(Y_true, Y_pred, *, alpha=0.05, n_boot=100, sample_size=500, random_state=None)[source]

Perform bootstrap-based residual diagnostic tests for Gaussian Process residuals.

This function repeatedly draws bootstrap samples of residuals and computes standard diagnostic tests:

  • Shapiro-Wilk (normality, small/medium N)

  • D’Agostino K² (normality via skewness & kurtosis)

  • Jarque-Bera (normality via joint skew/kurtosis)

  • Levene (homoscedasticity / equal variance)

  • Ljung-Box (no autocorrelation at lag 10)

Per-test p-values are collected across bootstrap samples, and summary statistics (mean/median p-value, rejection rate) are returned.

Parameters:
  • Y_true (numpy.ndarray) – True target observations.

  • Y_pred (numpy.ndarray) – Predicted values from the GP.

  • alpha (float, optional) – Significance threshold used for computing rejection rates.

  • n_boot (int, optional) – Number of bootstrap iterations.

  • sample_size (int, optional) – Size of each bootstrap sample. If larger than the dataset, the full dataset is used.

  • random_state (int, numpy.random.Generator, or None, optional) – Seed or initialized generator for reproducibility.

Returns:

pandas.DataFrame

A dataframe with rows for each test:

[‘Shapiro-Wilk’, “D’Agostino”, ‘Jarque-Bera’, ‘Levene’, ‘Ljung-Box’]

and columns:
  • Test

  • Purpose

  • Mean p-value

  • Median p-value

  • Rejection Rate (fraction of p < alpha)

  • Warn (True if rejection rate > 0.05)

Notes

  • All exception handling is preserved exactly as in the original version.

  • Tests that fail or produce NaN p-values are handled identically.

  • The logic for the “Warn” column is unchanged.