krigekit.kriging_indicator#

Python wrapper for Multiple Indicator Kriging (MIK) and Sequential Indicator Simulation (SIS) via the Fortran t_kriging_indicator type.

Each of the K categories (or threshold classes) is treated as one indicator variable (ivar = 1..K). Estimation produces K probability values per block; simulation produces a single drawn category (encoded as a one-hot binary vector).

Typical workflow — estimation#

>>> ik = IndicatorKriging(ncat=3, ndim=2)
>>> ik.set_categorical_obs(coord=obs_coord, categories=obs_cat,
...                        category_labels=[1, 2, 3], nmax=20)
>>> for k in range(1, 4):
...     ik.set_vgm(ivar=k, jvar=k, vtype="sph", nugget=0, sill=1.0,
...                a_major=1000, a_minor1=500, a_minor2=500)
...     ik.set_search(ivar=k)
>>> ik.set_grid(coord=grid_coord)
>>> ik.solve()
>>> probs, var = ik.get_results()   # probs.shape == (ngrid, ncat)
>>> del ik

Typical workflow — SIS#

>>> ik = IndicatorKriging(ncat=3, ndim=2, nsim=100)
>>> ik.set_categorical_obs(...)
>>> ik.set_indicator_vgm(vtype="sph", sill=0.2, a_major=500)
>>> ik.set_grid(coord=grid_coord)
>>> ik.set_sim()    # generates U(0,1) draws in Python and passes to Fortran
>>> for k in range(1, 4): ik.set_search(ivar=k)
>>> ik.solve()
>>> sims, _ = ik.get_results()     # sims.shape == (ngrid, ncat, nsim)
...                                 # each [:, :, i] is a one-hot encoded realisation

Classes#

IndicatorKriging

Multiple Indicator Kriging and Sequential Indicator Simulation,

Module Contents#

class krigekit.kriging_indicator.IndicatorKriging(ncat: int, nvar: int | None = None, ndim: int = 2, ndrift: int = 0, unbias: int = 1, nsim: int = 0, anisotropic_search: bool = False, weight_correction: bool = False, use_old_weight: bool = False, store_weight: bool = False, cross_validation: bool = False, write_mat: bool = False, neglect_error: bool = True, varying_vgm: bool = False, std_ck: bool = False, verbose: bool = False, pf_cache: bool = False, weight_file: str = '', bounds: tuple | None = None, seed: int | None = None)#

Bases: krigekit.kriging.Kriging

Multiple Indicator Kriging and Sequential Indicator Simulation, with optional secondary co-variate support.

Extends Kriging — all setup, solve, and results methods are inherited unchanged. The differences are:

  • The Fortran object is a t_kriging_indicator (created via krige_ind_create), which overrides prepare, sim_draw, and post_solve to implement indicator-specific behaviour.

  • ncat names the K indicator categories. nvar defaults to ncat (pure MIS) but can be set larger to add secondary continuous co-variates for co-kriging MIS (see below).

  • set_categorical_obs() is a convenience method that converts raw category labels into K binary indicator datasets.

Parameters:
  • ncat (int) – Number of categories K. Indicator variables occupy ivar = 1..ncat.

  • nvar (int, optional) – Total number of co-kriging variables. Defaults to ncat (pure MIS). Set nvar = ncat + M to include M secondary continuous variables (ivar = ncat+1 .. nvar). Secondary variables contribute to the kriging weights but are excluded from the CDF draw and probability normalisation.

  • ndim (int) – Number of spatial dimensions (2 or 3).

  • nsim (int) – 0 = estimation (returns probabilities); >0 = SIS (returns one-hot draws).

:param All other keyword arguments are passed through to Kriging.:

Notes

For SIS (nsim > 0), call set_sim() after set_grid().

Co-kriging MIS example (K=3 categories + 1 secondary variable):

ik = IndicatorKriging(ncat=3, nvar=4, ndim=2)
ik.set_categorical_obs(coord, cats, nmax=20)   # loads ivar=1,2,3
ik.set_obs(ivar=4, coord=sec_coord, value=sec_val)  # secondary
for iv in range(1, 5):
    for jv in range(1, 5):
        ik.set_vgm(ivar=iv, jvar=jv, ...)
ik.set_grid(coord=grid_coord)
for k in range(1, 5):
    ik.set_search(ivar=k)
ik.solve()
probs, var = ik.get_results()   # shape (ngrid, 3) — secondary excluded
set_sim(randpath: numpy.ndarray | None = None, sample: numpy.ndarray | None = None)#

Set up Sequential Indicator Simulation parameters.

Overrides set_sim(). When sample is None, delegates to the parent with no sample so that the Fortran set_sim_indicator override generates U(0, 1) draws directly; no sample array is created in Python. When sample is supplied, validates that every value lies in [0, 1] then delegates to the parent.

Parameters:
  • randpath (ndarray of int, shape (nblocks,), optional) – Random visiting order (1-based). Generated with a random permutation if omitted.

  • sample (ndarray, shape (nblocks, nvar, nsim), optional) – Pre-drawn U(0, 1) samples. Every value must lie in [0, 1]. When None, Fortran generates U(0, 1) via set_sim_indicator.

set_categorical_obs(coord: numpy.ndarray, categories: numpy.ndarray, category_labels=None, nmax: int = 32, maxdist: float | None = None, variance: numpy.ndarray | None = None)#

Build K binary indicator datasets from raw category labels.

For each category k (ivar = 1..K), creates a binary indicator array I_k = (categories == category_labels[k-1]) and calls set_obs(ivar=k, ...)().

Parameters:
  • coord (ndarray, shape (nobs, ndim)) – Sample coordinates.

  • categories (array-like, shape (nobs,)) – Integer or string category label at each sample.

  • category_labels (list, optional) – Ordered list of K unique labels — one per indicator variable. If omitted, sorted unique values from categories are used. The order determines which label maps to ivar=1, 2, …, K.

  • nmax (int) – Maximum neighbours per indicator variable.

  • maxdist (float) – Maximum search distance (0 = unlimited).

  • variance (ndarray, shape (nobs,) or None) – Measurement error variance at each sample (0 if omitted).

set_indicator_vgm(vtype: str = 'sph', nugget: float = 0.0, sill: float = 1.0, a_major: float = 1.0, a_minor1: float | None = None, a_minor2: float | None = None, azimuth: float = 0.0, dip: float = 0.0, plunge: float = 0.0, cross: str = 'same', proportions: numpy.ndarray | None = None)#

Set indicator variograms for all K² pairs in one call.

This is a convenience wrapper around set_vgm() that handles the three standard approaches for the cross-variogram sills.

Parameters:
  • vtype – Shared variogram model — same shape and range are used for all pairs. sill is the base partial sill (see cross for how it is applied to off-diagonal pairs).

  • nugget – Shared variogram model — same shape and range are used for all pairs. sill is the base partial sill (see cross for how it is applied to off-diagonal pairs).

  • sill – Shared variogram model — same shape and range are used for all pairs. sill is the base partial sill (see cross for how it is applied to off-diagonal pairs).

  • a_major – Shared variogram model — same shape and range are used for all pairs. sill is the base partial sill (see cross for how it is applied to off-diagonal pairs).

  • a_minor1 – Shared variogram model — same shape and range are used for all pairs. sill is the base partial sill (see cross for how it is applied to off-diagonal pairs).

  • a_minor2 – Shared variogram model — same shape and range are used for all pairs. sill is the base partial sill (see cross for how it is applied to off-diagonal pairs).

  • azimuth – Shared variogram model — same shape and range are used for all pairs. sill is the base partial sill (see cross for how it is applied to off-diagonal pairs).

  • dip – Shared variogram model — same shape and range are used for all pairs. sill is the base partial sill (see cross for how it is applied to off-diagonal pairs).

  • plunge – Shared variogram model — same shape and range are used for all pairs. sill is the base partial sill (see cross for how it is applied to off-diagonal pairs).

  • cross ({"same", "independent", "proportional"}) –

    Controls how cross-variogram sills are computed:

    • "same" (default) — sill is identical for all K² pairs. Simple and robust; post_solve normalisation absorbs the error.

    • "independent" — auto-variogram sill as given (or from proportions); cross-variogram sill = 0 (no co-kriging). Equivalent to running K separate ordinary kriging systems.

    • "proportional" — auto sills ∝ p_k (1 − p_k) from proportions; cross sills = √(s_k · s_l). Satisfies the LMC positive-definiteness condition for each nested structure. Requires proportions.

  • proportions (array-like of shape (K,), optional) –

    Observed category proportions [p_1, …, p_K]. When given:

    • auto sill for category k = p_k · (1 − p_k) (indicator variance)

    • cross sill for k ≠ l = √(s_k · s_l) (if cross="proportional")

    Ignored when cross="same".

Notes

Theoretical sills differ per category (p_k varies), and cross-sills are theoretically negative (−p_k · p_l for jointly exhaustive indicators). cross="same" uses one positive value for all pairs, relying on post_solve_indicator normalisation to produce a valid probability simplex. cross="proportional" is more principled; cross="independent" is the most conservative (no coupling between indicator variables).

Examples

Same sill for all 16 pairs (current practice):

ik.set_indicator_vgm(vtype="sph", nugget=0.02, sill=0.19,
                     a_major=500, a_minor1=80, azimuth=90)

Category-specific sills from observed proportions, proportional cross-terms:

props = np.array([0.18, 0.23, 0.21, 0.38])
ik.set_indicator_vgm(vtype="sph", nugget=0.02, sill=0.19,
                     a_major=500, a_minor1=80, azimuth=90,
                     cross="proportional", proportions=props)

Independent kriging (zero cross-covariance):

ik.set_indicator_vgm(vtype="sph", nugget=0.02, sill=0.19,
                     a_major=500, a_minor1=80, azimuth=90,
                     cross="independent", proportions=props)