Inference scales and the κ diagnostic

The session-level phi / phi_inv pair picks the inference scale: the scale on which the delta method and the κ diagnostic are computed, and the scale whose CI endpoints get back-transformed to the report.

Common scale helpers:

Helper

phi

When to use

Margins.linear_scale(...)

identity

additive contrasts; AME on response

Margins.log_scale(...)

exp

rate ratios, risk ratios, hazard ratios

Margins.logit_scale(...)

expit

odds ratios, probabilities

Margins.correlation_scale(...)

tanh

Fisher-z transformed correlations

The rule of thumb: pick the scale on which the contrast is most nearly linear in β. That keeps κ small, the symmetric Wald CI honest, and the back-transformed reporting CI asymmetric in the right direction.

import numpy as np
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf

from pymargins import Margins

rng = np.random.default_rng(0)
n = 1500
df = pd.DataFrame({"x": rng.normal(0, 1, n)})
lp = -3.0 + 1.8 * df["x"]
df["y"] = rng.binomial(1, 1 / (1 + np.exp(-lp)))
fit = smf.glm("y ~ x", data=df, family=sm.families.Binomial()).fit()

κ as a pre-flight diagnostic

print(Margins.linear_scale(fit, at="overall").diagnose().summary())
print(Margins.log_scale(fit, at="overall").diagnose().summary())
print(Margins.logit_scale(fit, at="overall").diagnose().summary())
Session diagnostic (50 design points)
  Session: Margins session
  Model: GLMResultsWrapper
  Adapter: StatsmodelsGLMAdapter
  Inference scale: identity
  Variance: default
  Confidence level: 0.95
  At: overall
  Method: delta (κ-threshold=0.3)
  n_sim: 4000
  n_boot: 1000 
  Gradient backend: autodiff
  Diagnostics: enabled
  Strict: False
  κ min:    0.010
  κ median: 0.131
  κ max:    0.337
  Verdict:  delta_unreliable
  Delta method is unreliable for this analytical posture. Use method='simulation' or method='bootstrap' for inference, or reconsider the inference scale (phi).
Session diagnostic (50 design points)
  Session: Margins session
  Model: GLMResultsWrapper
  Adapter: StatsmodelsGLMAdapter
  Inference scale: log
  Variance: default
  Confidence level: 0.95
  At: overall
  Method: delta (κ-threshold=0.3)
  n_sim: 4000
  n_boot: 1000 
  Gradient backend: autodiff
  Diagnostics: enabled
  Strict: False
  κ min:    0.001
  κ median: 0.006
  κ max:    0.124
  Verdict:  delta_borderline
  Delta method is borderline. Consider running specific estimands with method='simulation' or enabling automatic fallback via the session's kappa_threshold.
Session diagnostic (50 design points)
  Session: Margins session
  Model: GLMResultsWrapper
  Adapter: StatsmodelsGLMAdapter
  Inference scale: logit
  Variance: default
  Confidence level: 0.95
  At: overall
  Method: delta (κ-threshold=0.3)
  n_sim: 4000
  n_boot: 1000 
  Gradient backend: autodiff
  Diagnostics: enabled
  Strict: False
  κ min:    0.000
  κ median: 0.000
  κ max:    0.000
  Verdict:  delta_reliable
  Delta method is reliable across the design. Spot-check with method='simulation' for estimands at extreme covariate values.

When κ exceeds the session threshold (kappa_threshold=0.3 by default), the next call auto-falls-back to simulation. The summary on every result tells you which inference path was actually used.

See The κ curvature diagnostic for the math.