Inference scales and the κ diagnostic¶
The session-level phi / phi_inv pair picks the inference scale:
the scale on which the delta method and the κ diagnostic are
computed, and the scale whose CI endpoints get back-transformed to
the report.
Common scale helpers:
Helper |
|
When to use |
|---|---|---|
|
identity |
additive contrasts; AME on response |
|
|
rate ratios, risk ratios, hazard ratios |
|
|
odds ratios, probabilities |
|
|
Fisher-z transformed correlations |
The rule of thumb: pick the scale on which the contrast is most
nearly linear in β. That keeps κ small, the symmetric Wald CI
honest, and the back-transformed reporting CI asymmetric in the right
direction.
import numpy as np
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf
from pymargins import Margins
rng = np.random.default_rng(0)
n = 1500
df = pd.DataFrame({"x": rng.normal(0, 1, n)})
lp = -3.0 + 1.8 * df["x"]
df["y"] = rng.binomial(1, 1 / (1 + np.exp(-lp)))
fit = smf.glm("y ~ x", data=df, family=sm.families.Binomial()).fit()
κ as a pre-flight diagnostic¶
print(Margins.linear_scale(fit, at="overall").diagnose().summary())
print(Margins.log_scale(fit, at="overall").diagnose().summary())
print(Margins.logit_scale(fit, at="overall").diagnose().summary())
Session diagnostic (50 design points)
Session: Margins session
Model: GLMResultsWrapper
Adapter: StatsmodelsGLMAdapter
Inference scale: identity
Variance: default
Confidence level: 0.95
At: overall
Method: delta (κ-threshold=0.3)
n_sim: 4000
n_boot: 1000
Gradient backend: autodiff
Diagnostics: enabled
Strict: False
κ min: 0.010
κ median: 0.131
κ max: 0.337
Verdict: delta_unreliable
Delta method is unreliable for this analytical posture. Use method='simulation' or method='bootstrap' for inference, or reconsider the inference scale (phi).
Session diagnostic (50 design points)
Session: Margins session
Model: GLMResultsWrapper
Adapter: StatsmodelsGLMAdapter
Inference scale: log
Variance: default
Confidence level: 0.95
At: overall
Method: delta (κ-threshold=0.3)
n_sim: 4000
n_boot: 1000
Gradient backend: autodiff
Diagnostics: enabled
Strict: False
κ min: 0.001
κ median: 0.006
κ max: 0.124
Verdict: delta_borderline
Delta method is borderline. Consider running specific estimands with method='simulation' or enabling automatic fallback via the session's kappa_threshold.
Session diagnostic (50 design points)
Session: Margins session
Model: GLMResultsWrapper
Adapter: StatsmodelsGLMAdapter
Inference scale: logit
Variance: default
Confidence level: 0.95
At: overall
Method: delta (κ-threshold=0.3)
n_sim: 4000
n_boot: 1000
Gradient backend: autodiff
Diagnostics: enabled
Strict: False
κ min: 0.000
κ median: 0.000
κ max: 0.000
Verdict: delta_reliable
Delta method is reliable across the design. Spot-check with method='simulation' for estimands at extreme covariate values.
When κ exceeds the session threshold (kappa_threshold=0.3 by
default), the next call auto-falls-back to simulation. The summary
on every result tells you which inference path was actually used.
See The κ curvature diagnostic for the math.