Simultaneous confidence intervals and multiple-testing correction

When you ask for m margins in one call, the per-margin CIs are pointwise by default. Family-wise coverage requires a wider critical value.

Correlated margins: simultaneous CIs

For a single vector result where the components are correlated (e.g. predictions at adjacent ages, or slopes over several regions), use conf_int(simultaneous=True). This is correlation-aware and strictly tighter than Bonferroni:

res = m.predict(atexog={"age": [25, 35, 45, 55, 65]})

res.conf_int(level=0.95)                       # pointwise
res.conf_int(level=0.95, method="bonferroni")  # works with any method
res.conf_int(level=0.95, method="sidak")
res.conf_int(level=0.95, method="sup-t")       # requires draws (sim/bootstrap)

Choosing a method

Method

Assumption

When to use

pointwise

None

Single pre-registered comparison

bonferroni

None

Conservative fallback; always valid but often wide

sidak

Independent tests (or positively dependent)

Slightly less conservative than Bonferroni; good for large families

sup-t

Draws available (simulation or bootstrap)

Recommended when tests are correlated — e.g. predictions at adjacent ages

sup-t reads the family-wise critical value off the empirical distribution of max_j |t_j| from the simulation or bootstrap draw matrix. It is typically narrower than Bonferroni and Šidák when the margins in the family are correlated — exactly the case for predictions evaluated at neighbouring covariate profiles.

Example: all-pairs comparisons with simultaneous CIs

from pymargins import all_pairwise

scen, W = all_pairwise("region", ["N", "S", "E", "W"])
res = m.contrasts(scenarios=scen, contrasts=W)

# sup-t uses the joint bootstrap/simulation draws
res.conf_int(level=0.95, method="sup-t")

Independent results: multiple-comparison adjustment

When you have independent results from separate calls (e.g. estimates from different cohorts or subgroups that do not share a joint covariance), use pymargins.adjust():

from pymargins import adjust

results = {
    "cohort_a": m.predict(atexog={"tx": 1, "cohort": "a"}),
    "cohort_b": m.predict(atexog={"tx": 1, "cohort": "b"}),
    "cohort_c": m.predict(atexog={"tx": 1, "cohort": "c"}),
}

adj = adjust(results, method="holm")
print(adj.summary())
print(adj.to_frame())

adjust wraps statsmodels.stats.multitest.multipletests and accepts any of its method names ("bonferroni", "holm", "sidak", "fdr_bh", etc.). Pass a single MarginsResult, a list, or a dict.

Subgroup equality tests via pairwise contrasts

When you compute an estimand within subgroups (over="region"), the natural follow-up is to test whether the subgroup estimates are equal. Use pairwise_contrasts() to form all pairwise differences with proper joint inference, then joint_test() or adjust():

from pymargins import diff_matrix, adjust

# Slopes within each region
slopes = m.dydx("age", over="region")

# All pairwise differences (k*(k-1)/2 contrasts)
diffs = slopes.pairwise_contrasts()
print(diffs.summary())

# Joint Wald test: H0 — all slopes are equal
print(diffs.joint_test().summary())

# Or correct for multiple comparisons across the pairs
adj = adjust(diffs, method="holm")
print(adj.summary())

Building contrast matrices manually

diff_matrix(k, kind="reference") returns a (k-1, k) matrix of each level versus the first level. Use it directly with contrast() on any vector result:

C = diff_matrix(4, kind="reference")   # 3 rows: [2-1, 3-1, 4-1]
ref_diffs = slopes.contrast(C, labels=["S-N", "E-N", "W-N"])
print(ref_diffs.summary())

For all pairwise differences, kind="pairwise" returns a (k*(k-1)/2, k) matrix. pairwise_contrasts() is simply this matrix applied automatically.

When to use which

  • conf_int(simultaneous=True) — one vector result with correlated components. Uses the exact joint covariance or empirical correlation from draws. Tighter and more powerful than generic p-value adjustment.

  • pairwise_contrasts() — you already have a vector result and want to test all pairwise equalities. The contrasts share Σ̂, so joint inference is exact.

  • adjust(...) — several independent results that do not share a session or Σ̂. The only option when results come from different models or non-overlapping samples.