Linear contrasts with contrasts

import numpy as np
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf
from pymargins import Margins

rng = np.random.default_rng(42)
n = 2000
df = pd.DataFrame({
    "age": rng.integers(20, 75, n),
    "female": rng.binomial(1, 0.52, n),
    "treated": rng.binomial(1, 0.40, n),
    "region": rng.choice(["N", "S", "E", "W"], n),
    "group": rng.choice(["A", "B"], n),
    "preexist": rng.binomial(1, 0.4, n),
})
lp = (-1.5 + 0.04 * df["age"] - 0.3 * df["female"] + 0.8 * df["treated"]
      + 0.2 * (df["region"] == "S") + 0.4 * (df["region"] == "E")
      - 0.1 * (df["region"] == "W")
      + 0.6 * (df["group"] == "B") + 0.4 * df["preexist"])
df["y"] = rng.binomial(1, 1 / (1 + np.exp(-lp)))

fit = smf.glm("y ~ age + female + treated + C(region) + C(group) + preexist", data=df,
              family=sm.families.Binomial()).fit()
m = Margins.log_scale(fit, at="overall")

Margins.contrasts forms a weighted sum of scenario predictions on the inference scale. It is the workhorse for risk differences, risk ratios, odds ratios, lift, reference-level comparisons, and difference-in-differences — any estimand that is a linear combination of predictions.

For nonlinear compositions (ratios on the raw scale, NNT, reciprocals, custom utility functions) use evaluate. See Contrasts vs evaluate — choosing the right tool for the decision rule.

How contrasts works

contrasts takes a list of scenarios and a weight vector (or matrix). The engine:

  1. Computes the inference-scale prediction for each scenario: hᵢ = φ⁻¹( mean_predict(beta, scenario_i) ).

  2. Forms the weighted sum: Σᵢ wᵢ · hᵢ.

  3. Runs delta-method inference (or simulation/bootstrap if κ is high).

  4. Back-transforms CI endpoints with phi for reporting.

Mathematically:

result = φ( Σᵢ wᵢ · φ⁻¹(pᵢ) )

where pᵢ is the aggregated response-scale prediction for scenario i.

Because the inference is on a linear combination of φ⁻¹(pᵢ), the delta method is exact to the extent that the individual hᵢ are locally linear. This is why simple contrasts usually have smaller κ than evaluate calls.

Risk difference (linear scale)

The simplest contrast: the arithmetic difference between two predicted probabilities.

from pymargins import Margins, pairwise

m = Margins.linear_scale(fit, at="overall")

scen, w = pairwise("treated", [1, 0])
res = m.contrasts(scenarios=scen, contrasts=w)
print(res.summary())
=============================================================
              Margins Result (delta, level=0.95)             
=============================================================
           estimate  std err       z  P>|z|  [95% Conf. Int.]
-------------------------------------------------------------
treated=1    0.1492   0.0185  8.0667  0.000    0.1129, 0.1854
=============================================================

n = 2000
κ: 0.039
Delta-vs-sim disagreement: 1.543%

On the linear scale the contrast is p₁ p₀. The CI is symmetric on the probability scale.

Risk ratio (log scale)

A ratio is a difference on the log scale. The back-transform turns the log-ratio into a ratio with an asymmetric CI.

m = Margins.log_scale(fit, at="overall")

scen, w = pairwise("treated", [1, 0])
res = m.contrasts(scenarios=scen, contrasts=w)
print(res.summary())
=============================================================
              Margins Result (delta, level=0.95)             
=============================================================
           estimate  std err       z  P>|z|  [95% Conf. Int.]
-------------------------------------------------------------
treated=1    1.2247   0.0253  8.0232  0.000    1.1656, 1.2869
=============================================================

n = 2000
Note: std err is on the inference scale; estimate and CI are on the reporting scale.
κ: 0.046
Delta-vs-sim disagreement: 0.354%

The point estimate is exp(log(p₁) log(p₀)) = p₁ / p₀. Because the inference is on the log scale, the delta method is exact and κ is small.

Odds ratio (logit scale)

For probabilities near 0 or 1, the logit scale keeps the CI inside (0, 1) for each arm before forming the odds ratio.

m = Margins.logit_scale(fit, at="overall")

scen, w = pairwise("treated", [1, 0])
res = m.contrasts(scenarios=scen, contrasts=w)
print(res.summary())
=============================================================
              Margins Result (delta, level=0.95)             
=============================================================
           estimate  std err       z  P>|z|  [95% Conf. Int.]
-------------------------------------------------------------
treated=1    0.6877   0.1053  7.4956  0.000    0.6417, 0.7302
=============================================================

n = 2000
Note: std err is on the inference scale; estimate and CI are on the reporting scale.
κ: 0.017
Delta-vs-sim disagreement: 0.172%

The back-transform is expit(logit(p₁) logit(p₀)), which simplifies to the odds ratio (p₁/(1−p₁)) / (p₀/(1−p₀)).

Lift (RR − 1)

Lift is a risk ratio minus one. The easiest path is log_scale for the ratio, then subtract one from the estimate and CI endpoints:

m = Margins.log_scale(fit, at="overall")
res = m.contrasts(scenarios=scen, contrasts=w)

lift_est = float(res.estimate) - 1.0
lift_ci = (float(res.conf_int_lower) - 1.0, float(res.conf_int_upper) - 1.0)

Reference-level contrasts

Compare every level of a factor against a common baseline. The weight matrix has one row per comparison.

from pymargins import reference

scen, W = reference("region", ["N", "S", "E", "W"], ref_level="N")
res = m.contrasts(scenarios=scen, contrasts=W)
print(res.summary())
===========================================================
             Margins Result (delta, level=0.95)            
===========================================================
        estimate  std err        z  P>|z|  [95% Conf. Int.]
-----------------------------------------------------------
S_vs_N    0.9850   0.0381  -0.3969  0.691    0.9142, 1.0613
E_vs_N    1.0892   0.0341   2.5057  0.012    1.0188, 1.1646
W_vs_N    0.9357   0.0395  -1.6794  0.093    0.8660, 1.0112
===========================================================

n = 2000
Note: std err is on the inference scale; estimate and CI are on the reporting scale.
κ: max=0.052
Delta-vs-sim disagreement: 0.513%

All-pairs comparisons

Compare every level against every other level. The result carries a joint covariance, so simultaneous CIs are available.

from pymargins import all_pairwise

scen, W = all_pairwise("region", ["N", "S", "E", "W"])
res = m.contrasts(scenarios=scen, contrasts=W)

# Joint covariance is stored in the result for post-hoc combination
print(res.summary())
=========================================================================
                    Margins Result (delta, level=0.95)                   
=========================================================================
                      estimate  std err        z  P>|z|  [95% Conf. Int.]
-------------------------------------------------------------------------
region=S_vs_region=N    0.9850   0.0381  -0.3969  0.691    0.9142, 1.0613
region=E_vs_region=N    1.0892   0.0341   2.5057  0.012    1.0188, 1.1646
region=W_vs_region=N    0.9357   0.0395  -1.6794  0.093    0.8660, 1.0112
=========================================================================

n = 2000
Note: std err is on the inference scale; estimate and CI are on the reporting scale.
κ: max=0.052
Delta-vs-sim disagreement: 0.411%

Testing a non-zero null

# Test whether the risk ratio exceeds 1.5
scen_test, w_test = pairwise("treated", [1, 0])
print(m.log_scale(fit, at="overall").contrasts(scenarios=scen_test, contrasts=w_test).test(value=1.5).summary())
Hypothesis test (wald)
  H0: estimand = 1.5
  Alternative: two-sided
  Statistic: -8.0238
  P-value:   1.11e-15

The null value is interpreted on the reporting scale and lifted onto the inference scale via phi_inv automatically.

Predictions over a grid

Evaluate predictions at several values of a moderator:

res = m.predict(atexog={"age": [25, 45, 65], "treated": [0, 1]})
print(res.summary())
=======================================================================
                   Margins Result (delta, level=0.95)                  
=======================================================================
                   estimate  std err         z  P>|z|  [95% Conf. Int.]
-----------------------------------------------------------------------
age=25, treated=0    0.4721   0.0462  -16.2417  0.000    0.4312, 0.5168
age=25, treated=1    0.6728   0.0352  -11.2429  0.000    0.6279, 0.7209
age=45, treated=0    0.6590   0.0208  -20.0132  0.000    0.6326, 0.6865
age=45, treated=1    0.8178   0.0172  -11.6708  0.000    0.7907, 0.8459
age=65, treated=0    0.8083   0.0181  -11.7357  0.000    0.7800, 0.8375
age=65, treated=1    0.9085   0.0115   -8.3763  0.000    0.8883, 0.9291
=======================================================================

n = 2000
Note: std err is on the inference scale; estimate and CI are on the reporting scale.
κ: max=0.112
Delta-vs-sim disagreement: 0.621%

For a contrast at a specific moderator value, build the scenarios explicitly:

scen, w = pairwise("treated", [1, 0])
res = m.contrasts(
    scenarios=[
        {"atexog": {"treated": 1, "age": 45}, "label": "treated@45"},
        {"atexog": {"treated": 0, "age": 45}, "label": "control@45"},
    ],
    contrasts=w,
)
print(res.summary())
==============================================================
              Margins Result (delta, level=0.95)              
==============================================================
            estimate  std err       z  P>|z|  [95% Conf. Int.]
--------------------------------------------------------------
treated@45    1.2411   0.0270  8.0112  0.000    1.1772, 1.3084
==============================================================

n = 2000
Note: std err is on the inference scale; estimate and CI are on the reporting scale.
κ: 0.048
Delta-vs-sim disagreement: 0.405%

2×2 Difference-in-differences

DiD is a contrast across four cells with weights [+1, −1, −1, +1]. See Difference-in-differences on the response scale for the full derivation and response-scale motivation (Ai & Norton, 2003).

from pymargins import did

scen, w = did("group", "preexist",
              treated_level="B", control_level="A",
              post_level=1, pre_level=0)
print(m.contrasts(scenarios=scen, contrasts=w).summary())
========================================================
           Margins Result (delta, level=0.95)           
========================================================
     estimate  std err        z  P>|z|  [95% Conf. Int.]
--------------------------------------------------------
did    0.9542   0.0125  -3.7399  0.000    0.9311, 0.9780
========================================================

n = 2000
Note: std err is on the inference scale; estimate and CI are on the reporting scale.
κ: max=0.140
Delta-vs-sim disagreement: 0.276%

See also