Difference-in-differences on the response scale¶
import numpy as np
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf
from pymargins import Margins
rng = np.random.default_rng(42)
n = 4000
df = pd.DataFrame({
"group": rng.choice(["A", "B"], n),
"preexist": rng.binomial(1, 0.4, n),
"age": rng.integers(20, 80, n),
})
lp = (-1.5 + 0.6 * (df["group"] == "B") + 0.4 * df["preexist"]
+ 0.5 * ((df["group"] == "B") & (df["preexist"] == 1))
+ 0.02 * df["age"])
df["condX"] = rng.binomial(1, 1 / (1 + np.exp(-lp)))
fit = smf.glm("condX ~ C(group) * C(preexist) + age", data=df,
family=sm.families.Binomial()).fit()
m = Margins.linear_scale(fit, at="overall")
For a 2×2 DiD on a nonlinear model, evaluate the four cells on the response scale and difference them. The interaction coefficient itself is on the link scale and does not answer the question (Ai & Norton, 2003).
from pymargins import Margins, did
m = Margins.linear_scale(fit, vcov="HC3", at="overall")
scen, w = did(
"group", "preexist",
treated_level="B", control_level="A",
post_level=1, pre_level=0,
)
print(m.contrasts(scenarios=scen, contrasts=w).summary())
=======================================================
Margins Result (delta, level=0.95)
=======================================================
estimate std err z P>|z| [95% Conf. Int.]
-------------------------------------------------------
did 0.0998 0.0308 3.2382 0.001 0.0394, 0.1601
=======================================================
n = 4000
κ: max=0.018
Delta-vs-sim disagreement: 1.713%
At a single representative patient profile:
print(m.contrasts(
scenarios=did(
"group", "preexist",
treated_level="B", control_level="A",
post_level=1, pre_level=0,
age=60,
)[0],
contrasts=[+1, -1, -1, +1],
).summary())
=======================================================================
Margins Result (delta, level=0.95)
=======================================================================
estimate std err z P>|z| [95% Conf. Int.]
-----------------------------------------------------------------------
group=B, preexist=1 0.0880 0.0308 2.8539 0.004 0.0276, 0.1484
=======================================================================
n = 4000
κ: 0.022
Delta-vs-sim disagreement: 3.382%
The four cell predictions and the two simple effects share the same joint covariance, so the DiD’s standard error is exact under the delta method.