Bootstrap inference

import numpy as np
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf
from pymargins import Margins

rng = np.random.default_rng(42)
n = 2000
df = pd.DataFrame({
    "age": rng.integers(20, 75, n),
    "female": rng.binomial(1, 0.52, n),
    "treated": rng.binomial(1, 0.40, n),
})
lp = -1.5 + 0.04 * df["age"] - 0.3 * df["female"] + 0.8 * df["treated"]
df["y"] = rng.binomial(1, 1 / (1 + np.exp(-lp)))

fit = smf.glm("y ~ age + female + treated", data=df,
              family=sm.families.Binomial()).fit()
m = Margins.log_scale(fit, at="overall")

Switch the session’s inference method to "bootstrap" and pick the number of replicates. The default scheme is pairs — rows resampled IID with replacement.

m = Margins.log_scale(fit, method="bootstrap", n_boot=2000, vcov="HC3")
print(m.dydx("age").summary())
==========================================================
          Margins Result (bootstrap, level=0.95)          
==========================================================
     estimate  std err  statistic  P>|z|  [95% Conf. Int.]
----------------------------------------------------------
age    0.0088   0.0640    -4.7302  0.000    0.0077, 0.0098
==========================================================

n = 2000
Note: std err is on the inference scale; estimate and CI are on the reporting scale.
κ: 0.117

Parallelism uses thread pools; BLAS threads are pinned to 1 per worker to avoid oversubscription:

m = Margins.log_scale(fit, method="bootstrap", n_boot=2000, n_jobs=-1)

Point estimates under bootstrap

The point estimate stays the analytic g(β̂)pymargins does not report the bootstrap mean as the estimate, matching Stata’s convention. The bootstrap is used only for the standard error and the empirical quantiles of the CI. This keeps the estimator consistent even when the bootstrap distribution is biased (e.g. in small samples).

Failed refits

Failed refits are caught and counted; a RuntimeWarning fires when the failure rate exceeds 5%. If you see this warning, inspect the model specification — non-convergence on 5% of bootstrap samples usually indicates separation, perfect multicollinearity, or a misspecified link function.

For correlated data use Cluster and block bootstrap.