Causal Estimands¶

Part of Statistical Methods — MORIE’s statistical-methods reference.

MORIE provides a full suite of causal estimands. This page defines each estimand and maps it to the corresponding Python function.

Summary¶

ATE — Average Treatment Effect (all units): morie.causal.run_propensity_ipw_analysis().
ATT / ATTE — Average Treatment Effect on the Treated (treated units only): morie.causal.estimate_att().
ATC — Average Treatment Effect on the Controls (control units only): morie.causal.estimate_atc().
GATE — Group Average Treatment Effect (subgroups, e.g. gender): morie.causal.estimate_gate().
CATE — Conditional Average Treatment Effect (each individual unit): morie.causal.estimate_cate().
LATE — Local Average Treatment Effect (compliers, IV context): morie.causal.estimate_late().
PLR-ATE — DML–PLR ATE (all units): morie.effects.estimate_ate().
IRM-ATE — DML–IRM ATE, heterogeneous (all units): morie.causal.estimate_irm().
DR-ATE — AIPW doubly robust ATE (all units): morie.causal.estimate_aipw().

—

Average Treatment Effect (ATE)¶

The ATE averages over the full population:

\[\text{ATE} = \mathbb{E}[Y_i(1) - Y_i(0)]\]

where \(Y_i(1)\) and \(Y_i(0)\) are the potential outcomes under treatment and control for unit \(i\). Identification requires unconfoundedness (\(Y(0), Y(1) \perp T \mid X\)) and overlap (\(0 < P(T=1 \mid X) < 1\)).

—

Average Treatment Effect on the Treated (ATT)¶

The ATT conditions on actually treated units:

\[\text{ATT} = \mathbb{E}[Y_i(1) - Y_i(0) \mid T_i = 1]\]

Under unconfoundedness, ATT is identified via Hájek-weighted IPW where treated units receive weight 1 and controls receive weight \(\hat{e}(X_i) / (1 - \hat{e}(X_i))\):

\[\widehat{\text{ATT}} = \frac{\sum_{T_i=1} Y_i}{n_1} - \frac{\sum_{T_i=0} w_i Y_i}{\sum_{T_i=0} w_i}, \quad w_i = \frac{\hat{e}(X_i)}{1 - \hat{e}(X_i)}\]

The ATT is the relevant estimand when the treated group is the primary policy target (e.g. cannabis users in CPADS).

Python entry point: morie.causal.estimate_att()

—

Average Treatment Effect on the Controls (ATC)¶

The ATC conditions on the control population:

\[\text{ATC} = \mathbb{E}[Y_i(1) - Y_i(0) \mid T_i = 0]\]

Identification uses reversed weights: control units receive weight 1; treated units receive weight \((1 - \hat{e}(X_i)) / \hat{e}(X_i)\).

Python entry point: morie.causal.estimate_atc()

—

Group Average Treatment Effect (GATE)¶

The GATE generalises the ATE to subpopulations defined by a discrete group variable \(G \in \{g_1, \ldots, g_K\}\):

\[\text{GATE}_k = \mathbb{E}[Y_i(1) - Y_i(0) \mid G_i = g_k]\]

MORIE computes GATE by applying the AIPW doubly-robust influence function within each stratum of \(G\). The result is a DataFrame with one row per group, including AIPW ATE estimate, SE, 95% CI, and sample size.

GATE is useful for examining heterogeneity by gender, age group, province, or any other subgroup of substantive interest.

Python entry point: morie.causal.estimate_gate()

—

Conditional Average Treatment Effect (CATE)¶

The CATE produces a per-unit treatment effect estimate as a function of covariates \(X\):

\[\tau(x) = \mathbb{E}[Y_i(1) - Y_i(0) \mid X_i = x]\]

MORIE implements the T-learner (two separate outcome models):

Fit \(\hat{\mu}_1(X)\) on treated units \(\{i : T_i = 1\}\).
Fit \(\hat{\mu}_0(X)\) on control units \(\{i : T_i = 0\}\).
\(\widehat{\text{CATE}}_i = \hat{\mu}_1(X_i) - \hat{\mu}_0(X_i)\).

And the S-learner (single model with treatment as feature):

Fit \(\hat{\mu}(T, X)\) on the full sample.
\(\widehat{\text{CATE}}_i = \hat{\mu}(1, X_i) - \hat{\mu}(0, X_i)\).

Both learners use sklearn.ensemble.RandomForestRegressor by default.

Python entry point: morie.causal.estimate_cate()

—

Local Average Treatment Effect (LATE)¶

When the treatment assignment \(T_i\) is endogenous (e.g. influenced by unobserved factors), a valid binary instrument \(Z_i\) that satisfies:

Relevance: \(\text{Cov}(T_i, Z_i) \neq 0\)
Exclusion: \(Z_i \perp \varepsilon_i \mid X_i\)
Monotonicity: \(T_i(1) \geq T_i(0)\) for all units

identifies the LATE (also called the Complier Average Causal Effect):

\[\text{LATE} = \frac{\mathbb{E}[Y_i \mid Z_i=1] - \mathbb{E}[Y_i \mid Z_i=0]} {\mathbb{E}[T_i \mid Z_i=1] - \mathbb{E}[T_i \mid Z_i=0]}\]

This is the ATE for compliers — units who take up treatment when \(Z=1\) and do not when \(Z=0\).

With covariates, MORIE uses 2SLS (two-stage least squares):

Stage 1: \(\hat{T}_i = \hat{\gamma}_0 + \hat{\gamma}_1 Z_i + \hat{\gamma}_2 X_i\)
Stage 2: \(Y_i = \theta \hat{T}_i + \beta X_i + \varepsilon_i\)

Python entry point: morie.causal.estimate_late()

—

Doubly-Robust AIPW (ATE)¶

See Causal Inference for the full AIPW derivation and formula.

—

References¶

Imbens GW, Rubin DB (2015). Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press.
Hernan MA, Robins JM (2020). Causal Inference: What If. Chapman & Hall/CRC.
Kennedy EH (2016). Semiparametric theory and empirical processes in causal inference. Statistical Causal Inferences and Their Applications in Public Health Research, pp. 141–167. https://doi.org/10.1007/978-3-319-41259-7_8
Imbens GW, Angrist JD (1994). Identification and estimation of local average treatment effects. Econometrica, 62(2):467–475.
Nie X, Wager S (2021). Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108(2):299–319. https://doi.org/10.1093/biomet/asaa076

MORIE

Table of Contents

Related Topics