Propensity Scores¶

Part of Statistical Methods — MORIE’s statistical-methods reference.

The propensity score \(e(X) = P(T=1 \mid X)\) summarizes confounding information into a single scalar, allowing balancing without direct covariate matching (Rosenbaum & Rubin 1983).

Estimation¶

MORIE estimates propensity scores via logistic regression (default) or random forest, depending on the module configuration.

Logistic regression:

\[\log \frac{e(X_i)}{1 - e(X_i)} = \beta_0 + \beta^\top X_i\]

Implemented in morie.causal.compute_propensity_scores() using sklearn.linear_model.LogisticRegression with max_iter=1000.

Diagnostics¶

After propensity estimation:

Overlap check — histogram of \(\hat{e}(X)\) by treatment group. Extreme values near 0 or 1 indicate potential positivity violations.
Effective Sample Size (ESS) — see Causal Inference.
Covariate balance — standardized mean differences before and after weighting should be \(< 0.1\) for all covariates.

CPADS covariates¶

The default covariate set for the propensity-scores module is drawn from CPADS_REQUIRED_VARIABLES:

age_group
gender
province_region
mental_health
physical_health
alcohol_past12m

Treatment: cannabis_any_use Outcome: heavy_drinking_30d or ebac_tot

References¶

Rosenbaum PR, Rubin DB (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1):41–55. https://doi.org/10.1093/biomet/70.1.41
Austin PC (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46(3):399–424. https://doi.org/10.1080/00273171.2011.568786

MORIE

Table of Contents

Related Topics

Propensity Scores¶

Estimation¶

Diagnostics¶

CPADS covariates¶

References¶