Propensity Scores¶
Part of Statistical Methods — MORIE’s statistical-methods reference.
The propensity score \(e(X) = P(T=1 \mid X)\) summarizes confounding information into a single scalar, allowing balancing without direct covariate matching (Rosenbaum & Rubin 1983).
Estimation¶
MORIE estimates propensity scores via logistic regression (default) or random forest, depending on the module configuration.
Logistic regression:
Implemented in morie.causal.compute_propensity_scores() using
sklearn.linear_model.LogisticRegression with max_iter=1000.
Diagnostics¶
After propensity estimation:
Overlap check — histogram of \(\hat{e}(X)\) by treatment group. Extreme values near 0 or 1 indicate potential positivity violations.
Effective Sample Size (ESS) — see Causal Inference.
Covariate balance — standardized mean differences before and after weighting should be \(< 0.1\) for all covariates.
CPADS covariates¶
The default covariate set for the propensity-scores module is drawn from
CPADS_REQUIRED_VARIABLES:
age_groupgenderprovince_regionmental_healthphysical_healthalcohol_past12m
Treatment: cannabis_any_use
Outcome: heavy_drinking_30d or ebac_tot
References¶
Rosenbaum PR, Rubin DB (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1):41–55. https://doi.org/10.1093/biomet/70.1.41
Austin PC (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46(3):399–424. https://doi.org/10.1080/00273171.2011.568786