ddml
is an implementation of double/debiased machine
learning estimators as proposed by Chernozhukov et al. (2018). The key
feature of ddml
is the straightforward estimation of
nuisance parameters using (short-)stacking (Wolpert, 1992), which allows
for multiple machine learners to increase robustness to the underlying
data generating process. See also Ahrens et al. (2024b) for a
detailed illustration of the practical benefits of combining DDML with
(short-)stacking.
ddml
is the sister R package to our Stata package, mirroring
its key features while also leveraging R to simplify estimation with
user-provided machine learners and/or sparse matrices. See also Ahrens et al. (2024a) with
additional discussion of the supported causal models and benefits of
(short)-stacking.
Install the latest development version from GitHub (requires devtools package):
if (!require("devtools")) {
install.packages("devtools")
}::install_github("thomaswiemann/ddml", dependencies = TRUE) devtools
Install the latest public release from CRAN:
install.packages("ddml")
To illustrate ddml
on a simple example, consider the
included random subsample of 5,000 observations from the data of Angrist
& Evans (1998). The data contains information on the labor supply of
mothers, their children, as well as demographic data. See
?AE98
for details.
# Load ddml and set seed
library(ddml)
set.seed(75523)
# Construct variables from the included Angrist & Evans (1998) data
= AE98[, "worked"]
y = AE98[, "morekids"]
D = AE98[, "samesex"]
Z = AE98[, c("age","agefst","black","hisp","othrace","educ")] X
ddml_late
estimates the local average treatment effect
(LATE) using double/debiased machine learning (see
?ddml_late
). Since the statistical properties of machine
learners depend heavily on the underlying (unknown!) structure of the
data, adaptive combination of multiple machine learners can increase
robustness. In the below snippet, ddml_late
estimates the
LATE with short-stacking based on three base learners:
?ols
)?mdl_glmnet
)?mdl_xgboost
)# Estimate the local average treatment effect using short-stacking with base
# learners ols, rlasso, and xgboost.
<- ddml_late(y, D, Z, X,
late_fit_short learners = list(list(fun = ols),
list(fun = mdl_glmnet),
list(fun = mdl_xgboost,
args = list(nrounds = 100,
max_depth = 1))),
ensemble_type = 'nnls1',
shortstack = TRUE,
sample_folds = 10,
silent = TRUE)
summary(late_fit_short)
#> LATE estimation results:
#>
#> Estimate Std. Error t value Pr(>|t|)
#> nnls1 -0.221 0.187 -1.18 0.236
ddml
Check out our articles to learn more:
vignette("ddml")
is a more detailed introduction to
ddml
vignette("stacking")
discusses computational benefits
of short-stackingvignette("new_ml_wrapper")
shows how to write
user-provided base learnersvignette("sparse")
illustrates support of sparse
matrices (see ?Matrix
)vignette("did")
discusses integration with the
diff-in-diff package did
For additional applied examples, see our case studies:
vignette("example_401k")
revisits the effect of 401k
participation on retirement savingsvignette("example_BLP95")
considers flexible demand
estimation with endogenous pricesddml
is built to easily (and quickly) estimate common
causal parameters with multiple machine learners. With its support for
short-stacking, sparse matrices, and easy-to-learn syntax, we hope
ddml
is a useful complement to DoubleML
,
the expansive R and Python package. DoubleML
supports many advanced features such as multiway
clustering and stacking.
Ahrens A, Hansen C B, Schaffer M E, Wiemann T (2024a). “ddml: Double/debiased machine learning in Stata.” Stata Journal, 24(1): 3-45.
Ahrens A, Hansen C B, Schaffer M E, Wiemann T (2024b). “Model averaging and double machine learning.” https://arxiv.org/abs/2401.01645
Angrist J, Evans W, (1998). “Children and Their Parents’ Labor Supply: Evidence from Exogenous Variation in Family Size.” American Economic Review, 88(3), 450-477.
Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C B, Newey W, Robins J (2018). “Double/debiased machine learning for treatment and structural parameters.” The Econometrics Journal, 21(1), C1-C68.
Wolpert D H (1992). “Stacked generalization.” Neural Networks, 5(2), 241-259.