Skip to content

Commit f9d1e32

Browse files
committed
Merge branch 'master' of github.com:DoubleML/doubleml-for-py into m-prepare-release-0.1.0
2 parents 1803879 + a8dbb38 commit f9d1e32

File tree

12 files changed

+39
-67
lines changed

12 files changed

+39
-67
lines changed

.github/workflows/deploy_docu.yml

Lines changed: 35 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -15,13 +15,13 @@ on:
1515
jobs:
1616
build:
1717

18-
runs-on: ubuntu-latest
18+
runs-on: ubuntu-20.04
1919

2020
steps:
21-
- uses: actions/checkout@v2
22-
with:
23-
persist-credentials: false
24-
- name: Install SSH Client
21+
- name: Check out the repo containing the Python package
22+
uses: actions/checkout@v2
23+
24+
- name: Install SSH Client for deploying the docu to github pages
2525
uses: webfactory/ssh-agent@v0.4.1
2626
with:
2727
ssh-private-key: ${{ secrets.DEPLOY_KEY }}
@@ -37,14 +37,40 @@ jobs:
3737
run: |
3838
python -m pip install --upgrade pip
3939
pip install -r requirements-dev.txt
40-
pip install .
40+
pip install -e .
41+
42+
- name: Add R repository
43+
run: |
44+
sudo apt install dirmngr gnupg apt-transport-https ca-certificates software-properties-common
45+
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
46+
sudo add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/'
4147
- name: Install R
42-
uses: r-lib/actions/setup-r@v1
48+
run: |
49+
sudo apt-get update
50+
sudo apt-get install r-base
51+
sudo apt-get install r-base-dev
52+
53+
- name: Get user library folder
54+
run: |
55+
mkdir ${GITHUB_WORKSPACE}/tmp_r_libs_user
56+
echo R_LIBS_USER=${GITHUB_WORKSPACE}/tmp_r_libs_user >> $GITHUB_ENV
57+
58+
- name: Query R version
59+
run: |
60+
writeLines(sprintf("R-%i.%i", getRversion()$major, getRversion()$minor), ".github/R-version")
61+
shell: Rscript {0}
62+
63+
- name: Cache R packages
64+
uses: actions/cache@v2
4365
with:
44-
r-version: 'release'
45-
- name: Install R kernel for Jupyter
66+
path: ${{ env.R_LIBS_USER }}
67+
key: doubleml-user-guide-${{ hashFiles('.github/R-version') }}
68+
69+
- name: Install R kernel for Jupyter and the R package DoubleML
4670
run: |
4771
install.packages('remotes')
72+
remotes::install_github("DoubleML/doubleml-for-r")
73+
install.packages(c("knitr", "rmarkdown", "testthat", "patrick", "mvtnorm", "dplyr", "glmnet", "lgr", "ggplot2", "ranger", "hdm", "sandwich", "AER", "rpart"))
4874
install.packages('IRkernel')
4975
IRkernel::installspec()
5076
shell: Rscript {0}

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ It further can be readily extended with regards to
4848
- ... alternative resampling schemes,
4949
- ...
5050

51-
![OOP structure of the DoubleML package](/doc/oop.svg?raw=true)
51+
![An overview of the OOP structure of the DoubleML package is given in the graphic available at https://github.com/DoubleML/doubleml-for-py/blob/master/doc/oop.svg](/doc/oop.svg?raw=true)
5252

5353
## Installation
5454

doc/guide/algorithms.rst

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,6 @@ The DML algorithm can be selected via parameter ``dml_procedure='dml1'`` vs. ``d
8787
.. tabbed:: R
8888

8989
.. jupyter-execute::
90-
:raises:
9190

9291
library(DoubleML)
9392
library(mlr3)
@@ -117,7 +116,6 @@ stores the estimate :math:`\tilde{\theta}_0` in its ``coef`` attribute.
117116
.. tabbed:: R
118117

119118
.. jupyter-execute::
120-
:raises:
121119

122120
dml_plr_obj$coef
123121

@@ -135,7 +133,6 @@ are stored in the attribute ``psi``.
135133
.. tabbed:: R
136134

137135
.. jupyter-execute::
138-
:raises:
139136

140137
dml_plr_obj$psi[1:5, ,1]
141138

@@ -152,7 +149,6 @@ For the DML1 algorithm, the estimates for the different folds
152149
.. tabbed:: R
153150

154151
.. jupyter-execute::
155-
:raises:
156152

157153
dml_plr_obj$all_dml1_coef
158154

doc/guide/basics.rst

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,6 @@ The nuisance functions are given by
6060
.. tabbed:: R
6161

6262
.. jupyter-execute::
63-
:raises:
6463

6564
library(DoubleML)
6665
set.seed(1234)
@@ -126,7 +125,6 @@ efficient.
126125
.. tabbed:: R
127126

128127
.. jupyter-execute::
129-
:raises:
130128

131129
library(ggplot2)
132130

@@ -219,7 +217,6 @@ other half of observations indexed with :math:`i \in I`
219217
.. tabbed:: R
220218

221219
.. jupyter-execute::
222-
:raises:
223220

224221
non_orth_score = function(y, d, g_hat, m_hat, smpls) {
225222
u_hat = y - g_hat
@@ -231,7 +228,6 @@ other half of observations indexed with :math:`i \in I`
231228

232229

233230
.. jupyter-execute::
234-
:raises:
235231

236232
library(mlr3)
237233
library(mlr3learners)
@@ -334,7 +330,6 @@ orthogonalized regressor :math:`V = D - m(X)`. We then use the final estimate
334330
.. tabbed:: R
335331

336332
.. jupyter-execute::
337-
:raises:
338333

339334
library(data.table)
340335
lgr::get_logger("mlr3")$set_threshold("warn")
@@ -412,7 +407,6 @@ induced by overfitting. Cross-fitting performs well empirically.
412407
.. tabbed:: R
413408

414409
.. jupyter-execute::
415-
:raises:
416410

417411
set.seed(3333)
418412

@@ -489,7 +483,6 @@ The third term :math:`c^*` vanishes in probability if sample splitting is applie
489483
.. tabbed:: R
490484

491485
.. jupyter-execute::
492-
:raises:
493486

494487
g_all = ggplot(data.frame(theta_ols, theta_nonorth, theta_orth_nosplit, theta_dml)) +
495488
geom_density(aes(x = theta_ols), fill = "dark blue", alpha = 0.3, color = "dark blue") +

doc/guide/data_backend.rst

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,6 @@ demonstrated in the following. We download the Bonus data set from the Pennsylva
2525
.. tabbed:: R
2626

2727
.. jupyter-execute::
28-
:raises:
2928

3029
library(DoubleML)
3130

@@ -70,7 +69,6 @@ serving as treatment variable :math:`D` and the columns ``x_cols=`` specifying t
7069
.. tabbed:: R
7170

7271
.. jupyter-execute::
73-
:raises:
7472

7573
# Specify the data and the variables for the causal model
7674

@@ -132,7 +130,6 @@ variable ``y`` and a treatment variable ``d``
132130
.. tabbed:: R
133131

134132
.. jupyter-execute::
135-
:raises:
136133

137134
# Generate data
138135
set.seed(3141)
@@ -157,7 +154,6 @@ To specify the data and the variables for the causal model from arrays we call
157154
.. tabbed:: R
158155

159156
.. jupyter-execute::
160-
:raises:
161157

162158
obj_dml_data_sim = double_ml_data_from_matrix(X = X, y = y, d = d)
163159
obj_dml_data_sim

doc/guide/learners.rst

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -216,7 +216,6 @@ package for R.
216216
.. tabbed:: R
217217

218218
.. jupyter-execute::
219-
:raises:
220219

221220
library(DoubleML)
222221
library(mlr3)
@@ -244,7 +243,6 @@ Setting hyperparameters:
244243
.. tabbed:: R
245244

246245
.. jupyter-execute::
247-
:raises:
248246

249247
set.seed(3141)
250248
ml_g = lrn("regr.ranger", num.trees=10)
@@ -281,7 +279,6 @@ Setting treatment-variable-specific or fold-specific hyperparameters:
281279
.. tabbed:: R
282280

283281
.. jupyter-execute::
284-
:raises:
285282

286283
set.seed(3141)
287284
ml_g = lrn("regr.ranger")
@@ -307,7 +304,6 @@ The following example illustrates how to set parameters for each fold.
307304
.. tabbed:: R
308305

309306
.. jupyter-execute::
310-
:raises:
311307

312308
learner = lrn("regr.ranger")
313309
ml_g = learner$clone()
@@ -340,7 +336,6 @@ To illustrate the parameter tuning, we generate data from a sparse partially lin
340336
.. tabbed:: R
341337

342338
.. jupyter-execute::
343-
:raises:
344339

345340
library(DoubleML)
346341
library(mlr3)
@@ -387,7 +382,6 @@ for tuning, each of the two folds would be split up into 5 subfolds and the erro
387382
.. tabbed:: R
388383

389384
.. jupyter-execute::
390-
:raises:
391385

392386
library(DoubleML)
393387
library(mlr3)
@@ -432,7 +426,6 @@ external parameter tuning of the nuisance parts. The optimally chosen parameters
432426
.. tabbed:: R
433427

434428
.. jupyter-execute::
435-
:raises:
436429

437430
library(DoubleML)
438431
library(mlr3)
@@ -458,7 +451,6 @@ as provided by the ``ranger`` package.
458451
.. tabbed:: R
459452

460453
.. jupyter-execute::
461-
:raises:
462454

463455
library(DoubleML)
464456
library(mlr3)

doc/guide/models.rst

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,6 @@ Estimation is conducted via its ``fit()`` method:
3535
.. tabbed:: R
3636

3737
.. jupyter-execute::
38-
:raises:
3938

4039
library(DoubleML)
4140
library(mlr3)
@@ -88,7 +87,6 @@ Estimation is conducted via its ``fit()`` method:
8887
.. tabbed:: R
8988

9089
.. jupyter-execute::
91-
:raises:
9290

9391
library(DoubleML)
9492
library(mlr3)
@@ -137,7 +135,6 @@ Estimation is conducted via its ``fit()`` method:
137135
.. tabbed:: R
138136

139137
.. jupyter-execute::
140-
:raises:
141138

142139
library(DoubleML)
143140
library(mlr3)
@@ -187,7 +184,6 @@ Estimation is conducted via its ``fit()`` method:
187184
.. tabbed:: R
188185

189186
.. jupyter-execute::
190-
:raises:
191187

192188
library(DoubleML)
193189
library(mlr3)

doc/guide/resampling.rst

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,6 @@ implemented in ``DoubleMLPLR``.
3636
.. tabbed:: R
3737

3838
.. jupyter-execute::
39-
:raises:
4039

4140
library(DoubleML)
4241
library(mlr3)
@@ -71,7 +70,6 @@ The default setting is ``n_folds = 5`` and ``n_rep = 1``, i.e.,
7170
.. tabbed:: R
7271

7372
.. jupyter-execute::
74-
:raises:
7573

7674
dml_plr_obj = DoubleMLPLR$new(obj_dml_data, ml_g, ml_m, n_folds = 5, n_rep = 1)
7775
print(dml_plr_obj$n_folds)
@@ -92,7 +90,6 @@ The :math:`K`-fold random partition is stored in the ``smpls`` attribute of the
9290
.. tabbed:: R
9391

9492
.. jupyter-execute::
95-
:raises:
9693

9794
dml_plr_obj$smpls
9895

@@ -119,7 +116,6 @@ stored in the attributes ``psi_a`` and ``psi_b``.
119116
.. tabbed:: R
120117

121118
.. jupyter-execute::
122-
:raises:
123119

124120
dml_plr_obj$fit()
125121
print(dml_plr_obj$psi_a[1:5, ,1])
@@ -142,7 +138,6 @@ It results in :math:`M` random :math:`K`-fold partitions being drawn.
142138
.. tabbed:: R
143139

144140
.. jupyter-execute::
145-
:raises:
146141

147142
dml_plr_obj = DoubleMLPLR$new(obj_dml_data, ml_g, ml_m, n_folds = 5, n_rep = 10)
148143
print(dml_plr_obj$n_folds)
@@ -170,7 +165,6 @@ The third dimension refers to the treatment variable and becomes non-singleton i
170165
.. tabbed:: R
171166

172167
.. jupyter-execute::
173-
:raises:
174168

175169
dml_plr_obj$fit()
176170
print(dml_plr_obj$psi_a[1:5, ,1])
@@ -199,7 +193,6 @@ and the asymptotic standard error :math:`\hat{\sigma}/\sqrt{N}` in ``se``.
199193
.. tabbed:: R
200194

201195
.. jupyter-execute::
202-
:raises:
203196

204197
print(dml_plr_obj$coef)
205198
print(dml_plr_obj$se)
@@ -218,7 +211,6 @@ The parameter estimates :math:`(\tilde{\theta}_{0,m})_{m \in [M]}` and asymptoti
218211
.. tabbed:: R
219212

220213
.. jupyter-execute::
221-
:raises:
222214

223215
print(dml_plr_obj$all_coef)
224216
print(dml_plr_obj$all_se)
@@ -246,7 +238,6 @@ initialization of the ``DoubleMLPLR`` object.
246238
.. tabbed:: R
247239

248240
.. jupyter-execute::
249-
:raises:
250241

251242
set.seed(314)
252243
dml_plr_obj_internal = DoubleMLPLR$new(obj_dml_data, ml_g, ml_m, n_folds = 4)
@@ -273,7 +264,6 @@ and set the partition via the ``set_sample_splitting()`` method.
273264
.. tabbed:: R
274265

275266
.. jupyter-execute::
276-
:raises:
277267

278268
dml_plr_obj_external = DoubleMLPLR$new(obj_dml_data, ml_g, ml_m, draw_sample_splitting = FALSE)
279269

@@ -312,7 +302,6 @@ Note that cross-fitting performs well empirically and is recommended to remove b
312302
.. tabbed:: R
313303

314304
.. jupyter-execute::
315-
:raises:
316305

317306
dml_plr_obj_external = DoubleMLPLR$new(obj_dml_data, ml_g, ml_m,
318307
n_folds = 2, apply_cross_fitting = FALSE)
@@ -339,7 +328,6 @@ via ``set_sample_splitting()`` needs to be applied, like for example:
339328
.. tabbed:: R
340329

341330
.. jupyter-execute::
342-
:raises:
343331

344332
dml_plr_obj_external = DoubleMLPLR$new(obj_dml_data, ml_g, ml_m,
345333
n_folds = 2, apply_cross_fitting = FALSE,
@@ -381,7 +369,6 @@ justification, see also :ref:`bias_overfitting`.
381369
.. tabbed:: R
382370

383371
.. jupyter-execute::
384-
:raises:
385372

386373
dml_plr_no_split = DoubleMLPLR$new(obj_dml_data, ml_g, ml_m,
387374
n_folds = 1, apply_cross_fitting = FALSE)

0 commit comments

Comments
 (0)