The derivation of visit variables like AVISIT
,
AVISITN
, AWLO
, AWHI
, … or period,
subperiod, or phase variables like APERIOD
,
TRT01A
, TRT02A
, ASPER
,
PHSDTM
, PHEDTM
, … is highly study-specific.
Therefore {admiral}
cannot provide functions which derive
these variables. However, for common scenarios like visit assignments
based on time windows or deriving BDS period variables from ADSL period
variables, functions are provided which support those derivations.
AVISIT
, AVISITN
,
AWLO
, AWHI
, …)The most common ways of deriving AVISIT
and
AVISITN
are:
VISIT
and VISITNUM
).The former can be achieved simply by calling mutate()
,
like in the vignettes and the template scripts.
For the latter a (study-specific) reference dataset needs to be
created which provides for each visit the start and end day
(AWLO
and AWHI
) and the values of other visit
related variables (AVISIT
, AVISITN
,
AWTARGET
, …).
windows <- tribble(
~AVISIT, ~AWLO, ~AWHI, ~AVISITN, ~AWTARGET,
"BASELINE", -30, 1, 0, 1,
"WEEK 1", 2, 7, 1, 5,
"WEEK 2", 8, 15, 2, 11,
"WEEK 3", 16, 22, 3, 19,
"WEEK 4", 23, 30, 4, 26
)
Then the visits can be assigned based on the analysis day
(ADY
) by calling derive_vars_joined()
:
adbds <- tribble(
~USUBJID, ~ADY,
"1", -33,
"1", -2,
"1", 3,
"1", 24,
"2", NA,
)
adbds1 <- adbds %>%
derive_vars_joined(
dataset_add = windows,
filter_join = AWLO <= ADY & ADY <= AWHI,
join_type = "all",
)
USUBJID | ADY | AVISIT | AWLO | AWHI | AVISITN | AWTARGET |
---|---|---|---|---|---|---|
1 | -33 | NA | NA | NA | NA | NA |
1 | -2 | BASELINE | -30 | 1 | 0 | 1 |
1 | 3 | WEEK 1 | 2 | 7 | 1 | 5 |
1 | 24 | WEEK 4 | 23 | 30 | 4 | 26 |
2 | NA | NA | NA | NA | NA | NA |
If periods, subperiods, or phases are used, in the simpler of ADaM applications the corresponding variables have to be consistent across all datasets. This can be achieved by defining the periods, subperiods, or phases once and then using this definition for all datasets. The definition can be stored in ADSL or in a separate dataset. In this vignette’s examples, this separate dataset is called period/phase reference dataset depending what data it contains.
Note that periods, subperiods, or phases can be defined differently across datasets (for instance, they may be different across safety and efficacy analyses) in which case the start/stop dates should be defined in the individual datasets, instead of in ADSL. However, this vignette will not cover this scenario.
The sections below showcase the available tools in
{admiral}
to work with period, subperiod and phase
variables. However, at some point study specific code will always be
required. There are two options:
Study specific code is used to first derive the variables
PxxSwSDT
and PxxSwEDT
in ADSL. Then
create_period_dataset()
and
derive_vars_joined()
can be used to derive period/subperiod
variables like ASPER
or ASPRSDT
in BDS and
OCCDS datasets. See an example dataset here.
Study specific code is used to derive a dataset with one
observation per patient, period, and subperiod (see period reference dataset). Then
derive_vars_period()
can be used to derive
PxxSwSDT
and PxxSwEDT
in ADSL and
derive_vars_joined()
can be used to derive period/subperiod
variables like ASPER
or ASPRSDT
in BDS and
OCCDS datasets.
It depends on the specific definition of the periods/subperiods which option works best. If the definition is based on other ADSL variables, the first option would work best. If the definition is based on vertically structured data like exposure data (EX dataset), the second option should be used. This vignette contains examples for both workflows.
The {admiral}
functions expect separate reference
datasets for periods, subperiods, and phases. For periods the numeric
variable APERIOD
is expected, for subperiods the numeric
variables APERIOD
and ASPER
, and for phases
the numeric variable APHASEN
.
The period/phase reference dataset should be created according to the design and ADaM needs of the study in question. It should contain one observation per subject and period, subperiod, or phase. See the next section for an example dataset.
Consider a simple crossover study with the following design:
Given this design, an option could be to split the study into two periods:
Alternatively (or additionally) one could split into two phases:
Below, we present two example workflows: one where where a period reference dataset is created from the exposure dataset EX, and the other where a phase reference dataset is created using ADSL variables.
Below we create a period reference dataset starting from the exposure dataset EX. Consider the following exposure dataset:
ex <- tribble(
~STUDYID, ~USUBJID, ~VISIT, ~EXTRT, ~EXSTDTC,
"xyz", "1", "Day 1", "Drug X", "2022-01-02",
"xyz", "1", "Week 4", "Drug X", "2022-02-05",
"xyz", "1", "Week 8", "Drug X", "2022-03-01",
"xyz", "1", "Week 12", "Drug X", "2022-04-03",
"xyz", "1", "Week 16", "Drug Y", "2022-05-03",
"xyz", "1", "Week 20", "Drug Y", "2022-06-02",
"xyz", "1", "Week 24", "Drug Y", "2022-07-01",
"xyz", "1", "Week 28", "Drug Y", "2022-08-04",
"xyz", "2", "Day 1", "Drug Y", "2023-10-20",
"xyz", "2", "Week 4", "Drug Y", "2023-11-21",
"xyz", "2", "Week 8", "Drug Y", "2023-12-19",
"xyz", "2", "Week 12", "Drug Y", "2024-01-19",
"xyz", "2", "Week 16", "Drug X", "2024-02-20",
"xyz", "2", "Week 20", "Drug X", "2024-03-17",
"xyz", "2", "Week 24", "Drug X", "2024-04-22",
"xyz", "2", "Week 28", "Drug X", "2024-05-21"
)
Then to create a period reference dataset, code like this (or similar) would suffice:
period_ref <- ex %>%
# Select visits marking the start of each period
filter(VISIT %in% c("Day 1", "Week 16")) %>%
# Create APERIOD, APERSDT, TRTA based on SDTM counterparts
mutate(
APERIOD = case_when(
VISIT == "Day 1" ~ 1,
VISIT == "Week 16" ~ 2
),
TRTA = EXTRT,
APERSDT = convert_dtc_to_dt(EXSTDTC)
) %>%
# Create APEREDT based on start date of next period
arrange(USUBJID, APERSDT) %>%
group_by(USUBJID) %>%
mutate(
APEREDT = lead(APERSDT) - 1 # one day before start of next period
) %>%
# Tidy up
ungroup() %>%
select(-starts_with("EX"), -VISIT)
STUDYID | USUBJID | APERIOD | TRTA | APERSDT | APEREDT |
---|---|---|---|---|---|
xyz | 1 | 1 | Drug X | 2022-01-02 | 2022-05-02 |
xyz | 1 | 2 | Drug Y | 2022-05-03 | NA |
xyz | 2 | 1 | Drug Y | 2023-10-20 | 2024-02-19 |
xyz | 2 | 2 | Drug X | 2024-02-20 | NA |
The workflow above populates the Period End Date APEREDT
for all periods except the last one. The value of the last period could
then be populated with the End of Study Date (EOSDT
) from
ADSL, to obtain:
adsl <- tribble(
~STUDYID, ~USUBJID, ~TRTSDT, ~TRTEDT, ~EOSDT,
"xyz", "1", ymd("2022-01-02"), ymd("2022-08-04"), ymd("2022-09-10"),
"xyz", "2", ymd("2023-10-20"), ymd("2024-05-21"), ymd("2024-06-30")
)
period_ref <- period_ref %>%
left_join(adsl, by = c("STUDYID", "USUBJID")) %>%
mutate(APEREDT = case_when(
APERIOD == "1" ~ APEREDT,
APERIOD == "2" ~ EOSDT
)) %>%
select(-EOSDT, -TRTSDT, -TRTEDT)
STUDYID | USUBJID | APERIOD | TRTA | APERSDT | APEREDT |
---|---|---|---|---|---|
xyz | 1 | 1 | Drug X | 2022-01-02 | 2022-05-02 |
xyz | 1 | 2 | Drug Y | 2022-05-03 | 2022-09-10 |
xyz | 2 | 1 | Drug Y | 2023-10-20 | 2024-02-19 |
xyz | 2 | 2 | Drug X | 2024-02-20 | 2024-06-30 |
If the treatment start and end dates are already included in ADSL, we
can derive the phase variables directly in ADSL and create a phase
reference dataset by employing create_period_dataset()
.
Here is an example command to achieve this goal:
adsl1 <- adsl %>%
mutate(
PH1SDT = TRTSDT,
PH1EDT = TRTEDT + 28,
APHASE1 = "TREATMENT",
PH2SDT = TRTEDT + 29,
PH2EDT = EOSDT,
APHASE2 = "FUP"
)
phase_ref <- create_period_dataset(
adsl1,
new_vars = exprs(PHSDT = PHwSDT, PHEDT = PHwEDT, APHASE = APHASEw)
)
STUDYID | USUBJID | APHASEN | PHSDT | PHEDT | APHASE |
---|---|---|---|---|---|
xyz | 1 | 1 | 2022-01-02 | 2022-09-01 | TREATMENT |
xyz | 1 | 2 | 2022-09-02 | 2022-09-10 | FUP |
xyz | 2 | 1 | 2023-10-20 | 2024-06-18 | TREATMENT |
xyz | 2 | 2 | 2024-06-19 | 2024-06-30 | FUP |
If a period/phase reference dataset is available, the ADSL variables
for periods, subperiods, or phases can be created from this dataset by
calling derive_vars_period()
.
For example the period reference dataset from the previous section
can be used to add the period variables (APxxSDT
,
APxxEDT
) to ADSL:
adsl2 <- derive_vars_period(
adsl,
dataset_ref = period_ref,
new_vars = exprs(APxxSDT = APERSDT, APxxEDT = APEREDT)
)
STUDYID | USUBJID | AP01SDT | AP01EDT | AP02SDT | AP02EDT |
---|---|---|---|---|---|
xyz | 1 | 2022-01-02 | 2022-05-02 | 2022-05-03 | 2022-09-10 |
xyz | 2 | 2023-10-20 | 2024-02-19 | 2024-02-20 | 2024-06-30 |
If a period/phase reference dataset is available, BDS and OCCDS
variables for periods, subperiods, or phases can be created by calling
derive_vars_joined()
.
For example the variables APHASEN
, PHSDT
,
PHEDT
, APHASE
can be derived from the phase
reference dataset defined above.
adae <- tribble(
~STUDYID, ~USUBJID, ~ASTDT,
"xyz", "1", "2022-01-31",
"xyz", "1", "2022-05-02",
"xyz", "1", "2022-09-03",
"xyz", "1", "2022-09-09",
"xyz", "2", "2023-12-25",
"xyz", "2", "2024-06-19",
) %>%
mutate(ASTDT = ymd(ASTDT))
adae1 <- adae %>%
derive_vars_joined(
dataset_add = phase_ref,
by_vars = exprs(STUDYID, USUBJID),
filter_join = PHSDT <= ASTDT & ASTDT <= PHEDT,
join_type = "all"
)
STUDYID | USUBJID | ASTDT | APHASEN | PHSDT | PHEDT | APHASE |
---|---|---|---|---|---|---|
xyz | 1 | 2022-01-31 | 1 | 2022-01-02 | 2022-09-01 | TREATMENT |
xyz | 1 | 2022-05-02 | 1 | 2022-01-02 | 2022-09-01 | TREATMENT |
xyz | 1 | 2022-09-03 | 2 | 2022-09-02 | 2022-09-10 | FUP |
xyz | 1 | 2022-09-09 | 2 | 2022-09-02 | 2022-09-10 | FUP |
xyz | 2 | 2023-12-25 | 1 | 2023-10-20 | 2024-06-18 | TREATMENT |
xyz | 2 | 2024-06-19 | 2 | 2024-06-19 | 2024-06-30 | FUP |
TRTxxP
, TRTxxA
,
TRTP
, TRTA
, …)In studies with multiple periods the treatment can differ by period,
e.g. for a crossover trial - see the previous
section for an example design showcasing this. CDISC defines
variables for planned and actual treatments in ADSL
(TRTxxP
, TRTxxA
, TRxxPGy
,
TRxxAGy
, …) and corresponding variables in BDS and OCCDS
datasets (TRTP
, TRTA
, TRTPGy
,
TRTAGy
, …). They can be derived in the same way (and same
step) as the period, subperiod, and phase variables.
If the treatment information is included in the period/phase
reference dataset, the treatment ADSL variables can be created by
calling derive_vars_period()
. This is showcased below using
the period reference dataset from previous sections.
adsl <- derive_vars_period(
adsl,
dataset_ref = period_ref,
new_vars = exprs(
APxxSDT = APERSDT,
APxxEDT = APEREDT,
TRTxxA = TRTA
)
)
STUDYID | USUBJID | TRT01A | TRT02A | AP01SDT | AP01EDT | AP02SDT | AP02EDT |
---|---|---|---|---|---|---|---|
xyz | 1 | Drug X | Drug Y | 2022-01-02 | 2022-05-02 | 2022-05-03 | 2022-09-10 |
xyz | 2 | Drug Y | Drug X | 2023-10-20 | 2024-02-19 | 2024-02-20 | 2024-06-30 |
If a period/phase reference dataset is available, BDS and OCCDS
variables for treatment can be created by calling
derive_vars_joined()
.
For example the variables APERIOD
and TRTA
can be derived from the period reference dataset defined above.
adae <- tribble(
~STUDYID, ~USUBJID, ~ASTDT,
"xyz", "1", "2022-01-31",
"xyz", "1", "2022-05-02",
"xyz", "1", "2022-08-24",
"xyz", "1", "2022-09-09",
"xyz", "2", "2023-12-25",
"xyz", "2", "2024-06-07",
) %>%
mutate(ASTDT = ymd(ASTDT))
adae2 <- adae %>%
derive_vars_joined(
dataset_add = period_ref,
by_vars = exprs(STUDYID, USUBJID),
new_vars = exprs(APERIOD, TRTA),
join_vars = exprs(APERSDT, APEREDT),
join_type = "all",
filter_join = APERSDT <= ASTDT & ASTDT <= APEREDT
)
STUDYID | USUBJID | ASTDT | APERIOD | TRTA |
---|---|---|---|---|
xyz | 1 | 2022-01-31 | 1 | Drug X |
xyz | 1 | 2022-05-02 | 1 | Drug X |
xyz | 1 | 2022-08-24 | 2 | Drug Y |
xyz | 1 | 2022-09-09 | 2 | Drug Y |
xyz | 2 | 2023-12-25 | 1 | Drug Y |
xyz | 2 | 2024-06-07 | 2 | Drug X |
If no period/phase reference dataset is available but period/phase
variables are in ADSL, then the former can again be created from ADSL by
calling create_period_dataset()
, as was showcased here.
This time, when calling create_period_dataset()
we just
need to make sure we include TRTA = TRTxxA
as part of the
new_vars
argument to create the treatment variables as
well.
period_ref1 <- adsl %>%
create_period_dataset(
new_vars = exprs(APERSDT = APxxSDT, APEREDT = APxxEDT, TRTA = TRTxxA)
)
STUDYID | USUBJID | APERIOD | APERSDT | APEREDT | TRTA |
---|---|---|---|---|---|
xyz | 1 | 1 | 2022-01-02 | 2022-05-02 | Drug X |
xyz | 1 | 2 | 2022-05-03 | 2022-09-10 | Drug Y |
xyz | 2 | 1 | 2023-10-20 | 2024-02-19 | Drug Y |
xyz | 2 | 2 | 2024-02-20 | 2024-06-30 | Drug X |