Multiple Assignment with unpack

In R there are many functions that return named lists or other structures keyed by names. Often, you want to unpack the elements of such a list into separate variables, for ease of use. One example is the use of split() to partition a larger data frame into a named list of smaller data frames, each corresponding to some grouping.

x	group
1	train
2	calibrate
3	test
4	train
5	calibrate
6	test
7	train
8	calibrate
9	test

A multiple assignment notation allows us to assign all the smaller data frames to variables in one step, and avoid leaving a possibly large temporary variable such as parts in our environment. One such notation is unpack().

Basic `unpack()` example

# clear out the earlier results
rm(list = c('train_data', 'calibrate_data', 'test_data', 'parts'))

# split d and unpack the smaller data frames into separate variables
unpack(split(d, d$group),
       train_data = train,
       test_data = test,
       calibrate_data = calibrate)

knitr::kable(train_data)

	x	group
1	1	train
4	4	train
7	7	train

knitr::kable(calibrate_data)

	x	group
2	2	calibrate
5	5	calibrate
8	8	calibrate

knitr::kable(test_data)

	x	group
3	3	test
6	6	test
9	9	test

You can also use unpack with an assignment notation similar to the notation used with the
zeallot::%<-% pipe:

# split d and unpack the smaller data frames into separate variables
unpack[traind = train, testd = test, cald = calibrate] := split(d, d$group)

knitr::kable(traind)

	x	group
1	1	train
4	4	train
7	7	train

knitr::kable(cald)

	x	group
2	2	calibrate
5	5	calibrate
8	8	calibrate

knitr::kable(testd)

	x	group
3	3	test
6	6	test
9	9	test

Reusing the list names as variables

If you are willing to assign the elements of the list into variables with the same names, you can just use the names:

unpack(split(d, d$group), train, test, calibrate)

knitr::kable(train)

	x	group
1	1	train
4	4	train
7	7	train

knitr::kable(calibrate)

	x	group
2	2	calibrate
5	5	calibrate
8	8	calibrate

knitr::kable(test)

	x	group
3	3	test
6	6	test
9	9	test

# try the unpack[] assignment notation

rm(list = c('train', 'test', 'calibrate'))

unpack[test, train, calibrate] := split(d, d$group)

knitr::kable(train)

	x	group
1	1	train
4	4	train
7	7	train

knitr::kable(calibrate)

	x	group
2	2	calibrate
5	5	calibrate
8	8	calibrate

knitr::kable(test)

	x	group
3	3	test
6	6	test
9	9	test

Mixed notation is allowed:

rm(list = c('train', 'test', 'calibrate'))
unpack(split(d, d$group), train, holdout=test, calibrate)

knitr::kable(train)

	x	group
1	1	train
4	4	train
7	7	train

knitr::kable(calibrate)

	x	group
2	2	calibrate
5	5	calibrate
8	8	calibrate

knitr::kable(holdout)

	x	group
3	3	test
6	6	test
9	9	test

Unpacking only parts of a list

You can also unpack only a subset of the list’s elements:

rm(list = c('train', 'holdout', 'calibrate'))

unpack(split(d, d$group), train, test)

knitr::kable(train)

	x	group
1	1	train
4	4	train
7	7	train

knitr::kable(test)

	x	group
3	3	test
6	6	test
9	9	test

# we didn't unpack the calibrate set
calibrate

## Error in eval(expr, envir, enclos): object 'calibrate' not found

`unpack` checks for unknown elements

If unpack is asked to unpack an element it doesn’t recognize, it throws an error. In this case, none of the elements are unpacked, as unpack is deliberately an atomic operation.

# the split call will not return an element called "holdout"
unpack(split(d, d$group), training = train, testing = holdout)

## Error in write_values_into_env(unpack_environment = unpack_environment, : wrapr::unpack all source names must be in value, missing: 'holdout'.

# train was not unpacked either
training

## Error in eval(expr, envir, enclos): object 'training' not found

Other multiple assignment packages

`zeallot`

The zeallot package already supplies excellent positional or ordered unpacking.

The primary difference between zeallot’s %<-% pipe and unpack is that %<-% is a positional unpacker: you must unpack the list based on the order of the elements in the list. This style may be more appropriate in the Python world where many functions return un-named tuples of results.

unpack is a named unpacker: assignments are based on the names of elements in the list, and the assignments can be in any order. We feel this is more appropriate for R, as R has not emphasized positional unpacking; R functions tend to return named lists or named structures. For named lists or named structures it may not be safe to rely on value positions.

For unpacking named lists, we recommend unpack. For unpacking unnamed lists, use %<-%.

`vadr`

vadr::bind supplies named unpacking, but appears to use a “SOURCE = DESTINATION” notation. That is the reverse of a “DESTINATION = SOURCE” which is how both R assignments and argument binding are already written.

`tidytidbits`

tidytidbitssupplies positional unpacking with a %=% notation.

Multiple Assignment with unpack

Nina Zumel and John Mount

2023-08-19

Basic `unpack()` example

Reusing the list names as variables

Unpacking only parts of a list

`unpack` checks for unknown elements

Other multiple assignment packages

`zeallot`

`vadr`

`tidytidbits`

Multiple Assignment with unpack

Nina Zumel and John Mount

2023-08-19

Basic unpack() example

Reusing the list names as variables

Unpacking only parts of a list

unpack checks for unknown elements

Other multiple assignment packages

zeallot

vadr

tidytidbits

Basic `unpack()` example

`unpack` checks for unknown elements

`zeallot`

`vadr`

`tidytidbits`