• plug_split() specifies the type of splitting used in the analysis. It accepts a function .f that will be applied to the data. Only functions which return an rsplit object will be allowed. See package rsample and the details section. If a model has been fit before adding the split, it will need to be refit.

  • drop_split() removes the split specification from the tidyflow. Note that it keeps other preprocessing steps such as the recipe.

  • replace_split() first removes the split, then adds a new split specification. Any model that has already been fit based on this split will need to be refit.

plug_split(x, .f, ...)

drop_split(x)

replace_split(x, .f, ...)

Arguments

x

A tidyflow

.f

A function to be applied to the dataset in the tidyflow. Must return an object of class rsplit. See package rsample.

...

arguments passed to .f. These arguments must be named. The processing of ... respects the quotation rules from .f. In other words, if the function allows variables as strings and as names, the user can specify both. See the example sections.

Value

x, updated with either a new or removed split specification.

Details

The split specification is an optional step in the tidyflow. You can add a dataframe, prepare a recipe and fit the model without splitting into training/testing.

When applied to the data, the function .f must return an object of class rsplit. These are functions which come from the rsample package such as initial_split.

Examples

library(tibble)
library(rsample)

wf <-
 mtcars %>%
 tidyflow() %>%
 plug_split(initial_split, prop = 0.8, strata = "cyl")

wf
#> ══ Tidyflow ════════════════════════════════════════════════════════════════════
#> Data: 32 rows x 11 columns
#> Split: initial_split w/ prop = ~0.8, strata = ~"cyl"
#> Recipe/Formula: None
#> Resample: None
#> Grid: None
#> Model: None

# Strata as unquoted name
wf <- replace_split(wf, initial_split, prop = 0.8, strata = cyl)

wf
#> ══ Tidyflow ════════════════════════════════════════════════════════════════════
#> Data: 32 rows x 11 columns
#> Split: initial_split w/ prop = ~0.8, strata = ~cyl
#> Recipe/Formula: None
#> Resample: None
#> Grid: None
#> Model: None

drop_split(wf)
#> ══ Tidyflow ════════════════════════════════════════════════════════════════════
#> Data: 32 rows x 11 columns
#> Split: None
#> Recipe/Formula: None
#> Resample: None
#> Grid: None
#> Model: None

# New split function
replace_split(wf, initial_time_split)
#> ══ Tidyflow ════════════════════════════════════════════════════════════════════
#> Data: 32 rows x 11 columns
#> Split: initial_time_split w/ default args
#> Recipe/Formula: None
#> Resample: None
#> Grid: None
#> Model: None