plug_split()
specifies the type of splitting used in the analysis. It
accepts a function .f
that will be applied to the data. Only
functions which return an rsplit
object will be allowed. See
package rsample
and the details section. If a model
has been fit before adding the split, it will need to be refit.
drop_split()
removes the split specification from the tidyflow. Note
that it keeps other preprocessing steps such as the recipe.
replace_split()
first removes the split, then adds a new split
specification. Any model that has already been fit based on this
split will need to be refit.
plug_split(x, .f, ...)
drop_split(x)
replace_split(x, .f, ...)
A tidyflow
A function to be applied to the dataset in the tidyflow. Must
return an object of class rsplit
. See package
rsample
.
arguments passed to .f
. These arguments must be named.
The processing of ...
respects the quotation rules from .f
.
In other words, if the function allows variables as strings and
as names, the user can specify both. See the example sections.
x
, updated with either a new or removed split specification.
The split specification is an optional step in the tidyflow. You can add a dataframe, prepare a recipe and fit the model without splitting into training/testing.
When applied to the data, the function .f
must return an object
of class rsplit
. These are functions which come from the
rsample
package such as
initial_split
.
library(tibble)
library(rsample)
wf <-
mtcars %>%
tidyflow() %>%
plug_split(initial_split, prop = 0.8, strata = "cyl")
wf
#> ══ Tidyflow ════════════════════════════════════════════════════════════════════
#> Data: 32 rows x 11 columns
#> Split: initial_split w/ prop = ~0.8, strata = ~"cyl"
#> Recipe/Formula: None
#> Resample: None
#> Grid: None
#> Model: None
# Strata as unquoted name
wf <- replace_split(wf, initial_split, prop = 0.8, strata = cyl)
wf
#> ══ Tidyflow ════════════════════════════════════════════════════════════════════
#> Data: 32 rows x 11 columns
#> Split: initial_split w/ prop = ~0.8, strata = ~cyl
#> Recipe/Formula: None
#> Resample: None
#> Grid: None
#> Model: None
drop_split(wf)
#> ══ Tidyflow ════════════════════════════════════════════════════════════════════
#> Data: 32 rows x 11 columns
#> Split: None
#> Recipe/Formula: None
#> Resample: None
#> Grid: None
#> Model: None
# New split function
replace_split(wf, initial_time_split)
#> ══ Tidyflow ════════════════════════════════════════════════════════════════════
#> Data: 32 rows x 11 columns
#> Split: initial_time_split w/ default args
#> Recipe/Formula: None
#> Resample: None
#> Grid: None
#> Model: None