Add a recipe to a tidyflow — plug

plug_recipe() specifies the type of recipe used in the analysis. It accepts a function .f that will be applied to the data. Only functions which return a recipe object will be allowed. See package recipes for how to create a recipe.
drop_recipe() removes the recipe function from the tidyflow. Note that it keeps other preprocessing steps such as the split and resample.
replace_recipe() first removes the recipe function, then adds the new recipe function. Any model that has already been fit based on this recipe will need to be refit.

plug_recipe(x, .f, ..., blueprint = NULL)

drop_recipe(x)

replace_recipe(x, .f, ..., blueprint = NULL)

Arguments

x: A tidyflow
.f: A function or a formula with a recipe inside. See the details section.
...: Not used.
blueprint: A hardhat blueprint used for fine tuning the preprocessing. If NULL, hardhat::default_recipe_blueprint() is used.

Value

The tidyflow x, updated with either a new or removed recipe function.

Details

To fit a tidyflow, one of plug_formula() or plug_recipe() must be specified, but not both.

.f can be either a function or a formula. In either case, both things should have only one argument and return the recipe applied to the only argument, which is assumed to be the data.

If a function is supplied, it is assumed that there is one argument and that argument is for the data. The output should be the recipe applied to the main argument. The function is used as is.
If a formula, e.g. ~ recipe(mpg ~ cyl, data = .), it is converted to a function. It is also assumed that the first argument in the recipe function is passed to the data. Other arguments will be ignored. If a formula, the argument name can be either . or .x. See the examples section for more details.

Since the recipe step in a tidyflow is not the ideal step for exploration, we suggest that the user constructs the recipe outside the tidyflow and applies it to the data beforehand, just to make sure it works. After making sure the recipe can be fitted without errors, the user can provide the function or formula for the recipe. Defining a recipe without testing on the data can lead to errors on recipe that are best fixed in an interactive fashion.

Examples

library(recipes)
library(parsnip)

# Passing a function to `plug_recipe`
recipe_fun <- function(.x) {
  recipe(mpg ~ ., data = .x) %>%
   step_center(all_predictors()) %>%
   step_scale(all_predictors())
}

# Let's make sure that it works with the data first
recipe_fun(mtcars)
#> Recipe
#> 
#> Inputs:
#> 
#>       role #variables
#>    outcome          1
#>  predictor         10
#> 
#> Operations:
#> 
#> Centering for all_predictors()
#> Scaling for all_predictors()

# Specify the function to be applied to the data in `plug_recipe`
tflow <-
 mtcars %>%
 tidyflow() %>%
 plug_recipe(recipe_fun) %>%
 plug_model(set_engine(linear_reg(), "lm"))

# Fit the model
fit(tflow)
#> ══ Tidyflow [trained] ══════════════════════════════════════════════════════════
#> Data: 32 rows x 11 columns
#> Split: None
#> Recipe: available
#> Resample: None
#> Grid: None
#> Model:
#> Linear Regression Model Specification (regression)
#> 
#> Computational engine: lm 
#> 
#> ══ Results ═════════════════════════════════════════════════════════════════════
#> 
#> 
#> Fitted model:
#> 
#> Call:
#> stats::lm(formula = ..y ~ ., data = data)
#> 
#> Coefficients:
#> 
#> ...
#> and 5 more lines.

# Specify a formula of a recipe. Remove the old one and specify one on the
# fly:
tflow %>%
 replace_recipe(~ recipe(mpg ~ cyl, data = .) %>% step_log(cyl, base = 10)) %>%
 fit()
#> ══ Tidyflow [trained] ══════════════════════════════════════════════════════════
#> Data: 32 rows x 11 columns
#> Split: None
#> Recipe: available
#> Resample: None
#> Grid: None
#> Model:
#> Linear Regression Model Specification (regression)
#> 
#> Computational engine: lm 
#> 
#> ══ Results ═════════════════════════════════════════════════════════════════════
#> 
#> 
#> Fitted model:
#> 
#> Call:
#> stats::lm(formula = ..y ~ ., data = data)
#> 
#> Coefficients:
#> 
#> ...
#> and 3 more lines.

# Note how the function argument can be either `.` or `.x`
tflow %>%
 replace_recipe(~ {
  .x %>% 
   recipe(mpg ~ cyl + am) %>%
    step_log(cyl, base = 10) %>%
    step_mutate(am = factor(am)) %>%
    step_dummy(am)
 }) %>%
 fit()
#> ══ Tidyflow [trained] ══════════════════════════════════════════════════════════
#> Data: 32 rows x 11 columns
#> Split: None
#> Recipe: available
#> Resample: None
#> Grid: None
#> Model:
#> Linear Regression Model Specification (regression)
#> 
#> Computational engine: lm 
#> 
#> ══ Results ═════════════════════════════════════════════════════════════════════
#> 
#> 
#> Fitted model:
#> 
#> Call:
#> stats::lm(formula = ..y ~ ., data = data)
#> 
#> Coefficients:
#> 
#> ...
#> and 3 more lines.