Add formula terms to a tidyflow — plug

plug_formula() specifies the terms of the model through the usage of a formula.
drop_formula() removes the formula as well as any downstream objects that might get created after the formula is used for preprocessing, such as terms. Additionally, if the model has already been fit, then the fit is removed.
replace_formula() first removes the formula, then replaces the previous formula with the new one. Any model that has already been fit based on this formula will need to be refit.

plug_formula(x, formula, ..., blueprint = NULL)

drop_formula(x)

replace_formula(x, formula, ..., blueprint = NULL)

Arguments

x: A tidyflow
formula: A formula specifying the terms of the model. It is advised to not do preprocessing in the formula, and instead use a recipe if that is required.
...: Not used.
blueprint: A hardhat blueprint used for fine tuning the preprocessing. If NULL, hardhat::default_formula_blueprint() is used.

Value

The tidyflow x, updated with either a new or removed formula preprocessor.

Details

To fit a tidyflow, one of plug_formula() or plug_recipe() must be specified, but not both.

By default tidyflow leaves workflows to figure out which type of factor/character transformation to happen (either leave factor as is, transform to N-1 dummies or use a one-hot encoding approach of N dummy columns). These transformations depend on the specific model supplied in plug_model. See add_formula for more details on how transformations are handled.

However, plug_formula allows to override the type of transformation using the blueprint. For example, by passing default_formula_blueprint(intercept = TRUE, indicators = "none") to the blueprint argument of plug_formula you can enforce that all factors/characters area left without transforming. You can also use default_formula_blueprint(intercept = TRUE, indicators = "traditional") and default_formula_blueprint(intercept = TRUE, indicators = "one_hot") to transform all factors/characters to N-1 dummies or N dummies respectively.

For example, to transform all factors/characters to one-hot encoding, you can pass the blueprint to plug_formula:

bp <- default_formula_blueprint(intercept = TRUE, indicators = "one_hot")
iris %>%
  tidyflow(seed = 21315)
  plug_formula(Sepal.Length ~ Species, blueprint = bp) %>%
  plug_model(set_engine(parsnip::linear_reg(), "lm")) %>%
  fit()

For custom transformations between types (for example, applying one-hot on factors and not on characters), the user can provide a recipe with a step_dummy step to plug_recipe. See this vignette for more details

Examples


# Just for the pipe: %>%
library(tibble)

tflow <-
  mtcars %>%
  tidyflow(seed = 652341) %>% 
  plug_formula(mpg ~ .)

tflow
#> ══ Tidyflow ════════════════════════════════════════════════════════════════════
#> Data: 32 rows x 11 columns
#> Split: None
#> Formula: mpg ~ .
#> Resample: None
#> Grid: None
#> Model: None

drop_formula(tflow)
#> ══ Tidyflow ════════════════════════════════════════════════════════════════════
#> Data: 32 rows x 11 columns
#> Split: None
#> Recipe/Formula: None
#> Resample: None
#> Grid: None
#> Model: None

replace_formula(tflow, mpg ~ disp)
#> ══ Tidyflow ════════════════════════════════════════════════════════════════════
#> Data: 32 rows x 11 columns
#> Split: None
#> Formula: mpg ~ disp
#> Resample: None
#> Grid: None
#> Model: None