Why does R drop attributes when subsetting?

PUBLISHED ON MAR 16, 2019 — R

I had to spend about 1 hour yesterday because R did something completely unpredictable (for my taste). It dropped an attribute without a warning.

df <- data.frame(x = rep(c(1, 2), 20))

attr(df$x, "label") <- "This is clearly a label"

df$x
##  [1] 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1
## [36] 2 1 2 1 2
## attr(,"label")
## [1] "This is clearly a label"

The label is clearly there. To my surprise, if I subset this data frame, R drops the attribute.

new_df <- df[df$x == 2, , drop = FALSE]

new_df$x
##  [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

It doesn’t matter if it’s using bracket subsetting ([) or subset.

new_df <- subset(df, x == 2)

new_df$x
##  [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

That’s not good. R’s dropping attributes silently. For my specific purpose I ended up using dplyr::filter which safely enough preserves attributes.

library(dplyr)

df %>% 
  filter(df, x == 2) %>% 
  pull(x)
##  [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## attr(,"label")
## [1] "This is clearly a label"
TAGS: R
comments powered by Disqus