Notes, content and exercises for the RECSM 2020 course Machine Learning for Social Scientists. These are intended to introduce social scientists to concepts in machine learning using traditional social science examples and datasets. Currently, it is not intended to be a book but rather supporting material for the course. Perhaps it evolves enough to be a book some day.

To be able to run everything in the material and the slides, make sure you can install the packages in the code chunk below. If you’re using Windows, be sure to have the latest version of R and Rtools installed before installing these packages. You can download Rtools from here. Read the instructions in detail to make sure it’s installed correctly.

all_pkgs <- c('devtools', 'tidymodels', 'ggplot2', 'baguette', 'rpart.plot', 'vip', 'plotly', 'dplyr', 'ggfortify', 'tidyflow', 'tidyr')
install.packages(all_pkgs, dependencies = TRUE)

These lines of code can take a while to install (more than 30 minutes), so don’t worry.

Once that’s finished, make sure you all of these are installed with this:

setdiff(c(all_pkgs, "tidyflow"), row.names(installed.packages()))

The expected result should be character()

Slides for course:

Day 1 - Morning: Slides

Day 1 - Afternoon: Slides

Day 2 - Morning: Slides

Day 2 - Afternoon: Slides

Day 3 - Morning: Slides

Day 3 - Afternoon: Slides

Slides for ISEAK:

Thursday: Slides

Friday: Slides

Friday: Slides

Friday: Slides