Evaluating Regression Models

Packages

library(palmerpenguins)
library(openintro)

Loading required package: airports

Loading required package: cherryblossom

Loading required package: usdata

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2

── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(nycflights13)

Warning: package 'nycflights13' was built under R version 4.4.2

library(tidymodels)

── Attaching packages ────────────────────────────────────── tidymodels 1.2.0 ──
✔ broom        1.0.6     ✔ rsample      1.2.1
✔ dials        1.3.0     ✔ tune         1.2.1
✔ infer        1.0.7     ✔ workflows    1.1.4
✔ modeldata    1.4.0     ✔ workflowsets 1.1.0
✔ parsnip      1.2.1     ✔ yardstick    1.3.1
✔ recipes      1.1.0     
── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
✖ scales::discard() masks purrr::discard()
✖ dplyr::filter()   masks stats::filter()
✖ recipes::fixed()  masks stringr::fixed()
✖ dplyr::lag()      masks stats::lag()
✖ yardstick::spec() masks readr::spec()
✖ recipes::step()   masks stats::step()
• Learn how to get started at https://www.tidymodels.org/start/

library(yardstick)

Iris data

Up through this point, we have gone through SLR and MLR. You may be asking yourself, how do we choose? There are a lot of different ways we can justify a choice of model. In this activity, we are going to justify a model based on it’s predictive performance.

The Iris data set is a pre-loaded data set in R. Pull up the help file for the Iris data set to see what it’s all about. In this case, we are interested in Sepal length, and will fit a variety of models to investigate Sepal length.

The goal of this activity is to choose a model that “performs the best” when predicting new observations.

Testing vs Training data

We are going to take 80 percent of the Iris data set, and use these data to train our models. We are then going to take the remaining 20 percent and use these data to test our models predictive performance. Before we start, let’s give the iris data set an id column. This may be useful for us in a bit!

iris_full <- iris |>
  mutate(id = 1:nrow(iris))

To take a random sample from our data set, we can use the sample_n() function. They syntax for this function is number of rows, replace =, set.seed =. Demo: Write the code below to sample 80% of the rows from the iris_full data set. Name this R object train.

train <- iris_full |>
  sample_n(.8*nrow(iris_full), replace = FALSE, seed = 123)


test <- iris_full |>
  anti_join(train, by = "id")

Now, let’s create a test data set! That is, we need the remaining 20% of rows. There are a number of functions that can do this in R for us, but let’s critically think about how we could accomplish this using the tools we have learned in class… Name the object test.

Fitting models

Let’s start with fitting a SLR model, and get the predictions

model1 <- linear_reg() |> # linear reg
  fit(Sepal.Length ~ Sepal.Width, data = train) # y ~ x

pred <- predict(model1, data.frame(Sepal.Width = test$Sepal.Width))

Writing functions

I think this is a good opportunity for us to continue sharpeing our coding skills. Let’s learn about how to write functions. Then, we are going to to write a function that cauclautes Mean Absolute Error.

Let’s start off with an example, and comment through it.

exchange <- function(money){ #define the arguments (); notice {
  CAD <- money * 1.45 # notice money shows up here
  return(CAD)
}

exchange(5)

[1] 7.25

Now, let’s write our own function to calculate Mean Absolute Error. Let’s name this mean_abs_error

mean_abs_error <- function(obs, pred, n){
  
  merror <- (sum(abs(obs-pred)))*(1/n)
  return(merror)
}


mean_abs_error(obs = test$Sepal.Length, pred = pred, n = nrow(test))

[1] 0.7402839

Doing this is yardstick

We can create a data frame with predictions and observed data + run in through the mae function. Let’s see if we get the same results.

data <- data.frame(pred, test$Sepal.Length)

mae(data, truth = test.Sepal.Length, estimate = .pred)

# A tibble: 1 × 3
  .metric .estimator .estimate
  <chr>   <chr>          <dbl>
1 mae     standard       0.740