Multiple Linear Regression

Lecture 21

Dr. Elijah Meyer

NC State University
ST 295 - Spring 2025

Invalid Date

Checklist

– Homework 4 (assigned today; due Monday at 11:59pm)

> This homework assignment will be shorter than the others 

– Quiz-8 (assigned Thursday; due Monday at 11:59pm)

Draft components: For Peer Review due on April 10th

Draft components

You have already (or will receive by today) feedback on your project… here are the next steps

Draft components

– Choose a data set

– Respond to all issues and/or close them for that data set

– Update your about section

Draft components

In your report.qmd, you are going to start writing up your report!

– Introduction and data

– Exploratory data analysis (summary stats + graphs)

– Methodology (at least the proposed idea)

  > We will have covered Simple and Multiple Linear regression by the due date 
  > We will cover how to choose your "best" model + logistic regression 
  > Prepare material will be posted for the remainder of the semester today

Project website

Demo

Questions?

Last time

fish <- read_csv("data/fish.csv")

fish_hw_fit <- linear_reg() |> 
  fit(weight ~ height, data = fish)

fish_hw_tidy <- tidy(fish_hw_fit)
fish_hw_tidy |>
  kbl(digits = 3)
term estimate std.error statistic p.value
(Intercept) -288.415 33.953 -8.494 0
height 60.916 2.636 23.111 0

For a 1 cm increase in height, we estimate on average a 60.916 gram increase in weight

Mean response

The equation

\[ \widehat{weight} = -288.415 + 60.916*height \]

Moving forward

> 1 Explanatory Variable

Why?

– It allows for a better understanding of the relationship between your variables

– Helps control for confounding variable

What’s a confounding variable?

Confounding variable

A confounding variable is one that has a relationship with both x and y. Including it in our model is a way to “account for it”, to get a better understanding

Simple vs Multiple Linear Regression

fish_hw_fit <- linear_reg() |> 
  fit(weight ~ height, data = fish)

tidy(fish_hw_fit) |>
  kable(digits = 3)
term estimate std.error statistic p.value
(Intercept) -288.415 33.953 -8.494 0
height 60.916 2.636 23.111 0

Simple vs Multiple Linear Regression

Note: There are Bream and Roach species of fish in the data set.

fish_hw_fit <- linear_reg() |> 
  fit(weight ~ height + species, data = fish)

tidy(fish_hw_fit) |>
  kable(digits = 3)
term estimate std.error statistic p.value
(Intercept) -828.233 69.685 -11.885 0
height 95.241 4.544 20.958 0
speciesRoach 342.665 41.833 8.191 0

AE

Additive model