library(tidyverse)
library(tidymodels)
library(kableExtra)
fish <- read_csv("data/fish.csv")
Modeling fish
For this application exercise, we will work with data on fish. The dataset we will use, called fish
, is on two common fish species in fish market sales.
The data dictionary is below:
variable | description |
---|---|
species |
Species name of fish |
weight |
Weight, in grams |
length_vertical |
Vertical length, in cm |
length_diagonal |
Diagonal length, in cm |
length_cross |
Cross length, in cm |
height |
Height, in cm |
width |
Diagonal width, in cm |
Model fitting
- Demo: Fit a model to predict fish weights from their heights. Comment the code below.
fish_hw_fit <- linear_reg() |>
fit(weight ~ height, data = fish)
fish_hw_tidy <- tidy(fish_hw_fit)
Model summary
-
Demo: Display the model summary. Next, show how you can extract these values from the model output. Hint: pull up the help file
pull()
.
Why might we do this?
Who remembers inline code? 32 will produce the number of rows in the mtcars data set! This helps us avoid errors and improve reproducibility.
Now, write a sentence saying what the slope coefficient is, and report this using inline code! Hint: Feel free to create a R object to help you with this.
This sentence should have a back tick (`) followed by the letter r. It then should have a R object name that represents the slope coefficient … followed by another back tick to close the inline code chunk.
LaTeX
LaTeX is a widely used language software in academia for the communication and publication of scientific documents and technical note-taking in many fields, owing partially to its support for complex mathematical notation. Let’s demo how this works!
Demo: Write out your model using mathematical notation.
**We do this by setting up a LaTeX code chunk using two $ signs to start it, and two more to end it We then call for math text using backlashes, with the notation that we want. For example, if we want a hat on something, we use \widehat{}
with what we want that hat on inside the brackets. We can do similar things for beta.
We will pick up this skill as we continue working with it in class.
Take reproducibility up a notch
Let’s combine LaTeX and in-line code!
Below, create a LaTeX chunk that writes out our estimated equation, using int
and est
.
We combine the new topics above by creating a latex code chunk and embedding r inline code!
Predict “by hand”
R practice: Predict what the weight of a fish would be with a height of 10 cm, 15 cm, and 20 cm using this model “by hand”. Create a R object called x with the three height values, and plug it into an equation.
x <- c(10,15,20)
288 + 60.9*x
[1] 897.0 1201.5 1506.0
Predict using R
We don’t always have to do this by hand… we have pre-built functions to help us! To do this, we are going to use the function predict()
. The syntax for predict is as follows:
predict(model, data.frame(x = number or character))
Calculate your predictions above again, but now using the predict()
function.
predict(fish_hw_fit, data.frame(height = c(10,15,20)))
# A tibble: 3 × 1
.pred
<dbl>
1 321.
2 625.
3 930.
Interpret
Besides prediction, we can also interpret each coefficient to get a better sense of the relationship between our response and explanatory variable. Let’s interpret each coefficient below, and pay special attention to the wording:
Intercept: For a height of 0 cm, we estimate the mean weight of a 288 grams
Slope: For a 1 cm increase in height, we estimate the mean weight to increase by 60.9 grams