Homework Assignment

HW 33 — Regression from a ML Perspective

📘 Related: Lesson 33 🛠 MATLAB required 📁 Dataset: physics_grades.mat

📖 Background

For this assignment, you will approach curve fitting from a machine learning perspective. Your task will be to build a model to predict the final grades for Physics 215 students based on their Physics 110 grade.

In particular, your friend received a 3.5 (out of 4.0) in Physics 110 and is asking you to help determine if they should take Physics 215. They heard that people typically get a lower grade in 215 than they do in 110 and will not take the course if they anticipate getting less than a B. Should your friend take 215?

For reference, the mapping between letter grade and numerical grade is:

LetterNumerical
A4.00
A−3.67
B+3.34
B3.01
B−2.68
C+2.35
C2.02
C−1.69
D1.36
F1.03
Reference: See the Lesson 33 notes — the loss function formulation, regularization term, and design matrix notation used in this assignment are all covered there.

Tasks

Dataset: physics_grades.mat is available on Microsoft Teams under the Supplemental folder.
  1. Load and split the data. The MATLAB file physics_grades.mat contains 389 individuals' grades for both Physics 110 (first column) and Physics 215 (second column). Load this data and split it into a set for training your model and a set for testing.
  2. Set up the loss function. Using a quadratic polynomial as your model, set up a loss equation as an anonymous function of your model parameters. Your loss function should take in a column vector of model parameters \([\beta_0;\ \beta_1;\ \beta_2]\) and output a scalar value for the loss. The \(x\) values should be the Physics 110 grades, and the \(y\) values should be the Physics 215 grades in the training set. Note that you may choose \(\lambda\) as you see fit.

    Also set up a similar loss function to measure the error on the test data. This testing loss function should not include the regularization term, and the \(x, y\) data should be the grades in the testing set.
  3. Train the model. Train your model using MATLAB's fminsearch. Note that this function only requires an anonymous function and initial conditions, not the gradient. The output will be the model parameters that minimize the loss on the training data.
  4. Evaluate on test data. Measure the loss on your test data using your test loss function from Part 2.
  5. Plot the results. Plot a scatter plot of the complete grade set and your quadratic model on the same plot.
  6. Make a prediction. Using the learned model parameters, predict your friend's Physics 215 grade. Should they take the course?