Loss Functions, Training Diagnostics, and Machine Learning Systems

Read It: Loss Functions, Training Diagnostics, and Machine Learning Systems

Loss Functions

Enter image and alt text here. No sizes!

ADD IMAGE : L27:Slide 10

Enter image credit here

[00:47:36.51] Just real quickly, some loss functions that we can look at-- again, in terms of the language that's used in the machine learning tools that we'll go over in the code, mean squared error-- just looking at the difference between your labels and what you predict, squaring it, summing it, divided by n. Then the absolute error, instead of taking the square, you look at the absolute value. But [INAUDIBLE].

[00:48:00.91] Mean absolute percentage error-- you do that. You look at the absolute value, but you divide by the true value. So, it's all in reference to what the actual label was. So, what's the percentage difference? And then this mean squared logarithmic error, yeah, there's the function there for it.

Training Diagnostics

ADD IMAGE : L27: Slide 11

Enter image credit here

[00:48:25.06] All right, so one final word. So, I've talked about the architecture, the choices that you can make in setting this thing up. I've talked about how that fitting procedure works with the backpropagation. Let's say you execute all that, and you're looking at error over the epochs. So that's what we're looking at in these example plots here.

[00:48:48.88] We're looking at error or our loss function, whatever it may be. It could be mean squared error. It could be mean absolute error-- whatever one of those last ones is. And we're looking at how that behaves over our epochs, a.k.a. our iterations. And it starts out really high because we've started out with a really bad guess for the w's. But as we get better and better guesses for w's, these decrease.

[00:49:09.80] What do the colors represent? Well, the red is the error for our training set-- so the data we're fitting to. And the blue is the error for our validation set, a.k.a. our test data set. And so, what ideally we want to see is that both of these decrease and enough iterations, or in terms of our choice of how many epochs we run, well, we can terminate when they've more or less leveled off.

[00:49:40.85] When we stop seeing appreciable decreases, we can say, OK, it's good enough. It's a case of diminished returns if we continue doing this on. We're going to be waiting, and we're just going to get a really minute improvement. So that's ideally what we want to see.

[00:49:58.70] In terms of this overfitting that I talked about before, we can identify that by looking at this kind of diagnostic plot. Again, overfitting is where basically we have so many nodes that we're essentially describing each and every data point that we've given it, and it does a terrible job at predicting because it's being too restrictive to our existing data. And so how that shows up is that we'll see an increase in our validation error.

[00:50:27.63] We're hugging our existing data so tight that we're doing a terrible job at predicting. How does that show up? Our prediction error starts to go up. So, the best place to stop here would be right here, but that still gives us a worse model than what we would have gotten above.

[00:50:45.19] Really, if you start seeing this overfitting, you should go back and either redesign your architecture. So, you have fewer nodes or get more data, which is probably harder to do. But if you can, that's the best thing to do. Another question-- so more often than not, you won't see something so smooth. Well, maybe I shouldn't say more often than that. But sometimes you won't see something that behaves so smoothly.

[00:51:12.73] Sometimes you'll get some erratic patterns in these. They might jump all over the place. Or they might be smooth, and then you get a sudden jump up. And then it starts going down again, as we see here.

[00:51:24.09] And so the question is, well, should we stop here? This started to go back up. We started to get this overfitting. Really, the answer is just to push on a little bit further. Do more iterations until we get as low as we possibly can and stop here.

[00:51:40.43] Make sense, guys? And so here I just defined some overfitting. Again, we can't generalize to any new data because, again, we've got so many terms that we've got one that's describing each and every data point, essentially.

Machine Learning Systems

ADD IMAGE: L27: Slide 12

Enter image credit here

[00:51:57.30] I'll leave you with this to think about. So, this is sort of a cynical view on machine learning. This is your machine learning system? Yep, you pour the data into this big pile of linear algebra, which is essentially what's being done here, and collect answers on the other side.

[00:52:10.44] And he says, well, what if they're wrong? Well, you just stir this up and adjust your w's until it's right. That's kind of a very cynical outlook on machine learning. It does have some truth to it.

[00:52:21.06] It is kind of a very brute force and naive approach to doing regression. But it tends to work really well in lots of settings. It's very adaptable. So, there's some use to it.

[00:52:34.72] There's a famous quote by George Box that says, all models are wrong, but some are useful, which is absolutely true. No model is going to perfectly describe what's happening in nature, but we can make some good use out of some models.

Try It: OPTION 1 GOOGLE COLAB

Click the Google Colab file used in the video here.
Go to the Colab file and click "File" then "Save a copy in Drive", this will create a new Colab file that you can edit in your own Google Drive account.
Once you have it saved in your Drive, try to implement the following code to import a file of your choice by mounting your Google Drive:

Note: You must be logged into your PSU Google Workspace in order to access the file.

from google.colab import drive

drive.mount('/content/drive')

import pandas as pd

df = pd.read_csv('yourfilename.csv')

df # print the dataframe

Once you have implemented this code on your own, come back to this page to test your knowledge.

OPTION 2 : DATACAMP

Try It: OPTION 2 DataCamp - Apply Your Coding Skills

Dictionaries are a quick way to create a variable from scratch. However, their functionality is limited, so we will often want to convert those dictionaries into DataFrames. Try to code this conversion in the cell below. Hint: Make sure to import the Pandas library.

# This will get executed each time the exercise gets initialized.

# Create a Simple Dictionary
mydict = {'Name':['Amy', 'Bob', 'Clair', 'Daisy'],
          'Birthday':['9/3/1991', '4/21/1988', '4/21/1990', '11/11/1989'], 
          'Age':[31, 34, 32, 33]}

# convert the dictionary to a DataFrame

# print the values in the 'Birthday' column

# Create a Simple Dictionary
mydict = {'Name':['Amy', 'Bob', 'Clair', 'Daisy'],
          'Birthday':['9/3/1991', '4/21/1988', '4/21/1990', '11/11/1989'], 
          'Age':[31, 34, 32, 33]}

# convert the dictionary to a DataFrame
import pandas as pd
mydataframe = pd.DataFrame(mydict)

# print the values in the 'Birthday' column
mydataframe['Birthday']

Read It: Loss Functions, Training Diagnostics, and Machine Learning Systems

Try It: OPTION 1 GOOGLE COLAB

Note: You must be logged into your PSU Google Workspace in order to access the file.

Try It: OPTION 2 DataCamp - Apply Your Coding Skills

Assess It: Check Your Knowledge

Knowledge Check (replace question)

Navigation

EMS

Programs

Related Links