EME 210
Data Analytics for Energy Systems

Standard Deviation and Standard Error

PrintPrint

  Read It: Standard Deviation and Standard Error 

When working with sampling distributions, we distinguish the standard deviation from the standard error. Both measures are calculated the same way (e.g., through the typical standard deviation formula), the difference is the type of data they are applied to. The standard deviation is applied to the actual data (sample or population) and is a measure of the variance from the mean. The standard error is applied to the sampling distribution and is a measure of the variance from the true population parameter. The video below will show the difference in calculating these measures in Python.

  Watch It: Video - Standard Deviation and Standard Error (03:26 minutes)

Click here for a transcript.

Welcome back to the videos where we're discussing sampling distributions. In this video, we're going to talk about the difference between standard deviation and standard error. And so, this is a really important distinction because, while they sound very similar, they measure two different things. Our standard deviation is essentially going to measure the variability of the sample or the population. Whereas, the standard error is going to measure the variability of the sample means. And what makes it more confusing is that we use the same function to calculate both. It's just what we call it becomes different. So this is an important distinction to keep in mind, but we'll go ahead and walk through how we can calculate each of these in Python.

So we're going to start with the standard deviation of each sample and the true population. So we're going to use our DF concat data, and we're going to say group by label. So similar to what we did to come up with the sample means, but in this case we're going to use the dot STD for standard deviation function, and then we say rename, and we'll give new columns. Outcome will become standard deviation, and we can go ahead and print that. So this is our standard deviation we've got one for each sample, and we've got the true standard deviation. And so, recall that we had our sample means from above. We called them data means, and I'll just print them here for you.

And so, these were our sample means. If we want the standard error, it is the standard deviation of the sample means. So the sample standard error is then data means dot STD. Same calculation, but then we can see here that this is our standard error. Whereas, these were our standard deviations. And so, the only difference is what you're using the command on. If you're getting standard error, it should be on some statistical sample distribution. If you're wanting standard deviation, it should be on actual data, whether that's the sample that you collected or the population.

Credit: © Penn State is licensed under CC BY-NC-SA 4.0

  Assess It: Check Your Knowledge

Knowledge Check 

 FAQ

Add new questions