EME 210
Data Analytics for Energy Systems

(link is external) (link is external)

Significance

PrintPrint

Significance

Read It: Significance

So far in this course, we have been using various data samples, most of which are quite large (e.g., they have a large value for "n"). However, in real world applications, you might not have access to these large datasets. Imagine you are in a situation where it is too expensive or infeasible to collect more than a dozen data points. At this point, the sample size is likely not going to be considered "sufficiently large" and some of our tests will result in very different results. In the viedo below, we will walk through an experiement to see how the p-value changes with increasing sample size, leading to different conclusions. 

 Watch It: Video - Significance (10:47 minutes)

Click here for a transcript.

Hello, and welcome back to the lecture series. In this lesson we're going to talk about more of the traditional statistical side of hypothesis testing. So normally if you look at a statistics textbook it won't teach you how to conduct these randomization procedures. Instead, they'll focus on the statistical theory. So the first part of this lecture is actually having, we'll focus on significance.

So we're here in Google Colab. And so, in order to focus on significance we're actually going to look at a deck of cards. So we're going to use a proportion. So we've got our proportion, is the number of cards, number of red cards divided by the total number of cards. And in theory, it should be 50-50. So a standard deck of cards has half-red, half-black colors. And then our alternative is that red's more than 50 percent. In effect, that we've got a deck that isn't fair. So, we're going to go ahead and get started with that. So I'm going to type in x equals np dot array. And this is how we're going to create an array before we get started. So, I have a deck of cards here. So I'll shuffle it. And I'm just going to draw the top card. And we're going to record whether or not it's red or black. So, first card is a five of diamonds. That is red. Second card, six of diamonds, also red. Six of hearts. Nine of diamonds Need to have a comma there. Nine of hearts. And we'll do one more, four of diamonds.

So we'll go ahead and stop there. And we're going to use this data to look into what it means, how significance levels of how significance testing can change based off of our data that we have here. So with that, we can go ahead and get started. All right, so we have this data here that I threw out of my deck of cards. And you know the first six draws we had were red data points. And so we can go ahead and develop our hypothesis test. This is a single proportion test so we need to know the sample size which is just the length of our x-array. We need to know how many iterations we're going to do. So that's capital N and then we need to integrate or conduct the binomial test where we specify our sample size, our null proportions, and the number of iterations. So we can run, this looks like it's still connecting, and I need to run my libraries here.

So we've got numpy pandas plot nine. And later in this lecture series we'll get into scipy dot stats. So run this, so this sets up our binomial data set and then we need to actually calculate our proportions. So we can say p-hat is just counts divided by the sample size. And our data p-hat is the actual data set. So we have the length of x in which x equals r, and then we divide that by little n. So, we've got our data here, and as we have done in previous lectures we can go ahead and visualize this. So we need to create a data frame. So we say pd dot data frame of just our sampling distribution of proportions, and I'll go ahead and rename these columns to just say p hat. And then we can do our gg plot. And so we're going to plot our p hat data frame and we're going to do a dotplot. So geom underscore dotplot. We have our AES where we just need to provide the x value, which is p hat. And we can set the dot size to something relatively small. And then we can add our vertical line where the x-intercept is set to data p hat, the color is set to blue, and the line type is set to dashed. So we can run that. And we can look at this, and we can already see that you know the majority of our data is certainly below our sample statistic where 100 percent of our data came from the red draws. So we can already sort of guess that our p-value is going to be quite low. But let's see how low it actually is. A

nd so, we can say print the p-value. And this is just the length of p hat data frame where the p hat column is greater than or equal to our sample proportion, divided by the number of iterations that we did. And so we can see this p-value is .021. Where we therefore reject the null hypothesis in favor of the alternative that the proportion of reds is actually higher than 50 percent. Well, let's say for example, that I only did this four times or three times. So we just drew three cards. We can go ahead and run this, and rerun all of these different things. We can see how our proportions are sort of spreading out. They're not becoming as normally distributed as we would expect. And then we run this, and suddenly our p-value is greater than 0.5. Which means we fail to reject the null hypothesis that there's more than 50 percent of the deck. So if I go back here, and you know, let's say we continue to draw cards. I just drew a ten of diamonds an eight of hearts, and then a queen of hearts. So now we have even more reds. We're up to nine reds drawn. Run through all of this. We can see it's becoming a little bit more normally distributed in the plot. And now our p-value is zero. One hundred percent of the data is below there. So we've got as low as we can go, which provides really strong evidence that this deck that I'm drawing from is not actually a fair deck. There are much more reds than we would get in a normal randomly distributed deck. And if we continue to increase the number of draws, if they continue to be red, we'll continue to have this p-value drop to zero.

And so, this is really good to evidence how you know the significance of our data set of our hypothesis test can change depending on how large your data set is, but also, how where you set your significance level. And so, if we go back to when we had six red draws and run through these our p-value is .016. Which under a normal significance level we would reject the null hypothesis. But sometimes people will use 99 significance levels which becomes 0.01. And in this case, if we followed that significance level we would fail to reject the null hypothesis, just barely. And so, this is why, you know, determining that significance level is really important. And sticking to it, and acknowledging what your, which level you're working at. But also making sure that you get a large enough sample size so that you feel confident in your final statistical conclusion.

Credit: © Penn State is licensed under CC BY-NC-SA 4.0(link is external)

 Try It: DataCamp - Apply Your Coding Skills

  1. Use this link to a card drawing simulator(link is external).
  2. Draw some cards to create your own x variable with card colors.
  3. Then, edit the code below to calculate the p-value.
  4. Try drawing different amounts of cards and see how the p-value changes.