EME 210
Data Analytics for Energy Systems

Correlation One-Line Test

PrintPrint

Correlation One-Line Test

Read It: Correlation One-Line Test

We can also conduct a one-line test for correlation using the stats.pearsonr command from the scipy.stats library. You can read more about the command in the documentation linked here. Below, we demonstrate the implementation of this command in a video.

 Watch It: Video -  One Line Test Correlation (2:31 minutes)

Click here for a transcript.

Welcome back. In this short video, I'm going to talk about the optional one-liner for correlation. And so, when we went through all the hypothesis tests in lesson six, we then learned all of those one minor tests, this will be the one minor for correlation. And so, in particular, we're using this pearsonr function in sci-fi stats, which can be found here. And we need to provide the x value, the x data the y data and the alternative that we want to use. So, I'm just going to save this data as results, and say stats dot pearsonr.

And I'm going to use the original data, so here. We don't want to use our simulated data because we shuffled it so much. We don't have the original anymore. So we need to go back to our merge data frame, which is, you know, the reason why we make copies. So, I give it the first variable, and the second variable, and then the alternative is greater.

So, we can run this, don't get any results because I'm storing it somewhere, but then I can come in and say print the correlation coefficient. And this will be the zero with term of our results vector. And then we can print p-value.

And, this is the first term, so remember that python does counting from zero onward. 
And so, we go zero and one. And so, here it's printing out that same correlation coefficient that we got above, but it's giving us a p-value that is much more specific than our p-value here, like we've seen before, and tells us that it is. We reject the null hypothesis in favor of the alternative, that this correlation coefficient is statistically significantly greater than zero.

Credit: © Penn State is licensed under CC BY-NC-SA 4.0

Try It: GOOGLE COLAB

  1. Click the Google Colab file used in the video here.
  2. Go to the Colab file and click "File" then "Save a copy in Drive", this will create a new Colab file that you can edit in your own Google Drive account.
  3. Once you have it saved in your Drive, try to edit the following code to implement the one-line correlation test. 

Note: You must be logged into your PSU Google Workspace in order to access the file.

# create some data
df = pd.DataFrame({'x': np.random.randint(0,100,1000),
                   'y': np.random.randint(0,200,1000)})

# run the test with alternative = greater
results = ... 
print('Correlation Coefficient: ', ...)
print('p-value: ', ...)

Once you have implemented this code on your own, come back to this page to test your knowledge.


 Assess It: Check Your Knowledge

Knowledge Check