EME 210
Data Analytics for Energy Systems

Summary

PrintPrint

Summary

By now, you may have observed a theme in developing randomization distributions for hypothesis testing. Without giving specific code, our hypothesis testing procedure generally goes like this:

  1. Formulate null and alternative hypotheses
  2. Calculate the sample statistic from original data
  3. Compute the randomization distribution
    1. Simulate new sample under the condition of the null hypothesis being true
    2. Calculate the sample statistic from new, simulated sample
    3. Repeat many times and save the collection of simulated sample statistics (this collection is your randomization distribution)
  4. [Optional, but recommended] Visualize the randomization distribution and original sample statistic (histograms or dotplots are great for this)
  5. Find the p-value as the proportion of simulated sample statistics that are at least as extreme as the original sample statistic
  6. State the conclusion of your hypothesis test, in the context of the original hypotheses and larger problem

The table below summarizes the various tests that we've covered so far, but bear in mind that as long as you can calculate a statistic (a single-valued summary) from your data and formulate hypotheses, you can adopt the framework above to perform a hypothesis test. This is a valuable feature of the randomization approach!

Test Hypotheses Statistic Randomization Procedure
Single Mean H o : μ =  null value 
H a : μ <  null value  or
p >  null value  or
μ  null value 
Sample mean, x ¯ Shift sample so that mean agrees with null value
Single Proportion H o : p =  null value 
H a : p <  null value  or
p >  null value  or
p  null value 
Sample proportion, p ^ Draw random samples from binomial distribution with p =  null value 
Difference of Means H o : μ A μ B =  null value 
H a : μ A μ B <  null value  or
μ A μ B >  null value  or
μ A μ B  null value 
Difference of sample means, x ¯ A x ¯ B Reallocate observations between samples A and B
Mean of Differences
(paired comparison)
H o : μ A B =  null value 
H a : μ A B <  null value  or
μ A B >  null value  or
μ A B  null value 
Mean of differences in samples, x A = x B Reallocate between paired observations (multiply paired difference by 1 or -1, by random chance
Difference of Proportions H o : p A p B =  null value 
H a : p A p B <  null value  or
p A p B >  null value  or
p A p B  null value 
Difference of sample proportions, p ^ A p ^ B Reallocate observations between samples A and B

A major challenge is figuring out which form of hypothesis test is appropriate for your given problem. The flowchart below can help, and reviewing problems in this course and textbook will help you build a sense of what test to perform.

Hypothesis Testing Table
Hypothesis Testing Table
Credit: © Penn State is licensed under CC BY-NC-SA 4.0

 Assess It: Check Your Knowledge Quiz


Reminder - Complete all of the Lesson 5 tasks!

You have reached the end of Lesson 5! Double-check the to-do list on the Lesson 5 Overview page to make sure you have completed all of the activities listed there before you begin Lesson 6.