EME 210
Data Analytics for Energy Systems

Uniform and Triangular Distributions

PrintPrint

Read It: Uniform and Triangular Distributions

Enter image and alt text here. No sizes!
INSERT IMAGE: L29:Slide 7

Uniform distribution:

[00:19:53.60] We saw this with z. We saw this actually with the dice rolling. It's saying we can-- we're just going to draw any value between A and B with equal likelihood. Now with the dice rolling, we limited it to integers. So, it's got to be 1, 2, 3, 4, 5. It can't be 1.5 or 2.25 or something like that. But you could also have a continuous version of it. We could say-- we could limit it to-- I'm sorry, we could not limit it to integers. We could say any value between one and 6. It could be 1.1111. It might be.  

[00:20:31.13] But this is defined by just its lower and upper limit. Say, lower bound A, upper bound B. Turns out the probability of where the-- I should say, the relative likelihood of choosing any single value is 1 over the difference of those two. That's equal for all sets. Another common one is triangular distribution. So, it takes in-- the parameters it takes are now not only a lower and upper bound, A and B, but also a C, that defines the beacons.  

Enter image and alt text here. No sizes!
INSERT IMAGE: L29:Slide 8

Triangular Distribution:

[00:21:03.63] This is quite useful for cases like the GPA or exam scores where we know it's got to be truncated. There's got to be some bound. It can't go above 4 for a GPA, for example. It can't go below 0. But, like the normal distribution, it's got some values that are more likely than others. You might have a 3.16 GPA that's most likely. We want to make sure to capture that. So, it's kind of combining the best of both worlds of the normal and the unicorn.  

[00:21:41.41] All right, so those are some typical ones. There's also-- you can go online. There's a list of using the random package in numpy. There's a list of distributions that we can choose from. So, you can-- my main point here is not to just, I don't want to overwhelm you, but I just want to show you that there really are a lot a lot of options here. So, we've got our triangular, we've got our uniform, of course, we've got our normal. We also have log normal. This is if they're saying that if the log-- I'm sorry, if-- let's say it this way. If x, say some variable x is log normally distributed, then if you take the log of that, it would be normal distribution.  

[00:22:35.93] And one beneficial feature of this is that it's-- it can't go less than 0. So, the log of something can't be negative. But it has some mean to it. Asymptotically goes up to-- asymptotically approaches 0. [INAUDIBLE]. Anyway, but you can explore these. But again, those most common ones are good starters. So, Weibull distribution really good. This is-- it's been shown-- a lot of people say that wind speeds, anybody-- any of you guys interested in wind power? So, wind speeds, wind velocities, I should say, follow a Weibull distribution, although I have a paper that shows that they don't really.  


Try It: OPTION 1 GOOGLE COLAB

  1. Click the Google Colab file used in the video here.
  2. Go to the Colab file and click "File" then "Save a copy in Drive", this will create a new Colab file that you can edit in your own Google Drive account.
  3. Once you have it saved in your Drive, try to implement the following code to import a file of your choice by mounting your Google Drive:

Note: You must be logged into your PSU Google Workspace in order to access the file.

from google.colab import drive

drive.mount('/content/drive')

import pandas as pd

df = pd.read_csv('yourfilename.csv')

df # print the dataframe

Once you have implemented this code on your own, come back to this page to test your knowledge.


OPTION 2 : DATACAMP

 Try It: OPTION 2 DataCamp - Apply Your Coding Skills

Dictionaries are a quick way to create a variable from scratch. However, their functionality is limited, so we will often want to convert those dictionaries into DataFrames. Try to code this conversion in the cell below. Hint: Make sure to import the Pandas library.

# This will get executed each time the exercise gets initialized. # Create a Simple Dictionary mydict = {'Name':['Amy', 'Bob', 'Clair', 'Daisy'], 'Birthday':['9/3/1991', '4/21/1988', '4/21/1990', '11/11/1989'], 'Age':[31, 34, 32, 33]} # convert the dictionary to a DataFrame # print the values in the 'Birthday' column # Create a Simple Dictionary mydict = {'Name':['Amy', 'Bob', 'Clair', 'Daisy'], 'Birthday':['9/3/1991', '4/21/1988', '4/21/1990', '11/11/1989'], 'Age':[31, 34, 32, 33]} # convert the dictionary to a DataFrame import pandas as pd mydataframe = pd.DataFrame(mydict) # print the values in the 'Birthday' column mydataframe['Birthday']


Assess It: Check Your Knowledge

Knowledge Check (replace question)