EME 210
Data Analytics for Energy Systems

Getting Started with Python and Google Colab

PrintPrint

Using Google Colaboratory for this Course

The assignments for this course rely on Google Colaboratory (Colab) notebooks. Each homework assignment and exam will have portions that require you to write your own Python code, and this code needs to be submitted in a Colab notebook, through Canvas (the Learning Management System for the course). Furthermore, examples in subsequent lessons are given in Colab notebooks. For these reasons, it is important to become familiar with how Colab notebooks work. Both Google Colab and Canvas are accessed online, so there is nothing you need to install. However, you do need to access course materials and assignments with your PSU Google Workspace account, so that your submissions are properly attributed to you.

Preliminary Task: Make sure your PSU Google Workspace account is setup

  1. Go to https://google.psu.edu/
  2. Click Launch
  3. A Google sign-in screen should open. Login with your PSU email (e.g., xyz123@psu.edu), and associated password. You may need to perform two-factor authentication.
  4. If you encounter a new window asking what sort of account to use, select Organization G Suite Account
  5. Otherwise, you should be at your Google Account home page, in which case your account is successfully set up.

Overview

The following videos demonstrate:

  1. How to access Google Colab through Canvas (for Assignments, including their submission)
  2. How to create your own Google Colab notebooks from scratch (in Google Drive)
  3. The basics of adding code to a Colab notebook
  4. How to add Text Cells to a Colab notebook
  5. What to do if your Colab session freezes or stalls
  6. The basics of functions in Python

1. How to access Google Colab through Canvas

This first video starts in Canvas and walks through accessing and submitting a mock assignment that uses Colab. For all assignments and exams, a template Colab notebook is given to you for you to add your work. The video also points out how to attach supplemental work. For example, you may find it easier to write the solutions to textbook problems on paper and scan that work. The scan can be attached to your submission as a separate document alongside your Colab notebook. The video also shows how to view feedback and comments on your graded work.

Accessing Colab for Homework
Click here for a transcript of Accessing Colab for Homework.

Hi. In this video I'm going to show you how to access Colab for your homework assignments through Canvas and submit them. So, here we go. So, I'm starting here in Canvas under our course assignments, here. And just to illustrate how to access Google Colab, we're going to do this through this Python and Colab practice, which you all will have access to, and you can use to test out the environment. So, we'll click on that. Takes a minute for Google assignments to load, but once it does load, it gives this view. And we can see here if we scroll down that we have an ipynb file. This is a Python notebook, Google Colab Python notebook. And so, we can click on that to access the assignment. And so, here I can insert all my work to be submitted. There's other things that I can attach to the homework assignment too, and I'll show you that functionality in a minute here.

But just to illustrate this, so to start out, we've got just an empty, well semi-empty, Colab notebook here. And I've got some prompts to do the exercise. And in the next video, I'll show you how to add work to this, how to add cells containing your code, and text, and things like that. But for now, suppose we add everything. We're complete and we're happy with it. Then we just want to make sure to click save. It'll say all changes saved. We can close this out. Back in Google assignments, we'll still have this file here. And if we want to make sure that it's updated, we can click on it again to make sure it all works in there. But otherwise, if everything looks good, we can click open to attach and submit. We'll have the Python notebook file here.

We can add other files. So, for example, we can access files from Google Drive, or you can upload files. So, if you wanted to answer the textbook questions in a Word doc, you could upload that here. Furthermore, you could create, for example, a Google doc that contains your answers to questions. Some questions may have you sketch out a solution, say, for example, what our distribution might look like. And so, you could add your sketch here, either as a slide, or as a JPEG, or what have you. So, we might have some other files added on here. But at the very least, you do need to have your Python notebook with the coding work contained in it.

Once that's all good, you click submit. It says resubmit here because I've done this exercise already. And so, if you haven't submitted yet, it'll just be a submit button. You click on that, and you're done. And that'll automatically get pushed to Canvas where the instructors will grade it. It will give you feedback that will look like this. So, this is overall feedback on the whole assignment. I'm saying to myself, “Nice job, Dr. Morgan!” and then any rubric that's used will have the grades entered here. So, you'll see, you get full credit for this example. It'd be 0.5 out of 0.5, and 0.5 out of 0.5. Additionally, instructors can leave comments directly in your Python notebook file, and you'll see those on the right-hand side. Okay, so that covers it for accessing your Colab homework assignments and submitting.

Credit: © Penn State is licensed under CC BY-NC-SA 4.0

2. How to create your own Google Colab notebooks from scratch

For general-purpose use, outside of this course, you may find it helpful to develop your own Colab notebooks. For example, you may find Python a convenient tool for solving some problems in another course. A Colab notebook is an easy way to develop this Python code and display your results. The video shows how easy it is to make your own, blank Colab notebook through Google Drive. The notebook, ending with the extension ".ipynb", is treated just like any other file in Google Drive.

Access Colab from Google Drive
Click here for a transcript of Access Colab from Google Drive.

Hello. Today we're going to be talking about how you import data into Google Colab. We're going to talk about three different methods over the course of three videos, each of which is slightly different, but it'll give you good practice on how you can import data and then going forward you can always choose whichever one you feel the most comfortable with. So, without further ado, let's go ahead and get started. 

Overview

So, over here in Google Colab we have some text cells that describe these different methods. So, the first one I'm going to go over is mounting your Google Drive, the second is what I call the drag and drop, and the third is using a special upload files button. Before we get started, just a reminder of some key terminology. We'll be working with libraries, and we'll be working with functions within those libraries, and we're going to give libraries nicknames. And we do all of this through using the “from” import and “as” commands. 

Tutorial

All right, so in order to mount your Google Drive…  first, in order for this to work, you will need to have all of your data stored on your personal Google Drive, but once it's there you can always access it through this process. So, to start, we're going to import the drive library from the Google Colab meta library. So, we say from Google Colab, so this is our meta, or larger, library.  We're going to import a sub-library called Drive, and then in order to connect to our Google Drive, we use that library “drive” with the mount command, and we say, “slash content slash drive,” in quotes. And so, we click this, and it's going to go through a process. So, you need to say, “yes.” Connect. Use your Penn State email address. Say, “allow,” and it'll sort of go through its process. And occasionally, it takes longer or shorter, depending on the current resources available. But once it's done, you'll see this check mark. It'll tell you that we mounted it and if you click this file folder over here, we now see that there is a drive folder. And so, this is where we start step three.  We click the file folder icon. We click into the drive folder, and you navigate to wherever you stored your data. So, I'm going to use this lecture three retail sales. I'm going to click these three dots over here and copy the path, and then I'm going to close that. And that path is where we're going to actually access the data. Before we can do that though, we need to import another library. So, we say import pandas as PD. So, here the library is called pandas, and we're giving it a nickname PD so that when we use these commands, we don't need to repeatedly type pandas over and over again, we can just type PD, which makes things a little faster. And so then to actually read the file in, we give it a name. So, I'll just say “DF” for data frame and the command is, “PD dot read underscore CSV.” And then you open some quotes. And here is where you actually paste that file path that you copied earlier using step three. And sometimes this will be where you end. In this particular file, we need to add an additional argument. We say skip rows equals four. And we're doing this because this particular data set has four rows of metadata where it's telling us the units, and the source of the data, and their own internal processes they went through. And we don't need that, we want to start with row five, which is where the headers are. And this is something that you generally only learn by opening the actual file up in Excel to figure out how many rows you need to skip. But we can run that. We can see that it's run here.  If we open up the variable tab over here, this X and curly bracket, our data frame now shows up as  “DF” tells us the type of data, and the shape. We can also come over here, print the data frame by just typing the name and hitting run, and then we can see that we've got different variables, column headers, and different pieces of data within that data frame. So that is the first option that we have in order to upload data into Google Colab.

Credit: © Penn State is licensed under CC BY-NC-SA 4.0

3. The basics of adding code to a Colab notebook

Now that you know how to access or create a Colab notebook, let's get to the fun stuff: coding! The following video demonstrates how to add a "Code Cell" to the notebook. In this Code Cell, you will enter your Python code, and execute the code by pressing the "Play" button on the left side of the cell. It is important to bear in mind, especially as we progress to more complicated coding examples and assignments, that:

  • The lines of Python code within a Code Cell are executed in order from top to bottom.
  • The Code Cells are executed in the order in which you press the "Play" buttons (unless you do "Run All", which will run the cells in order from top to bottom in your notebook).
Adding Text Cells
Click here for a transcript of Adding Text Cells.

Hi. In this video, we're going to go over some basics of using Google Colab. In particular, inserting coding cells and running those computations. So, without further ado, here we go. So, we're in this practice document here. And to do my work, to insert the code, all I have to do is hover over one chunk of code and click, plus code. Okay, and so right now I've got a code cell. And I can enter any code I want here. The prompt for this exercise has me print out a statement, hello world. And so, I can type out my code to do that. In order to make this happen, we'll use the print function, which I will go over many functions in this course, so this is the first that you're introduced to. The print function will just print whatever argument we give it in the subsequent parentheses to the screen. And so, in quotes, I'll put “hello world!” exclamation point and that's my instruction for the computation here.

In order to make this happen, however, I need to click the play button. Pretty straightforward. Note that when I do this, it takes a minute because it's connecting up here to Google servers. When we get this green check mark it's connected, and it can execute. Here it's executed, and it's printed to the screen, hello world. Let's do something mathematical. So, if we want to insert a code cell down here, again hit code, also I can go to insert, and choose code cell up here. It'll be another way to do it. And so, the prompt here is to display the result of any mathematical operation. And so, if I just do two plus two, if I treat this like a regular calculator and just do two plus two, it'll print out four. Note that I don't have to use print here because it just gives the default output straight to this. However, if I assign this output to a variable by saying variable x for example, equals the result of this operation, it does not get displayed to screen unless I resort to my print command. Note that I do not use quotes here because this is not, I don't want to print literally x, I want to print what is contained in the object x, and it's 4. If I had used quotes, it would print literally x to the screen. You see the difference there? Furthermore, some other basic operations are subtraction, using the minus sign, you get zero, of course. Division, you get one, multiplication, four, and exponentiation we also get four.

One other thing to point out here in terms of basics is, we can see what variables we have stored by looking at our variable list here. So, we click on this variable list. It opens up a side menu, and we see we have variable x stored as an integer here. If I make another variable, when I say, y equals four times eight, I print y to the screen. Not only am I left with what's printed for x, but I also have the output from y here, and we get y as a new variable here. Note that when I click play here, it reruns this whole cell in order from top to bottom. So, it does x first, prints x, does y second, prints y. Furthermore, the order that we run the cells, the coding cells, is important. So, if I insert another coding cell here, and I do another operation, I say z is equal to x plus y, we print z. Note that this is going to require x and y to exist before I run the cell. So, I need to run this one first, so that x and y are contained in memory. That is, I have them in the variable list here, and then I can run the second cell that has z, working with x, and y. And now we have the z variable. So, pay attention to the order that you run cells in because if we restart runtime here, and wipe out what's in memory, if I now try to run this cell without having x and y, I get an error, “x is not defined.” And so, they know I'd have to pay attention and say, well, I should have run this one first, so that I have x and y, and now I can do z.

Okay, in subsequent videos we'll go over how to use the text cell and more about functions, but for now, that gives you a good basic overview of how to use the coding cells in Google Colab.

Credit: © Penn State is licensed under CC BY-NC-SA 4.0

4. How to add Text Cells to a Colab notebook

Text Cells are very handy for providing an explanation of your work. Besides describing the steps you perform in your neighboring code cell, or interpreting the results of your code, Text Cells can also be used in this course to give your solutions to the textbook problems (if you prefer). You can even insert images into a Text Cell, which may be handy for some assignment problems which ask you to draw or sketch something. The following video gives a brief demonstration of some of these key features of Text Cells, and you are encouraged to explore other functionality of the Text Cells.

Runtime Troubleshooting
Click here for a transcript of Runtime Troubleshooting.

Hi. In this video, I want to explain how to insert a text cell, and how to edit some of the text there. So, if we're going back to this example, we've already done some mathematical operations here. I might want to explain what I've done. And so, I can hover over, and I can add a text cell by clicking this button. And anything I type here will give me some formatted text. And I have different formatting options here. I can change this, designated as a heading, bold, italicize, etc. These are all operations that you may already be familiar with from something like Microsoft Word. And so, I can explain what I've done. I can say, here I've calculated z as x plus y.

On the right-hand side, it gives me a preview of what I've typed. When I move away from this, it just gives me the result, so my explanatory text here. And I can always go back and edit this. Again, if I want to stylize it a bit and italicize some of these mathematical variables, I can do that. I can insert a hyperlink. Something that might be useful for some of the homework problems in this course would be to insert an image. So, if I click on, insert image, it gives me a dialog and I can navigate to anywhere. to upload an image here. This is just an example image. And it shows as a bunch of gobbledygook, not a whole lot, but if I again navigate away from the cell, let this scroll to the very top here. If I scroll back down, it does have just the image there. Okay?

I can always delete what I have by clicking this button here. I can add comments to it. I can also move cells around in the document by using these arrows. So, I can move this down below, here. But I don't really want to do that because this cell depends on x and y's, we've discussed before. And that's about it.

Credit: © Penn State is licensed under CC BY-NC-SA 4.0

5. What to do if your Colab session freezes or stalls

Sometimes, you may unfortunately find that the code in your Colab notebook refuses to run or appears to be running but is taking an exceptionally long time. Often, this is due to an interruption of the connection to Google's servers (upon which your code is running). The following video shows you how to reset this connection, through a few different approaches you can try. In summary, we recommend trying these in order until your code is working again:

  1. Go to Runtime menu --> Restart Runtime (or Restart and Run All, if you want to also execute all your Code Cells in order)
  2. Save your Colab notebook and restart your browser (quit and open again)
  3. Save your Colab notebook and restart your computer
  4. Save a copy of your Colab notebook, as a separate file, and continue your work in that. Make sure to then attach this new notebook to your assignment submission.
Colab Python Basics
Click here for a transcript of Colab Python Basics.

Hi. In this video, I wanted to discuss troubleshooting some runtime errors that you may encounter, but hopefully, you don't. So, normally, you should see this green check mark indicating that you're safely connected to Google's server and everything's running smoothly. If you don't see that, usually just running a code cell will connect you to that runtime server. However, you may encounter the situation where either the execution of a code cell is taking a very long time, longer than you would expect it to. Maybe it's just chugging on indefinitely, or the whole thing freezes up. And you may still have this green check mark, or it may turn into some other symbol. But if you do, if things are still taking too long, or it seems frozen, something that you can try to do is go to runtime and restart runtime here. Okay, you’ll get this dialogue. You click, yes, and it just says restarting, initializing. It goes through this process of reconnecting again. Basically, you're just refreshing that connection to the services that Google provides.

Alternatively, you can do restart and run all. If you do that, then it will restart and run all the code cells in order from top down. So again, pay attention to the order that you have code, and how they integrate with one another. Until it gets to the bottom, and everything will be refreshed in your Colab session. Okay? If that does not help, hopefully, you've saved your work, or hopefully, you still can save your work. Click save, and you might just need to shut down your browser, reopen it, and continue on with your work in a new refreshed browser.

If that still doesn't help, another thing you can try is saving this, restarting your browser, and restarting your whole computer, basically to refresh your computer's connection to the internet. Yet another thing you can try is, file, and not just save, but save a copy in Drive. This will create an alternate version of the work you're working on, this Colab notebook, and you can always do file, and locate in Drive, to see where this new version is. And here we see a copy of Eugene Morgan python code practice. So, that is this current document. And I can continue on with my work here. I can finish my homework assignment, save it, and then when I go to submit the assignment in Canvas, I just need to make sure to attach this code here, and sub that in there. Okay? So, those are several things you can try if you do encounter the unfortunate situation where your runtime environment has frozen up. Okay?

Credit: © Penn State is licensed under CC BY-NC-SA 4.0

6. The basics of functions in Python

A previous video introduced you to the print function and did not provide much explanation about how you would know how to use that function. This video provides some more general background on functions in Python (you will be introduced to many throughout this course). The typical format of a Python function follows:

function_name(argument_1, argument_2, ..., argument_n, 
    parameter_1 = default_value_1, parameter_2 = default_value_2, ..., 
    parameter_n = default_value_n)

Different functions may only take a certain number of arguments, or some can take any number (such as print). Often, "argument" and "parameter" are conflated and used interchangeably. Either term indicates information that is passed into the function.

Basics of Functions in Python
Click here for a transcript of Basics of Functions in Python.

In this video, let's talk a little bit more about functions. So, when I introduce this print function you might have asked yourselves, “Well how did Dr. Morgan know to put in, ‘hello world’ here in quotes, or x or y here? How do we know how this print function works?” Well, in a subsequent video we'll talk a bit more about debugging code and getting help with functions and various things in Python. But I want to point one thing out to you and discuss the basic usage of how functions work in Python.

So, to do that, let's insert a new code cell and let's continue on with print as an example, here. Suggesting print is the built-in function. If I do a parenthesis, what automatically pops up in Google Colab, and this is very nice, is how to use the function. So, first, I would have to know that some function called print exists. And that's a separate issue. In this class, we’ll introduce you to many functions to use for data processing visualization, and analysis, and statistics, and things like that. So, we'll cover a lot of those. So if we know print exists, we want to use it, well, the next question is well how do we use it? And that's what these instructions are telling you here. Although initially, this might not make a whole lot of sense to you.

One key thing to focus on here is this line of example code right here. This is generally stating how to use the print function. So, we have the function, print, open parentheses, some value. This could be a text or a number, as you've seen in examples previous. Dot, dot, dot means we could have any number of additional values included here. And then we've got some strange things. We've got sep, end, file, all those business. What are those? Well, all these things separated by commas here are called arguments to the function. These are things that could be supplied to the function and so it needs values, those are required arguments, and then these are optional arguments. And what's being shown here are the default values for those. And we can read about them further down.

So file, a file-like object or a stream. Default is to do the current system standard out. This is basically just printing the screen, is the default. We see that here, sys.stdout means print to screen for this file argument. But you could provide paths to other files if you wanted to write to those. sep string or text inserted between values, the default is a space. And then, end string. Again, string is referring to text appended after the last default, a new line, and so on so forth.

Okay, so how can we put these to use? Well, we could have one value, we could say our value of z is… and then z. Let's see what happens here. So, we're supplying two values now. Of course, I didn't need to run these again. There we go. Our value of z is 36. And I could change how these are separated. I could say sep equals, and I could put anything I want in here. If I put a colon, it changes that space to a colon, if I wanted a new line, it's dash n, or I should say backslash n, is 36, and so on and so forth. And so, I could tack on any number of things here. I could say z equals x plus y. Note that each instance is separated by a new line. [unintelligible] Or want to go back to just a space again, include that, and so on, and so forth. So, I think you get the idea here. Okay, so that's a little bit more detail of how functions are structured and how they operate in Python, and how to learn about what arguments to supply to specific functions once you know what those are.

Credit: © Penn State is licensed under CC BY-NC-SA 4.0

Summary

You now have the ability to get started using Google Colab notebooks and developing your own Python code. The Python and Colab Practice assignment in Canvas is a non-graded assignment that gives you the opportunity to practice accessing and submitting an assignment, so you can see for yourself how that works without fear of having a mistake that costs you points. You can also experiment with Python code and Text Cells in this practice assignment.


 frequently asked questions FAQ