Lesson 1: Introduction to Cloud and Server GIS

Overview

It is an exciting time for Cloud GIS. There has been a huge upswing in interest in geospatial data, and the computing infrastructure to support the visualization and analysis of this data has been developed as well. While the infrastructure to support cloud GIS is now accessible to everyone, the exact forms it will actually take is still unknown.

The purpose of this course is twofold: first, to give you practice with using a variety of cloud GIS services, and second, to give you an understanding of what cloud computing is more broadly, and how it should and should not be applied in various GIS problem contexts.

You will be creating a variety of cloud GIS services during this course. Most of these will fall under the heading of server GIS. The platforms for this will include GIS infrastructure as a service, GIS platforms as a service, and GIS software as a service. These platforms will be defined in subsequent lessons, along with the five essential aspects of cloud computing.

By the time this course is over, you should feel comfortable setting up and using server GIS using cloud computing, and you should have an understanding of how cloud computing can help solve GIS problems.

(source: Flicker.com)

Lesson Overview

The first lesson begins with a discussion of the definition of cloud computing followed by the use of the Amazon Web Services (EC2) to create your own cloud by starting and stopping a virtual machine. Finally, we have our first cloud computing discussion, on the overall topic of Cloud GIS.

Lesson Objectives

At the successful completion of this lesson you should be able to:

set up your Amazon account;
create an Amazon instance from an Amazon Machine Image. Start, stop, and log in to that instance.

Deliverables

Complete: L01: Assignment
Participate: L01 Discussion: Cloud Computing

What is Cloud Computing?

Cloud computing is a concept and a phrase that has become increasingly popular. However, there are a number of competing definitions for what "cloud computing" entails. One very effective definition of cloud computing, which consists of five essential characteristics, three service models, and four deployment models, comes from the National Institute of Standards and Technology (NIST). The final version of this definition [1] was published in October of 2011, and is available from NIST. During this course, we will explore how the essential characteristics and service models in particular can be used in a GIS context. Let's consider them now in turn.

Essential Characteristics

The five essential characteristics can be hard to recall at times. During one sleepless night, I came up with the mnemonic NO-REM as a way to remember them.

N stands for Network access. Cloud computing services can be accessed from a variety of networked devices, such as workstations, mobile phones, and other servers. A GIS example is a geospatial information service that allows access from browsers and from other servers.

O stands for On-demand self-service. Cloud computing services should be accessible at will, without having to consult and get permission from a human being. A GIS example is the ability to start multiple map servers by using a browser interface.

R stands for Resource pooling. Cloud computing resources such as processing power, storage, and input-output (to use the Von Neumann architecture [2]) are provisioned for different clients from a common set of physical assets. Clients need not know (and often cannot know) exactly where the physical assets are. A GIS example is sharing computers owned and administered by Esri, Amazon, or Microsoft, without knowing or caring how these computers are being provisioned (as long as they stay up!)

E stands for Elasticity. Cloud computing services can be scaled up and down to meet demand and decrease waste. A GIS example is processing a large spatial data set quickly using many cloud computers, which are then discarded when the task is done.

M stands for Measured service. Cloud computing services are paid for by resources used (such as processing power, storage capacity, or number of user accounts). A GIS example is paying for a map server only for the hours it is up and the bandwidth it uses, rather than a whole computer.

All of the GIS examples of the essential characteristics are ones that you will experience during this course.

Service Models

The three service models have a definite order to them. Infrastructure as a service (IaaS) is the more fundamental layer, followed by Platform as a service (PaaS) and Software as a Service (SaaS). I was quite impressed with the diagram explaining IaaS, PaaS, and SaaS [3] at venturebeat.com, so I slightly changed it and re-made it below:

Figure 1.1 Comparison of Service Models

Credit: Frank Hardisty (adapted from venturebeat.com)

As you can see, in the traditional, computer-under-your-desk model, you manage everything. Moving to a Cloud Infrastructure as a Service (IaaS) means that the von Neumann troika of IO, storage, and processors, are managed by the provider, and you bring everything else. In a GIS context, this would mean that you rent computing power from a cloud provider, and use it to solve GIS problems. We'll use the Amazon Elastic Compute Cloud (EC2) running ArcGIS Server to study IaaS.

A Cloud Platform as a Service means that the vendor provides the physical layer, the operating system, and the runtime. You are still able to add your own data and write software that runs on the cloud platform. Facebook, from a developer's perspective, offers a PaaS because APIs (programming tools) are available to write programs that run in Facebook. Google App Engine is another example: Google gives you the computing power and you write the apps.

Software as a Service (SaaS) is a bit easier to understand. You just use it without having to install anything or write any code. Online email is a very good example of SaaS. So is Google Maps. Many GIS and mapping companies are offering mapping and spatial data processing through a SaaS model.

Next, we will get started on the leading IaaS: Amazon's EC2.

Introduction to Amazon EC2

The Amazon Elastic Compute Cloud (EC2) is an infrastructure as a service (IaaS) cloud. This means that it provides computing power and resources that you can use for a fee. You take care of running the software; Amazon EC2 provides the hardware.

To understand Amazon EC2, it’s important to understand the concept of virtualization. When you use your computer at home, it’s very likely that you have one physical “box” sitting on or below your desk, with a power button, disk drives, a video card, and so on. The relationship between the physical machine and the machine you log into is 1 to 1. Virtualization, however, is the idea of hosting multiple “virtual machines” on a single physical box. These virtual machines share some hardware resources, but they appear to the end user as distinct machines that can be logged into and administered separately.

You may have used virtual machines at your place of employment; many companies are using them in the workplace because they are more flexible and cost efficient. Most often, an IT administrator will purchase or choose a powerful machine and configure it to be a “virtual server”, which is a physical machine that hosts multiple virtual machines. Obviously, it takes a powerful computer to act as a virtual server, and it takes a fair amount of IT administration skill to set one up.

Enter Amazon EC2. When you work with Amazon EC2, you create and run virtual machines in Amazon’s data centers. You don’t have to know too much about the details of the virtual servers (nor does Amazon want to reveal this). The idea is that you can focus on the software on your server and let Amazon take care of the hardware needs.

Of course, there is a cost for using these resources. You are charged hourly fees for the computing power used, and for the amount of data that you store on Amazon EC2. Most of the things you can do or use on Amazon EC2 have some sort of fee associated with them, but unless you are running a high-traffic site with many gigabytes of data, computing power and disk space are the two biggest cost concerns.

Advantages of Amazon EC2

The benefits of Amazon EC2 can be enormous in some situations. Here are a few of the immediate advantages:

You don’t have to purchase or set up a virtual server; instead, you use Amazon’s hardware infrastructure. This is especially useful if you don’t have an IT person on staff, or if you don’t have the money to purchase a virtual server. As you will see later in this lesson, it’s relatively painless to set up your own virtual machine on Amazon EC2.
You can easily obtain a machine to prototype or test a new application. If your organization is in a financial crunch, all of your machines may be in use or out of date, making it difficult to try new things. With Amazon EC2 you can obtain a machine for a few days or weeks for a relatively low cost, in order to test or learn new software and applications. In essence, this is what you’ll do in this course, as you use Amazon EC2 for just a few weeks so you can learn ArcGIS Server.
You can easily obtain a server that is public-facing (in other words, that can be accessed by anyone on the Internet). In some organizations it takes a fair amount of paperwork, official approval, and coordination with IT staff to get a public-facing server. This is for good reason, since any time you open up a server to the world, there are a lot more security risks that come into play. Setting up a public-facing server on Amazon EC2 carries somewhat less risk because the machine is not running on your organization’s hardware and can be completely isolated from your network if you choose.
You can add “auto scaling” rules that add or remove machines depending on how busy your site is at any given time. This is how Amazon EC2 gains the “Elastic” part of its name. This elasticity can be incredibly cost-efficient for certain types of sites, such as those responding to natural disasters. Suppose that you administer a weather site, and one afternoon a string of serious tornados hits. Your site will see a lot more traffic that day, especially if your site gets linked to by other sites. If you were hosting your site on premises, you might run out of hardware, or it might take some time to add new machines. If you were hosting your site on Amazon EC2 with auto scaling rules, your site could temporarily expand to use whatever amount of hardware was needed.

EC2 instances and AMIs

Before going forward, there are two important vocabulary terms that you should understand regarding Amazon EC2:

EC2 instance – An EC2 instance is a virtual machine running on Amazon EC2.
AMI – An Amazon Machine Image (AMI, sometimes pronounced “ah-mee” or "Amy") determines the files, settings, and software that are applied when you create a new EC2 instance. You can think of an AMI as a blueprint for creating an EC2 instance. Also, when you work with an EC2 instance, you can save a “snapshot” of your work at any time by creating an AMI. This way, if something ever happens to your instance, you can create a new instance and continue where you left off.

How do you get software like ArcGIS Server running on Amazon EC2?

Esri has created an AMI that has ArcGIS Server installed and configured. You will use these AMIs to create EC2 instances, thereby getting the server software running on Amazon EC2. Once you get the instance running, you can log into it using an application called Windows Remote Desktop. This is the same way that you would remotely log in to any other computer in your network, except this time the machine is outside your network, running on Amazon EC2.

You can perform all of these steps on your own home computer as long as it has an Internet connection. In fact, it's recommended that you use your home computer because some workplace IT departments have placed restrictions on accessing computers outside the firewall (like Amazon EC2 instances) using Remote Desktop. Please note that you cannot use a personal hotspot through a mobile phone to log in to your EC2 instances.

Introduction to Cloud and Server Security

Security is one of the biggest issues that causes organizations to hesitate when they consider cloud computing. It is a natural reaction, after all for most modern organizations, their data is their lifeblood. How could we entrust that to people we don't know and have little control over?

It may surprise you to know that in the eyes of many security experts, using cloud computing can make your data more secure, not less secure, as long as the correct procedures are followed. Which makes sense when you think about it, it's another aspect of the benefits of scale. I have a lot of confidence in the computer security folks here at Penn State; they are excellent. I'm also pretty confident that Amazon and Google employ even better computer security experts. However, no experts can protect you from yourself; if you start up a server, expose it to the world, and fail to patch it, it will get hacked. Therefore, the key is to follow the correct procedures.

This page describes how to safely complete this course and gives some general guidance and pointers to further information on how to safely administer a server in production.

Security in this course

In this course, we will be learning about and experimenting with a variety of cloud computing technologies. The most important method you should follow to use them safely is simply to follow the directions in the course in their entirety. Don't skip steps. Further general guidelines for security in this course can be divided into two parts. We will first use Infrastructure as a Service, by starting server instances on Amazon's compute cloud. Later in the course, we will use Platform as a Service and Software as a Service services like ArcGIS Online, Carto, and Mapbox. Good security practices for the second part of the course (platform and software as services) are easily described; use a strong, unique password for each service. If managing multiple strong passwords is an issue for you (I think it is for most humans), consider using software like 1Password [4]. The rest of this section gives some guidelines for safe usage of Amazon server instances for the Infrastructure as a Service portion of the course.

Stop your instances when not in use. As a general guideline, you should stop your Amazon cloud server instances while you are not using them. A good workflow while using cloud computer instances is to start them when you want to begin a work session, and then stop them when you are done. After all, a computer that is turned off is one that cannot be remotely hacked. Do not terminate an instance that you care about; it will be irretrievably deleted.
Understand the basics of public-key cryptography. Amazon relies on public-key cryptography to give you safe access to your instances. You are given a key pair file that you use to retrieve the initial administrator password for your EC2 instances. Keep your key pair file in a safe place where you can find it and make sure it is not exposed through any web-facing folder.
Don't surf the Internet from your servers. Please do not use the browsers installed on your cloud computing instances to go to any website that you do not 1.trust, and 2.need to access to complete your work. There are a lot of security vulnerabilities in JavaScript, Flash, and other entertaining and productive technologies. We accept these vulnerabilities in daily life because we want to use Internet services and because the consequences of a security breach while using a computer as a regular user are not too severe. However, while you are logged into a server as an administrator, you must not go surfing around the web. Only visit websites you must visit and that you trust. Should you visit CNN.com? No. Should you visit our course website to download a file for an assignment? Yes.
This last one is somewhat advanced and doesn't apply to exercises in this course, but it's worth sharing anyway. Be careful when making and sharing images based on EBS volumes (Amazon EC2's version of "hard drives"). Also, be very careful with whom you share access to the "hard drive" EBS volume that the instances uses. You should never make a public copy of such an EBS volume. Even files you have deleted may still be accessible to anyone you give access to. Amazon EC2 guru Eric Hammond ran a little contest to illustrate the dangers. He made a text file on an EBS volume that contained a gift card number, and then deleted it. He then made a public copy of the EBS volume and invited the readers of his blog to try to retrieve the "deleted" file and claim the prize. Soon there were posts from multiple people who were able to read the file. Details are here: Hidden Dangers in Creating Public EBI Snapshots on EC2 [5].

Even if you are not going to be sharing copies of your instances, you may need to think about how to safely put your server into production.

Security for servers in production

A production server is one that is serving your business content, live, to end users in a highly available and highly secure manner. A complete guide to production server security is unfortunately outside the scope of this course. We are not setting up servers that are ready to go into production. However, we will be covering some aspects of server security as we go along in the course. In addition, here are some general principles that you should know:

Turn off operating system components and services you are not using. If a computer is being used as a machine to lauch web servers and databases, but is not running a web server or a database itself, then disable the web server and database software on that computer.
Limit account access. Use an administrative account only for administrative tasks. Normal usage should use an account with more restricted rights.
Limit port access. Use firewalls to restrict port availability, and be cautious when opening ports. If you are using Amazon Web services, use Security Groups as well as a firewall on your operating system for defense in depth. A good introduction to using Security Groups [6] can be found at rightscale.com.
Sanitize any input allowed into your system. For example, if you allow people to upload images, test the uploads before saving them on your system to make sure they are valid images rather than malicious files. Be especially cautious of SQL or other data manipulation input.
Keep your operating system and server software patched and up to date.
Back up your data and systems. Data which is not backed up will eventually be lost. Failure rates for individual computer instances are usually high in a cloud environment, due to the use of commodity hardware. A good backup strategy involves backing up your data to two separate physical locations or two different "Regions" on Amazon EC2. Keeping a remote backup of your server can be a lifesaver in an emergency, and can make you a hero if you can get the system restored quickly. A good strategy for EBS-backed instances (that is the most popular kind of instance and the kind that we are using in this course) is to take regular snapshots of the boot EBS volume. You can then easily create a new server instance from one of the snapshots if disaster strikes.
Don't share your private keys or passwords with anyone. Responsible administrators will never ask you to share this kind of information. I'm sure you have already heard this, but it bears repeating.

A good resource for how to safely run a server in production is the National Institute for Standards and Technology's "General Guide to Server Security [7]," accessible at csrc.nist.gov/publications/nistpubs/800-123/SP800-123.pdf. It describes the necessary planning steps, how to secure your operating system, how to secure your server software, and how to maintain your security. It also describes the multiple personnel roles that are involved in good security practices. If you will be involved in helping to administer a server in production, I suggest you read this guide and follow its recommendations.

OK, enough of the heavy stuff; it's time to start your first cloud computer in this course, an Amazon instance!

Setting up your first Windows instance

Let's create an EC2 instance that is running Windows. The purpose of this exercise is to get you familiar with the basics of Amazon EC2 using some familiar software. Before you attempt this part of the lesson, you need to make sure you've obtained an Amazon account and enabled it for use with Amazon EC2. This should have been covered during the course orientation.

If you have any doubt about the above, contact the course instructor.

Here are the steps for getting Windows running on Amazon EC2. Since Amazon can potentially update their site at any given time, some minor adjustments may be required for these steps. Contact the instructor if you have questions, or, if you find an issue that you are able to work around, please mention it in a comment in the Technical Discussion forum.

Open a web browser to Amazon's AWS Management console page [8].

This application is called the AWS Management Console, and it helps you create and manage things on EC2, such as instances. This app has some quirks, and I've found that I have to run it in the Google Chrome browser to completely avoid them. Sometimes it will work in Firefox.
Click Sign in to the AWS Console.
1. If prompted to choose a Root or IAM user, select the Root option. In a production environment you would likely want to create other IAM users under which to run things. It's generally not recommended to use an Admin or Root login for day-to-day operations, but for our purposes in 865 it will be just fine.
Sign in with your Amazon account name and password.
You should be taken to a screen with a bunch of Amazon Web Services listed, such as Elastic Beanstalk, S3, etc. These represent all the types of web services that Amazon offers. For now, you're interested only in EC2, which is Amazon's set of web services for renting hardware infrastructure.
Click the EC2 link. On the right, you'll see a summary of all the items you have running in Amazon EC2. There should be nothing listed. On the left, you'll see a menu of different categories of things you can create in EC2, such as Instances, Volumes, Elastic IPs, etc. You'll learn about a few of these as we go along.

In the upper-right corner, notice that a dropdown list allows you to pick the region you want to work in. It likely reads "US East (N. Virginia)" or a location closer to where you live. Amazon runs EC2 from various data centers placed around the world. You can choose which data center, or region, will house the resources you create. Typically, the closer you can place your region to your end users, the faster your services and apps will appear. But some organizations may also pick a region based on legal requirements relating to countries that can or cannot house their data. Be aware that costs are slightly higher in some regions. You can see a list of costs at Amazon's Elastic Compute Cloud Pricing page [9]. For this course, set the region to "US East (N. Virginia)."

Before you launch an actual instance, you'll create an Amazon virtual private cloud (VPC) which is sort of your own special space carved out of Amazon's cloud. Instances in a VPC can see each other and your own network fairly easily, but they're not immediately accessible from elsewhere without some extra work on your part. That's a good thing for security.
Click the Services dropdown in the upper left and click VPC (it's under Networking & Content Delivery).

Creating a VPC is potentially a very technical and complex activity, but it's something most people have to do at first. For that reason, Amazon has made a wizard for setting up a real basic VPC. This will suffice for our purposes.
Click Create VPC.
Select the the VPC and more option.
Leave the Auto-generate box checked, and enter a name like Geog865VPC in the box below it.
Leave the default value, 10.0.0.0/16, in the IPv4 CIDR block box. This is a default private IP address [10] range for your machine that's only visible to other machines within your VPC. You'll create an Elastic IP later which will be the public address we'll use with ArcGIS Enterprise.
Leave the other options in their default settings and click Create VPC.
Click the Services dropdown in the upper left and click EC2 to go back to the EC2 resources page.
From the left menu, click Instances. Ensure that you are using the N. Virginia region, then click the Launch Instance button.

Now you're at a place where you can type a name for your instance. It used to be that your instances in the console were just assigned an ID. This was hard to keep track of once you had more than just a few instances, so Amazon allows you to type other metadata about the instance. This is stored as name/value pairs.
Enter something like, Geog 865 Windows instance, in the Name box.
In the list of AMIs, browse in the Windows section and select the one that's called Microsoft Windows Server 2019 Base.

The next section provides Instance Type options. On this panel, you'll choose the size, or computing power, of your instance. Micro instances -- low-resource options suitable for many trial situations -- is the type selected by default on this panel. Note that the instance size you choose drastically affects the price that you pay, so follow these instructions carefully.
Select t2.micro. This instance is sufficiently powerful for what we need to do, and at the time of this writing, it's part of the AWS free tier, which means that you won't be billed against your credit card until the free credits that AWS provides in their Free-Tier are exhausted. Then click Next: Configure Instance Details.

The next step in this process has to do with logging into your instance for the first time. You need to get special file called a key pair that allows you to retrieve the instance's administrator password. This is a one-time action; you can use this key pair for the rest of the instances you launch in the course.
Click the link to Create a new key pair, type a name for your key pair (e.g., geog865), then click Create Key Pair. A small, text-based file with the extension of .pem will be downloaded to your machine. Keep this key pair file in a safe place that you remember for later in the course.
In the Network Settings section, a default VPC may be selected. We want to use the VPC we created in the previous steps, so click the Edit button and change the drop-down selection to show your VPC as the destination network for this instance.
Confirm that the Subnet box shows one of your "public" subnets. This specifies that the resource we are creating will have access to the Internet.
Leave the Create Security Group option selected and enter a name like, Geog 865 Security Rules and a description, if desired. Machine instances you create will operate under the firewall settings you specify here.

Your new security group will start out with a rule allowing Remote Desktop (RDP) access, so you can log in to your instance and administer it. Windows Remote Desktop requires port 3389 to be open. Note the Source IP address, which defaults to Anywhere (0.0.0.0/0), which basically allows any IP (i.e., any computer) to use Remote Deskop to access your machine. You do not typically open RDP access to all addresses ( 0.0.0.0/0 ). Instead, you specify your local computer's IP address or your organization's range of IP addresses using; this is a much more secure approach that you would use in a production setting. For our purposes in Geog865, leave the option set to Anywhere (0.0.0.0/0). This allows any computer to use Remote Desktop to contact your Instance. They will still need your Windows username and password to log in, but the firewall on your EC2 instance won't block it. This will avoid difficulties if you happen to use a different computer to work on your instance, or if you use your laptop in a public wi-fi environment, like Starbucks, which will assign your computer a different IP everytime you connect.
Click the Add Security Group Rule button and select HTTP from the Protocol dropdown list, with a source of Anywhere. You have just allowed HTTP access on Port 80 to everyone, thereby letting Internet users access your web services. Port 80 is the most common port used on the Internet for incoming web traffic into a server.

Some of the other settings are beyond the scope of this course. However, you will enable Termination Protection. Terminating your instance deletes it forever. Termination Protection is nothing fancy; it just prevents you from terminating an instance until you explicitly disable termination protection on the instance. It's a way of making you go through an extra step to make sure you don't accidentally do something you didn't intend to do, which is helpful for beginners.
Expand the Advanced Details section and Enable Termination Protection.
The last section shows a summary of the instance that will be created. Examine it, then click Launch Instance.
Return to the Instances section of the EC2 console, where you should see your instance listed. Within a minute or two, you'll see its status change from pending to running, but this does not mean the instance is ready yet. It takes around 10 minutes for Windows and the software running on your instance to configure itself. It's best not to disturb the instance while this is occurring.

Because you created your instance in Amazon VPC, it's not publicly visible by default. Furthermore, the name of the instance will change every time you stop and start the instance. In order to reach your instance in a consistent fashion from a remote desktop connection, you'll need to set up an Amazon Elastic IP. This is an unchanging address that Amazon allocates to you for your use. You can then associate it with any instance you choose. Every time you stop and start the instance, you'll associate it with this IP address.
At least 10 minutes after performing the previous step, open the AWS Management Console, enter the EC2 Dashboard, and click Elastic IPs.
Click Allocate New Address, and then Allocate using Amazon's pool of IPv4 addresses.
You should see an address appear in your list of Elastic IPs, such as 107.20.220.152. (You may need to Clear Filters to see any other IPs you already have.)
Check the box next to your new Elastic IP and click Actions > Associate address.
Choose your Geog 865 Windows instance from the dropdown list and click Associate.

IMPORTANT NOTE: The Elastic IP you create at this step must remain constant for the duration of this course. Please avoid deleting the Elastic IP or using new ones. You may create and destroy machine Instances, to which you associate the Elastic IP, but the Elastic IP itself must stay the same.
Notify me as soon as you've established your Elastic IP. Tell me what your Elastic IP is so I can help get a domain name for your server (a requirement for ArcGIS Enterprise). I will also help get everyone an SSL Certificate for your server (also a requirement for ArcGIS Enterprise). The Certificate will be associated with your domain name, so we need to get that established first. The process of registering domain names takes some time, so try to get me your IP as soon as possible.

Once you launch an instance, the instance starts automatically and your Amazon bill begins accruing. It's very important to understand that you begin amassing charges right away; Amazon does not wait until you log in to your instance to begin charging you. In order to control costs, you need to stop your instance whenever you aren't using it. Before you take a break, please immediately continue reading the next section of the lesson to understand how to properly stop and start your instance.

How to stop and start your instance

Fortunately, you don't have to repeat all the previous steps to complete the Launch Instance wizard every time you want to use Amazon EC2. Once you have an instance created, it's fairly easy to log in, start, and stop it. Before we talk about logging in, let's cover the basics of how to stop and start the instance. You'll need to begin using these techniques immediately, every time you use your instance, in order to keep costs down.

When people begin using Amazon EC2, they often ask about the difference between logging out, stopping, or terminating an instance.

You can close your Windows Remote Desktop Connection session or click the Windows Log Out button when you are finished using your instance. However, this does not stop the instance and you continue to accrue charges for it.
You can stop the instance, which is akin to pressing the power button to turn off your physical machine on your desktop: the machine is still there, and you can start it later, but it's not using any resources like electricity, spinning its CPU, etc., and it's not getting charged by Amazon. (Amazon does continue to charge you for the disk space your instance is using, but this is a relatively small charge.) When you are working through this course, you should stop the instance when you are finished working on the lessons for the day. When you are ready to go back to the lessons, you can start the instance and continue working with your programs and data.
You can terminate an instance, which makes the instance go away forever. The only thing left behind is any disk drive that was attached to the instance. Terminating your instance will hopefully not be necessary until the very end of the course; however, you can keep it as an option if your instance gets corrupted. If you terminate your instance, then you will have to create a new one using the steps in the previous section of the lesson for the Launch Instance wizard.

If you fail to stop your instances after you have finished working, you will quickly use up the Amazon Free-Tier credits and start seeing charges to your credit card.

Below are some reference instructions that you can use to stop and start your instance (Do not stop your instance for at least 10 minutes after you first launch it. It needs time to configure Windows for the first time.)

You can return to this page throughout the course if you need help remembering how to stop and start your instance.

Stopping your Windows instance

Use the instructions below to stop a Windows instance like the one you created in Lesson 1. Do not use these instructions for ArcGIS Server instances.

Log in to the AWS Management Console and open the EC2 page.
Click Instances.
Right-click your instance and click Instance state > Stop.

This stops the clock on the charges for running your instance. When an instance is stopped, no one can use your server and you cannot log in.

Starting a Windows instance

Use the instructions below to start a Windows instance like the one you created in Lesson 1. When you start your instance, it takes a few minutes to boot up, but you shouldn't have to wait the full 10 minutes that you waited when you first launched the instance. Always follow the instructions below when you start your instance:

Log in to the AWS Management Console and click the Amazon EC2 tab.
Click Instances.
Right-click your instance and click Instance state > Start.
Wait at least 10 minutes so that the machine can start up and configure itself correctly.
If you notice that your machine Instance no longer has your Elastic IP associated with it, click Elastic IPs and check the box next to your Elastic IP.
Click Actions > Associate Address, choose your instance from the drop-down list, and click Associate.

After a few minutes your instance will be ready to use with its Elastic IP. After enough times of repeating this action you should have these instructions memorized.

Viewing your bill

To view your accrued charges at any time, go to the AWS Management Console Billing page [11]. You can see detailed reports of your usage of each part of the service by clicking the Bill Details button at the upper right of the main Billing page.

I recommend that you view your credits after every lesson so that you understand whether you are in danger of excessive charges. If you consistently stop your instances after you are finished working, your costs will remain small.

Logging in to your instance

After you've given your instance about 10 minutes to configure Windows, you can get ready to log in to the instance and start working with your software. The first thing you need to do is get the Administrator password so you can log in.

Log in to the AWS Management Console and view your list of instances, as you did in the previous walkthrough (Services > EC2 and then click Instances in the left menu).
Right-click your instance and click Get Windows Password.
Click Browse... and browse to the key pair file you saved in the previous walkthrough when you launched the instance. This is a small file with a .pem extension.
Click Decrypt Password and copy or write down the displayed password.
Open Windows Remote Desktop. In most versions of Windows, you can browse to this from Start > All Programs > Accessories > Remote Desktop Connection. In older versions of Windows, it may be in a folder called Communications.

Note: If you're using Windows 8.1 (or later) or any machine with a very high resolution display, you might notice that your Remote Desktop has very, very small fonts, making your EC2 instance difficult to use. In this event, you might like to (as I do) use Microsoft's "Remote Desktop Connection Manager" tool. It has the same functionality as Remote Desktop Connection, but with the ability to smartly scale the size of the screen.

Remote Desktop is a program that you can use to log in to other computers from your own computer. If you're new to Remote Desktop, you may want to take some time to read Remote Desktop Connection: frequently asked questions [12].
In Remote Desktop Connection, click the Options button > Local Resources tab > More button and check the box for Drives. Then click OK. This will permit you to copy data from your machine onto the remote machine (in this case, your Amazon EC2 instance).
In Remote Desktop Connection, under the General tab, type or paste the Elastic IP of your instance into the Computer input box. If you can't remember what this is, click Elastic IPs in the AWS Management Console and you will see it listed.
In the User name input box, type Administrator. Then click the Connect button.

You might see a warning message here about remote desktop connections harming your computer. Anytime you connect to a remote computer, there is the possibility that a malicious party could try to pose as the machine you are logging in to. Older versions of Remote Desktop were especially susceptible to this type of "man in the middle" attack. The work you are doing for this course is relatively benign and low risk, so you can click Connect.

If you are using a computer at work, it's possible that Remote Desktop connections to machines outside your corporate firewall are blocked. If this is the case, you need to work with your IT administrator to open communication through port 3389 on your machine to all machines in the Amazon subnet. If you work in a high-security environment (or any environment with lots of red tape), getting approval to change a firewall rule like this may be difficult or impossible, and it will be easier to perform these steps from home instead.
In the Password input box, carefully type or paste the administrator password that you obtained in the first few steps above (sometimes Windows will not allow you to paste a password). Then click OK.

You may see a window warning you that the identity of the remote computer cannot be verified. You can ignore this warning and click Yes.

In a few seconds, you should see Windows appear. You are now working in a remote desktop session that is connected to your Amazon EC2 instance. This behaves just like any program in Windows. You can minimize it or close it, but note that closing your remote desktop session does not stop your instance. Your instance will continue to accrue charges until you right-click it in the AWS Management Console and click Stop. This is what you should do if you are interrupted while performing these instructions, or need to take a break at any time.

The first thing you'll do on your instance is change the administrator password to something that only you know and can easily remember.
On your instance (not your own computer), click the search magnifying glass icon in the lower left and find Computer Management.
Expand Local Users and Groups and click Users.
In the list of users, right-click Administrator and click Set Password > Proceed. Type and confirm a new password that you can remember. In the future, you can use this password when logging in to your instance.
Now do a bit of exploring around the machine. Notice that you have the basic Windows programs available to you.
Take a screen capture of your entire desktop and save it to your local machine as evidence that you made it this far. You will need this in your lesson deliverables. At least some part of your Remote Desktop session should be visible in the screen capture, although you may also have to display part of the desktop of your local machine before you can successfully take the capture onto the local machine.
Open Windows Explorer (using the little file folder icon at the bottom of the screen) and click This PC. Notice that your instance has a 30 GB C: drive (which we could have made bigger when we launched the instance). You should also see all the disk drives from your local computer listed (go ahead and browse them). This is how you can copy files from your own machine onto your EC2 instance.

The purpose of launching this instance was to show you how you can use the AWS Management Console to create and log in to a virtual machine. We're not going to use it for any future assignments, although an instance like this might be useful if you need a certain program or a particular level of computing power just for a short time.

In future lessons, we'll work with Windows instances similar to this one that are running ArcGIS Server. You will make remote desktop connections into them, change the administrator password, and move files around in the same way that you have observed here. Therefore, although there is no GIS in this lesson, the material covered here lays down the fundamentals that you'll use for at least the next three lessons and possibly your term project.
When you are done looking around in the instance, close your Remote Desktop session and stop the instance using the instructions in the previous page.

If something goes wrong with your instance, you can terminate it and create a new instance.

Typically when you log in to your instance, you'll open Remote Desktop Connection and type the user name Administrator, followed by the new password that you set above. To end your session, you can just close the remote desktop window. If you are going away for more than an hour, also make sure to stop your instance in the AWS Management Console.

Assignment: Reflecting on your first AWS walkthrough

The first few lessons of this course have relatively lengthy technical walkthroughs with lots of moving parts, therefore your assignments will largely consist of showing evidence via screen captures that you were able to complete these steps. I'll also ask one or two reflection questions that you will answer to accompany these images. In Lesson 4, you'll have the opportunity to complete a more complex walkthrough and make a video demonstration of your work.

Deliverables

Please create a new document and put the following in it:

The screen capture that you were instructed to take at the end of the walkthrough when you created your first Windows instance.
A thoughtful paragraph or two answering the following question:

Think about your current workplace (whether GIS-related or not) and how it uses computers and information technology. Then think about your recent experience launching a brand new Windows instance onto which you could do all kinds of things. What are some ways that using virtual machines on an IaaS cloud like Amazon EC2 could benefit everyday computing and IT operations at your workplace? What are some technical or logistical obstacles that your workplace might face with using virtual machines on the cloud? If you aren't currently employed or you are a full time student, consider this question from the perspective of a former job you've had, or a graduate student using computers to conduct research.

Submit your document to the lesson drop box on Canvas.

Cloud Computing Discussions: Introduction

Cloud GIS

Cloud GIS is an emerging set of technologies, concepts, and work practices, rather than a settled set of services. This course is intended to provide you with both practical skills that you need to use cloud computing in your everyday work, and also the background and critical thinking you need to make decisions about how cloud computing should be used in your organization.

Cloud computing discussion

Most lessons feature a cloud computing discussion page that presents an important aspect of cloud computing and encourages you to envision its potential impact on GIS systems. Many of these will be accompanied by required readings that will guide you toward the discussion. We'll use several free online chapters of Rosenburg and Mateos' The Cloud At Your Service. Buying the entire book is optional, but I will list other chapters from it that will be helpful in some of the discussions.

The trends we will cover this term are:

GIS cloud computing
Cloud computing and you
Cloud computing economics
Cloud security
Cloud architectures
Cloud service vendors
The future of cloud computing

I'll ask you to participate in threaded discussions with your classmates based on several prompts that I will provide on the topics above. These constitute an important part of your participation grade.

On the next page, you'll find your first cloud computing assignment. In this assignment, we will examine what cloud computing is, how we are using, and how we plan to use cloud computing in GIS.

Cloud Computing Discussion: Cloud GIS

First, please read Chapter 1 in Rosenberg and Mateos's The Cloud at Your Service [13]. The book is available as a free preview from the publisher's website. Purchasing the entire book is optional for this class, and not required.

Also, please read this paper:

Spatial Cloud Computing [14]

and this one:

Cloud Computing: A Solution to Geographical Information Systems (GIS) [15].

Please note that I am not endorsing the opinions expressed in these papers, instead, I am hopeful that these papers will stimulate a good discussion. Please feel free to agree or disagree with the authors.

Deliverables for this week's emerging theme:

Post a comment directly on the discussion forum in Canvas that describes how you see cloud computing being used in GIS work, and conversely, how cloud computing might need to change to fit the needs of GIS, in addition to any reactions you have to the papers above.
Then, I'd like you to offer additional insight, critique, a counter-example, or something else constructive in response to one of your colleagues' posts. For possible full credit on this assignment, you need to do both.
Brownie points for linking to other technology demos, pictures, blog posts, etc., that you've found to enrich your posts so that we may all benefit.