It is an exciting time for Cloud GIS. There has been a huge upswing in interest in geospatial data, and the computing infrastructure to support the visualization and analysis of this data has been developed as well. While the infrastructure to support cloud GIS is now accessible to everyone, the exact forms it will actually take is still unknown.
The purpose of this course is twofold: first, to give you practice with using a variety of cloud GIS services, and second, to give you an understanding of what cloud computing is more broadly, and how it should and should not be applied in various GIS problem contexts.
You will be creating a variety of cloud GIS services during this course. Most of these will fall under the heading of server GIS. The platforms for this will include GIS infrastructure as a service, GIS platforms as a service, and GIS software as a service. These platforms will be defined in subsequent lessons, along with the five essential aspects of cloud computing.
By the time this course is over, you should feel comfortable setting up and using server GIS using cloud computing, and you should have an understanding of how cloud computing can help solve GIS problems.
The first lesson begins with a discussion of the definition of cloud computing followed by the use of the Amazon Web Services (EC2) to create your own cloud by starting and stopping a virtual machine. Finally, we have our first cloud computing discussion, on the overall topic of Cloud GIS.
At the successful completion of this lesson you should be able to:
Cloud computing is a concept and a phrase that has become increasingly popular. However, there are a number of competing definitions for what "cloud computing" entails. One very effective definition of cloud computing, which consists of five essential characteristics, three service models, and four deployment models, comes from the National Institute of Standards and Technology (NIST). The final version of this definition [1] was published in October of 2011, and is available from NIST. During this course, we will explore how the essential characteristics and service models in particular can be used in a GIS context. Let's consider them now in turn.
The five essential characteristics can be hard to recall at times. During one sleepless night, I came up with the mnemonic NO-REM as a way to remember them.
N stands for Network access. Cloud computing services can be accessed from a variety of networked devices, such as workstations, mobile phones, and other servers. A GIS example is a geospatial information service that allows access from browsers and from other servers.
O stands for On-demand self-service. Cloud computing services should be accessible at will, without having to consult and get permission from a human being. A GIS example is the ability to start multiple map servers by using a browser interface.
R stands for Resource pooling. Cloud computing resources such as processing power, storage, and input-output (to use the Von Neumann architecture [2]) are provisioned for different clients from a common set of physical assets. Clients need not know (and often cannot know) exactly where the physical assets are. A GIS example is sharing computers owned and administered by Esri, Amazon, or Microsoft, without knowing or caring how these computers are being provisioned (as long as they stay up!)
E stands for Elasticity. Cloud computing services can be scaled up and down to meet demand and decrease waste. A GIS example is processing a large spatial data set quickly using many cloud computers, which are then discarded when the task is done.
M stands for Measured service. Cloud computing services are paid for by resources used (such as processing power, storage capacity, or number of user accounts). A GIS example is paying for a map server only for the hours it is up and the bandwidth it uses, rather than a whole computer.
All of the GIS examples of the essential characteristics are ones that you will experience during this course.
The three service models have a definite order to them. Infrastructure as a service (IaaS) is the more fundamental layer, followed by Platform as a service (PaaS) and Software as a Service (SaaS). I was quite impressed with the diagram explaining IaaS, PaaS, and SaaS [3] at venturebeat.com, so I slightly changed it and re-made it below:
As you can see, in the traditional, computer-under-your-desk model, you manage everything. Moving to a Cloud Infrastructure as a Service (IaaS) means that the von Neumann troika of IO, storage, and processors, are managed by the provider, and you bring everything else. In a GIS context, this would mean that you rent computing power from a cloud provider, and use it to solve GIS problems. We'll use the Amazon Elastic Compute Cloud (EC2) running ArcGIS Server to study IaaS.
A Cloud Platform as a Service means that the vendor provides the physical layer, the operating system, and the runtime. You are still able to add your own data and write software that runs on the cloud platform. Facebook, from a developer's perspective, offers a PaaS because APIs (programming tools) are available to write programs that run in Facebook. Google App Engine is another example: Google gives you the computing power and you write the apps.
Software as a Service (SaaS) is a bit easier to understand. You just use it without having to install anything or write any code. Online email is a very good example of SaaS. So is Google Maps. Many GIS and mapping companies are offering mapping and spatial data processing through a SaaS model.
Next, we will get started on the leading IaaS: Amazon's EC2.
The Amazon Elastic Compute Cloud (EC2) is an infrastructure as a service (IaaS) cloud. This means that it provides computing power and resources that you can use for a fee. You take care of running the software; Amazon EC2 provides the hardware.
To understand Amazon EC2, it’s important to understand the concept of virtualization. When you use your computer at home, it’s very likely that you have one physical “box” sitting on or below your desk, with a power button, disk drives, a video card, and so on. The relationship between the physical machine and the machine you log into is 1 to 1. Virtualization, however, is the idea of hosting multiple “virtual machines” on a single physical box. These virtual machines share some hardware resources, but they appear to the end user as distinct machines that can be logged into and administered separately.
You may have used virtual machines at your place of employment; many companies are using them in the workplace because they are more flexible and cost efficient. Most often, an IT administrator will purchase or choose a powerful machine and configure it to be a “virtual server”, which is a physical machine that hosts multiple virtual machines. Obviously, it takes a powerful computer to act as a virtual server, and it takes a fair amount of IT administration skill to set one up.
Enter Amazon EC2. When you work with Amazon EC2, you create and run virtual machines in Amazon’s data centers. You don’t have to know too much about the details of the virtual servers (nor does Amazon want to reveal this). The idea is that you can focus on the software on your server and let Amazon take care of the hardware needs.
Of course, there is a cost for using these resources. You are charged hourly fees for the computing power used, and for the amount of data that you store on Amazon EC2. Most of the things you can do or use on Amazon EC2 have some sort of fee associated with them, but unless you are running a high-traffic site with many gigabytes of data, computing power and disk space are the two biggest cost concerns.
The benefits of Amazon EC2 can be enormous in some situations. Here are a few of the immediate advantages:
Before going forward, there are two important vocabulary terms that you should understand regarding Amazon EC2:
Esri has created an AMI that has ArcGIS Server installed and configured. You will use these AMIs to create EC2 instances, thereby getting the server software running on Amazon EC2. Once you get the instance running, you can log into it using an application called Windows Remote Desktop. This is the same way that you would remotely log in to any other computer in your network, except this time the machine is outside your network, running on Amazon EC2.
You can perform all of these steps on your own home computer as long as it has an Internet connection. In fact, it's recommended that you use your home computer because some workplace IT departments have placed restrictions on accessing computers outside the firewall (like Amazon EC2 instances) using Remote Desktop. Please note that you cannot use a personal hotspot through a mobile phone to log in to your EC2 instances.
Security is one of the biggest issues that causes organizations to hesitate when they consider cloud computing. It is a natural reaction, after all for most modern organizations, their data is their lifeblood. How could we entrust that to people we don't know and have little control over?
It may surprise you to know that in the eyes of many security experts, using cloud computing can make your data more secure, not less secure, as long as the correct procedures are followed. Which makes sense when you think about it, it's another aspect of the benefits of scale. I have a lot of confidence in the computer security folks here at Penn State; they are excellent. I'm also pretty confident that Amazon and Google employ even better computer security experts. However, no experts can protect you from yourself; if you start up a server, expose it to the world, and fail to patch it, it will get hacked. Therefore, the key is to follow the correct procedures.
This page describes how to safely complete this course and gives some general guidance and pointers to further information on how to safely administer a server in production.
In this course, we will be learning about and experimenting with a variety of cloud computing technologies. The most important method you should follow to use them safely is simply to follow the directions in the course in their entirety. Don't skip steps. Further general guidelines for security in this course can be divided into two parts. We will first use Infrastructure as a Service, by starting server instances on Amazon's compute cloud. Later in the course, we will use Platform as a Service and Software as a Service services like ArcGIS Online, Carto, and Mapbox. Good security practices for the second part of the course (platform and software as services) are easily described; use a strong, unique password for each service. If managing multiple strong passwords is an issue for you (I think it is for most humans), consider using software like 1Password [4]. The rest of this section gives some guidelines for safe usage of Amazon server instances for the Infrastructure as a Service portion of the course.
Even if you are not going to be sharing copies of your instances, you may need to think about how to safely put your server into production.
A production server is one that is serving your business content, live, to end users in a highly available and highly secure manner. A complete guide to production server security is unfortunately outside the scope of this course. We are not setting up servers that are ready to go into production. However, we will be covering some aspects of server security as we go along in the course. In addition, here are some general principles that you should know:
A good resource for how to safely run a server in production is the National Institute for Standards and Technology's "General Guide to Server Security [7]," accessible at csrc.nist.gov/publications/nistpubs/800-123/SP800-123.pdf. It describes the necessary planning steps, how to secure your operating system, how to secure your server software, and how to maintain your security. It also describes the multiple personnel roles that are involved in good security practices. If you will be involved in helping to administer a server in production, I suggest you read this guide and follow its recommendations.
OK, enough of the heavy stuff; it's time to start your first cloud computer in this course, an Amazon instance!
Let's create an EC2 instance that is running Windows. The purpose of this exercise is to get you familiar with the basics of Amazon EC2 using some familiar software. Before you attempt this part of the lesson, you need to make sure you've obtained an Amazon account and enabled it for use with Amazon EC2. This should have been covered during the course orientation.
If you have any doubt about the above, contact the course instructor.
Here are the steps for getting Windows running on Amazon EC2. Since Amazon can potentially update their site at any given time, some minor adjustments may be required for these steps. Contact the instructor if you have questions, or, if you find an issue that you are able to work around, please mention it in a comment in the Technical Discussion forum.
Once you launch an instance, the instance starts automatically and your Amazon bill begins accruing. It's very important to understand that you begin amassing charges right away; Amazon does not wait until you log in to your instance to begin charging you. In order to control costs, you need to stop your instance whenever you aren't using it. Before you take a break, please immediately continue reading the next section of the lesson to understand how to properly stop and start your instance.
Fortunately, you don't have to repeat all the previous steps to complete the Launch Instance wizard every time you want to use Amazon EC2. Once you have an instance created, it's fairly easy to log in, start, and stop it. Before we talk about logging in, let's cover the basics of how to stop and start the instance. You'll need to begin using these techniques immediately, every time you use your instance, in order to keep costs down.
When people begin using Amazon EC2, they often ask about the difference between logging out, stopping, or terminating an instance.
If you fail to stop your instances after you have finished working, you will quickly use up the Amazon Free-Tier credits and start seeing charges to your credit card.
Below are some reference instructions that you can use to stop and start your instance (Do not stop your instance for at least 10 minutes after you first launch it. It needs time to configure Windows for the first time.)
You can return to this page throughout the course if you need help remembering how to stop and start your instance.
Use the instructions below to stop a Windows instance like the one you created in Lesson 1. Do not use these instructions for ArcGIS Server instances.
This stops the clock on the charges for running your instance. When an instance is stopped, no one can use your server and you cannot log in.
Use the instructions below to start a Windows instance like the one you created in Lesson 1. When you start your instance, it takes a few minutes to boot up, but you shouldn't have to wait the full 10 minutes that you waited when you first launched the instance. Always follow the instructions below when you start your instance:
After a few minutes your instance will be ready to use with its Elastic IP. After enough times of repeating this action you should have these instructions memorized.
To view your accrued charges at any time, go to the AWS Management Console Billing page [11]. You can see detailed reports of your usage of each part of the service by clicking the Bill Details button at the upper right of the main Billing page.
I recommend that you view your credits after every lesson so that you understand whether you are in danger of excessive charges. If you consistently stop your instances after you are finished working, your costs will remain small.
After you've given your instance about 10 minutes to configure Windows, you can get ready to log in to the instance and start working with your software. The first thing you need to do is get the Administrator password so you can log in.
If something goes wrong with your instance, you can terminate it and create a new instance.
Typically when you log in to your instance, you'll open Remote Desktop Connection and type the user name Administrator, followed by the new password that you set above. To end your session, you can just close the remote desktop window. If you are going away for more than an hour, also make sure to stop your instance in the AWS Management Console.
The first few lessons of this course have relatively lengthy technical walkthroughs with lots of moving parts, therefore your assignments will largely consist of showing evidence via screen captures that you were able to complete these steps. I'll also ask one or two reflection questions that you will answer to accompany these images. In Lesson 4, you'll have the opportunity to complete a more complex walkthrough and make a video demonstration of your work.
Please create a new document and put the following in it:
Submit your document to the lesson drop box on Canvas.
Cloud GIS is an emerging set of technologies, concepts, and work practices, rather than a settled set of services. This course is intended to provide you with both practical skills that you need to use cloud computing in your everyday work, and also the background and critical thinking you need to make decisions about how cloud computing should be used in your organization.
Most lessons feature a cloud computing discussion page that presents an important aspect of cloud computing and encourages you to envision its potential impact on GIS systems. Many of these will be accompanied by required readings that will guide you toward the discussion. We'll use several free online chapters of Rosenburg and Mateos' The Cloud At Your Service. Buying the entire book is optional, but I will list other chapters from it that will be helpful in some of the discussions.
The trends we will cover this term are:
I'll ask you to participate in threaded discussions with your classmates based on several prompts that I will provide on the topics above. These constitute an important part of your participation grade.
On the next page, you'll find your first cloud computing assignment. In this assignment, we will examine what cloud computing is, how we are using, and how we plan to use cloud computing in GIS.
First, please read Chapter 1 in Rosenberg and Mateos's The Cloud at Your Service [13]. The book is available as a free preview from the publisher's website. Purchasing the entire book is optional for this class, and not required.
Also, please read this paper:
and this one:
Cloud Computing: A Solution to Geographical Information Systems (GIS) [15].
Please note that I am not endorsing the opinions expressed in these papers, instead, I am hopeful that these papers will stimulate a good discussion. Please feel free to agree or disagree with the authors.
In the previous lesson, you learned the basics about servers and clouds, and you got some experience setting up an EC2 instance. In this lesson, you’ll learn a little bit more about how a server can augment your GIS. You will also set up a new instance running Esri ArcGIS Server, which you will use in Lessons 2 through 4.
ArcGIS Server is just one part of Esri's ArcGIS Enterprise product suite that they market for sharing GIS across the web and internal organizational environments. We'll discuss ArcGIS Enterprise and Portal from time to time in later lessons; however, we are going to focus on ArcGIS Server here. Fortunately, the ArcGIS Server piece is relatively easy to get running in the cloud, and we'll concentrate on ArcGIS Server to understand how maps and GIS datasets you make on your desktop are exposed across the web.
Although you could potentially install ArcGIS Server on your own home or work computer, in this course, you will run ArcGIS Server on the Amazon Elastic Compute Cloud (EC2). Basically, you pay Amazon an hourly fee to run ArcGIS Server on their machines. This is an easy way to practice with a real server without compromising or adjusting your own machine. Running ArcGIS Server on Amazon EC2 also helps you learn about the cloud by using it.
At the successful completion of this lesson, you should be able to:
ArcGIS Server is Esri software that allows you to expose your GIS as a set of web services. It is just one component in a larger software suite called ArcGIS Enterprise that enables organizations to deploy their GIS onto the web. In Lesson 5, we'll talk more about the different parts of ArcGIS Enterprise.
Web services are software code or components that run on a specialized machine called a server. Web services receive requests from other apps and machines, called clients. The request might be to send some information or process some data. GIS web services do things related to sending and processing geographic information. Here are some examples of types of web services offered by ArcGIS Server.
ArcGIS Server works through the concept of distributed computing, in which you can increase the power of your server by adding more physical machines. For this reason, ArcGIS Server is made up of several different components that you can either install all on one machine or spread out among many machines. We won’t examine these components in much detail in this course, because you will have ArcGIS Server installed for you, and you will only be using one machine. However, below is a brief introduction of the most common components.
When you run a program on the Windows operating system, it runs as a specific user account and can only do things that the account can do. This is why you sometimes see Windows popping up messages that the program needs Administrator permissions to continue. That type of message means that the account running the program is not an administrator, so you need to manually confirm that it should temporarily be allowed to do something that only an administrator would ordinarily be allowed to do.
ArcGIS Server uses an account to run the GIS server, called the ArcGIS Server account [18]. This account is specified during the ArcGIS Server installation. You won't do much with the ArcGIS Server account in this course because it comes preconfigured when you run ArcGIS Server on Amazon EC2.
If you run ArcGIS Server in your own organization, you need to remember to give the ArcGIS Server account permission to read any GIS data used by the server. The account also needs permission to write to any datasets you will edit.
In this class, you’ll work with your own ArcGIS Server that runs on Amazon EC2. It has a GIS server and the ArcGIS Server account already configured. After logging in to your server, you’ll publish some map services and use them in a web app that you create. You’ll also learn techniques for speeding up your map services, using a tile cache, and how to use a map service for web editing.
This lesson gets you to the point of setting up a server, publishing a service, and making a simple web map on ArcGIS Online.
In the previous lesson, you used the AWS Management Console to set up an EC2 instance. When you build an ArcGIS Server site on Amazon EC2, you typically use a different approach, in our case, a resource called Cloud Formation [19]. This consists of a text file that pre-defines all of the parameters of the site you intend to build on the AWS platform, which can be deployed to install everything in an unmanaged manner. Cloud Formation templates exist, which can be customized to deploy the precise system you need. ESRI has developed Cloud Formation Templates that are already set up to do the heavy lifting of installing ArcGIS Enterprise in AWS, leaving us to provide only a few parameters.
It's possible to build simple one-machine ArcGIS Server sites manually with the AWS Management Console. You can even put several of these "siloed" sites under a load balancer to get more computing power. However, to get the full benefit of the ArcGIS Server architecture, in which multiple GIS servers process and balance loads in a peer-to-peer fashion, Cloud Formation Templates are the way to go.
Getting access to the ArcGIS Enterprise AMIs
Cloud Formation uses some Esri-created Amazon Machine Images (AMIs) behind the scenes to create your ArcGIS site. These AMIs have ArcGIS Server, ArcGIS Pro and in some cases, a database installed on them.
The AMIs require that you "bring your own license" and apply it to any Esri software that you run on the EC2 instances. In other words, Esri pricing is not built into the hourly fees for the instance, like it is with Windows. The Esri AMIs are accessible by anyone in the AWS Marketplace, but you must log in with your Amazon account and accept the terms and conditions for using them.
This doesn't actually launch anything right now, it simply establishes that you agree to the terms of using the particular AMI, but if you don't perform this step and accept the terms, Cloud Formation will fail when you try to create a site. In fact, if you ever experience Cloud Formation failures in the future, you should check to make sure you have accepted the software terms for the exact AMIs that you are trying to use. There's nothing else you need to do on the AMI Marketplace page.
Security Requirements for ArcGIS Enterprise
Recent versions of ArcGIS Enterprise and Server now require that all communications be performed over a secure channel. This means that anyone making a request for a map service or web app from your ArcGIS Enterprise/Server machine must do so using the https protocol rather than traditional http. You may have noticed that many websites you visit now appear with an https URL. Https uses something called, Secure Socket Layer (SSL) to encrypt all traffic that is sent between clients and the web server. In this way, any text that's sent, including passwords, usernames, and other content, is protected from hackers who might try to intercept or monitor it. Implementing SSL on a web server is good practice, which is why many websites and web services are utilizing it.
Enabling SSL on a web server isn't a trivial process, however, and it requires that an SSL Certificate be obtained and installed. SSL Certificates are issued by authoritative providers that verify the identity of your web server and provide an assurance that the communication channel clients establish with the server are properly encrypted. It makes sense that only authorized providers issue SSL Certificates, otherwise anyone could generate them and deploy them improperly. Further complicating this process is that SSL Certificates are attached to the fully-qualified domain name rather than the IP address of a web server.
Every web server has an IP number, which has the form xxx.xxx.xxx.xxx, that uniquely identifies it on the Internet, but clients typically don't use that number to communicate with it. Instead, clients (like you in your web browser) use a fully-qualified domain name to call a server. A fully-qualified domain name is a URL you would enter to visit a website, for example, www.pasda.psu.edu [21] or www.arcgis.com [22]. Domain names are linked to IP addresses using a registry called DNS (Domain Name System). Anyone wanting to attach a domain name to their server's IP must make a request to a DNS server. This request is performed by authorized Internet service providers.
So, to enable SSL on our ArcGIS Enterprise/Server machines, we need to do two things: (1) assign a unique, fully-qualified domain name to our Elastic IP in DNS, and (2) generate and install an SSL Certificate that refers to our domain name. To facilitate the setup of our ArcGIS machines in AWS, I have performed these steps for you. I assigned you a domain name in the form, namegeog865####.e-education.psu.edu, and registered it in DNS by linking it to the Elastic IP you created in Lesson 1. I also generated SSL Certificates for you using the same domain name I assigned you. That being completed, the process of installing and configuring these on your ArcGIS machines is trivial using the Cloud Formation Template; all you need to do is reference your domain name and SSL Certificate in the template and Cloud Formation does the rest.
In this part of the lesson, you'll use Cloud Formation to create an ArcGIS Enterprise site on Amazon EC2.
Before we proceed to create a new EC2 machine instance for Enterprise, I recommend that we terminate the instance and storage you created in Lesson 1. We won't use that machine or its storage subsequently, so we may as well remove it and not incur any more potential costs.
To simplify the Cloud Formation installation, we will upload a few config files to an S3 Bucket, from which the template can access them. You will refer to them later as you customize the template parameters.
Your new machine instance is now set up and ready for you to log into and start working with ArcGIS Server.
Debugging Resources:
If you receive an error in the CloudFormation Event page, you may see information about which step in the process caused the issue; the error may appear in red text on the stack page. The Event logs in CloudFormation sometimes aren't too helpful however. This is because, often, the error occurs after CloudFormation has successfully created your EC2 Instance and while the ArcGIS software is being configured on the machine instance itself. Errors in the CloudFormation template don't report specifics about any errors encountered on the EC2 Instance, rather, the errors are logged in files saved on your EC2 instance. To view those logs, check to see If your EC2 Instance was created and still appears in your AWS Management Console. (If it is not there, repeat the CloudFormation process, being sure that the "preserve successfully provisioned resources" is set to True.) If it is there, proceed to create your Windows username and password and use Remote Desktop to log into it. On your EC2 virtual machine, open a File Explorer and use the View - Options - Change Folder and Search Options settings to be sure you can see protected operating system files, see file extensions, see hiddn folders, etc. The log folder that ArcGIS generates is hidden by default.
Browse to C:\cinc and open arcgis-enterprise-primary.log in a text editor. You'll see entries with their respective timestamps as they occured during the install. Scroll through the entries in chronological order until you encounter one with a Warning or Error indicator. That should indicate what the issue was. It is very common for us to enter the name of a license file, domain name, or anything else incorrectly in the CloudFormation template. The log file in C:\cinc usually provides information we can use to deduce where the error/typo occurred. If you are unable to interpret the error logs and find the culprit, feel free to send the log file to me and we will get to the bottom of it.
Now that you have an ArcGIS Server site running, let's take a quick tour to give you a feel for what's there.
You should now have a good feel for what's running on your ArcGIS Server site and the settings available there. The next item of business is to log into the EC2 instance itself and move some data there. This will allow you to publish your own web services on the ArcGIS Server site.
Now that your site has been created and started, you can get ready to log in to the instance and start working with your software. Some of these steps will be similar to what you did in Lesson 1, but please follow them closely.
The password rules are fairly stringent; please see them in the image in Figure 2.1, below.
The following paragraph talks about disabling IE enhanced security on your EC2 machine. An alternative to doing that is to simply install the Google Chrome browser on your EC2 machine and use it instead of Internet Explorer. You may use Internet Explore to browse to the Google site to download and install Chrome.
As a security precaution, it's usually not a good idea to go around browsing the web from your production server machine. To do so is to invite malware intrusions onto one of your most sensitive computers. The operating system on your instance, Windows Server 2012, enforces this by blocking Internet Explorer from accessing most sites. This is called IE Enhanced Security Configuration (ESC).
IE ESC gets burdensome when you're using the server solely for development or testing purposes like we are. To smooth out the workflows in this course, you'll disable IE ESC right now and leave it off for the duration of the course.
Remember that if you are going away for more than an hour, you should stop your instance using in the AWS Management Console. (Only stop your machine Instance. Leave your storage volume(s) and Elastic IPs as they are. Deleting them may require that you completely rebuild your virtual machine.)
ArcGIS Server on Amazon EC2 comes preconfigured with some running services and data. These can help you understand how the server works and they're also a good way to verify that your server is running correctly. Let's take a few minutes to look at these items.
Now that you've seen what's preconfigured on your server, you'll learn a little more about how you can copy your own data onto the instance and start your own mapping web service.
One of the most challenging aspects of moving to a cloud deployment is transferring data from your local (on-premises) environment onto the cloud. In this section of the lesson, we'll look at special problems that arise in data transfer scenarios. We'll also discuss ways data can be moved to Amazon EC2, and you'll copy some GIS data to your own instance in preparation for publishing a web service.
For your data to go from your machine to commercial cloud services such as Amazon EC2 or Amazon S3, it must go "across the wire", meaning it is transferred through the Internet onto the cloud-based server. This can pose the following issues:
Let's examine these problems one at a time.
GIS data collections can be very large: up to terabytes in size. This is often the case when imagery is involved, but even vector datasets with a broad amount of coverage or detail can prove unwieldy for an Internet transfer.
When moving large datasets to the cloud, you have to plan for enough time to move the dataset and, if possible, increase your bandwidth. After doing a test transfer of a few hours or days, you should be able to get an idea of the rate of data transfer, and you can thereby extrapolate how long it would take to transfer the entire dataset.
If this amount of time is unreasonable (say, months) you may consider shipping the data directly to the cloud provider on a piece of hard media. The cloud provider can then load the data directly onto the cloud much faster than you could send it over the Internet. Amazon provides such a service called AWS Snowball [23]. You load up your data on a ruggedized secure device called a "Snowball" and ship it to Amazon. In the old days of computing this technique was called "sneakernet", since you could sometimes put your data on a floppy disk and walk it across the office to another computer faster than you could send it electronically.
Cloud-based data centers like Amazon's are built to handle high levels of data traffic coming in and out. However, your connection going out to the cloud may be limited by a slow connection or lack of available bandwidth. Some IT departments and internet service providers (ISPs) throttle or cap the amount of data that can be transferred from any one machine or node in the network. These types of policies are sometimes put in place to prevent the use of streaming sites such as BitTorrent that violate company policy or simply monopolize the organization's available bandwidth. However, sometimes these policies can negatively affect legitimate business needs such as transferring data to the cloud. If you find yourself in a situation with low bandwidth, it might be helpful to visit with your IT department to understand if your machines are being throttled and could be granted an exception. If an exception is not possible due to other bandwidth needs within the company you might explore whether your data transfer could occur during off-hours such as nights or weekends.
Confidential or proprietary datasets, such as health records, may require extra security measures for transfer to the cloud. When dealing with sensitive data, the first question to answer is whether it is legal or feasible for the data to be hosted in the cloud in the first place. For example, some government organizations responsible for national security may possess classified or secret data that could never be uploaded to Amazon's data centers no matter the measures taken to ensure secure data transfer. Also, some organizations may not have the desire or permission to host datasets on servers that are physically located in a different country.
Other types of datasets may be okay to host on the cloud but must be encrypted during transfer, to prevent a malicious party from using any data that may be stolen en route to the cloud server. Secure socket layer (SSL) connections (HTTPS) and secure FTP are two techniques for encrypting data for Internet transfer.
Sometimes the ability for one computer to directly "see" or communicate with another computer is hindered by firewalls or network architectures. For example, your computer at work is probably allowed to only access the file systems of other computers on your internal network. You could potentially open up a folder on your Amazon EC2 instance for access by anyone but this opens a security risk that malicious parties could find the folder and copy items into it.
There are a number of strategies that people use to get around these limitations when transferring data into Amazon EC2 and other cloud environments, these include:
The ArcGIS Server on Amazon EC2 help has an overview of data transfer techniques. Please take some time right now to read Strategies for data transfer to Amazon Web Services [24].
In this part of the lesson, you'll copy some data to your EC2 instance in preparation for publishing a web service. Before you attempt these steps, you should be logged in to your EC2 instance through Windows Remote Desktop Connection. If you followed the steps earlier in the lesson for connecting via Remote Desktop then your local disk drives should be available to the instance.
For simplicity in this course, you'll follow the workflow of transferring all data to your EC2 instance, working with ArcGIS Desktop on your EC2 instance, and publishing to ArcGIS Server on your EC2 instance. Theoretically, you could do most of the desktop work on your own computer and then publish up to the server when you were ready. However, any time you introduce separate computers into the architecture, especially on different networks (in the case of your home computer and your EC2 instance), things can get more complicated. Because you have a limited time available to learn about ArcGIS Server, I want you to spend the time experimenting with the capabilities of the server, not worrying about network issues or which machine contains the data.
However, in large organizations, these challenges of distributed architectures are inevitable. Some GIS shops might have a GIS server administrator who controls access to ArcGIS Server, and a number of cartographers and desktop GIS users who just prepare the maps for publishing. This latter group of "publishers" work on machines that are separate from the server and may even reside on a different subnet than the server. In some cases, the publisher machines and the server machines use different copies of the data that are kept in sync by an automated process, and the paths to the data used by the publishers may be different than the paths used by the publishers.
To help manage these scenarios, ArcGIS has the ability to "register" a data location, meaning that you provide ArcGIS Server with a list of data locations you typically use. If the publishers use a different path to the data than the server uses, you can provide both the paths. Then, when you publish a service, the map is copied to the server and all the paths in the map are switched to use the server's path instead of the publisher's path.
This can be a difficult concept to conceptualize with just a verbal explanation, so please take a few minutes to read the help topic registering data on ArcGIS Server [26]. This has some diagrams of different situations where data registration can be particularly useful. It is one of the most important help topics for ArcGIS Server.
Please note that if you try to publish a service and ArcGIS Server does not find any of the data paths in your map in its list of registered folders and databases, the data will be packaged up and copied to the server [17]at the time you publish. The copying ensures that no data paths will be broken in the published service. This automatic data copying is an interesting feature in some scenarios where the publishers do not have the rights to log in to the server machine, but it is not an appropriate workflow for managing large amounts of data. The best approach is to make sure you set up workable data locations on the publisher's machine and the server machines, and then carefully register those locations with ArcGIS Server. In some cases, like ours, the publisher's machine and the server machine will be viewing the same path to the data.
Follow the steps below to register your C:\data folder with ArcGIS Server:
Now you're ready to publish a map web service using your Appalachian Trail dataset that you placed in C:\data. You'll do this in the next section of the lesson.
In the previous part of this lesson, you copied a map document to your EC2 instance. However that map is still only available inside ArcMap on your instance. Now you'll take the step of publishing the map as a web service so that it can be used by anyone.
Whenever you publish a service, you begin the process in ArcMap, having opened the map document that you would like to publish. You run an analysis process on the map to find anything that might prevent it from being drawn by ArcGIS Server's drawing engine. You then set service properties and publish the service.
When you publish a service, you are giving the server a set of things that it can do with a particular map. In order for this to be useful to anyone, the client application and the server need to be able to communicate with each other in a way that both understand. There are several ways that an ArcGIS Server map service can allow itself to communicate with client applications.
Representational State Transfer (REST) allows a client to discover information about a service or invoke operations on a service using a known structure of URLs. REST is not really a communication protocol, but rather an architecture; a way of building a web service so that it has a hierarchy of resources and operations that can be accessed by formulating the correct URL.
The actual bits of information sent "across the wire" can vary in format, but JavaScript object notation (JSON) is often used. JSON is desirable because of its well-known structured format and the fact that it can compact information into a minimal amount of characters.
Here's an example of some JSON that describes a Pennsylvania municipalities map service [28]. Take a few moments to examine all the properties exposed in this JSON. This is actually an easy-to-read format of JSON with extra line breaks and spaces called "pretty JSON." Removing the spaces to get pure JSON makes it more difficult [29] for you to read, but reduces the information that the computer has to read and can, therefore, make your web service more efficient.
REST is stateless, meaning that any one request cannot depend on information sent in a previous or future request. All requests are independent of each other. This requirement can make for some interesting architectural considerations. For example, to support an interactive web editing session with REST, you must send an entire digitized feature to the database at once; you cannot send the feature vertex by vertex as it is digitized.
Because of REST's simplicity and efficiency, the Esri web mapping APIs for JavaScript, Flex, and Silverlight communicate with ArcGIS Server web services using REST.
Each GIS web service has its own specific purpose. It may support analysis performed inside an organization, or it may be intended to be used by anyone on the web. In this lesson, we'll assume that the Appalachian Trail service you just published is intended to be used by anyone on the web to explore and use in their own maps.
So, how could someone use your trails service in their own web map? A programmer could put the URL of your service directly into web app code and then write appropriate code to display the map. That's a topic for a different course, and ultimately writing code is something that many people cannot or will not do. In this part of the lesson, you'll use the ArcGIS Online map viewer, an interactive web map designing tool, to see how you can put together several services into a web map.
You might say that the ArcGIS Online map viewer is "running on the cloud". It is software as a service (SaaS), meaning you don't have to install any software in order to use it. When you save maps on ArcGIS Online, they are not saved to your computer, rather they are saved on an Esri server. You can come back and work with your maps from any computer as long as you tell the application who you are by logging in.
To perform this exercise, your Amazon EC2 instance must be running, but you can do the steps on your local computer.
https://namegeog865.e-education.psu.edu/server/rest/services/<your service name>/MapServer
.https
://namegeog865.e-education.psu.edu/
server
/rest/services
. You should see a hyperlink on the rest services page for your map service. Click it, and note the URL in the browser. This is what you can copy and paste into the ArcGIS Online map above.So, what good is this map that you've made? As mentioned above, if you have a permanently running server with a permanent address, you might choose to save your map and share it with the public. People could then search for and view the map in ArcGIS.com. Another way the map can be used is by web app developers. Each map saved on ArcGIS.com is assigned an ID. Esri has designed their web programming frameworks (APIs) for JavaScript, Flex, and Silverlight such that a developer can just reference a map ID in the code, rather than building the map "from scratch".
For this week's assignment, create a new document and insert the following:
When you are finished working on this lesson, remember to stop your Instance in the AWS Management Console.
How cloud computing services are defined and used is a key part of understanding cloud computing and foundational knowledge in this course. However, cloud computing definitions vary from source to source. For this week's discussion assignment, I'd like you to look back at the NIST definition of cloud computing [31] we are using in this course, and also read Chapter 2 of The Cloud at Your Service (available here as a preview from the publisher [13]). Then, compare Rosenberg and Mateos' definition of cloud computing with the NIST-based definition.
Web services have the potential to expose your data to a much wider audience than may have previously seen it. But beyond allowing simple visualization of the data, web services can also permit editing and creation of data over the web. This type of "web editing" can allow field workers and people who typically don't use GIS to contribute valuable information to your database, information that you might not otherwise get.
For all its benefits, exposing a database on the web comes with some challenges. How do you protect your data from becoming corrupted? If you put a database on the cloud, how do you keep it in sync with the database in your office? And what happens if multiple users edit a feature at the same time?
This lesson explores some of the requirements and challenges related to making a GIS database available for editing on the web. You'll put some data in a SQL Server Express database on your EC2 instance, and you will use that data to design a map for web editing. You'll learn about how ArcGIS Server provides a special type of "feature service" that is engineered to allow editing through a web service. Finally, you'll make a web application that allows others to edit your data over the Internet.
Throughout this lesson, you'll be guided with step by step instructions. At the end of the lesson, you'll post a screenshot of your work. Pay close attention to what you are doing, because next week you will be assigned a project in which you will have to think through these processes on your own.
At the successful completion of this lesson you should be able to:
The ability to expose a GIS dataset to Internet users and allow them to modify it presents some enormous opportunities and challenges. A GIS professional needs to carefully understand and weigh these considerations before making decisions about how to make data available for web editing.
Web apps for editing GIS vector geometries were cumbersome and somewhat rare until about 2005. Attributes could be sent to a database through a web service fairly easily, but sketching geographic features on a screen posed some different problems. How could vertices be drawn in the web browser in real time as the user sketched them, without the entire page refreshing? Or how could a user view a snapping threshold on the screen while making a sketch? These problems were somewhat alleviated when AJAX came on the scene.
The bane of web developers up to this point had been the necessity of doing a full send and retrieval of information to the server in order to accomplish anything, with the ubiquitous "page blink" occurring in between. AJAX was not a particular product or feature, but rather a technique that web developers devised to work with existing technologies, with the goal of making their apps more interactive.
JavaScript is a language web developers use to program actions on web pages (in contrast to HTML, which is markup language used to lay out the static elements on the page). AJAX stands for Asynchronous JavaScript and XML. Web developers discovered that they could use JavaScript to send and retrieve XML packets of information from the server to create certain effects in their applications, without doing a full refresh of the page or requiring any type of browser plug-in. This revolutionized the interactivity of web applications.
Perhaps you remember the first time you saw Google Maps. This was actually one of the first programs, mapping-related or otherwise, to really give people an idea of the power of AJAX. Google Maps used AJAX requests to request pregenerated map images as the user panned and zoomed, creating a smoother web map navigation experience than most people had ever seen. Virtually all major commercial web mapping sites now use this approach.
AJAX techniques helped open the door for interactive editing of GIS geometries through web applications. Users could now sketch edits on their maps and see each vertex of the sketch drawn in real time without being interrupted by page blinks or waiting for the browser to respond. They could press a key and immediately see a snapping threshold that would be applied to a vertex. People began to think about the ways web browser-based editing could improve their GIS. In more recent years, people have also began to consider benefits of smartphone and tablet-based editing.
There are often many individuals within an organization who lack extensive GIS training, but could still contribute valuable information to the GIS. These include receptionists, field technicians, planners, and managers. Web editing comes with several advantages for these types of professionals.
First, virtually everyone has used a web browser, so an intuitively designed web application can be much less intimidating for them than a full-featured desktop GIS program like ArcMap.
Another advantage is that the web app can be specifically tailored to certain editing tasks that are within the audience's realm of expertise - no more, no less. If you need field technicians to sketch the locations of telephone poles, you can design a web app that allows sketching of telephone poles and nothing else. You can make the symbols big and round and even tappable on a smartphone by someone wearing large gloves. You can make the app as simple as needed, remembering that a simple app can still collect very valuable information.
Because everyone knows how to use a web browser, web editing gives you the potential to allow everyone to contribute to your GIS. When might this be a good thing? When everyone knows something that you don't! Or when the power of everyone can create something more complete, useful, or accurate than you can create on your own.
Enter two buzz terms that have crept into the GIS world in recent years with the advent of web editing, crowdsourcing, and volunteered geographic information (VGI). Crowdsourcing is the idea of allowing anyone to edit an information repository, with the faith that this will make the repository more complete and accurate over time. Wikipedia is an example of a crowdsourced online encyclopedia. In the GIS realm, OpenStreetMap is an example of a crowdsourced street map of the world. People hold mapping parties for OpenStreetMap where they ride and walk around a town collecting data and then enter it all into a database. Regardless of your feelings on using crowdsourced data, this type of activity undoubtedly increases the quantity and accuracy of the data already in the database (especially if the database previously contained nothing).
Similarly, VGI allows individuals to enhance a dataset with information that they alone may possess. Whereas the term crowdsourcing evokes images of mass participation in the creation of publicly-available dataset, VGI is more versatile in that it can contribute to private or temporary datasets. For example, a city might set up a VGI application to allow citizens to report broken streetlights, graffiti, overgrown trees, and so on. This information may remain proprietary to the city and it may go away over time, unlike a crowdsourced database that is expected to remain more or less available.
You've learned that the beauty of web services is that they communicate by common architectures and protocols like REST and SOAP, that can be invoked by any device with a connection to the Internet. These devices include the numerous tablets and smartphones that have hit the market in recent years. The ArcGIS Server feature service that you will create in this lesson could potentially be used in editing apps for the iOS, Android devices, or Windows Phone.
Smartphone and tablet-based editing greatly facilitates the field work and crowdsourcing scenarios that you learned about above. It's a lot easier to record something you see on a street, such as a pothole, if you can send it to the database right away from your mobile device. The editing app may even allow you to attach a picture to your feature that you captured just seconds before with your phone!
Along with the benefits of web editing comes a number of challenges. These need to be well-understood and dealt with appropriately by anyone planning a web editing implementation.
When thinking about web security, it helps to consider all the different tiers, or levels, at which someone can access your system. Then consider how each tier might be vulnerable and how it might be secured. With web editing security, you need to at least consider the application tier, the web service tier, and the data tier.
If this sounds confusing to you, let's talk through them one at a time. First, consider the application tier. You need to decide which people will have access to your web editing application. Is it open to everyone on the Internet? This is the easiest to set up, but it also results in the most vulnerability. Your organization's existing firewalls might also make it fairly easy to set up an application that's only visible to members of your internal network, in other words, people who work for your organization. A trickier kind of application security to set up is one that has a login page where only certain members of your organization are allowed to log in, but not others. Setting up this type of security is certainly doable, but is beyond the scope of this course.
The next level of security to think about is the web service tier. An organization can set up its server such that certain services require a login to access. If the application itself also requires a login, the application developer must figure out how to get that name and password to be applied to the services accessed therein.
Regardless of whether you decide to require a login for your services, you should consider which layers should have editing allowed. There are some datasets that you'll want to expose for web editing, and others that you will not. You need to design your web services such that they only allow editing of those particular datasets that you want to have modified. Several days before writing this lesson, I came across a web map showing election results in a certain country. The authors of this map had used a feature service with popup balloons to display the election results. This was not a bad plan; however, the web service authors had inadvertently left the editing capability exposed on the feature service. I discovered that I could potentially click a popup balloon and literally rewrite the election statistics for any particular province!
Your software will give you controls over which features may be edited, and you must carefully understand and use these controls. Sometimes you may be required to group editable layers in one web service and non-editable layers in a separate web service that has different security settings.
The final tier of security to consider is the data tier. By exposing your dataset for web editing, you are opening your database to many more people than would otherwise have access to it. You need to plan for the scenario where a malicious party could corrupt or delete your data. Keeping a backup or replica of your data is recommended in web editing scenarios.
In addition to the threat of a malicious party corrupting your data, it's possible that a well-intentioned user could make a mistake and negatively affect your database. Before you take all the changes submitted by your web editors and push them into your on-premises database, you might choose to have a GIS analyst examine the edits and perform a quality check. This type of scenario is possible if you maintain separate replicas or copies of the database for web editing and for on-premises work.
You can reduce the possibility of data corruption by carefully limiting the types of features and attributes that web editors can access and create. Later in this lesson, you'll use feature templates that ArcGIS provides for editing. The feature templates help you give the web editor a palette of approved features that can be created, while making it impossible or difficult to create other types of features. For example, if you want your web editors to add only 8", 12", or 16" pipes to the database (no 20" pipes, or fire hydrants), you can create a feature template with only those three types of pipes, with the size attribute preset for each one.
You've learned in this section that keeping a copy of your data for web editing and a separate copy for your on-premises work can be a good practice for maintaining security and data integrity. The tricky part is synchronizing the two copies at the appropriate times, with only the appropriate changes. ArcGIS contains a feature called geodatabase replication that can help with this.
The simplest option for replication is to make a one-way replica of the geodatabase, which is essentially a one-off copy made in a desired projection and format. This is useful for exposing a read-only database for web use. Some sites use one-way replication into the mercator projection for their web database, since they need to use mercator on the web but not in their office.
A more complex, but more useful action is to make a two-way replica of your database. This creates a copy of the database for web editing that can be synchronized with the original (or "production") database at intervals that you choose. A GIS analyst can potentially examine the web edits before synchronizing them with the production database. If the two databases are separated by firewalls or reside on different networks, an ArcGIS Server geodata service can be used to synchronize the two. This is beyond the scope of the course, but it's important for you to have a basic knowledge of these architectures in case you ever need to implement them.
If replication is a new concept to you, or you would like to learn more about it, you can read the first several topics in the ArcGIS help section Managing distributed data [32].
GIS vector datasets come in many formats. Some of these are better suited to web editing than others. Since we are working with ArcGIS Server in this exercise, we'll talk about some of the data formats that Esri offers and which ones are required for web editing. You'll then load some GIS data into a database on your EC2 instance.
Whether you're working with Esri software or not, one of the most ubiquitous formats for exchanging GIS datasets is the shapefile. This is a data format developed and openly documented by Esri, meaning that other software companies are allowed to use, create, and share shapefiles. A shapefile actually consists of multiple files with the same root name and different suffixes (.shp, .dbf, .prj, etc.) that store the data's geometry, attributes, projection information, and so on. You'll often see shapefiles available on GIS data warehouse sites that allow you to browse and download geographic datasets.
A shapefile is handy for exchanging data, but it's not very useful for web editing. Because the shapefile is an openly documented file format, it may be possible for a web developer to write an application that edits shapefiles. However, this would be a significant amount of work and ArcGIS does not supply out-of-the-box web editing functionality for shapefiles. Nor does ArcGIS support web editing with the shapefile's more advanced (but less openly documented) cousin, the file geodatabase.
In order to perform web editing with ArcGIS Server, your data must be stored in the ArcGIS Data Store or in a geodatabase hosted in a relational database management system (RDBMS). Here's what those terms mean:
Why are these things required for web editing with ArcGIS Server? One thing you have to consider is that when you configure editing on the web, you may not want to expose your main production database to everyone on the network. Your data is valuable. You may have spent thousands of dollars collecting it. It may be required to meet certain quality standards. To protect your data, you'll probably choose to expose a copy, or replica, of it for web editing. This replica goes on your EC2 instance. You'll keep a separate replica of the data in your on-premises environment. This on-premises replica can be protected by your firewall, data quality checks, and so on.
From time to time, you can synchronize the two replicas using ArcGIS software tools. This means that one replica gets sent the changes that were made to the other replica, and vice versa. ArcGIS Server even provides a special type of web service for synchronizing two replicas, called a geodata service.
Terms you may see during this lesson include geodatabase and feature class. Geodatabase is an Esri-coined term to describe a database containing related GIS datasets, tables, relationship classes, topologies, and so on. A feature class is a vector dataset within a geodatabase.
Let's load some data onto your EC2 instance and prepare it for web editing. Your whole goal is to make a map on your instance and expose it through an ArcGIS feature service, which is the type of service that you can edit over the web. The first step is to get the data onto your instance and load it into SQL Server Express.
The first part of this process for us is to install SQL Server Express and its required licensing and system components.
You learned in the previous lesson how publishing a web service requires some extra thought beyond just taking your existing map document and putting it on the server. It requires that you think about basemaps and business layers and separate those out into different services. It requires that you think about the coordinate systems of your data and the services you will overlay. Throughout this course, you'll learn of even more things to prepare for as you design a web service. In this section, we'll cover some considerations for web editing.
There are perhaps some layers in your map that you will want users to edit, and other layers that you will not want anyone to modify. For the most fine-grained control, the editable layers should be isolated into their own web service, with the non-editable layers being published in a separate service.
Once you isolate the editable layers into their own ArcGIS map document, you can set up feature templates that determine the types of items users will be allowed to create. You can predefine the symbology and some of the attributes of these items to make the job of your editors (and, as you will see, your web application developers) as simple as possible.
When the map is ready, you publish it to ArcGIS Server with the Feature Access capability enabled.
Let's take a look at how you can design a map document (.mxd file) for web editing. You will start with an MXD that was included in the BighornSheep lesson data that you downloaded previously.
In the next section, you'll publish both your maps to ArcGIS Server so that they can be used in a web editing app.
In order to get your maps into a web editing app, you need to publish them as services. Specifically, in ArcGIS, you publish your map as a map service, with the Feature Access capability enabled. This creates a feature service that can be used for web editing.
The terminology here can be confusing. Is it a map service or a feature service? The answer is...both. When you look at your GIS server as an administrator, you'll only see one service, a map service with the Feature Access capability enabled. However when you look at the GIS server as a consumer of the service, for example when you are developing a web app with the service, you will see two ways that you can access the service. You'll see the map service URL and the feature service URL. You need to use the feature service URL in order to access web editing functions. The feature service provides methods (or REST "operations") for editing. These operations include Add Features, Update Features, Delete Features, and Apply Edits. You have to enable the Feature Access capability and use the feature service URL (it ends with "FeatureServer") in order to get these methods. They don't come with a regular old map service.
In the previous section, you created two maps: BighornHabitat and BighornReferenceLayers. You'll publish the BighornHabitat map as a feature service. The reference layers map also needs to be published, but it doesn't need to have the Feature Access capability enabled.
Try these steps:
Taking into account the service you published in the previous lesson, your Services Directory should now contain the following services:
In the previous sections of this lesson, you have laid all the groundwork for allowing web-based editing of your GIS datasets. You've set up the database, prepared the maps, and published a web service that allows editing. Your final step is to make a web application that allows editing.
Web editing can be a successful or frustrating experience for users depending on how the web services and app are designed. You already did some work with your web services to make them easy to visualize and understand. For example, you made only a few layers editable and verified that a feature template was available so that users could create only certain types of features and have some of the attributes pre-set.
In the same way that you made simple, focused services for editing, you also need to make a simple, focused application. An application that has too many buttons, functions, or GIS lingo can seem over-complicated to field workers and other professionals in your organization who may need to perform web editing. Fortunately, it's a lot easier to make a simple web app than a complex web app.
In this lesson, you'll build a web editing applications using the ArcGIS [30]development tools, which will let you build in a what-you-see-is-what-you-get (WYSIWYG) environment so you can quickly and easily create an app. For those of you that are interested in going beyond the simple functionality, ArcGIS has an "extensible framework" which means you can build your own custom widgets and themes if you have some programming knowledge. We won't cover those extensions in this class but you should know that you're not limited to the tools and templates that Esri provides.
Before we start creating the app, let's assemble the web map that we want to display inside of it. You'll do this using the same ArcGIS.com map viewer that you used in the previous lesson.
Now that you've got a web map set up, you can get down to the business of creating your web app.
In your ArcGIS Online website, click the Content link in the main menu.
Click the Create App button, and select Web AppBuilder.
Choose 2D, enter a title (e.g., BighornSheepEditingApp), tags (again at least one is required), and a summary (e.g., Bighorn Sheep editing app using ArcGIS Online), click OK.
You will see the Web AppBuilder for ArcGIS screen, and then you'll be redirected to a webpage displaying a theme and other graphical styles.
Choose a style and color scheme that you like. I'll leave it up to you to be creative.
The important elements live under the Map and Widget tabs. We'll start with the Map tab.
Click the Map tab, and click Choose Web Map.
Select the BighornWebMap you created above, and click OK.
This will bring in all the layers that you configured in the ArcGIS.com map viewer. That's all you need to do for the map design. Now, let's add some widgets.
Click the Widget tab and click Set the widgets in this controller and then the +.
From the list of widgets that appears, add the Edit and Measurement widgets. Also feel free to choose a couple of others.
Click Save in the bottom of the pane and then Launch. The web app will open in a new browser tab (it might take a few seconds to start). Take a quick look around.
Now, go to your own local computer (not your EC2 instance), and log in to arcgis.com. Click Content, then click your BighornSheepEditingApp, and click View Application.
From a look at the URL, you will see that your web app is, in fact, running on ArcGIS online servers via the Penn State URL (pennstate.maps.arcgis.com); however, it is still depending on services that are running on your EC2 instance. When you stop your instance, this app will not work as expected.
If you wanted to download the source code for this app and host it on your own web server, you could easily do that using the Download link that appears by each app in your ArcGIS Online content.
Test out your Web App thoroughly by reading and following along below.
Now that you have your app created, you can use your widgets to edit some of the underlying data. If you click the Edit widget, a sidebar window will appear with the list of editable layers within your feature layer. If you select the Sightings, you can add in some new sightings and some attributes. Try it!
You can also add a habitat area by drawing a polygon on the map (follow the on-screen instructions). You can use the controls at the bottom of the edit window to modify those changes (just like you might in the Edit window of ArcMap).
When you're done editing, click the X in the upper right of the window. These changes will be saved back to the server version. You could open your database and look at it in ArcMap on the EC2 instance to prove that this is the case.
Please create a new document or PowerPoint slide show. Paste your screenshots that you took from your app. Label each with a description of what is happening in the screenshot.
Also, answer the following question in a thoughtful paragraph: What things would you change about this walkthrough or the app design if this were a real-world deployment looking at animal sightings? If you're having trouble coming up with ideas, think about this question across several layers, or tiers, of the architecture, starting at the database tier and working your way up to the GIS server tier and the web application tier.
Upload this document to Canvas in the lesson drop box.
Cloud computing advocates often cite cost savings as a reason to adopt cloud computing. But, there is no guarantee that any given project can be more cheaply executed using cloud computing than using traditional IT provisioning. This week, we will discuss how cloud computing economics might apply to your organization or situation.
First, please read this white paper from Amazon: The Well-Architected Framework - Cost Optimization Pillar [43]. This document is part of a series available here [44] on best practices for using AWS's infrastructure services. Also, if you have purchased the optional textbook The Cloud at Your Service, please read chapter 3, "The business case for cloud computing". Despite the title, it gives a reasonably objective view of how cloud computing costs break out for different kinds of users (start-ups, small and medium businesses, large businesses).
Second, please post your reaction in the lesson discussion in Canvas on the topic below
There's so much to learn about ArcGIS Server and GIS server technology in general that it's impossible to cover it all in this course. Instead, we've chosen to focus on some of the issues most commonly faced by people setting up and running a GIS server. In Lesson 2, you learned how to set up a server and a web service, and you viewed that service on the web. In Lesson 3, you took that a step further and learned how to prepare data for editing over the web. You also made a fully-featured web application.
In Lesson 4, you will learn how to build rasterized tile caches to improve the speed of your map services. This is a practice used by major web mapping services such as Google Maps, Bing Maps, MapQuest, and the ArcGIS Online services that you have already used in this course.
Building and maintaining tile caches requires careful strategy and planning, far beyond just knowing how to push the buttons to make tiles. For this reason, map tiling can be a fun and intriguing subject to study.
At the successful completion of this lesson you should be able to:
By this point in the course, you may have observed that there's more than one way to take raw GIS data from your server and put it into a map in someone's web browser. Recall some of the map services you used in the previous two lessons:
Like the two other choices above, tiled maps also have their unique drawbacks. The biggest one is the time investment and server power needed to generate the cache, along with the disk space necessary to store it. Also, because a cache represents a snapshot of your data at one point in time, it requires maintenance. If your source data or your map symbology is edited, you have to update the corresponding tiles in order for people to see the changes.
In this lesson, you'll learn about designing a map with the goal of building a tile cache. You'll get a chance to make some tiles and use them on the web. Since the number of tiles in a cache can multiply with each scale level added and become unmanageably large, you'll also learn about strategies for building and updating very big caches.
A word about different tile types before we begin: There are two main types of tiles commonly used in web maps today. The kinds of tiles we've been talking about above can be thought of as rasterized tiles; in other words, they are images made up of grids of pixels. Rasterized tiles are easy for clients to draw because most apps and all web browsers know how to display an image like a JPG or a PNG; however, the server has to construct the image and, after that, you're stuck with the colors and symbols you chose.
To get around issues with rasterized tiles, another type of tiles called vector tiles have been increasing in popularity. Vector tiles are similar in concept to rasterized tiles in the sense that they are square packets of information structured in a pyramid motif and sent by the server; however, they contain vector coordinates instead of a picture of the data. This allows the styling to be easily changed. Vector tiles are displayed as client-side graphics, so the client software needs to understand what a vector tile is and how to deal with it. Older mapping software and APIs may not be able to consume vector tiles.
We will talk more about vector tiles in Lesson 5 when we work with Mapbox software, since Mapbox pioneered this format and based their company on it. Esri vector tile support [45] is growing, although it has lagged behind that of Mapbox.
Be aware that all the remaining content in Lesson 4 refers to rasterized tiles, and some of the design and performance considerations discussed may be very different when thinking about vector tiles.
Building rasterized cache tiles is CPU and memory-intensive. Your server is making thousands of repetitive map draws, sometimes with a very complex MXD in the background. You can build a cache a lot faster if you assign the tile creation to a powerful machine.
This short-term need for high computing power is a perfect use case for cloud computing. A lot of offices don't have a powerful machine to spare for building tiles (usually their beefiest machine is the server that's already hosting their live apps and web services). In this situation, a server administrator could launch a high memory and/or high CPU instance for just a few hours for the purpose of building tiles. The extra cost is often worth the time savings that it takes to build the cache. Once the tiles are created, the machine can be shut down or scaled down.
For this lesson only, you'll change your ArcGIS Server site to run on a memory-optimized instance [46]. This costs significantly more than the general purpose instance [47] type that you've been using, but it will allow you to work with a complex map document and build cache tiles much faster.
Now that you are running on an instance that costs (at the time of this writing) $1.064/hour as opposed to 40 cents/hour, it's more important than ever that you remember to stop your site when you are done working on your lesson materials for the day. Also, be sure to set your instance type back to m4.2xlarge after building all your tiles in Lesson 4.
You'll find during this lesson that a rasterized tiled map service takes a lot of planning. Let's look at a few of the considerations needed to get a map ready for publishing as a tiled service. You'll download and examine a predesigned map and publish it as a service in preparation for making some tiles yourself.
The first question to settle is whether or not to make a tile cache at all. If the map is going to put strain on your server or take a noticeable amount of time to draw (these two often go together), then you need to consider making a tile cache. Most vector basemaps that give geographic context to your web map contain a lot of layers and fall into this category. This is one reason that splitting up your layers into basemap services and business layer services is a good idea; you can potentially cache the basemap while leaving the business layers uncached.
Is it necessary to cache the business layers, since that kind of data changes more frequently? Google used to do it with the Wikipedia layer in Google Maps [50]. With so many features (Wikipedia articles) to show, and with the amount of traffic Google Maps receives, it was burdensome on the servers to draw those points on the fly. (Sadly, the Wikipedia layer is no longer offered.)
In addition to high traffic scenarios, you can also consider caching business layers when the map covers a relatively small extent, the data doesn't change very often, or the data is displayed at small scales only. Layers like weather radar need to be updated frequently, but are rarely viewed at large scales and require relatively few tiles in the cache, thus the update can be performed in a reasonable amount of time.
There are a lot of decisions you need to make about how to set up your tile cache, but the first choice is the set of scales at which you are going to generate tiles. These scales represent the snapshots at which web users will see your map. They also determine how long it's going to take to create the cache, and which other web services the cache will be able to overlay. Ideally, you'll decide on your set of cache scales before you start designing your map.
Keep these things in mind when choosing a set of scales:
Creating detailed vector basemaps of the type that are typically cached presents a grand cartographic challenge. In contrast to paper cartography, in which the map has to be designed at just one scale, the web basemap has to be designed to look good at every scale in your tiling scheme.
Designing this type of multilevel basemap can require you to include varying symbols at different levels of your map. For example, a road might be represented with a 3-point line width at a large scale, a 1-point width at a medium scale, and may not be visible at all at a small scale. Since ArcMap does not allow scale-dependent symbols, you'll sometimes need to add multiple copies of the same layer into your map, set different scale ranges on them, then assign appropriate symbols for each scale range.
It's also important to choose muted colors for the base map that look good, but do not overwhelm other layers placed on top. Go to Google Maps: Designing the Modern Atlas [51] to see some examples of how the Google Map design has toned itself down over time to be more accommodating to overlays.The Esri Light Gray Canvas basemap is another study of designing a basemap specifically as a backdrop for more important thematic or operational layers.
When web mapping exploded during the past two decades, some cartographers expressed their chagrin at the simple, uniform maps churned out by websites. Some may have thought their very jobs and livelihood were threatened. However, the years have shown that cartography holds a critical place in web mapping. Projects like the OpenStreetMap terrain layer [52] and the Esri World Topographic Map [53] incorporate very advanced cartographic techniques. In a sense, map tiling gave cartographers a ticket to ride in the web world, since these detailed maps would be too slow to serve dynamically.
No wonder some GIS professionals shrink at the thought of trying to design such a map on their own. Some organizations that lack an in-house cartographer have just limped along with the same symbols they used when more primitive map server technology was available. Others have imitated the colors and symbols of the ubiquitous Google Maps in their own basemaps (perhaps in response to a manager's demand, "Make our maps look like that!").
In response to queries about how the ArcGIS Online basemaps were constructed, Esri has released sample ArcMap documents using all the ArcGIS Online base map symbols. People can insert their own data into the map or simply copy the symbol settings into their own maps. Examining one of these maps provides a good lesson in multilayer basemap design.
In this part of the lesson, you'll download and examine a map template that Esri has provided for the ArcGIS Online street map. This sample map covers the Little Rock, Arkansas region. You'll then publish the map as a service and get it ready for creating tiles in the next section of the lesson.
Now that you've finished designing your map, you're ready to start creating the cache of map tiles. As an advance notice, you should plan at least one continuous hour to work on this page of the lesson.
In this lesson, you'll learn how to create tiles using ArcGIS Server. However, tiles can be created using many other types of GIS and mapping utilities. Mapnik [55] is an example, which is used to create the tiles for OpenStreetMap.
Map tiling has become so popular that the Open Geospatial Consortium (OGC) has even released the Web Map Tiling Standard (WMTS) detailing an open specification on how mapping web services should expose their tile sets. ArcGIS Server services that have a tile cache can respond to WMTS-formatted requests.
When you publish a map service or image service in ArcGIS Server, you can define whether it will have a cache and what the cache properties will be. You can either build the tiles right at the time the service is published, or you can instigate the tile building later using geoprocessing tools like Manage Map Server Cache Tiles. Building the tiles at publish time is appropriate for smaller cache jobs, and that's what we'll do in this lesson.
The tile cache you just built was pretty straightforward. You just gave the tool a map with symbology defined for each scale level, it created tiles, and within a few minutes, you had your cache. In this case, you were fortunate that you just needed a cache of Little Rock, Arkansas. But what if you needed a cache of the entire United States, or world, down to a large scale like 1:4,500? This could take days or weeks to build, and could require terabytes of disk space. Even if you were successful at building such a cache, would you be able to do it again if the source data were updated?
This section of the lesson discusses strategic approaches for building large caches. These are presented in the order that they should be considered, meaning that if you skip down and implement one of the later strategies first, you still may end up doing things inefficiently.
If you need a tile cache that covers an enormous area at large scales, it would be worth your while to consider using one that someone else has built. Why go to the trouble if someone else has done it already? You've seen these types of worldwide tiled map services already throughout this course. They include ArcGIS Online, Bing Maps, and Google Maps. The companies who have built these caches have spent many thousands of dollars and hours collecting the data (often competing against each other for the best quality), building the tiles, and purchasing the hardware to serve them out in a rapid way. If you can get away with using them, you may save much time and resources.
The disadvantage of using someone else's tiles is that you cannot guarantee the accuracy or currency of the data. You don't get to choose the symbology or projection of the data either. Usually, you have to work in the Mercator projection.
Finally, if the tiled service goes offline for some reason or you lose your connection, you may have no control over when it will reappear. No server, whether it's maintained by Microsoft, Google, or Esri, can guarantee 100% uptime; however, this applies to your own servers as well. It's likely that these third-party services have better hardware infrastructure than your own when it comes to serving tiles; however, those tiles must still cross the Internet to get to your app, and that opens the door to potential connectivity problems.
Some organizations, especially those in the military and intelligence communities, have much of their network blocked from Internet access. Recognizing this, some tiled map service providers sell an appliance, basically a big server containing all the map tiles that can be plugged into your network. This eliminates the Internet access requirements, but still requires you to load periodic updates to the appliance. The Esri Data Appliance for ArcGIS [56] is an example of this type of appliance.
Some areas of a web map generate a lot more attention than other areas. Someone looking for directions to a particular house may zoom in down to the largest available scale in an urban area. However, in the middle of the desert where there are few geographic features to see, it's unlikely that someone would ever zoom to a very large scale such as 1:1100 (the largest scale offered by ArcGIS Online/Bing Maps/Google Maps).
Creating tiles at small scales isn't a problem since it takes relatively few tiles to cover the map, but if you are limited on time or disk space, it pays to be selective about which tiles you cache at the largest scales.
Some GIS professionals have a hard time accepting the fact that they don't need to create every tile at every scale. They feel that all places are created equal, and shudder at the idea that someone might zoom to an area of their map and see a "Data not available" image. In fact, such an experience is now commonplace among laypeople who use web maps, who tend to blame themselves when they see a "Data not available" tile ("Oh, I zoomed in too far") as opposed to blaming the server administrator ("Why isn't there a map here!?")
A useful website for countering the idea that "all places are created equal" was Microsoft Hotmap, an old project by Microsoft Researchers to visualize tile usage in Virtual Earth (now Bing Maps). This site is no longer functioning, but a screenshot below will give you an idea of its appearance. You could open Hotmap and zoom into your town, then use the Select Data Level dropdown to visualize tile usage at different levels. At the zoomed out data levels, most of the tiles are requested fairly often. But when you get down to the zoomed in data levels (17 - 19), some clear patterns begin to emerge regarding where people want to see tiles: urban areas, major roads, coastlines, and other areas of interest. There are also some places where people never or rarely view tiles: wilderness areas, bodies of water, and so on. These are the tiles you don't want to spend your resources creating and storing (for more images and analysis see Fisher D 2007 Hotmap: Looking at geographic attention. IEEE Transactions on Visualization and Computer Graphics 13: 1184-91 [57] and Fisher D 2009 The Impact of Hotmap. WWW Document [58].
A few years ago, one of the authors of this course undertook a project to selectively cache the state of California using the observed usage patterns in Hotmap. He and his colleague combined urban areas, roads, coastlines, and places of interest into a single vector dataset that covered about 25% of the land area of California, but included about 97% of its population. The use of this dataset to define tile creation, as opposed to the entire state boundary, saved nearly 1 million tiles when caching down to the 1:4500 scale (see Quinn S and Gahegan M 2010 A predictive model for frequently viewed tiles in a web map. Transactions in GIS 14: 193-216).
When using ArcGIS Server to create tiles, there are a couple of settings on the Manage Map Server Cache Tiles tool that allow you to be strategic about which tiles you create. These are the ability to check on and off the scales you want to create, and the ability to pass in a feature class boundary that will define the area of tile creation. For a large caching job, you'll probably run the tool at least twice. The first time, you'll have only the small scales checked, and you won't pass in a feature class, you'll just create all the tiles. The second time, you'll have only the large scales checked, and you will pass in a feature class constraining the area where you want to create tiles, just like you did in the previous section of the lesson where you passed in the urban Little Rock feature class.
The faster a map draws dynamically, the faster it will create cache tiles. All GIS software has its potential tweaks that can be made to increase performance, and ArcGIS is no exception. You've already learned, for example, that you can analyze your map using the Analyze button and see a list of potential performance issues.
Anything you can do to reduce computation will help your map draw faster. Matching the coordinate system of your source data, your data frame, and your web map will eliminate any costly projection on the fly. Saving out your labels to annotation (a way of storing labels in a database) will relieve the server from having to make label placement decisions while it is drawing your map. Spatial indexes [59] can help your map more quickly find the features that it needs to draw for each requested tile.
The more computing power you can put behind creating tiles, the faster you can build your cache. CPU and memory restraints are often more of a problem than having enough disk space to store the tiles.
There are two ways you can increase your server computing power, scaling up or scaling out. Scaling up means you replace your existing machine with something more powerful, like we did in this lesson. Scaling out means that you add more servers to your architecture, with these servers possibly all having the same size and spec.
The concept of having more than one server working on one job is called distributed computing. Although distributed computing can allow you to do great things, it comes with some unique challenges. All machines have to be able to see the data and access it, which may require some adjustment of paths used in your maps. For example, in a distributed setup, you want to use network paths like \\server\data, instead of local paths like c:\data. Cloud Formation sets up your site so that if you put your data in C:\data on the site server instance (one named SITEHOST, for example), you can reference it through the path \\SITEHOST\data from any machine in your site.
Distributed computing may also require some adjustment of security settings so that the tile creation software has permissions to access the data from any machine. In ArcGIS Server, this is accomplished by giving the ArcGIS Server account permissions to your data folder (Cloud Builder does this for you), and registering the data folder with ArcGIS Server (you did this earlier in the course).
Cloud computing can be an attractive environment for building caches, because you can access a higher level of computing power than you might typically have in your office. Usually, you only need it for a short period (a few hours or days to create all the tiles), so the prospect of renting a server by the hour becomes very attractive.
One challenge with building tiles in the cloud is moving them around. First, you have to get your data onto the cloud so that your caching software can quickly get it as the tiles are being drawn. Then you have to move the tiles back to their final home, which may be on premises. Both of these transactions involve moving data across the Internet and can be influenced by your organizations' bandwidth and security policies.
When creating tiles with ArcGIS Server on Amazon EC2, it's a lot easier to scale up than to scale out. As you have seen, Amazon offers the option to change the instance type (in other words, CPU, memory, etc.) without terminating the instance. This is very handy when you start doing something and realize you need a bigger machine, although you are required to stop the instance before you change size. Some of the largest instance types on Amazon EC2 have an enormous degree of CPU power and may negate the need to scale out. Scaling out ArcGIS Server on Amazon EC2 is accomplished by adding more GIS server machines to your site.
Think back over the above strategies and consider why the techniques at the beginning should be employed before those at the end. It can be exciting to think about how many tiles you can build with distributed computing and all the computing horsepower that's available through the cloud. You may actually save the most time and resources by carefully planning which scales you want to create and selectively generating tiles at the largest scales. If the cache is still going to be overwhelmingly large, consider using an existing cache or a data appliance. By using a combination of the above strategies, you can usually find a way to build the cache you need, whatever the size.
A tile cache is just a picture of your data at one point in time. If that data ever changes, you need to update the cache. This final section of the lesson gives some practical considerations for updating and maintaining a cache over time.
Your update strategy probably should have come into consideration before you even decided you were going to create a cache. If you need to see data in real time, or you have frequent changes occurring over broad extents of the map, then creating a tile cache may not be appropriate.
For each map, there's a threshold of acceptable data currency. For a neighborhood street map available in your handheld GPS, you may find it acceptable if the street data is updated once every three months. For a tax assessor looking at land parcels, it may be acceptable to have the data current to within the past day or two. For a 911 operator tracking a vehicle's progress, a delay of more than a few seconds may not be acceptable.
If the cache update can be performed within the threshold of acceptable data currency, then it may make sense to create a cache. If the cache cannot be updated that quickly, then caching should not be used.
There are two approaches for cache updates; generate the entire cache, or focus the updates on places where the data has changed. If your entire cache can be rebuilt within the threshold of acceptable data currency, then it may be easier to do the first option, you can just kick off a rebuild of all the tiles and be done.
If your cache is very large and it is undesirable to rebuild the entire thing, then you need some way to track places that have been edited (for the sake of this discussion, we'll call these "dirty areas"). You can then pass the dirty area polygons into your caching tools to define where tile updates should occur.
So how do you find the dirty areas? One approach is to track them as edits are being made, each transaction can be logged to a database and, at the end of the edit session, the spatial extents of all the transactions can be exported to create a vector dataset of dirty areas.
If real-time tracking of the dataset editing is not an option, you can attempt to compare two datasets directly for attributes or spatial features that do not match. This type of strategy is required when you receive a dataset update without any record of how it was created (such as from a data vendor every six months). It requires that features have at least one key field in common between the two datasets. Comparing attributes is necessary if map symbolization or labeling could change based on a field value.
Accomplishing either of the above solutions in ArcGIS requires custom programming. Fortunately, this problem is common enough that people have posted some scripts and tools online that help address it. The Show Edits Since Reconcile [60] tool, written by Tom Brenneman, compares two versions of an ArcSDE geodatabase and outputs a feature class of spatial discrepancies. It can be installed into your list of toolboxes in ArcGIS. A similar tool Compare two feature classes in a file geodatabase [61], written by Sterling Quinn, is designed for those who do not have their data in ArcSDE.
Basing an ArcGIS tile cache update on dirty areas requires some degree of caution. A feature class full of small, adjacent polygons can cause the Manage Map Server Cache Tiles tool to work slowly and inefficiently. If there are a lot of small dirty areas in close proximity, they should be merged before the dirty areas feature class is used to define a caching job.
It's common to perform tile cache updates on a regular basis, such as every three months, every week, or every evening. Because caching is so resource-intensive, many server administrators like to build the updated tiles on a staging server and then copy them to their production server. This avoids disruption to those who are viewing tiles on the live website.
Whether you use a staging server or not, it's wise to perform the update during times when the fewest possible individuals will be using your site. For most sites, this is during the early morning hours or the weekend. Since you probably do not want to log in at 2 AM Sunday morning to run your caching tools, it's worth exploring whether your tile caching software can be automated and scheduled to run at given times.
The ArcGIS tools, for example, can be automated using a Python script. Python is a relatively simple programming language to learn, and it can be used to run any ArcGIS tool, including Manage Map Server Cache Tiles. For a full update process, you might decide to chain several tools and functions together in one script, such as:
Once you have a script that does everything you need, you can use your operating system to schedule it to run on a regular basis. Task Scheduler, included with Windows, is an example of a program that can run scripts on a repeated basis at any time you specify (such as nights or weekends).
Python scripting with ArcGIS is taught in Penn State's Geog 485: GIS Programming and Software Development [62]. If you're curious to see an example of a Python script that updates a cache, check out the ArcGIS help topic Automating cache creation and updates with geoprocessing [63].
In this assignment, you will put together all of the ArcGIS Server skills that you learned in Lessons 2 - 4. Starting with a folder of raw GIS datasets, you will compose maps, publish them as web services, and assemble those services into a web application. You will create a video tour of your web application so that you don't have to leave your server running as the project is graded.
The data for this assignment consists of vector feature classes covering an area around a town. I downloaded these from the State of California Geoportal [64] (formerly the California Spatial Information Library - CaSIL) and did some post-processing on them so that they cover the same extent. Don't worry too much about what town this really is; for this assignment, consider that it could be Anytown, USA.
Download the data for this assignment [65]
Pretend you work for a town that up until now has only done GIS in the desktop realm (maybe there is no pretending needed). You are moving to ArcGIS Server for the first time. You want to take your GIS data and make it available in a series of highly-focused web applications.
Your first application will focus on your urban flooding dataset. This is a point feature class that shows areas in the city that tend to pool with water and flood during a storm event. Your web app will allow "non-GIS-trained" personnel in other city departments to add and remove points from this layer.
You've been asked to create a basemap web service that will be used as a backdrop in this web application and other apps your town will create in the future. You must design this basemap yourself and create a tile cache for it. An existing basemap from ArcGIS Online, Bing Maps, or Google Maps cannot be used because the map needs to show your town's own data. However, you can imitate design principles and techniques used in those maps.
You are also to create a separate web service containing only the urban flooding layer. This layer should be exposed as a feature service and should be editable. This involves loading the source data into SQL Server Express as shown in Lesson 3.
Once you have created these two web services, you must overlay them in a web application that allows the urban flooding service to be edited by the application user. Do this using the ArcGIS Web AppBuilder unless you already have extensive coding experience with another API such as the ArcGIS API for JavaScript.
Because this assignment takes a fair amount of time, there is no cloud computing discussion assignment this week.
To minimize the amount of time your cloud-based server is left running, this project will be graded based on a short video tour of your app. You should record this using Zoom, Screencastomatic, or a comparable screen recording utility of your choice. Your video must demonstrate the following features in your ArcGIS Services Directory and your flooding application. Each item is worth 3 points, resulting in a total of 30 points available for this project (making it three times the value of a typical weekly assignment):
I recommend you use your video recording software to export an .MP4 file or some other easily shareable format. You can either host the file on YouTube, your PSU Microsoft OneDrive space, or some other online repository and provide a link (make sure it is viewable to the faculty). Zoom [66] is a tool available to PSU faculty, staff, and students that will easily allow you to screen share and easily record your screen. Zoom recordings will save as an .MP4 file. If you don't want to put the video online or can't get that to work, you can upload it to Canvas. Contact your instructor if these options don't work.
Do not host the video on your EC2 instance. Your instance should be stopped when you are not working on this course.
This lesson marks a shift in the course where we will move away from talking about ArcGIS Server running on Amazon EC2 infrastructure and begin discussing various online software as a service (SaaS) options that enable GIS in the cloud. We'll begin by exploring services offered by the company Mapbox. You'll have a chance to restyle some online basemaps and see how these new styles can take effect immediately using vector tiles. You'll also learn how to create and load data into Mapbox for thematic mapping.
Before you begin this lesson, please make sure that all your Cloud Builder sites and all your Amazon EC2 instances are stopped. You wouldn't want to leave them running and accruing charges during the next few weeks while we are working with other technology.
At the successful completion of this lesson you should be able to:
In Lesson 1, you learned about the software as a service (SaaS) model of cloud computing. With SaaS, the end user doesn’t have to install, configure, or code anything: the software is accessed directly from the cloud, usually through a web browser. The cloud hardware itself is maintained or leased by the service provider, with all the details of the back end architecture hidden from the end user. In Lesson 1, you used Google Fusion Tables as an example of SaaS. Others include all Google Docs, Gmail, and the ArcGIS.com map viewer that you used in the previous lessons.
Although you may be accustomed to using free SaaS such as online email, there is also much SaaS that is sold through upfront or metered fees. In fact, the free SaaS that you encounter is usually a gateway to more services that are available on a subscription basis. For example, you’ve already seen a little bit about how the ArcGIS.com map viewer is free to use, but you’ve also seen that Esri has a for-purchase credit system used for other services (which you’ll learn about in a later lesson). In a similar fashion, Mapbox offers a free tier of services but requires a subscription for certain volumes or usages.
SaaS is gaining popularity in the GIS industry because it saves people the hassle of installing and administering complex software. This is a boon for industries that want to use maps and spatial processing, but may not have the hardware or personnel to fully deploy a GIS onsite. It also allows them to give GIS a trial or pilot run for a relatively low cost and setup effort.
Because SaaS runs in a web page and needs to be accessible on many devices, its design is also usually streamlined compared to more complex desktop GIS software interfaces. SaaS generally lowers the bar for getting started with GIS. It is an excellent way for beginners to learn GIS, mapping, and design techniques, although it should be kept in mind that the features offered by SaaS may be limited compared to locally installed software.
SaaS is also an attractive way to do GIS because certain elements of functionality can be purchased on an as-needed basis. For example, companies who need to host just one or two spatial datasets as web services can do so without having to spend lots of money upfront on their own GIS server. Organizations that use GIS SaaS should assess the cost of services on a periodic basis. If a company needs to host many datasets and perform constant data processing or geocomputation operations, the cost of SaaS may actually exceed the cost for an in-house GIS server. In other words, although SaaS is convenient, it may not always be the most economical option.
This lesson begins a series on SaaS GIS offerings. We’ll first learn about Mapbox and its services for web map design and delivery. Then we’ll look at services from Carto, which are focused primarily on thematic mapping and analysis. Finally, we’ll spend two lessons looking at ArcGIS Online, covering its web map assembly tools in more depth and exploring its geoprocessing services.
Headquartered in Washington, DC, Mapbox is a company that provides location and mapping services such as online basemap hosting, geocoding, routing, image processing, and web mapping APIs. Mapbox is a young company, but it has made waves in the geospatial software industry by offering a unique blend of cloud technologies, map delivery and styling innovations, and open source utilities.
Mapbox markets itself as “a mapping platform for developers [67]”. It does not offer desktop-based GIS software; rather, its services seem to be aimed at journalists, full time web and mobile app developers, and organizations looking for an alternative to other cloud-based GIS products. Some of these might not have the equipment, funding, personnel, or business need to implement a full onsite GIS.
The vector data in Mapbox maps comes largely from OpenStreetMap [68], a free geographic database open to editing by anyone on the Internet. Mapbox did not invent OpenStreetMap, but it is one of the first companies to aggressively build a business model around the project. Using OpenStreetMap lowers the price point for Mapbox maps and increases the flexibility of the map (because you theoretically have some control over OpenStreetMap quality and content in your area of interest.) Because unintentional errors and vandalism do occur in OpenStreetMap, Mapbox uses employees and software tools to monitor incoming OpenStreetMap edits and improve the map. This investment offers benefits to both Mapbox and OpenStreetMap, although its effect on the community dynamics of the OpenStreetMap project is still beginning to be understood.
Mapbox mapping services rely heavily on a vector tile approach wherein packets of vector coordinates are sent to client devices to be drawn. The tiles use a pyramid motif similar to what you saw with the rasterized tiles you created with ArcGIS Server, but they contain vector coordinate information rather than images. An advantage of this approach is that vector tiles can be restyled quickly without having to re-make all the tiles, since the data is decoupled from the drawing rules. Vectors also facilitate visual effects for map rotation and zooming.
One disadvantage of vector tiles is that more computing logic is needed to display vector tiles than rasterized ones (since displaying an image is one of the most basic tasks a computer can do). Also, the symbol set and visual effects available with vector tiles may be more limited compared to what can be drawn with rasterized tiles. Finally, although it seems obvious, vector tiles can only display vectors; satellite imagery, shaded relief, and some field-based phenomena must still be drawn with rasterized tiles.
Mapbox has offered several cartographic products for designing map styles and making tiles. Their legacy TileMill tool created rasterized tiles with the aim of hosting them on Mapbox servers, although utilities existed for unpacking the tiles and hosting them on your own website (see in Geog 585: Open Web Mapping [69]). Their current tools are aimed toward creating vector tiles to be hosted on Mapbox servers.
Unlike ArcGIS Online, Mapbox does not offer web services for performing vector spatial operations such as buffering, intersections, etc.; instead, Mapbox created a free and open source JavaScript library called turf.js [70] that developers can use to perform these operations on the client side. As with many of Mapbox’s services, using turf.js requires some programming ability; but it comes with the benefit of not having to pay for a cloud service to perform these operations. Some kinds of batch operations, complex calculations, multi-step models, or large datasets may still be better suited for sending to a server.
Mapbox offers a light amount of usage for free, allowing us to experiment with their services. On the Mapbox [71] website, click Pricing and look over the plans. Then go ahead and sign up for a user name and move on to the next section of the lesson.
As you saw with ArcGIS Server, online base maps can have dozens of layers, with all kinds of rules about what zoom levels they are hidden and displayed; therefore, we’re not going to start from scratch. Instead, we’ll start with existing Mapbox basemaps (which are pretty well designed to begin with) and make small modifications to fit our taste.
First, please download the data for Lesson 5 exercises [72]. After you download the data, unzip it.
We’ll start out simple by infusing some of our own data into one of the Mapbox basemaps. We’ll then view our creation in ArcMap, where you already know how to add more layers on top.
Suppose you’re examining nighttime safety in the Washington DC area. You want to understand where activities are occurring at night relative to existing street lighting. The “Dark” basemap offered by Mapbox looks appealing for your purposes, but you want to integrate a layer showing areas that are illuminated. You plan on eventually doing some visualization of street vendor activity, pedestrian patterns, crimes, and other happenings in relation to the street lighting.
For the best user experience, you should complete the following steps on a desktop computer (not a tablet or phone):
To recap, you took an existing Mapbox-designed basemap and fused in some of your own data. You then brought this into ArcGIS Pro so you could work with it as a basemap. The next walkthrough will go beyond this, showing you how to modify the Mapbox design, create your own thematic data, and view your creations on the web.
Now that you’ve gotten a feel for the Mapbox environment, let’s try something a little more complex that involves modifying the Mapbox style, creating thematic data from scratch, and viewing the result in a web browser environment rather than ArcMap.
Suppose you’re in charge of making a website to show the five best restaurants in your town (with you and only you as the judge). You want to make a map quickly that you can embed in a website, but since you’re somewhat of a picky cartographer, you want to have full control over the map style. Let’s do this with Mapbox, first designing a basemap, then adding data to represent the five restaurants of interest.
You’re done editing your basemap for now. You don’t have to save your work; Mapbox Studio has been doing this as you go along. Now let’s get the restaurants entered.
Next, you’ll apply your own style to the icons and add some labels. You’ll then preview your map in a web browser.
There are several ways you could symbolize these restaurant points. One way might be with a little icon in the form of a scalable vector graphics (SVG) file. Mapbox provides a nice set of these SVG icons called Maki.
Another way is to just use a basic marker like a circle. We’ll take this approach, but we’ll also add a label from some of the information we entered in the restaurant fields. The restaurant points and the labels will be treated as separate layers in the map. Follow these steps:
Mapbox is really geared toward developers, people who write code to embed maps in websites and apps. Websites are typically written in JavaScript, with the maps being embedded through special programming libraries (APIs) that offer functions for working with tiles, markers, etc. One of the more popular of these APIs is Leaflet. Follow the instructions below to make a real simple web page that embeds your Mapbox map via Leaflet.
<!DOCTYPE html> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Leaflet + Mapbox test</title> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/leaflet/1.0.3/leaflet.css" type="text/css" crossorigin=""> <script src="https://cdnjs.cloudflare.com/ajax/libs/leaflet/1.0.3/leaflet.js" crossorigin=""></script> <style> #mapid { width: 512px; height: 512px; border: 1px solid #ccc; } .leaflet-container { background: #fff; } </style> <script type="text/javascript"> function init() { // create map and set center and zoom level var map = new L.map('mapid'); map.setView([47.000,-120.554],13); var mapboxTileUrl = 'PASTE YOUR URL INSIDE THESE SINGLE QUOTES'; L.tileLayer(mapboxTileUrl, { attribution: 'Background map data © <a href="http://openstreetmap.org">OpenStreetMap</a> contributors' }).addTo(map); } </script> </head> <body onload="init()"> <h1 id="title">Favorite restaurants</h1> <div id="mapid"></div> </body> </html>
This is a pretty basic example, but hopefully it helps you see how a map like this could be embedded anywhere in a web page by an able JavaScript developer. This could be a useful supplement to a blog, news article, corporate web page, etc.
For your assignment this week, you’ll practice the things you did in the above walkthroughs, this time using your own data.
Security is one of the biggest concerns for organizations considering using cloud computing. I have mixed feelings about this. On the one hand, giving up physical control is a big step. On the other hand, data is not 100% secure on-site either, and leading cloud providers have security teams that are second to none.
First, read the AWS Security Center website [75] and some of its subsidiary pages, especially this overview of Security Processes [76]. In the optional textbook, you can read The Cloud at Your Service chapter 4, Security and the Private Cloud.
There are many interesting offerings of cloud GIS SaaS. This week, we'll try CARTO [77] (formerly CartoDB). Worth noting about CARTO is that it is an open-source project. You could completely replicate what they have, using your own Linux server. At the same time, CARTO is able to operate a business by selling their services to the many folks who would rather focus on simple mapping on the cloud instead of deploying the entire software themselves on their own hardware.
The source code is available at the GitHub website [78]. They are using a fantastic set of technologies, although it might be quite a job keeping up with all the dependent projects if you wanted to work on the source. Fortunately, there's no need, as we can use their free pricing tier to get a feel for their cloud offerings.
At the successful completion of this lesson you should be able to:
CartoDB was officially launched in 2012 as a web mapping front end to a PostgreSQL + PostGIS back end database. The software was open source and could run on one's own hardware, but at the same time CartoDB offered an online subscription service wherein customers could upload datasets and make maps without having to touch the source code or configure anything themselves. In 2015, one of the best-known PostGIS masterminds, Paul Ramsey, joined CartoDB [79]. In 2016, the company changed its name to CARTO [80] and repositioned itself as a "location intelligence" tool rather than just a basic web mapping interface and online database. As such, it now also offers geodemographic analysis, routing, proximity, and address finding services.
Some of the services, you will note, are similar to those offered by the other SaaS offerings we are studying in this course: Mapbox and ArcGIS Online. This is inevitable because these companies have have found an eager market for the kinds of services they offer, and competition is a byproduct. In these course lessons, we have tried to focus the walkthroughs on some of the unique strengths of each platform or the technologies that they pioneered. One of CARTO's unique points is its variety of thematic mapping options and its appealing basemap and thematic styling options. The color schemes use the ColorBrewer [81] ramps which were developed at Penn State and are based on scientific color theory. Cartographers using CARTO can aggregate point data to tesselated regions such as hexbins or their own boundary files that they upload. They can also make time series maps, rasterized heatmap-style density surfaces, proportional symbol maps, etc.
CARTO offers a "Builder" app for web-based design and an "Engine" piece consisting of APIs. CARTO services can perhaps be considered as either PaaS or SaaS. How can we distinguish between them? One way is to consider how they will be used. If the service is being used as a source and combined with others, then it is probably a platform service. If it is being consumed directly by the end user, then it's a software service. Along the same lines, if you access the service programmatically, it's more likely to be a platform service than if you access it with a GUI.
So, if we used CARTO as a source of web maps that we pass along to end users, then it's a software service. If we use CARTO as a "table in the cloud" then we would be using it as a platform. CARTO's provision of spatial data tables on the Internet, along with both GUI and programmatic access for users and programmers, makes them a good example of a cloud GIS.
Let's go through the steps of uploading a basic dataset to CARTO and making a web map.
First, download the data for this lesson [82]. This folder contains datasets that I derived from Portland Maps Open Data [83] and OpenDataPhilly [84]. They are stored in GeoJSON, a popular format based on JavaScript syntax that is used for interchanging vector data on the web.
Then follow these steps:
Let's try one more kind of map that CARTO does very well: the animated time series map. This type of map is used when your data has a date and/or time field representing when an event occurred. The data we'll use represent incidents of gun violence in Philadelphia. Each point is a shooting with a field noting when the event took place. Animating these events over time within a map can show temporal and spatial patterns of violence throughout the city.
In this week's assignment, you'll continue getting some experience with CARTO's thematic mapping services. Please assemble a document with all of the following:
This week's cloud computing discussion covers Service Oriented Architecture (SOA) and Hadoop-style massively parallel data processing systems. SOA is interesting because this is how new Internet services are being developed. It is also a huge engineering challenge.
An epic blog post that helped me understand the importance of this was written by software engineer Steve Yegge [86], it is known as Stevey's Google Platforms Rant [87]. Yegge used to work for Amazon and now works for Google. Apparently, it was meant to be internal to Google, but it was accidentally published to great acclaim. Please read it for his passionate advocacy of a service oriented architecture and developer tools, and for his rather humorous, if somewhat salty and irreverent, description of life while working at these software companies.
Hadoop is an amazing system that was started by Doug Cutting, who wanted to provide the means to be able to index the entire Internet overnight, which at the time, only Google was doing effectively. Please read the Wikipedia entry on Apache Hadoop [88] for background. Hadoop is quite powerful, but also notoriously tricky to get working. Amazon has an interesting service called Elastic MapReduce [89] which claims to take a way a lot of the pain of setting up and maintaining such systems.
In this lesson, we begin exploring Esri's offering of SaaS cloud resources, ArcGIS Online. In review, SaaS represents the end of the cloud spectrum where more components of a system are handled by the service provider and the user is responsible for none of the hardware, software and data infrastructure. In most SaaS cases, all client interaction occurs via a Web browser, which ideally offers a user-friendly and rapid development experience.
In this lesson, you will use ArcGIS Online to assemble and share online maps that combine various web services. You'll also see how you can upload your own data to ArcGIS Online and have it run as a live web service similar to the services you published with ArcGIS Server.
At the successful completion of this lesson, you should be able to:
The definition of SaaS suggests that all components of a computing system are provided and managed by the cloud service provider, freeing the client to focus only on utilizing or consuming the resources. ArcGIS Online is an example of this level of service. As you observed in Lesson 2, ArcGIS Online can be used as a canvas for creating web map mashups, combining services from multiple sources. You'll do some more of this in this lesson. But you'll also go a bit further and see how ArcGIS Online can be used as a hosting site for your own web services and applications.
ArcGIS Online can host web services in very much the way that ArcGIS Server can host web services. This means that you can make a map in ArcMap, choose File > Share As > Service like you have always done, and choose to host the service using ArcGIS Online servers instead of your own ArcGIS Server. In fact, there are other entry points into publishing a service that don't require ArcMap, such as uploading a CSV file or a shapefile and publishing it.
Because Esri is marketing ArcGIS Online to individuals and groups who may not be familiar with ArcGIS Server or GIS technical parlance, they don't use the term web services in the ArcGIS Online documentation; instead, they use the term "hosted web layers". Nevertheless, these hosted web layers use the same Esri GeoServices [90] specification that is used by ArcGIS Server services. Therefore, code you write to interact with these services looks very similar to code you would write for ArcGIS Server.
Hosting web layers in ArcGIS Online costs money. You buy a block of "credits" from Esri, and these credits are consumed as you consume various resources in ArcGIS Online, such as uploading data and hosting services. The Service Credits Overview [91] page shows the cost in credits for various actions.
The types of web layers that ArcGIS Online can host are limited compared with ArcGIS Server. Originally, ArcGIS Online could only host (rasterized) tiled map services and feature services. Recently, layers for supporting 3D views have been added (scene layers and elevation layers). Vector tile layers are also new and can only be published through ArcGIS Pro.
There are several workflows you can take to prepare hosted layers, depending how much GIS software you have installed onsite. A good example is with rasterized tiles. You can optionally build the cache tiles using ArcGIS Online (which costs credits) or you can build them yourself in ArcGIS Desktop and upload them as a "tile package" to ArcGIS Online where they can reside as a hosted layer (which saves credits but takes more work). See the article Workflows for building and hosting cached map tiles in ArcGIS [92] for a comparison of options for building rasterized tile layers in ArcGIS Online.
Esri has developed a number of apps for getting data into ArcGIS Online and viewing it once it is there. One of the most widely used is Collector for ArcGIS, which is used for data collection in the field, sometimes in disconnected environments. You install Collector on smartphones or tablets from the device's app store. When you open up Collector, you connect to a web map that you've saved on ArcGIS Online. You can then download base map data to your device so that you maintain geographic context if or when you become disconnected from the Internet while gathering data.
When you go out into the field, Collector uses your device's GPS to place you on the map. You can then take data points at any location and optionally supply attributes and/or attach a photo from the device's camera. When you return to a connected environment, you can "sync" the device's data into your ArcGIS Online service, where it is then available to other client applications.
Other apps such as the Esri Operations Dashboard are used for visualizing data from ArcGIS Online, whether it was put there by Collector or other means. This video series from the Esri Federal User Conference shows how Collector, ArcGIS Online, and the Operations Dashboard can work together in real time. Although this demonstration was conducted several years ago early in Collector's history, it does a nice job of showing the fundamental purpose of the app and how it can be used for data acquisition in the field.
As you will see on You Tube's "Up Next" list, there are a number of follow-up videos in this Operation Gold series that you can continue watching to see how the data is used further down the line after it is collected.
When you were viewing the credit cost page, perhaps you noticed that ArcGIS Online offers geocoding and place finding services (which are free up to a point), as well as things like routing and network analysis services. Along with the ability to create mashups from third-party services, these capabilities may conceptually shift ArcGIS Online out of the SaaS category and toward the PaaS category, as the data component becomes more managed and handled by the client. This is perhaps a purely philosophical conversation rather than a practical one, but given the breadth of functionality provided by ArcGIS Online, its users may consider it to fall within SaaS or PaaS depending on the specific manner in which the site is used.
Esri has recently productized a version of ArcGIS Online that can be run on premises, named Portal for ArcGIS. This is aimed at organizations that are disconnected from the Internet (such as the intelligence community), organizations that need a higher SLA (uptime percentage) than ArcGIS Online can offer, or organizations that simply do not feel comfortable moving to the cloud yet.
Portal for ArcGIS looks and feels the same as ArcGIS Online, but uses ArcGIS Server on the back end for hosting any services published by portal users. The administrator of Portal for ArcGIS is responsible for making sure the portal and server have enough hardware to accommodate requests and uploads by portal users. You will learn more about Portal for ArcGIS in Lesson 9.
The ArcGIS.com website provides a view into ArcGIS Online. Sometimes you might hear the terms ArcGIS.com and ArcGIS Online used interchangeably, but ArcGIS Online can be accessed through other Esri clients such as ArcMap and programmatically through any client using the ArcGIS REST API [95].
Take a tour of the ArcGIS.com website using the steps below.
Think about how the physical location of the source data, the manner in which you are using it, and extent to which all of this is transparent to you fits the SaaS model of cloud computing. Have the underlying technical details of how and where the data are published been adequately hidden from you? You've seen that you can discover some of that information, via the services directory, but it isn't always necessary to know it, when using the client provided by ArcGIS Online.
This site is an example of a SaaS resource because all components of the infrastructure are managed by the cloud, from the underlying hardware and operating system, to the software and data. In fact, user-created maps in the ArcGIS Online Gallery are primarily generated by compiling and overlaying existing map layers already published to ArcGIS Online or other mapping servers. It is possible to create new data in this environment by drawing features as graphics (Add > Sketch Layer). But this is the extent to which data can be directly managed; the ArcGIS Online cloud manages the underlying data storage details.
In this exercise, we will create a new web map using the ArcGIS Online map viewer. The SaaS service model specifically enables access to resources from a thin client (e.g., Web browser) and conceals the underlying cloud infrastructure, including network, servers, operating systems, and storage. The following example of building a map in ArcGIS Online takes this service model one step further by integrating not just the services of one cloud computing infrastructure (Esri), but also the underlying infrastructures of other cloud services as well (US Census Bureau, NOAA). You'll see how ArcGIS Server services can be mixed with other types of services, such as WMS, during this process.
Let's take this opportunity to review the Essential Characteristics of Cloud Computing and identify how services like ArcGIS Online achieve them:
To illustrate the flexibility and interoperability of cloud GIS, we will consume three map services from various service providers via different protocols (Esri GeoServices REST Specification and WMS). You can then add more services to this map if you wish, including any services you have published in earlier exercises. For now, let’s consume an Esri basemap service via REST, a glaciers layer via WMS, and a snow depth layer via REST. We’ll imagine that we are planning a hiking trip in Mount Rainier National Park, and we want to get an understanding of conditions.
First, download the data for this lesson [97]. Then, do the following:
You just assembled a web map by combining web services from multiple sources. In the next section of the lesson, you'll add in some of your own data.
So far, we’ve brought in web services from a few different servers in order to get a multidimensional picture of conditions around Mount Rainier. In many cases, you might want to add your own data to supplement whatever web services you find. Suppose you’re going to be hiking on a section of the Wonderland Trail, which encircles Mount Rainier. Let’s add this trail to our map by uploading a shapefile directly to our map in ArcGIS Online. This dataset was adapted from a file geodatabase feature class downloaded from the Washington State Recreation and Conservation Office public download page [101].
The functionality to add a shapefile directly from a .zip file is not yet supported in the new (current) version of the ArcGIS Online Map Viewer, so to perform this function we need to temporarily switch back to the Map Viewer Classic.
Your data is saved inside the web map rather than being published as a regular, individual layer in your ArcGIS Online content. This works fine for small datasets that need to be used on a limited basis. However, there may be other situations where you want the data to be available to multiple web maps, or at multiple offices. In such a situation, you can publish the data as a service running directly on ArcGIS Online (with no ArcGIS Server needed). We will do this in the next section of the lesson.
When you saved your map in the previous section of the lesson, your trail features got saved with it. If you want your uploaded dataset to be accessible outside the map as a service to anyone who uses your ArcGIS Online organization, or the public in general, you must publish it separately as a “hosted feature service”. Let’s do that with a different shapefile.
Switch your mentality now to that of a park ranger who wants to share information with co-workers deployed in other stations around the park. You have a point dataset representing maintenance issues reported on the trail. You want your colleagues to be able to load this whenever they’re logged in to your organization.
We've done quite a bit of work in this walkthrough to construct a useful and nice looking web map with intuitive layer names and pop-ups. As a final step, it will be helpful for you to practice pulling this web map into an app. We'll do this using ArcGIS Online templates, a slightly different way from Lesson 3 that doesn't require the Web AppBuilder.
Because we didn't make the trail issues feature service public, I am not confident that I will be able to view your web maps live. Therefore, I want you to take a series of screen captures demonstrating this app. I also have some questions for you to reflect on.
Please create a new document and insert the following things:
This week's assignment is a little different, involving a bit less reading and a bit more research. I would like each of you to identify a cloud computing product and produce a short report. Please tell us:
1. Who is offering it
2. What essential cloud characteristics it exemplifies (remember NO-REM: Network availability, On-demand access, Resource pooling, Elasticity, Metered service), and
3. Which cloud service models (or mix of models) the product uses. Recall that these include Saas (Software as a Service), PaaS (Platform as a Service), and IaaS (Infrastructure as a Service).
4. Please comment on the services' suitability for GIS use. A GIS consists of a spatial data store, spatial data analysis, and spatial visualization (or mapping).
5. Finally, see if you can apply the concepts of SLAs and measuring operations to these providers. If you can, please explain how they apply, if not, then explain why not.
You must choose a cloud computing product that no one else is doing. So, if you are concerned that someone else might be interested in the same product, please post a short note when you have picked a product that identifies it as yours ("I'm reviewing XYZ.cloud.com" or something similar).
If you want to comment on other people's reviews, that would be good, and would help you if you had any deficiencies in your review. However, commenting on other people's reviews is not needed for full credit (unlike in other weeks).
If you have purchased the optional textbook, the chapter "Practical considerations" from The Cloud at Your Service may help you think of some discussion points.
Here's a list of cloud products to choose from if you don't have one in mind: ArcGIS Online, Mapbox, GIS Cloud, CARTO, Amazon EC2, Microsoft Azure, Google Maps, Google Earth, Heroku, CloudBees, Google AppEngine, GMail, Dropbox, OpenStreetMap, GitHub, SourceForge.
So far in our exploration of software as a service (SaaS) providers, we have focused largely on map design and construction. We’ve also seen how datasets can be uploaded and stored on the cloud. In this lesson, we’ll move forward and look at how GIS tools and algorithms can be invoked in a SaaS environment.
You got a taste of GIS as a service back when you used CARTO to aggregate farm dropoff points to neighborhoods. This required an algorithm to run determining the neighborhood where each point was located. The neighborhoods layer was then updated with a field showing the count of all points inside. If this were run locally, it would require you to install GIS or other spatial data processing software. Offloading this operation to the cloud requires you to solely focus on the input and output data.
Many other GIS operations are possible in the cloud; all that’s needed are some known input/output formats and some server logic that can then process the data. A popular input/output format is vector features. You’ve seen how there are lots of known formats for that, such as GeoJSON, CSV, KML, etc. Once the server receives these, it can perform operations such as buffering, intersection, routing, drive time analysis, etc., and send back the result in the form of more vectors, an image, or perhaps even textual reports. These analyses might incorporate sophisticated datasets from the cloud provider, such as road networks, address databases, or demographic information. Cloud providers can charge a metered fee, deducting money or credits for each operation performed, or they can charge flat monthly fees for different tiers of capabilities.
Although Esri is not the only company that offers GIS operations as a cloud service, it is clearly an area where they specialize. Esri ArcGIS Desktop software has hundreds of tools running all kinds of GIS operations. The challenge for Esri (and other cloud service providers) is to expose these kinds of tools online through an interface that’s intuitive to people who may have never used any GIS before. These users may know exactly what they want to accomplish, but would not be familiar with GIS terms like clip, union, buffer, etc. Companies offering GIS as a service must clearly define these terms or simplify them. Pause and spend a few minutes looking over the Perform analysis [103] page to see how Esri uses a combination of graphical icons and simplified terms to explain the spatial analysis capabilities in ArcGIS Online.
In this lesson, you'll use GIS services on ArcGIS Online to derive service areas, join demographic variables to those, and export data for further analysis outside the cloud. These are just a few of the many possible operations offered by ArcGIS Online, but they should give you a taste of how to invoke the analysis and manage Esri service credits.
At the successful completion of this lesson, you should be able to:
Let’s get some practice with ArcGIS Online GIS analysis services. Imagine you’re working for a sushi delivery company that has made its way to fame via an app accessed from people’s smartphones. Customers use the app to order fresh-made sushi to be delivered to their home. Your company makes the sushi in small “stores” (similar to pizza delivery outlets) and delivers it from those locations.
Unfortunately, business isn’t doing too well, and the company has been forced to cut its number of stores. You are tasked with determining one of the four stores to shut down in the Yakima metro area. You’ve just learned about some geographic analysis that you could perform online to help you with your decision. You already know that you want to consider the area your delivery cars can reasonably reach from each store. You also want to find out about the customer base of each store, including how many people live near each store, how much they tend to spend on restaurants, and how many own smartphones.
The only data you have at this point is a spreadsheet containing the locations of your stores and the amount you pay in rent each month for the commercial space. Let’s start by getting that data into ArcGIS Online.
First download the data for this lesson [104]. Then do the following:
The first thing you want to do is find out what area is served by each store. The company has learned that customers demand their sushi within 20 minutes of ordering it. Twelve minutes are typically required to make the food, and eight minutes are allotted for delivery. Let’s find out the areas that lie within an eight-minute drive of each store.
The service area polygons we calculated were interesting and somewhat useful, but the raw area of the polygon alone is not enough to help us get a feel for the underlying population served. Parts of the city are much more densely populated than others. Also, people in some neighborhoods tend to eat out more than others. Some neighborhoods might also have a higher density of smartphone usage where people would be inclined to order using your app. We’ll explore these variables in the next section of the lesson.
In this part of the walkthrough, we’ll try to learn a bit more about the customer base that lives within each 8-minute service area polygon we derived. We’ll accomplish this using what Esri calls “Data Enrichment”, in other words, joining and summarizing attributes from extensive demographic databases.
Since your boss only speaks Excel, it might be nice to get your stores spreadsheet back with all these enriching variables added. This can be accomplished via a simple table join from the enriched service areas back onto the delivery points.
For this week's assignment, please create a single document containing all of the following:
1. Unfortunately, there is no button on any GIS, cloud-based or otherwise, that says “Give me the answer”. All the same, you were able to use ArcGIS Online services to learn quite a bit more about the potential customers of your stores. Given what you learned, make a decision about which store should be cut that would minimize the overall financial impact on the franchise.
Write a justification for your manager of about 500 words detailing your decision. This should contain evidence using the enriched variables you derived and any maps you want to make with ArcGIS Online. Discuss the impact and usefulness (or lack thereof) of each variable. If you’re at a loss about what else to include in your report, try adding a map of what the service areas would look like with your selected store cut out.
There is no “right” answer to this question (although there may be questionable or unsupported answers). I am mainly looking for evidence that you’ve thought about the data and the analysis performed in the walkthroughs, and that you can use the output to address a spatial problem.
As you perform any additional analysis and make your maps, keep an eye on your credit usage. You want to leave enough ArcGIS Online credits in your account for your final project (if you are going to use ArcGIS Online in that project).
2. In this lesson, you observed how Esri has tried to put a very user-friendly face on some complex analysis tools in order to make them approachable to people without formal GIS training. What is gained and/or lost under this approach? Are there dangers that the tools might be misused if they are overly "dumbed down", or is the simplification of the tools helpful for everyone?
Study at least two (2) of the ArcGIS Online analysis operations and find their corresponding tools in ArcToolbox. Paste screenshots of both in your report here, and provide some commentary on (1) how the user interfaces have been changed for an ArcGIS Online audience, and (2) how their user interfaces are helped or hindered through this simplification.
The final week of this course will be dedicated to a term project that each of you will complete to integrate and apply your understanding of Cloud and Server GIS in the context of an application scenario you choose. You will select ONE project option from the list below and submit an abstract in week 8 describing your project idea. To a large degree, you will have the freedom to shape the specifics of your term project around a cloud GIS project that interests you. I hope that this allows you to either focus on a topic related to your day-to-day work or choose an area that sparks your curiosity.
Here are the options you have for your term project. You should choose ONE of these options:
OPTION 1: Set up an ArcGIS Server-based website using your own data, using EC2 as a hosting service.
OPTION 2: Solve a GIS problem using multiple cloud machines.
OPTION 3: Use ArcGIS Online, Carto, or Mapbox technology to solve geographic data handling problems.
OPTION 4: Design a cloud-based infrastructure that complements an existing GIS. Since this option will deliver a design rather than an example of a working system, the written component will need to be much larger than under the other options.
OPTION 5: Develop your own topic using Cloud GIS. You will need to receive approval from your instructor.
The term project includes the following deliverables:
Please use the Term Project Rubric [105]as a guide for implementing your project. It describes all the pieces that need to be present in order to earn an A, such as discussions of cost and security considerations.
By the end of week 8, you need to submit an abstract (summary) of your project idea. Submit this to the corresponding drop box on Canvas in the form of a document of 200 - 300 words. The instructor will review immediately and post any concerns or needed modifications.
In this course, you’ve seen how specialized server-based software such as ArcGIS Server can be used to distribute GIS resources and processing throughout an organization and, more broadly, to the public. This software is powerful, but requires advanced administration and usage skills. You’ve also become familiar with a number of providers who offer mapping and GIS services on the public cloud. The simplified and often browser-based interfaces of these SaaS providers are very attractive to organizations that want to put spatial data and analysis in the hands of users who aren’t trained in GIS. At the same time, some organizations may feel hesitant about how much of their data and operations they want to transfer onto a third-party cloud service. Concerns can include security of data, control over service uptime, and the amount of fees paid to the cloud provider.
For these reasons, organizations sometimes desire to build a cloud locally (i.e., “in house” or “on-premises”), so they can offer the simplified SaaS user experience while maintaining complete control over hardware, software, security, infrastructure, and related costs. Some SaaS cloud providers make their software available to install locally for this purpose. You’ve already seen how CARTO is an open-source project that can be installed in a local environment [106]. Most people would rather pay CARTO for a subscription than go to the trouble of setting up and maintaining a local instance; therefore CARTO continues to operate successfully as a business, however, the option for an on-premises deployment exists.
In this lesson, we’ll discuss how a local implementation of ArcGIS Online can be deployed using an Esri product called Portal for ArcGIS. We'll also take a deeper look at Esri's means for organizing maps and data through its ArcGIS Online organizational subscription services.
At the successful completion of this lesson, you should be able to:
To understand Portal for ArcGIS, it’s helpful to examine how Esri server-based GIS products evolved. Years ago, Esri customers had to deploy ArcGIS Server onsite in order to publish web services. Eventually, ArcGIS Online was released with an interface that allowed people to publish feature services and (rasterized) tiled map services in the cloud without owning ArcGIS Server.
These ArcGIS Online hosted services were popular with customers that needed to make basic mashups with basemaps and thematic overlays but didn’t want to implement a full-blown ArcGIS Server. Other useful features included the ability to create, save, and share web maps using the map viewer tools you’ve been exercising in the past few lessons. This was done within the umbrella of an ArcGIS Online “organization” that Esri customers could create and administer.
In order to allow their customers the option to run such a solution on premises, Esri introduced Portal for ArcGIS. This gave organizations a basic browser-based interface where employees could upload data, make GIS web services, create maps, and share them with others at their workplace. It had the same features as an ArcGIS Online organization, but a connection to the Internet was not required.
This new Portal for ArcGIS product could be connected or “federated” to an ArcGIS Server site to give greater exposure to ArcGIS Server web services throughout the organization. The ArcGIS Server could further be configured as a “hosting server” in order to power the feature services and tiled map services published by portal users. Thus, the ArcGIS Online and ArcGIS Server functionalities were brought together. At version 10.5, Esri rebranded the ArcGIS Server + Portal for ArcGIS and their supporting components as ArcGIS Enterprise and developed a more integrated installation experience.
Esri now encourages customers to install Portal for ArcGIS as a user-friendly interface to their ArcGIS Server deployment. Think about the way you have been looking at your own ArcGIS Server site so far: because you are an administrator, you have access to ArcGIS Server Manager. That's easy enough to navigate, but your server users would just see the REST Services Directory, a very minimalist application that was built with developers (i.e., programmers) in mind. Portal for ArcGIS gives a nicer looking face to these services and can also function as a collaborative tool for internally sharing GIS services, maps, and data.
At this point, stop and read the following article very carefully, paying attention to the graphical figures. It describes in detail the different levels of integration you can configure between a portal and an ArcGIS Server site.
About using your portal with ArcGIS Server [107]
When learning about Portal for ArcGIS, be aware that the term “portal” is a term broadly used across the web that can mean several different things. Even in GIS contexts, a portal is traditionally a site where a person can go to find data downloads. Indeed, Esri still makes available software called GeoportalServer for building these types of sites. Portal for ArcGIS, however, is broader than these traditional portals in the sense that people can publish items to a back-end server. They can also use interactive tools on the portal to make and share maps. In this way, the portal goes beyond being a data catalog to acting as a multi-purpose GIS platform.
This lesson provides a tour of some public facing ArcGIS Online organizational pages while also describing how Portal for ArcGIS is configured and used.
Organizations wanting to share access to maps and data links with the public will often do this on ArcGIS Online using “organizational” pages that are similar to look and function to Portal for ArcGIS. It is rare or unlikely that you will find a Portal for ArcGIS implementation open to the public because, in most cases, portals are isolated to internal environments for security and resource management purposes; however, looking at these organizational pages on ArcGIS Online can give you an idea of how a portal interface feels and behaves. Esri sometimes even refers to ArcGIS Online as a type of portal (lower-case "p"), not to be confused with the Portal for ArcGIS (upper-case "P") software product meant to be installed on internal infrastructure. We will keep this distinction between a lower-case portal and upper-case Portal in mind and use it throughout the lesson.
See this article for Esri’s official take on the difference between ArcGIS Online organizational subscriptions and Portal for ArcGIS deployments: Understand the relationship between Portal for ArcGIS and an ArcGIS Online subscription [108].
The first page we'll explore is a portal for City of Aurora, Colorado Maps [109]. The page looks somewhat like the default ArcGIS Online site, but it’s been customized with the city’s logo image and some local maps. Click the Gallery link, and you’ll be taken to some web maps that the city has shared with the public. Try a few of them. If you’re a small government, this is a real simple way to get some maps online without having someone with a ton of JavaScript experience on staff.
The Groups link is a place where collaborative groups can be configured for different purposes. Later, we’ll take a look at an organization with some extensive groups. Aurora is not heavily using this feature.
Now, take a look at this portal for City of Rio de Janeiro, Brazil [110]. It uses the same sort of layout and concept, except everything is in Portuguese. Explore around a little bit with a few of the maps in the gallery.
Here’s one more example from the International Joint Commission [111]. Go to this page and explore. Then click the Groups link. The International Joint Commission is a large governmental organization made up of US and Canadian offices. The groups page allows maps and other resources to be organized around local sub-jurisdictions. Click a group name, and then click the Content tab to see some of the maps shared in each group.
The three pages we’ve looked at all have a similar look and feel, as they have just undergone some minor customization from the default style. An example of a page with a bit more customization is Boston Maps [112]. Navigate around this page for a while and you’ll see that although the style on the surface looks a bit different, underneath you have the same core links and structure.
Finally, take a closer look at our own organizational ArcGIS Online instance at Penn State [113]. In this case, you can sign into the site to get access to more content and functionality than you had in the other cases. In the Penn State organization, you can create content (maps, apps, etc.) and upload data, all of which are hosted by esri's servers in the cloud (likely running on AWS or Azure infrastructure). As the sites are utilized, apps developed, and data uploaded, credits are consumed. Credits cost real money, and the amount can add up very quickly, particularly when uploading large quantities of data (imagery can be a culprit) or running geoprocessing tasks repeatedly (think of running a geocoding operation on addresses across the country). As we discussed earlier, these personalized ArcGIS Online organizations are a quick and easy way for you to create your own portal, but they aren't free. Being thoughtful about how they will be used and if restrictions should be put in place to prevent users from consuming excessive credits (intentionally or accidentally) is a good idea.
Because most Portal for ArcGIS deployments are not public facing, this lesson does not offer an interactive tour; however, please watch this video segment [114] from the 2016 Esri International User Conference where product evangelist Derek Law demonstrates an example portal. This link starts at about 28 minutes in, and you should watch it until at least minute 32.
Notice that the user experience of Portal for ArcGIS is nearly the same as with an ArcGIS Online organizational page. The main difference is that the back end hardware is managed by your organization, not Esri. The name and password that you use when you log into the portal is also managed by your organization; Esri does not store or do anything with those credentials, and something like your ArcGIS Online developer account would not work for logging into someone's portal.
If you are still not entirely certain of the purpose or functionality of the portal, or if you are confused about the difference between Portal for ArcGIS and ArcGIS Online, I recommend watching the entire presentation in the above video link. The beginning part of the video is introductory chatter, and the technical material starts at about 7 minutes in.
Back in Lesson 2, we installed ArcGIS Enterprise. Per the Esri help topic What is ArcGIS Enterprise [115], the product comes with:
Up to this point, we've only really interacted with the ArcGIS Server portion of the Enterprise suite of products. And that's perfectly reasonable, because Server is the backbone of Enterprise, and is the component that does the heavy lifting of publishing your data and services. There are many use-cases in which only an ArcGIS Server is utilized in a production setting. Portal is an optional component and one that may be very useful in some cases. A very common setting for a Portal installation is an organization that has a collection of datasets to manage and some number of users that need to interact with the data with varying levels of access and editing privileges. Portal provides a way to interact with Server through a GUI that presents functionality, like users, groups, permissions, and sharing, in a perhaps more user-friendly manner. Read more about Portal on the esri website [116].
As we saw earlier, installing and configuring ArcGIS Enterprise requires close collaboration with IT staff in your organization. In particular, if you recall, there were a couple things I needed to set up for you before you could run the CloudFormation installation. The installation requires a fully-qualified domain name and an SSL certificate that will allow for encrypted connections. These are things that we typically don't acquire on our own; instead, we work with our local IT folks or other organizations to set them up for us. Let's revisit these items and talk about why they are necessary for an Enterprise installation.
Every computer that's on the Internet, whether a physical machine like your desktop or laptop computer, a physical computer server in a server farm somewhere in the world, or a virtual machine like the ones we created in AWS, has a unique number that identifies it on the network. This is its IP number (or address). IP numbers typically have the form of four sets of values separated by periods, and the values can be between one and three characters. For example, 123.4.56.789 is a possible IP address.
(In order to expand the range of possible IP numbers, a new style of IP addresses with much longer values has been developed. This is called IPv6, and you may see computers with such numbers, particularly when connecting to wi-fi networks hosted by large Internet Service Providers (ISPs) like Verizon or Comcast. But we won't get into that here and just focus on IPv4.)
When we created our EC2 Instances in AWS, they were assigned a local IP number that's only unique within the Amazon ecosystem. So, we created an Elastic IP number and attached it to our Instance so that our machine is now uniquely identified on the Internet. Organiaztions, like Penn State and Amazon, are allocated a specific range of IP numbers that it is allowed to use for its computers, and those IP numbers are unique and do not exist in any other place on the Internet. By creating an Elastic IP (and paying a fee to reserve it for ourselves), Amazon assigned each of us one of its allotted IP numbers, which assures us that our IP address is, in fact, unique.
At this point, our virtual machine (EC2 Instance) is uniquely identifiable on the Internet. You could open a web browser and type the IP number into the address bar and connect to your computer's web server. But, as you know, it's rarely the case that you enter an IP number to visit a website. Rather, we use a more friendly-looking address to reference a server. These fully-qualified domain names (FQDN) consist of a specific server name, like baxtergeog865xxxx, and a domain, like e-education.psu.edu. In Geog865, we all have addresses on the same domain (e-education.psu.edu), but we each have our own individual name in front of it. Like IP numbers, these FQDNs are unique on the Internet and are a more convenient way to specify a web address. However, for that to work, the FQDN must be associated with the IP number of the machine it's intended for.
DNS is the resource that registers domain names and their corresponding IP addresses on the Internet. DNS entries must be made by an authoritative provider to be sure that the information is properly registered on the Internet, so that anyone typing the name into their browser will direct them to IP address of the correct server. In Geog865, I asked the IT department to register our names in DNS, since they have authoritative access and ownership over the e-education.psu.edu domain. Amazon has it's own mechanism called Route53 [117], which may be used for some domain names [118]. When we began this semester, I asked you to send me your Elastic IP. I then created a FQDN for you (using your last name and semester with geog865xxxx.e-education.psu.edu). Finally, I provided your domain name and corresponding IP address to the Penn State IT folks to register then in DNS.
Another reason it is important for us to utilize a FQDN (and why it is required by ArcGIS Enterprise) is that we need to enable Secure Sockets Layer (SSL) on our servers. SSL encrpyts all traffic to and from our webserver to make it more secure and harder for hackers to intercept. You know that SSL is enabled on a website when you see the https prefix on its URL instead of http. Most web servers, ISPs, and software products (like ArcGIS Enterprise) are now requiring SSL to be enabled. Similar to DNS, SSL is enabled by generating a certificate from an authoritative provider that is specific to a particular domain name. SSL certificates aren't associated with IP addresses, which is one reason why it is neccessary for us to utilize FQDNs on our ArcGIS Enterprise installs.
The SSL certificate verifies your web address’s identity and is usually obtained for a fee from a certificate authority. IT departments typically manage the acquisition and distribution of these certificates throughout their organizations. In the case of our Geog865 installations, I asked the Penn State IT department to request an SSL certificate containing all of our domain names from an authoritative provider, in our case, an organization called InCommon. I provided this certificate, in the form of a .pfx file, to everyone to supply to the CloudFormation template. You can inspect your SSL certificate by visiting your ArcGIS Server or Portal website and clicking the lock icon next to the https url and browsing its contents.
Deploying ArcGIS Enterprise on clouds like AWS or Microsoft Azure might be simpler in some ways than doing it on-premises because Esri has automated parts of the configuration process with tools like Cloud Formation [119]. This is possible because all the software and configuration on the AMIs are well known. Installation in your on-premises environment could become complex if you are running some kind of software, scan, or policy that doesn't "play nicely" with one of the ArcGIS Enterprise components. Furthermore, if you're not on the IT staff, you might have greater control over cloud accounts and environments than you typically do in your on-premises environment. Tools, like Enterprise Builder [120], exist to facilitate the installation of Enterprise on an existing machine.
Since we used the Cloud Formation template to install Enterprise on our AWS machines, Portal was installed as well. You should be able to connect to your Portal with a URL like namegeog865####.e-education.psu.edu/portal. You should see a default-looking ArcGIS Online page, which illustrates essentially what Portal is: your own local, stand-alone instance of ArcGIS Online.
Sign in using the ArcGIS Site Admin username and password you created in the Cloud Formation template. You will see options to manage Members (users), view your software licenses (esri software like ArcGIS Pro and other extensions have the option to be licensed through Portal in some cases), monitor the usage of your Enterprise installation, and configure the Settings of your Portal. Explore the Settings options that are available and check out esri documentation to learn more about options like configuring your home page [121] with a custom look and feel, managing your Servers, and specifying default settings.
For this week's assignment, we're going to perform a few admistrative tasks to be sure our Server and Portal sites are running smoothly. Return to your AWS Console and start the EC2 instance you used in Lessons 2 - 4.
There a number of ways to access configuration options for ArcGIS Enterprise. Two of these options are via a web browser. Depending on how your Enterprise installation is configured, you may need to use a browser on the EC2 instance itself through a Remote Desktop connection rather than from your local computer. In these cases, administrative access has been disabled from remote client machines. This is a setting you could change on your server, as well as confirming the appropriate firewall ports are open. For now, visit these sites from a browser on your EC2 machine:
Let's explore the ArcGIS Server Manager site. Visit your Manager site with a url like, baxtergeog865####.e-education.psu.edu/server/manager.
Under the Services tab, you should see the various services you've created so far in the course lessons. Click the pencil icon next to one of your services to see the options you have to administer them. Explore the various sections by clicking the tabs along the left of the window. A few things to look for in particular:
Under the Site tab in the ArcGIS Server Manager, you'll see a few sub-sections that contain many of the properties of your Server's configuration. Among these are:
Another useful page on this site is the Software Authorization sub-section. Click that heading and you'll see the licensing information for you installation. This can be useful when determining when you need to renew licenses or remember which extensions you have access to.
Finally, click on the Logs tab of the ArcGIS Server Manager site.
The View Logs sub-section is a place you can go to view error logs generated by your ArcGIS Server. This can be a very useful place to look when services aren't working properly. You can change the level of log detail to view by changing the Log Filter dropdown; the Debug option will show you the most information. You can also change the way logs are generated and stored on your server by clicking the Settings button. The Debug option will result in the most comprehensive log files, which you can filter any way you'd like when viewing, but it's not recommended to leave your logs configured to Debug for very long because the log files stored on your server will get very large and take up a lot of space. But when troubleshooting a problem it's good practice to set the log setting to Degub temporarily to investigate the problem and to then revert it back to Warning or Severe afterwards to save space.
The Statistics sub-section is a very useful resource for monitoring the client usage of your server. You will see graphs of a few default reports on the statistics page that you can click and interact with. Click on the Total Requests for the Last 7 Days graph. You will see all of the services running on your server listed along the left. You can toggle the visibility of them individually to see their usage on the graph. You also have the option to specify the timeframe of the statistics report. Often, when running your own ArcGIS Server installation you will want to understand how your services are being utilized by clients, or you may need to generate numbers for other people in your organization to demonstrate the value of the services you provide. These dynamic graphs are a useful tool, and you may export the data as a .csv spreadsheet and extract information using a tool like Excel. Back on the main Statistics page, you can click the New Report button to create a custom view and save it as a thumbnail. You might create a custom report of a handful of your services and a relevant timeframe for your organization, maybe the last month, and export a report regularly to monitor usage over time.
You can also generate reports using a custom toolbox in ArcGIS Pro. This can be useful if you need to create a report that the web-based interface won't support. For example, the Statistics page in the ArcGIS Server Manager will only list a limited number of services in the toggle list. If you need to generate a report of more services, you'll need to run a custom tool in ArcGIS Pro to create and save the report. Below, we will see where custom reports are stored in the adminstrative section of ArcGIS Server.
Open a new browser tab and visit the ArcGIS Server Administrator Directory (baxtergeog865####.e-education.psu.edu/server/admin). Log in with your siteadmin credentials. I don't recommend making any changes here, but feel free to explore the various sections to see the types of information that's available.
From the root page, click on the usagereports link. You will see a list of some default reports; if you create custom reports using ArcGIS Pro, they will appear on this page. You have the option to export the data from any of these to an .html, .json, or .csv file.
Back on the root page, click on the System link. From here you can view the licensing information of your installation, web adaptor configurations, and the directories where logs, tile caches, and other files are stored, among other things. Click on the webadaptors link. You will probably only see one web adaptor, with a long alpha-numeric name, listed. Click on the web adaptor name and you'll see that it specifies the name of the machine, its IP address, and the port (80 or 443) that it uses. In a production setting, the web adaptor will specify the fully-qualified domain name (e.g., baxtergeog865####.e-education.psu.edu) of your server and its public IP address (your Elastic IP). Recall that the web adaptors link our ArcGIS sites with the machine's web server, which in our case is IIS (Internet Information Services). There will be a separate web adaptor configured for the server and portal portions of your site. In the cloud formation template, we specified a name for our server site ("server") and our portal site ("portal"). The cloud formation template didn't do that for us here (although our sites still work), but in a production setting, you will have web adaptors listed here that link both the server and portal urls to your installation.
Finally, let's open our web server to see that both the server and portal folders have been created for us. From the desktop of your EC2 instance, click the Start button and type IIS. Click on Internet Information Services Manager when it appears in the list. Expand your server to view the contents of the Default Web Site. You should see two virtual directories listed: server and portal.
Virtual directories link a url folder name to a physical folder location on our server. The urls for these two directories take the form:
Enter each of these in a new browser window and you will see that they take you to your server and portal sites. Back in IIS, right-click on either the portal or server virtual directory, choose Manage Application, and click Advanced Settings. You will see a path on your server's C: drive that contains the web content for each site. You can use Windows Explorer to browse to those folders and see their contents. In summary, the web adaptors link the two urls above to the virtual directories in the web server. When installing ArcGIS Enterprise in a production setting or using tools other than cloud formation, there is a post-install setup procedure to get this all configured. Esri provides documentation [123] detailing how that process works. Not something we want to mess with here, but something you'll need to do when configuring ArcGIS Enterprise in your production environment.
For this week's assignment, please create a single document containing all of the following:
For this week’s discussion, we will think together about the future of cloud computing, and by extension, of cloud GIS. Please read The Cloud as a Tectonic Shift in IT: The Irrelevance of Infrastructure as a Service [124]. This blog post by the CTO of CloudBees contains some interesting predictions about the future of IaaS, PaaS, and SaaS.
If you have the optional textbook, you can supplement this with the chapter titled “Cloud 9: The future of the cloud.”
Please pick one of the predictions from the article or book chapter that you find interesting, and write about why you found it interesting. For example, you could find one of the predictions thought-provoking, or you might disagree with the authors. Also, please make your own prediction about how the advent of cloud computing will effect GIS. One way to approach this would be to extend the prediction you reacted to into GIS. Then, respond to one of your classmate's predictions.
This is the final week for our course in Cloud and Server GIS. There is no new content from me this week, instead, you will spend the week working on your term projects and producing a written report and video demonstration.
At the successful completion of this lesson, you should be able to:
As part of your term project, you're required to submit a video demonstration that you record using screen capturing software such as Zoom, Screencastomatic, Adobe Captivate, etc.
Please review the Term Project Rubric [105] to get an idea of what elements are required in the video.
Please attempt to host this online somewhere in a location such as a Box folder, blog, or website so that the instructor can view it using a simple URL. If all else fails, you can send the instructor a compiled .mp4 or other similar video file compatible with common media playing software.
If you are really happy with your video, I encourage you to make it public on YouTube or a similar video sharing site. A demonstration of what you can do with cloud and server GIS can be an excellent part of your portfolio. One of its advantages is that it remains accessible even when the server instance is not.
If you have questions about how to make videos, please post them on the Technical Discussion Forum in Canvas.
This week is dedicated to working on the term project. When you have finished, please post your writeup to the appropriate drop box on Canvas. Your writeup should contain a hyperlink to your video demonstration. If that is not possible, you can make a separate upload of a video file onto your term project submission. Canvas allows for multiple uploaded files.
The writeup should be about 500 - 1000 words. If it is any shorter, you'll have trouble covering all the required elements in enough depth.
Please review the Term Project Rubric [105] to understand what is required, and post to the forum or use email if you need any clarification.
Links
[1] http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf
[2] https://en.wikipedia.org/wiki/Von_Neumann_architecture
[3] http://venturebeat.com/2011/11/14/cloud-iaas-paas-saas/
[4] http://1password.com
[5] http://alestic.com/2009/09/ec2-public-ebs-danger
[6] https://docs.rightscale.com/cm/dashboard/clouds/generic/security_groups_concepts.html
[7] https://csrc.nist.gov/publications/detail/sp/800-123/final
[8] http://aws.amazon.com/console
[9] http://aws.amazon.com/pricing/ec2
[10] https://docs.aws.amazon.com/vpc/latest/userguide/vpc-ip-addressing.html
[11] https://console.aws.amazon.com/billing/home
[12] http://windows.microsoft.com/en-US/windows-vista/Remote-Desktop-Connection-frequently-asked-questions
[13] https://www.manning.com/books/the-cloud-at-your-service
[14] https://doi.org/10.1080/17538947.2011.587547
[15] http://www.enggjournals.com/ijcse/doc/IJCSE11-03-02-006.pdf
[16] http://server.arcgis.com/en/server/latest/publish-services/windows/overview-register-data-with-arcgis-server.htm
[17] http://server.arcgis.com/en/server/latest/publish-services/windows/copying-data-to-the-server-automatically-when-publishing.htm
[18] http://server.arcgis.com/en/server/latest/administer/windows/the-arcgis-server-account.htm
[19] https://aws.amazon.com/cloudformation/
[20] https://aws.amazon.com/marketplace/pp/prodview-rh32a6tw3ju4a?ref_=srh_res_product_title
[21] http://www.pasda.psu.edu
[22] http://www.arcgis.com
[23] http://aws.amazon.com/importexport/
[24] http://server.arcgis.com/en/server/latest/cloud/amazon/strategies-for-data-transfer-to-aws.htm
[25] https://www.e-education.psu.edu/geog865/sites/www.e-education.psu.edu.geog865/files/data/AppalachianTrail.zip
[26] https://enterprise.arcgis.com/en/server/latest/manage-data/windows/overview-register-data-with-arcgis-server.htm
[27] https://baxtergeog865su22.e-education.psu.edu/server
[28] https://mapservices.pasda.psu.edu/server/rest/services/pasda/PennDOT/MapServer/10?f=pjson
[29] https://mapservices.pasda.psu.edu/server/rest/services/pasda/PennDOT/MapServer/10?f=json
[30] http://www.arcgis.com/features/index.html
[31] https://www.e-education.psu.edu/geog865/cloud_introduction
[32] http://desktop.arcgis.com/en/arcmap/latest/manage-data/geodatabases/understanding-distributed-data.htm
[33] https://enterprise.arcgis.com/en/data-store/latest/install/windows/what-is-arcgis-data-store.htm
[34] http://desktop.arcgis.com/en/arcmap/latest/manage-data/geodatabases/an-overview-of-versioning.htm
[35] http://desktop.arcgis.com/en/arcmap/latest/manage-data/geodatabases/replicas-and-geodatabases.htm
[36] https://www.e-education.psu.edu/geog865/sites/www.e-education.psu.edu.geog865/files/data/BighornSheep.zip
[37] http://gis.utah.gov/
[38] https://www.e-education.psu.edu/spatialdb/l5_p3.html
[39] https://pro.arcgis.com/en/pro-app/latest/help/data/geodatabases/overview/versioning-types.htm
[40] https://www.e-education.psu.edu/spatialdb/node/2032
[41] https://baxtergeog865su22.e-education.psu.edu/server/manager
[42] http://pennstate.maps.arcgis.com
[43] https://d0.awsstatic.com/whitepapers/architecture/AWS-Cost-Optimization-Pillar.pdf
[44] http://aws.amazon.com/whitepapers/
[45] http://pro.arcgis.com/en/pro-app/help/mapping/map-authoring/author-a-map-for-vector-tile-creation.htm
[46] https://aws.amazon.com/ec2/instance-types/#Memory_Optimized
[47] https://aws.amazon.com/ec2/instance-types/#General_Purpose
[48] http://aws.amazon.com/ec2/instance-types/
[49] https://aws.amazon.com/ec2/pricing/on-demand/
[50] http://maps.google.com
[51] http://www.core77.com/blog/case_study/google_maps_designing_the_modern_atlas_21486.asp
[52] http://mike.teczno.com/notes/osm-us-terrain-layer.html
[53] http://www.arcgis.com/home/item.html?id=6e850093c837475e8c23d905ac43b7d0
[54] http://www.arcgis.com/home/webmap/viewer.html
[55] http://mapnik.org/
[56] https://doc.arcgis.com/en/data-appliance/
[57] https://www.microsoft.com/en-us/research/publication/hotmap-looking-at-geographic-attention/
[58] http://research.microsoft.com/apps/pubs/default.aspx?id=81244
[59] http://desktop.arcgis.com/en/arcmap/latest/manage-data/geodatabases/an-overview-of-spatial-indexes-in-the-geodatabase.htm
[60] http://www.arcgis.com/home/item.html?id=b75fc9edf166438c82d66f4982e4e031
[61] http://esriurl.com/compare
[62] http://www.e-education.psu.edu/geog485/
[63] http://server.arcgis.com/en/server/latest/publish-services/windows/automating-cache-creation-and-updates-with-geoprocessing.htm
[64] https://gis.data.ca.gov/
[65] https://www.e-education.psu.edu/geog865/sites/www.e-education.psu.edu.geog865/files/Town.zip
[66] https://psu.zoom.us/
[67] https://www.mapbox.com/about/
[68] http://www.openstreetmap.org
[69] https://www.e-education.psu.edu/geog585/node/705
[70] http://turfjs.org/
[71] http://www.mapbox.com
[72] https://www.e-education.psu.edu/geog865/sites/www.e-education.psu.edu.geog865/files/data/mapbox_lesson_data.zip
[73] https://www.mapbox.com/
[74] http://opendata.dc.gov
[75] http://aws.amazon.com/security/
[76] https://docs.aws.amazon.com/whitepapers/latest/introduction-aws-security/introduction-aws-security.pdf
[77] http://www.carto.com
[78] https://github.com/CartoDB/cartodb
[79] https://carto.com/blog/paul-ramsey
[80] https://carto.com/blog/from-cartodb-to-carto
[81] http://colorbrewer2.org
[82] https://www.e-education.psu.edu/geog865/sites/www.e-education.psu.edu.geog865/files/data/carto_lesson_data.zip
[83] http://gis-pdx.opendata.arcgis.com/
[84] https://www.opendataphilly.org/
[85] https://carto.com/signin/
[86] http://en.wikipedia.org/wiki/Steve_Yegge
[87] https://gist.github.com/1281611
[88] http://en.wikipedia.org/wiki/Apache_Hadoop
[89] http://aws.amazon.com/elasticmapreduce/
[90] http://geoservices.github.io/
[91] http://www.esri.com/software/arcgis/arcgisonline/credits
[92] http://blogs.esri.com/esri/arcgis/2013/02/06/workflows-for-building-and-hosting-cached-map-tiles-in-arcgis/
[93] https://www.youtube.com/watch?v=jo6x1dfalFM
[94] https://www.youtube.com/watch?v=Or1QQ_lW00c&t=208s
[95] https://developers.arcgis.com/rest/
[96] http://www.arcgis.com/home/index.html
[97] https://www.e-education.psu.edu/geog865/sites/www.e-education.psu.edu.geog865/files/data/arcgis_online_map_lesson_data.zip
[98] https://tigerweb.geo.census.gov/tigerwebmain/tigerweb_wms.html
[99] https://tigerweb.geo.census.gov/arcgis/services/TIGERweb/tigerWMS_PhysicalFeatures/MapServer/WMSServer
[100] https://mapservices.weather.noaa.gov/raster/rest/services/snow/NOHRSC_Snow_Analysis/MapServer
[101] https://rco.wa.gov/
[102] http://doc.arcgis.com/en/arcgis-online/share-maps/create-hosted-views.htm
[103] http://doc.arcgis.com/en/arcgis-online/analyze/perform-analysis.htm
[104] https://www.e-education.psu.edu/geog865/sites/www.e-education.psu.edu.geog865/files/data/arcgis_online_gis_lesson_data.zip
[105] https://www.e-education.psu.edu/geog865/sites/www.e-education.psu.edu.geog865/files/data/Geog%20865%20Term%20Project%20Rubric.docx
[106] http://cartodb.readthedocs.io/en/latest/install.html
[107] http://server.arcgis.com/en/portal/latest/administer/windows/about-using-your-server-with-portal-for-arcgis.htm
[108] http://server.arcgis.com/en/portal/latest/administer/windows/choosing-between-an-arcgis-online-subscription-and-portal-for-arcgis.htm
[109] https://auroraco.maps.arcgis.com/home/index.html
[110] https://pcrj.maps.arcgis.com/home/index.html
[111] https://ijc.maps.arcgis.com/home/index.html
[112] https://boston.maps.arcgis.com/home/index.html
[113] https://pennstate.maps.arcgis.com
[114] https://youtu.be/vS5EJeAFmqU?t=27m56s
[115] http://server.arcgis.com/en/server/latest/get-started/windows/what-is-arcgis-enterprise-.htm
[116] https://enterprise.arcgis.com/en/portal/latest/administer/windows/what-is-portal-for-arcgis-.htm
[117] https://aws.amazon.com/route53/what-is-dns/
[118] https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/registrar-tld-list.html
[119] https://enterprise.arcgis.com/en/server/latest/cloud/amazon/aws-cloud-formation-and-arcgis-server.htm
[120] https://enterprise.arcgis.com/en/get-started/latest/windows/arcgis-enterprise-builder.htm
[121] https://server.arcgis.com/en/portal/latest/administer/windows/configure-home.htm
[122] https://enterprise.arcgis.com/en/server/latest/administer/windows/configure-service-instance-settings.htm
[123] https://enterprise.arcgis.com/en/web-adaptor/latest/install/iis/welcome-arcgis-web-adaptor-install-guide.htm
[124] https://www.cloudbees.com/blog/cloud-tectonic-shift-it-irrelevance-infrastructure-service-iaas