The links below provide an outline of the material for this lesson. Be sure to carefully read through the entire lesson before returning to Canvas to submit your assignments.
Welcome to Geography 485. Over the next ten weeks, you'll work through four lessons and a final project dealing with ArcGIS automation in Python. Each lesson will contain readings, examples, and projects. Since the lessons are two weeks long, you should plan between 20 - 30 hours of work to complete them, although this number may vary depending on your prior programming experience. See the Course Schedule section of this syllabus, below, for a schedule of the lessons and course projects.
As with GEOG 483 and GEOG 484, the lessons in this course are project-based with key concepts embedded within. However, because of the nature of computer programming, there is no way this course can follow the step-by-step instructional design of the previous courses. You will probably find the course to be more challenging than our courses on GIS fundamentals. For that reason, it is more important than ever that you stay on schedule and take advantage of the course message boards and private email. It's quite likely that you will get stuck somewhere during the course, so before getting hopelessly frustrated, please seek help from me or your classmates!
I hope that by now that you have reviewed our Orientation and Syllabus for an important course site overview. Before we begin our first project, let me share some important information about the textbook and a related Esri course.
The textbook for this course is Python Scripting for ArcGIS Pro by Paul A. Zandbergen. As you read through Zandbergen's book, you'll see material that closely parallels what is in the Geog 485 lessons. This isn't necessarily a bad thing; when you are learning a subject like programming, it can be helpful to have the same concept explained from two angles.
My advice about the readings is this: Read the material on the Geog 485 lesson pages first. If you feel like you have a good understanding from the lesson pages, you can skim through some of the more lengthy Zandbergen readings. If you struggled with understanding the lesson pages, you should pay close attention to the Zandbergen readings and try some of the related code snippets and exercises. I suggest you plan about 1 - 2 hours per week of reading if you are going to study the chapters in detail.
In all cases, you should get a copy of the textbook because it is a relevant and helpful reference.
The Zandbergen textbook is up to the 3rd Edition as of Summer 2024. The free copy of the book available through the PSU library is the 2nd Edition. Differences between the two editions are relatively minor and you may assume that the section numbers referenced here in the lessons are applicable to both editions unless otherwise noted.
You may see that in Esri's documentation, shapefiles are also referred to as "feature classes." When you see the term "feature class," consider it to mean a vector dataset that can be used in ArcGIS.
Another type of standalone dataset dating back to the early days of ArcGIS is the ArcInfo coverage. Like the shapefile, the coverage consists of several files that work together. Coverages are definitely an endangered species, but you might encounter them if your organization used ArcInfo Workstation in the past.
There is a free Esri Virtual Campus course, Python for Everyone [1], that introduces a lot of the same things you'll learn this term in Geog 485. Python for Everyone consists of a series of short videos and exercises, some of which might help toward the projects. If you want to get a head start, or you want some reinforcement of what we're learning from a different point of view, it would be worth your time to complete that Virtual Campus course.
All you need in order to access the course is an Esri Global Account, which you can create for free. You do not need to obtain an access code from Penn State.
The course moves through ideas very quickly and covers a range of concepts that we'll spend 10 weeks studying in depth, so don't worry if you don't understand it all immediately or if it seems overwhelming. You might find it helpful to quickly review the course again near the end of Geog 485 to review what you've learned.
If you have any questions now or at any point during this week, please feel free to post them to the Lesson 1 Discussion Forum. (To access the forums, return to Canvas via the Canvas link. Once in Canvas, you can navigate to the Modules tab, and then scroll to the Lesson 1 Discussion Forum.) While you are there, feel free to post your own responses if you are able to help a classmate.
Now, let's begin Lesson 1.
This lesson is two weeks in length. (See the Calendar in Canvas for specific due dates.) To finish this lesson, you must complete the activities listed below. You may find it useful to print this page so that you can follow along with the directions.
Do items 1 - 3 (including any of the practice exercises you want to attempt) during the first week of the lesson. You will need the second week to concentrate on the project and quiz.
By the end of this lesson, you should:
A geographic information system (GIS) can manipulate and analyze spatial datasets with the purpose of solving geographic problems. GIS analysts perform all kinds of operations on data to make it useful for solving a focused problem. This includes clipping, reprojecting, buffering, merging, mosaicking, extracting subsets of the data, and hundreds of other operations. In the ArcGIS software used in this course, these operations are known as geoprocessing and they are performed using tools.
Successful GIS analysis requires selecting the most appropriate tools to operate on your data. ArcGIS uses a toolbox metaphor to organize its suite of tools. You pick the tools you need and run them in the proper order to make your finished product.
Suppose you’re responsible for selecting sites for a chain restaurant. You might use one tool to select land parcels along a major thoroughfare, another tool to select parcels no smaller than 0.25 acres, and other tools for other selection criteria. If this selection process were limited to a small area, it would probably make sense to perform the work manually.
However, let’s suppose you’re responsible for carrying out the same analysis for several areas around the country. Because this scenario involves running the same sequence of tools for several areas, it is one that lends itself well to automation. There are several major benefits to automating tasks like this:
The ArcGIS platform provides several ways for users to automate their geoprocessing tasks. These options differ in the amount of skill required to produce the automated solution and in the range of scenarios that each can address. The text below touches briefly on these automation options, in order from requiring the least coding skill to the most.
The first option is to construct a model using ModelBuilder. ModelBuilder is an interactive program that allows the user to “chain” tools together, using the output of one tool as input in another. Perhaps the most attractive feature of ModelBuilder is that users can automate rather complex GIS workflows without the need for programming. You will learn how to use ModelBuilder early in this course.
Some automation tasks require greater flexibility than is offered by ModelBuilder, and for these scenarios it's recommended that you write short computer programs, or scripts. The bulk of this course is concerned with script writing.
A script typically executes some sequential procedure of steps. Within a script, you can run GIS tools individually or chain them together. You can insert conditional logic in your script to handle cases where different tools should be run depending on the output of the previous operation. You can also include iteration, or loops, in a script to repeat a single action as many times as needed to accomplish a task.
There are special scripting languages for writing scripts, including Python, JScript, and Perl. Often these languages have more basic syntax and are easier to learn than other languages such as C, Java, or Visual Basic.
Although ArcGIS supports various scripting languages for working with its tools, Esri emphasizes Python in its documentation and includes Python with the ArcGIS Desktop and Pro installations. In this course, we’ll be working strictly with Python for this reason, as well as the fact that Python can be used for many other file and data manipulation tasks outside of ArcGIS. You’ll learn the basics of the Python language, how to write a script, and how to manipulate and analyze GIS data using scripts. Finally, you’ll apply your new Python knowledge to a final project, where you write a script of your choosing that you may be able to apply directly to your work.
A more recently developed automation option available on the ArcGIS platform is the ArcGIS API (Application Programming Interface) for Python. This is an environment in which Python scripting is better integrated with Esri's cloud- and server-based technologies (ArcGIS Online, Portal for ArcGIS, ArcGIS Enterprise). Code written to interact with the Python API is often written in a "notebook" environment, such as Jupyter Notebook. In a notebook environment, code can be executed in a stepwise fashion, with intermediate results displayed in between the Python statements. The use of the Python API in a Jupyter Notebook environment is a topic in our Advanced Python class, GEOG 489.
For geoprocessing tasks that require support for user interaction with the map or other UI elements, the ArcGIS Pro SDK (Software Development Kit) offers the ability to add custom tools to the Pro interface. The Pro SDK requires programming in the .NET framework using a .NET language such as Visual Basic .NET or C#. Working with this SDK's object model provides greater flexibility in terms of what can be built, as compared to writing Python scripts around their geoprocessing framework. The tradeoff is a higher level of complexity involved in the coding.
Finally, developers who want to create their own custom GIS applications, typically focused on delivering much narrower functionality than the one-size-fits-all ArcGIS Pro, can develop apps using the ArcGIS Maps SDKs (previously called Runtime SDKs). The Maps SDKs make it possible to author apps for Windows, Mac, or Linux desktop machines, as well as for iOS and Android mobile devices, again involving a greater level of effort than your typical Python geoprocessing script. In the past, there was a native version for MacOS but that has been retired. Instead, programmers can use the Maps SDK for Java version to develop for MacOS.
This first lesson will introduce you to concepts in both model building and script writing. We’ll start by just getting familiar with how tools run in ArcGIS and how you can use those tools in the ModelBuilder interface. Then, we’ll cover some of the basics of Python and see how the tools can be run within scripts.
The ArcGIS software that you use in this course contains hundreds of tools that you can use to manipulate and analyze GIS data. Back before ArcGIS had a graphical user interface (GUI), people would access these tools by typing commands. Nowadays, you can point and click your way through a whole hierarchy of toolboxes using the Catalog window in ArcGIS Pro.
Although you may have seen them before, let’s take a quick look at the toolboxes:
Let’s examine a tool. Expand Analysis Tools > Proximity > Buffer, and double-click the Buffer tool to open it.
You've probably seen this tool in past courses, but this time, really pay attention to the components that make up the user interface. Specifically, you’re looking at a dialog with many fields. Each geoprocessing tool has required inputs and outputs. Those are indicated by the red asterisks. They represent the minimum amount of information you need to supply in order to run a tool. For the Buffer tool, you’re required to supply an input features location (the features that will be buffered) and a buffer distance. You’re also required to indicate an output feature class location (for the new buffered features).
Many tools also have optional parameters. You can modify these if you want, but if you don’t supply them, the tool will still run using default values. For the Buffer tool, optional parameters are the Side Type, End Type, Method, and Dissolve Type. Optional parameters are typically specified after required parameters.
Hover your mouse over any of the tool parameters. You should see a blue "info" icon to the left of the parameter. Moving your mouse over that icon will show a brief description of the parameter in a pop-out window.
If you’re not sure what a parameter means, this is a good way to learn. For example, viewing the pop-out documentation for the End Type parameter will show you an explanation of what this parameter means and list the two options: Round and Flat.
If you need even more help, each tool is more expansively documented in the ArcGIS Pro web-based help system. You can access a tool's documentation in this system by clicking on the blue ? icon in the upper-right of the tool dialog, which will open the help page in your default web browser.
You can access ArcGIS geoprocessing tools in several different ways:
We’ll start with the simplest of these cases, running a tool from its GUI, and work our way up to scripting.
Let’s start by opening a tool from the Catalog pane and running it using its graphical user interface (GUI).
Examine the first required parameter: Input Features. Click the Browse button and browse to the path of your cities dataset C:\PSU\Geog485\Lesson1\us_cities.shp. Notice that once you do this, a name is automatically supplied for the Output Feature Class (and the output path is the same as the input features). The software does this for your convenience only, and you can change the name/path if you want.
A more convenient way to supply the Input Features is to just select the cities map layer from the dropdown menu. This dropdown automatically contains all the layers in your map. However, in this example, we browsed to the path of the data because it’s conceptually similar to how we’ll provide the paths in the command line and scripting environments.
Hover over the Buffer tool entry in this list to see a pop-out window. This window lists the tool parameters, the time of completion, and any problems that occurred when running the tool (see Figure 1.1). These messages can be a big help later when you troubleshoot your Python scripts. The text of these messages is available whether you run the tool from the GUI, from the Python window in Pro, or from scripts.
When you work with geoprocessing, you’ll frequently want to use the output of one tool as the input into another tool. For example, suppose you want to find all fire hydrants within 200 meters of a building. You would first buffer the building, then use the output buffer as a spatial constraint for selecting fire hydrants. The output from the Buffer tool would be used as an input to the Select by Location tool.
A set of tools chained together in this way is called a model. Models can be simple, consisting of just a few tools, or complex, consisting of many tools and parameters and occasionally some iterative logic. Whether big or small, the benefit of a model is that it solves a unique geographic problem that cannot be addressed by one of the “out-of-the-box” tools.
In ArcGIS, modeling can be done either through the ModelBuilder graphical user interface (GUI) or through code, using Python. To keep our terms clear, we’ll refer to anything built in ModelBuilder as a “model” and anything built through Python as a “script.” However, it’s important to remember that both things are doing modeling.
ModelBuilder is Esri’s graphical interface for making models. You can drag and drop tools from the Catalog pane into the model and “connect” them, specifying the order in which they should run.
Although this is primarily a programming course, we’ll spend some time in ModelBuilder during the first lesson for two reasons:
Let’s get some practice with ModelBuilder to solve a real scenario. Suppose you are working on a site selection problem where you need to select all areas that fall within 10 miles of a major highway and 10 miles of a major city. The selected area cannot lie in the ocean or outside the United States. Solving the problem requires that you make buffers around both the roads and the cities, intersect the buffers, then clip to the US outline. Instead of manually opening the Buffer tool twice, followed by the Intersect tool, then the Clip tool, you can set this up in ModelBuilder to run as one process.
Click OK to dismiss the model Properties dialog.
You now have a blank canvas on which you can drag and drop the tools. When creating a model (and when writing Python scripts), it’s best to break your problem into manageable pieces. The simple site selection problem here can be thought of as four steps:
Let’s tackle these items one at a time, starting with buffering the cities.
Click the Buffer tool and drag it onto the ModelBuilder canvas. You’ll see a gray rectangular box representing the buffer tool and a gray oval representing the output buffers. These are connected with a line, showing that the Buffer tool will always produce an output data set.
In ModelBuilder, tools are represented with boxes and variables are represented with ovals. Right now, the Buffer tool, at center, is gray because you have not yet supplied the required parameters. Once you do this, the tool and the variable will fill in with color.
An important part of working with ModelBuilder is supplying clear labels for all the elements. This way, if you share your model, others can easily understand what will happen when it runs. Supplying clear labels also helps you remember what the model does, especially if you haven’t worked with the model for a while.
In ModelBuilder, right-click the us_cities.shp element (blue oval, at far left) and click Rename. Name this element "US Cities."
Right-click the buffer output element (green oval, at far right) and click Rename. Name this “Buffered cities.” Your model should look like this.
Practice what you just learned by adding another Buffer tool to your model. This time, configure the tool so that it buffers the us_roads shapefile by 10 miles. Remember to set the Dissolve type to Dissolve all output features... and to add meaningful labels. Your model should now look like this.
Rename the output of the Intersect operation "Intersected buffers." If the text runs onto multiple lines, you can click and drag the edges of the element to resize it. You can also rearrange the elements on the page however you like. Because models can get large, ModelBuilder contains several navigation buttons for zooming in and zooming to the full extent of the model in the View button group on the ribbon. Your model should now look like this:
Set meaningful labels for the remaining tools as shown below. Below is an example of how you can label and arrange the model elements.
When the model has finished running (it may take a while), examine the output on the map. Zoom into Washington state to verify that the has Clip worked on the coastal areas. The output should look similar to this.
That’s it! You’ve just used ModelBuilder to chain together several tools and solve a GIS problem.
You can double-click this model anytime in the Catalog pane and run it just as you would a tool. If you do this, you’ll notice that the model has no parameters; you can’t change the buffer distance or input features. The truth is, our model is useful for solving this particular site-selection problem with these particular datasets, but it’s not very flexible. In the next section of the lesson, we’ll make this model more versatile by configuring some of the variables as input and output parameters.
Most tools, models, and scripts that you create with ArcGIS have parameters. Input parameters are values with which the tool (or model or script) starts its work, and output parameters represent what the tool gives you after its work is finished.
A tool, model, or script without parameters is only good in one scenario. Consider the model you just built that used the Buffer, Intersect, and Clip tools. This model was hard-coded to use the us_cities, us_roads, and us_boundaries shapefiles and output a shapefile called suitable_land. In other words, if you wanted to run the model with other datasets, you would have to open ModelBuilder, double-click each element (US Cities, US Roads, US Boundaries, and Suitable land), and change the paths that were written directly into the model. You would have to follow a similar process if you wanted to change the buffer distances, too, since those were hard-coded to 10 miles.
Let’s modify that model to use some parameters, so that you can easily run it with different datasets and buffer distances.
Even though you "parameterized" the cities, your model still defaults to using the C:\PSU\Geog485\Lesson1\us_cities.shp dataset. This isn't going to make much sense if you share your model or toolbox with other people because they may not have the same us_cities shapefile, and even if they do, it probably won't be sitting at the same path on their machines.
To remove the default dataset, double-click the Cities element and delete the path, then click OK. Some of the elements in your model may turn gray. This signifies that a value has to be provided before the model can successfully run.
Double-click your model Lesson 1 > Find Suitable Land With Parameters and examine the tool dialog. It should look similar to this:
People who run this model will be able to browse to any cities, roads, and boundaries datasets, and will be able to control the buffer distance. The red asterisks indicate parameters that must be supplied with valid values before the model can run.
The above exercise demonstrated how you can expose values as parameters using ModelBuilder. You need to decide which values you want the user to be able to change and designate those as parameters. When you write Python scripts, you'll also need to identify and expose parameters in a similar way.
By now, you've had some practice with ModelBuilder, and you're about ready to get started with Python. This page of the lesson contains some optional advanced material that you can read about ModelBuilder. This is particularly helpful if you anticipate using ModelBuilder frequently in your employment. Some of the items are common to the ArcGIS geoprocessing framework, meaning that they also apply when writing Python scripts with ArcGIS.
GIS analysis sometimes gets messy. Most of the tools that you run produce an output dataset, and when you chain many tools together, those datasets start piling up on disk. Esri has programmed ModelBuilder's default behavior such that when a model is run from a GUI interface, all datasets besides the final output -- referred to as intermediate data -- are automatically deleted. If, on the other hand, the model is run from ModelBuilder, intermediate datasets are left in their specified locations.
When running a model on another file system, specifying paths as we did above can be problematic, since the folder structure is not likely to be the same. This is where the concept of the scratch geodatabase (or scratch folder for file-based data like shapefiles) environment variable can come in handy. A scratch geodatabase is one that is guaranteed to exist on all ArcGIS installations. Unless the user has changed it, the scratch geodatabase will be found at C:\Users\<user>\Documents\ArcGIS\scratch.gdb on Windows 7/8. You can specify that a tool write to the scratch geodatabase by using the %scratchgdb% variable in the path. For example, %scratchgdb%\myOutput.
The following topics from Esri go into more detail on intermediate data and are important to understand as you work with the geoprocessing framework. I suggest reading them once now and returning to them occasionally throughout the course. Some of the concepts in them are easier to understand once you've worked with geoprocessing for a while.
Looping, or iteration, is the act of repeating a process. A main benefit of computers is their ability to quickly repeat tasks that would otherwise be mundane, cumbersome, or error-prone for a human to repeat and record. Looping is a key concept in computer programming, and you will use it often as you write Python scripts for this course.
ModelBuilder contains a number of elements called Iterators that can do looping in various ways. The names of these iterators, such as For and While actually mimic the types of looping that you can program in Python and other languages. In this course, we'll focus on learning iteration in Python, which may actually be just as easy as learning how to use a ModelBuilder iterator.
To take a peek at how iteration works in ModelBuilder, you can visit the ArcGIS Pro ModeBuilder help book for model iteration [6]. If you're having trouble understanding looping in later lessons, ModelBuilder might be a good environment to visualize what a loop does. You can come back and visit this book as needed.
Read Zandbergen Chapter 3.1 - 3.6, and 3.8 to reinforce what you learned about geoprocessing and ModelBuilder.
The best way to introduce Python may be to look at a little bit of code. Let’s take the Buffer tool which you recently ran from the Geoprocessing pane and run it in the ArcGIS Python window. This window allows you to type a simple series of Python commands without writing full permanent scripts. The Python Window is a great way to get a taste of Python.
This time, we’ll make buffers of 15 miles around the cities.
Type the following in the Python window (Don't type the >>>. These are just included to show you where the new lines begin in the Python window.)
>>> import arcpy >>> arcpy.Buffer_analysis("us_cities", "us_cities_buffered", "15 miles", "", "", "ALL")
Zoom in and confirm that the buffers were created.
You’ve just run your first bit of Python. You don’t have to understand everything about the code you wrote in this window, but here are a few important things to note.
The first line of the script -- import arcpy -- tells the Python interpreter (which was installed when you installed ArcGIS) that you’re going to work with some special scripting functions and tools included with ArcGIS. Without this line of code, Python knows nothing about ArcGIS, so you'll put it at the top of all ArcGIS-related code that you write in this class. You technically don't need this line when you work with the Python window in ArcMap because arcpy is already imported, but I wanted to show you this pattern early; you'll use it in all the scripts you write outside the Python window.
The second line of the script actually runs the tool. You can type arcpy, plus a dot, plus any tool name to run a tool in Python. Notice here that you also put an underscore followed by the name of the toolbox that includes the Buffer tool. This is necessary because some tools in different toolboxes actually have the same name (like Clip, which is a tool for clipping vectors in the Analysis toolbox or tool for clipping rasters in the Data Management toolbox).
After you typed arcpy.Buffer_analysis, you typed all the parameters for the tool. Each parameter was separated by a comma, and the whole list of parameters was enclosed in parentheses. Get used to this pattern, since you'll follow it with every tool you run in this course.
In this code, we also supplied some optional parameters, leaving empty quotes where we wanted to take the default values, and truncating the parameter list at the final optional parameter we wanted to set.
How do you know the syntax, or structure, of the parameters to enter? For example, for the buffer distance, should you enter 15MILES, ‘15MILES’, 15 Miles, or ’15 Miles’? The best way to answer questions like these is to return to the Geoprocessing tool reference help topic for the Buffer tool [7]. All of the topics in this reference section have Usage and Code Sample sections to help you understand how to structure the parameters. Optional parameters are enclosed in braces, while the required parameters are not. From the example in this topic, you can see that the buffer distance should be specified as ’15 miles’. Because there is a space in this text, or string, you need to surround it with single quotes.
You might have noticed that the Python window helps you by popping up different options you can type for each parameter. This is called autocompletion, and it can be very helpful if you're trying to run a tool for the first time, and you don't know exactly how to type the parameters. You might have also noticed that some code pops up like (Buffer() analysis and Buffer3D() 3d) when you were typing in the function name. You can use your up/down arrows to highlight the alternatives. If you selected Buffer() analysis, it will appear in your Python window.
Please note that if you do you use the code completion your code will sometimes look slightly different - esri have reorganised how the functions are arranged within arcpy, they work the same they're just in a slightly different place. The "old" way still works though so you might see inconsistencies in this class, online forums, esri's documentation etc. So, in this example, arcpy.Buffer_analysis(...) has changed to arcpy.analysis.Buffer(....) reflecting that the Buffer tool is located within the Analysis toolbox in Pro.
There are a couple of differences between writing code in the Python window and writing code in some other program, such as Notepad or PyScripter. In the Python window, you can reference layers in the map document by their names only, instead of their file paths. Thus, we were able to type "us_cities" instead of something like "C:\\data\\us_cities.shp". We were also able to make up the name of a new layer "us_cities_buffered" and get it added to the map by default after the code ran. If you're going to use your code outside the Python window, make sure you use the full paths.
When you write more complex scripts, it will be helpful to use an integrated development environment (IDE), meaning a program specifically designed to help you write and test Python code. Later in this course, we’ll explore the PyScripter IDE.
Earlier in this lesson, you saw how tools can be chained together to solve a problem using ModelBuilder. The same can be done in Python, but it’s going to take a little groundwork to get to that point. For this reason, we’ll spend the rest of Lesson 1 covering some of the basics of Python.
Zandbergen covers the Python window and some things you can do with it in Chapter 2.
2nd Edition: 2.7, and 2.9-2.13
3rd Edition: 2.8, and 2.10-2.14
Python is a language that is used to automate computing tasks through programs called scripts. In the introduction to this lesson, you learned that automation makes work easier, faster, and more accurate. This applies to GIS and many other areas of computer science. Learning Python will make you a more effective GIS analyst, but Python programming is a technical skill that can be beneficial to you even outside the field of GIS.
Python is a good language for beginning programming. Python is a high-level language, meaning you don’t have to understand the “nuts and bolts” of how computers work in order to use it. Python syntax (how the code statements are constructed) is relatively simple to read and understand. Finally, Python requires very little overhead to get a program up and running.
Python is an open-source language, and there is no fee to use it or deploy programs with it. Python can run on Windows, Linux, Unix, and Mac operating systems.
In ArcGIS, Python can be used for coarse-grained programming, meaning that you can use it to easily run geoprocessing tools such as the Buffer tool that we just worked with. You could code all the buffer logic yourself, using more detailed, fine-grained programming with the ArcGIS Pro SDK, but this would be time consuming and unnecessary in most scenarios; it’s easier just to call the Buffer tool from a Python script using one line of code.
In addition to the Esri help which describes all of the parameters of a function and how to access them from Python, you can also get Python syntax (the structure of the language) for a tool like this :
PyScripter is an easy IDE to install for ArcGIS Pro development. If you are using ArcGIS Pro version 2.2 or newer, you will first have to create and activate a clone of the ArcGIS default Python environment (see here [8] for details on this issue). ArcGIS Pro 3.0 changed the way that Pro manages the environments and gives more control to the user. These steps below are written using Pro 3.1 as a reference. If you have an earlier version of Pro, these steps are similar in process but the output destination (step 4.1) will be set for you. To do this, follow these steps below and please let the instructor know if you run into any trouble.
Now perform the following steps to install PyScripter:
If you are familiar with another IDE you're welcome to use it instead of PyScripter (just verify that it is using Python 3!) but we recommend that you still install PyScripter to be able to work through the following sections and the sections on debugging in Lesson 2.
Here’s a brief explanation of the main parts of PyScripter. Before you begin reading, be sure to have PyScripter open, so you can follow along.
When PyScripter opens, you’ll see a large text editor in the right side of the window. We'll come back to this part of the PyScripter interface in a moment. For now, focus on the pane in the bottom called the Python Interpreter. If this window is not open, and not listed as a tab along the bottom, you can open it by going to the top menu and selecting View> IDE Windows> then select Python Interpreter. This console is much like the Python interactive window we saw earlier in the lesson. You can type a line of Python at the In >>> prompt, and it will immediately execute and print the result if there is a printable result. This console can be a good place to practice with Python in this course, and whenever you see some Python code next to the In >>> prompt in the lesson materials, this means you can type it in the console to follow along.
We can experiment here by typing "import arcpy"
to import arcpy or running a print statement.
>>> import arcpy >>> print ("Hello World") Hello world
You might have noticed while typing in that second example a useful function of the Python Interpreter - code completion. This is where PyScripter, like Pro's Python window, is smart enough to recognize that you're entering a function name, and it provides you with the information about the parameters that function takes. If you missed it the first time, enter print(in the IPython window and wait for a second (or less) and the print function's parameters will appear. This also works for arcpy functions (or those from any library that you import). Try it out with arcpy.Buffer_analysis.
Now let's return to the right side of the window, the Editor pane. It will contain a blank script file by default (module1.py). I say it's a blank script file, because while there is text in the file, that text is delimited by special characters that cause it to be ignored when the script is executed We'll discuss these special characters further later, but for now, it's sufficient to note that PyScripter automatically inserts the character encoding of the file, the time it was created, and the login name of the user running PyScripter. You can add the actual Python statements that you'd like to be executed beneath these bits of documentation. (You can also remove the function and if statement along with the documentation, if you like.)
Among the nice features of PyScripter's editor (and other Python IDEs) is its color coding of different Python language constructs. Spacing and indentation, which are important in Python, are also easy to keep track of in this interface. Lastly, note that the Editor pane is a tabbed environment; additional script files can be loaded using File > New or File > Open.
Above the Editor pane, a number of toolbars are visible by default. The File, Run, and Debug toolbars provide access to many commonly used operations through a set of buttons. The File toolbar contains tools for loading, running, and saving scripts. Finally, the Debug toolbar contains tools for carefully reviewing your code line-by-line to help you detect errors. The Debugging toolbar is extremely valuable to you as a programmer, and you’ll learn how to use it later in this course. This toolbar is one of the main reasons to use an Integrated Development Environment (IDE) instead of writing your code in a simple text editor like Notepad.
It’s time to get some practice with some beginning programming concepts that will help you write some simple scripts in Python by the end of Lesson 1. We’ll start by looking at variables.
Remember your first introductory algebra class where you learned that a letter could represent any number, like in the statement x + 3? This may have been your first exposure to variables. (Sorry if the memory is traumatic!) In computer science, variables represent values or objects you want the computer to store in its memory for use later in the program.
Variables are frequently used to represent not only numbers, but also text and “Boolean” values (‘true’ or ‘false’). A variable might be used to store input from the program’s user, to store values returned from another program, to represent constant values, and so on.
Variables make your code readable and flexible. If you hard-code your values, meaning that you always use the literal value, your code is useful only in one particular scenario. You could manually change the values in your code to fit a different scenario, but this is tedious and exposes you to a greater risk of making a mistake (suppose you forget to change a value). Variables, on the other hand, allow your code to be useful in many scenarios and are easy to parameterize, meaning you can let users change the values to whatever they need.
To see some variables in action, open PyScripter and type this in the Python Interpreter:
>>> x = 2
You’ve just created, or declared, a variable, x, and set its value to 2. In some strongly-typed programming languages, such as Java, you would be required to tell the program that you were creating a numerical variable, but Python assumes this when it sees the 2.
When you hit Enter, nothing happens, but the program now has this variable in memory. To prove this, type:
>>> x + 3
You see the answer of this mathematical expression, 5, appear immediately in the console, proving that your variable was remembered and used.
You can also use the print function to write the results of operations. We’ll use this a lot when practicing and testing code.
>>> print (x + 3) 5
Variables can also represent words, or strings, as they are referred to by programmers. Try typing this in the console:
>>> myTeam = "Nittany Lions" >>> print (myTeam) Nittany Lions
In this example, the quotation marks tell Python that you are declaring a string variable. Python is a powerful language for working with strings. A very simple example of string manipulation is to add, or concatenate, two strings, like this:
>>> string1 = "We are " >>> string2 = "Penn State!" >>> print (string1 + string2) We are Penn State!
You can include a number in a string variable by putting it in quotes, but you must thereafter treat it like a string; you cannot treat it like a number. For example, this results in an error:
>>> myValue = "3" >>> print (myValue + 2)
In these examples, you’ve seen the use of the = sign to assign the value of the variable. You can always reassign the variable. For example:
>>> x = 5 >>> x = x - 2 >>> print (x) 3
When naming your variables, the following tips will help you avoid errors.
Make variable names meaningful so that others can easily read your code. This will also help you read your code and avoid making mistakes.
You’ll get plenty of experience working with variables throughout this course and will learn more in future lessons.
Read Zandbergen section 4.5 on variables and naming.
The number and string variables that we worked with above represent data types that are built into Python. Variables can also represent other things, such as GIS datasets, tables, rows, and the geoprocessor that we saw earlier that can run tools. All of these things are objects that you use when you work with ArcGIS in Python.
In Python, everything is an object. All objects have:
One way to understand objects is to compare performing an operation in a procedural language (like FORTRAN) to performing the same operation in an object-oriented language. We'll pretend that we are writing a program to make a peanut butter and jelly sandwich. If we were to write the program in a procedural language, it would flow something like this:
If we were to write the program in an object-oriented language, it might look like this:
In the object-oriented example, the bulk of the steps have been eliminated. The sandwich object "knows how" to build itself, given just a few pieces of information. This is an important feature of object-oriented languages known as encapsulation.
Notice that you can define the properties of the sandwich (like the bread type) and perform methods (remember that these are actions) on the sandwich, such as adding the peanut butter and jelly.
The reason it’s so easy to "make a sandwich" in an object-oriented language is that some programmer, somewhere, already did the work to define what a sandwich is and what you can do with it. He or she did this using a class. A class defines how to create an object, the properties and methods available to that object, how the properties are set and used, and what each method does.
A class may be thought of as a blueprint for creating objects. The blueprint determines what properties and methods an object of that class will have. A common analogy is that of a car factory. A car factory produces thousands of cars of the same model that are all built on the same basic blueprint. In the same way, a class produces objects that have the same predefined properties and methods.
In Python, classes are grouped together into modules. You import modules into your code to tell your program what objects you’ll be working with. You can write modules yourself, but most likely you'll bring them in from other parties or software packages. For example, the first line of most scripts you write in this course will be:
import arcpy
Here, you're using the import keyword to tell your script that you’ll be working with the arcpy module, which is provided as part of ArcGIS. After importing this module, you can create objects that leverage ArcGIS in your scripts.
Other modules that you may import in this course are os (allows you to work with the operating system), random (allows for generation of random numbers), csv (allows for reading and writing of spreadsheet files in comma-separated value format), and math (allows you to work with advanced math operations). These modules are included with Python, but they aren't imported by default. A best practice for keeping your scripts fast is to import only the modules that you need for that particular script. For example, although it might not cause any errors in your script, you wouldn't include import arcpy in a script not requiring any ArcGIS functions.
Read Zandbergen section 5.9 (Classes) for more information about classes.
Another important feature of object-oriented languages is inheritance. Classes are arranged in a hierarchical relationship, such that each class inherits its properties and methods from the class above it in the hierarchy (its parent class or superclass). A class also passes along its properties and methods to the class below it (its child class or subclass). A real-world analogy involves the classification of animal species. As a species, we have many characteristics that are unique to humans. However, we also inherit many characteristics from classes higher in the class hierarchy. We have some characteristics as a result of being vertebrates. We have other characteristics as a result of being mammals. To illustrate the point, think of the ability of humans to run. Our bodies respond to our command to run not because we belong to the "human" class, but because we inherit that trait from some class higher in the class hierarchy.
Back in the programming context, the lesson to be learned is that it pays to know where a class fits into the class hierarchy. Without that piece of information, you will be unaware of all of the operations available to you. This information about inheritance can often be found in informational posters called object model diagrams.
Here's an example of an object model diagram for the ArcGIS Python Geoprocessor at 10.x [12]. Take a look at the green(ish) box titled FeatureClass Properties and notice at the middle column, second from the top, it says Dataset Properties. This is because FeatureClass inherits all properties from Dataset. Therefore, any properties on a Dataset object, such as Extent or SpatialReference, can also be obtained if you create a FeatureClass object. Apart from all the properties it inherits from Dataset, the FeatureClass has its own specialized properties such as FeatureType and ShapeType (in the top box in the left column).
Every programming language has rules about capitalization, white space, how to set apart lines of code and procedures, and so on. Here are some basic syntax rules to remember for Python:
Let’s look at a few example scripts to see how these rules are applied. The first example script is accompanied with a walkthrough video that explains what happens in each line of the code. You can also review the main points about each script after reading the code.
This first example script reports the spatial reference (coordinate system) of a feature class stored in a geodatabase. If you want to use the USA.gdb referenced in this example, you can run the code [13] yourself.
1 2 3 4 5 6 7 8 9 10 11 12 | # Opens a feature class from a geodatabase and prints the spatial reference import arcpy featureClass = "C:/Data/USA/USA.gdb/Boundaries" # Describe the feature class and get its spatial reference desc = arcpy.Describe(featureClass) spatialRef = desc.spatialReference # Print the spatial reference name print (spatialRef.Name) |
This may look intimidating at first, so let’s go through what’s happening in this script, line by line. Watch this video (5:54) to get a visual walkthrough of the code.
Again, notice that:
The best way to get familiar with a new programming language is to look at example code and practice with it yourself. See if you can modify the script above to report the spatial reference of a feature class on your computer. In my example, the feature class is in a file geodatabase; you’ll need to modify the structure of the featureClass path if you are using a shapefile (for example, you'll put .shp at the end of the file name, and you won't have .gdb in your path).
Follow this pattern to try the example:
We'll take a short break and do some reading from another source. If you are new to Python scripting, it can be helpful to see the concepts from another point of view.
Read parts of Zandbergen chapters 4 & 5. This will be a valuable introduction to Python in ArcGIS, on how to work with tools and toolboxes (very useful for Project 1), and also on some concepts which we'll revisit later in Lesson 2 (don't worry if the bits we skip over seem daunting - we'll explain those in Lesson 2).
Here’s another simple script that finds all cells over 3500 meters in an elevation raster and makes a new raster that codes all those cells as 1. Remaining values in the new raster are coded as 0. This type of “map algebra” operation is common in site selection and other GIS scenarios.
Something you may not recognize below is the expression Raster(inRaster). This function just tells ArcGIS that it needs to treat your inRaster variable as a raster dataset so that you can perform map algebra on it. If you didn't do this, the script would treat inRaster as just a literal string of characters (the path) instead of a raster dataset.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | # This script uses map algebra to find values in an # elevation raster greater than 3500 (meters). import arcpy from arcpy.sa import * # Specify the input raster inRaster = "C:/Data/Elevation/foxlake" cutoffElevation = 3500 # Check out the Spatial Analyst extension arcpy.CheckOutExtension( "Spatial" ) # Make a map algebra expression and save the resulting raster outRaster = Raster(inRaster) > cutoffElevation outRaster.save( "C:/Data/Elevation/foxlake_hi_10" ) # Check in the Spatial Analyst extension now that you're done arcpy.CheckInExtension( "Spatial" ) |
Begin by examining this script and trying to figure out as much as you can based on what you remember from the previous scripts you’ve seen.
The main points to remember on this script are:
Now try to run the script yourself using the FoxLake digital elevation model (DEM) in your Lesson 1 data folder. If it doesn’t work the first time, verify that:
You can experiment with this script using different values in the map algebra expression (try 3000 for example).
ArcGIS Pro edition:
Read the sections of Chapter 5 that talk about environment variables and licenses (5.11 & 5.13) which we covered in this part of the lesson.
ArcMap edition:
Read the sections of Chapter 5 that talk about environment variables and licenses (5.9 & 5.11) which we covered in this part of the lesson. The discussion of ArcGIS products in 5.11 does not apply to Pro. The useful content in this section begins at the bottom of page 117 with "Licenses for extensions..."
Think about the previous example where you ran some map algebra on an elevation raster. If you wanted to change the value of your cutoff elevation to 2500 instead of 3500, you had to open the script itself and change the value of the cutoffElevation variable in the code.
This third example is a little different. Instead of hard-coding the values needed for the tool (in other words, literally including the values in the script) we’ll use some user input variables, or parameters. This allows people to try different values in the script without altering the code itself. Just like in ModelBuilder, parameters make your script available to a wider audience.
The simple example below just runs the Buffer tool, but it allows the user to enter the path of the input and output datasets as well as the distance of the buffer. The user-supplied parameters make their way into the script with the arcpy.GetParameterAsText() function.
Examine the script below carefully, but don't try to run it yet. You'll do that in the next part of the lesson.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | # This script runs the Buffer tool. The user supplies the input # and output paths, and the buffer distance. import arcpy arcpy.env.overwriteOutput = True try : # Get the input parameters for the Buffer tool inPath = arcpy.GetParameterAsText( 0 ) outPath = arcpy.GetParameterAsText( 1 ) bufferDistance = arcpy.GetParameterAsText( 2 ) # Run the Buffer tool arcpy.Buffer_analysis(inPath, outPath, bufferDistance) # Report a success message arcpy.AddMessage( "All done!" ) except : # Report an error messages arcpy.AddError( "Could not complete the buffer" ) # Report any error messages that the Buffer tool might have generated arcpy.AddMessage(arcpy.GetMessages()) |
Again, examine the above code line by line and figure out as much as you can about what the code does. If necessary, print the code and write notes next to each line. Here are some of the main points to understand:
ArcGIS Pro edition:
Read the section of Chapter 5 that talks about working with tool messages (5.12) for another perspective on handling tool output.
ArcMap edition:
Read the section of Chapter 5 that talks about working with tool messages (5.10) for another perspective on handling tool output. This section discusses tool messages that appear in ArcMap's Results window. These messages are accessed in Pro by going to the Geoprocessing History, right-clicking on the desired tool, and selecting View details.
User input variables that you retrieve through GetParameterAsText() make your script very easy to convert into a tool in ArcGIS. A few people know how to alter Python code, a few more can run a Python script and supply user input variables, but almost all ArcGIS users know how to run a tool. To finish off this lesson, we’ll take the previous script and make it into a tool that can easily be run in ArcGIS.
Before you begin this exercise, I strongly recommend that you scan the ArcGIS help topic Adding a script tool [16]. You likely will not understand all the parts of this topic yet, but it will give you some familiarity with script tools that will be helpful during the exercise.
Follow these steps to make a script tool:
This is a very simple example, and obviously, you could just run the out-of-the-box Buffer tool with similar results. Normally, when you create a script tool, it will be backed with a script that runs a combination of tools and applies some logic that makes those tools uniquely useful.
There’s another benefit to this example, though. Notice the simplicity of our script tool dialog compared to the main Buffer tool:
At some point, you may need to design a set of tools for beginning GIS users where only the most necessary parameters are exposed. You may also do this to enforce quality control if you know that some of the parameters must always be set to certain defaults, and you want to avoid the scenario where a beginning user (or a rogue user) might change the required values. A simple script tool is effective for simplifying the tool dialog in this way.
ArcGIS Pro edition:
Read Zandbergen 3.9- 3.10 to reinforce what you learned during this lesson about scripts and script tools.
Each lesson in this course includes some simple practice exercises with Python. These are not submitted or graded, but they are highly recommended if you are new to programming or if the project initially looks challenging. Lessons 1 and 2 contain shorter exercises, while Lessons 3 and 4 contain longer, more holistic exercises. Each practice exercise has an accompanying solution that you should carefully study. If you want to use the USA.gdb referenced in some of the solutions you can find it here. [13]
Remember to choose File > New in PyScripter to create a new script (or click the empty page icon). You can name the scripts something like Practice1, Practice2, etc. To execute a script in PyScripter, click the "play" icon.
Suppose you're working on a project for the Nebraska Department of Agriculture and you are tasked with making some maps of precipitation in the state. Members of the department want to see which parts of the state were relatively dry and wet in the past year, classified in zones. All you have is a series of weather station readings of cumulative rainfall for 2008 that you've obtained from within Nebraska and surrounding areas. This is a shapefile of points called Precip2008Readings.shp. It is in your Lesson 1 data folder.
Precip2008Readings.shp is a fictional dataset created for this project. The locations do not correspond to actual weather stations. However, the measurements are derived from real 2008 precipitation data created by the PRISM Climate Group [23] at Oregon State University.
You need to do several tasks in order to get this data ready for mapping:
It's very possible that you'll want to repeat the above process in order to test different IDW interpolation parameters or make similar maps with other datasets (such as next year's precipitation data). Therefore, the above series of tasks is well-suited to ModelBuilder. Your job is to create a model that can complete the above series of steps without you having to manually open four different tools.
Your model should have these (and only these) parameters:
As you build your model, you will need to configure some settings that will not be exposed as parameters. These include the clip feature, which is the state of Nebraska outline Nebraska.shp in your Lesson 1 data folder. There are many other settings such as "Z Value field" and "Input barrier polyline features" (for IDW) or "Reclass field" (for Reclassify) that should not be exposed as parameters. You should just set these values once when you build your model. If you ever ask someone else to run this model, you don't want them to be overwhelmed with choices stemming from every tool in the model; you should just expose the essential things they might want to change.
For this particular model, you should assume that any input dataset will conform to the same schema as your Precip2008Readings.shp feature class. For example, an analyst should be able to submit similar datasets Precip2009Readings, Precip2010Readings, etc. for more recent years with the same fields, field names, and data types. However, he or she should not expect to provide any feature class with a different set of fields and field names, etc. As you might discover, handling all types of feature class schemas would make your model more complex than we want for this assignment.
Important: Given the scenario of wishing to re-run the model for other years of data, it would be a good idea to set default values for the exposed model parameters. Therefore, we are asking you to set default values for all parameters that are exposed as model parameters including the Power value, Search radius value, and Zone boundaries classification table. When you double-click the model to run it, the interface should look like the following:
Running the model with the exact parameters listed above should result in the following (I have symbolized the zones in Pro with different colors to help distinguish them). This is one way you can check your work:
Once you are done, take a screenshot of the layout of your final model in ModelBuilder (similar to Figure 1.5 in Section 1.3.2) to include in your homework submission.
The following tips may help you as you build your model:
The second part of Project 1 will help you get some practice with Python. At the end of Lesson 1, you saw three simple scripting examples; now your task is to write your own script. This script will create vector contour lines from a raster elevation dataset. Don't forget that the ArcGIS Pro Help [25] can indeed be helpful if you need to figure out the syntax for a particular command.
Earlier in the lesson, you were introduced to the Fox Lake DEM in your Lesson 1 data folder. It represents elevation in the Fox Lake Quadrangle, Utah. Write a script that uses the Contour tool in the Spatial Analyst toolbox to create contour lines for the quadrangle. The contour interval should be 25 meters, and the base contour should be 0. Remember that the native units of the DEM are meters, so no unit conversions are required.
Running the script should immediately create a shapefile of contour lines on disk.
Follow these guidelines when writing the script:
The deliverables for Project 1 are:
Important: Successful delivery of the above requirements is sufficient to earn 90% on the project. The remaining 10% is reserved for efforts that go "over and above" the minimum requirements. For Part I, this could include (but is not limited to) analysis of how different input values affect the output, substitution of some other interpolation method instead of IDW (for example Kriging), documentation for your model parameters that guides the end user in what to input, or demonstration of how your model was successfully run on a different input dataset. For Part II, in addition to the Contour tool, you could run some other tool that also takes a DEM as an input.
As a general rule throughout the course, full credit in the "over and above" category requires the implementation of 2-4 different ideas, with more complex ideas earning more credit. Note that for future projects, we won't be listing off ideas as we've done here. Otherwise, it wouldn't really be an over and above requirement.
To complete Lesson 1, please zip all your Project 1 deliverables (for parts I and II) into one file and submit them to the Project 1 Drop Box in Canvas. Then take the Lesson 1 Quiz if you haven't taken it already.
Links
[1] https://www.esri.com/training/catalog/57630436851d31e02a43f13c/python-for-everyone/
[2] https://www.e-education.psu.edu/geog485/sites/www.e-education.psu.edu.geog485/files/data/Lesson1.zip
[3] https://pro.arcgis.com/en/pro-app/help/analysis/geoprocessing/modelbuilder/create-a-model-tool.htm#ESRI_SECTION2_9F92183899BB40679F6C56E786F09992
[4] https://pro.arcgis.com/en/pro-app/help/analysis/geoprocessing/modelbuilder/modelbuilder-vocabulary.htm#ESRI_SECTION2_FCF8A4512F0E4429A10EDEA3593EB9E1
[5] https://pro.arcgis.com/en/pro-app/tool-reference/environment-settings/scratch-gdb.htm
[6] https://pro.arcgis.com/en/pro-app/help/analysis/geoprocessing/modelbuilder/iterators-for-looping.htm
[7] https://pro.arcgis.com/en/pro-app/tool-reference/analysis/buffer.htm
[8] https://community.esri.com/docs/DOC-12021-python-at-arcgispro-22
[9] http://sourceforge.net/projects/pyscripter/
[10] https://www.e-education.psu.edu/geog485/sites/www.e-education.psu.edu.geog485/files/ide_files/PyScripter-4.2.5-x64-Setup.zip
[11] https://www.e-education.psu.edu/geog485/sites/www.e-education.psu.edu.geog485/files/ide_files/PyScripter-4.2.5-x86.zip
[12] https://www.e-education.psu.edu/geog485/sites/www.e-education.psu.edu.geog485/files/file/Geoprocessor10.pdf
[13] https://www.e-education.psu.edu/geog485/sites/www.e-education.psu.edu.geog485/files/data/USA.gdb.zip
[14] https://creativecommons.org/licenses/by-nc-sa/4.0/
[15] https://pro.arcgis.com/en/pro-app/help/analysis/spatial-analyst/mapalgebra/working-with-raster-objects.htm
[16] https://pro.arcgis.com/en/pro-app/arcpy/geoprocessing_and_python/adding-a-script-tool.htm
[17] https://www.e-education.psu.edu/geog485/node/227
[18] https://www.e-education.psu.edu/geog485/node/228
[19] https://www.e-education.psu.edu/geog485/node/229
[20] https://pro.arcgis.com/en/pro-app/arcpy/functions/describe.htm
[21] https://www.e-education.psu.edu/geog485/node/230
[22] https://www.e-education.psu.edu/geog485/node/295
[23] http://www.prismclimate.org
[24] https://pro.arcgis.com/en/pro-app/tool-reference/3d-analyst/how-idw-works.htm
[25] https://pro.arcgis.com/en/pro-app/help/main/welcome-to-the-arcgis-pro-app-help.htm