GEOG 489
Advanced Python Programming for GIS

Lesson 1 Assignment

PrintPrint

Lesson 1 Assignment

Part 1 – Multiprocessing Script

We are going to use the arcpy vector data processing code from Section 1.6.6.2 (download Lesson1_Assignment_initial_code.py) as the basis for our Lesson 1 programming project. The code is already in multiprocessing mode, so you will not have to write multiprocessing code on your own from scratch but you still will need a good understanding of how the script works. If you are unclear about anything the script does, please ask on the course forums. This part of the assignment will be for getting back into the rhythm of writing arcpy based Python code and practice creating script tool with ArcGIS Pro. Your task is to extend our vector data clipping script by doing the following:

  • Modify the code to handle a parameterized output folder path (still using unique output filenames for each shapefile) defined in a third input variable at the beginning of the main script file. One way to achieve this task is by adding another (5th) parameter to the worker() function to pass the output folder information along with the other data.

To realize the modified code versions in this part, all main modifications have to be made to the input variables and within the code of the worker() and mp_handler() functions. Of course, we will also look at code quality, so make sure the code is readable and well documented. There are a few hints that may be helpful after we talk about Part 2.

Part 2 – Single File Multiprocessing Script Tool

In a single script file, (combining the mp_handler code and the worker function into one script) expand the code so that it can handle multiple input featureclasses to be clipped (still using a single polygon clipping feature class).

  1. The input variable tobeclipped should now take a list of feature class names rather than a single name.
  2. The worker function should, as before, perform the operation of clipping a single input file (not all of them!) to one of the features in the clipper feature class.
  3. The main change you will have to make here will be in the main code where the jobs are created.
  4. The names of the output files produced should have the format:

    clip_<oid>_<name of the input feature class>.shp

    For instance, clip_0_Roads.shp produced by clipping the Roads featureclass (found in the USA.gdb filegeodatabse) to the State oid '0'.

To realize the modified code versions in this part, it is important to remember how to avoid infinite recursions and the purpose of if name == '__main__':

Successful delivery of the above requirements is sufficient to earn 95% on the project. The remaining 5% is reserved for efforts that go "over and above" the minimum requirements. Over and above points may be earned by adding further geoprocessing operations (e.g. reprojection) to the worker() function, or other enhancements as you see fit, such as returning a dictionary of results from the workers and parsing them to print success/ failure messages.

You will have to submit several versions of the modified script for this assignment:

  • (A) The modified single-input-file script version from Part 1.
  • (B) The single file version multiple-input-files script tool version from Part 2 as an .atbx
  • (C) Potentially a third version if you made substantial modifications to the code for "over and above" points. If you created a new script tool for this, make sure to include the .atbx file as well.

Hint 1:

When you adapt the worker() function, I strongly recommend that you do some tests with individual calls of that function first before you run the full multiprocessing version. For this, you can, for instance, utilize what we learned about the if name == '__main__': conditional for the multicode script, or comment out the pool code and instead call worker() directly from the loop that produces the job list, meaning all calls will be made sequentially rather than in parallel. This makes it easier to detect errors compared to running everything in multiprocessing mode right away. Similarly, it could be a good idea to view the variables in the debugger or add print statements placed in the job list to make sure that the correct values will be passed to the worker function.

Hint 2 (concerns Part 2.A):

When changing to the multiple-input-files version, you will not only have to change the code that produces the name of the output files in variable outFC by incorporating the name of the input feature class, you will have to do the same for the name of the temporary layer that is being created by MakeFeatureClass_managment() to make sure that the layer names remain unique. Else some worker calls will fail because they try to create a layer with a name that is already in use.

To get the basename of a feature class without file extension, you can use a combination of the os.path.basename() and os.path.splitext() functions defined in the os module of the Python standard library. The basename() function will remove the leading path (so e.g., turn "C:\489\data\Roads.shp" into just "Roads.shp"). The expression os.path.splitext(filename)[0] will give you the filename without file extension. So for instance "Roads.shp" will become just "Roads". (Using [1] instead of [0] will give you just the file extension but you won't need this here.)

Hint 3 (concerns Part 2.B):

There is an additional Python module action that you will have to make in your if __name__ == "__main__": conditional in order to ensure that each process in the pool has an exclusive worker function.

Hint 3 (concerns Part 2.B):

You will also have to use the "Multiple value" option for the input parameter you create for the to-be-clipped feature class list in the script tool interface. If you then use GetParameterAsText(...) for this parameter in your code, you will get a single string(!) with the names of the feature classes the user picked separated by semicolons, not a list of name strings. You can then either use the string method .split(...) to turn this string into a list of feature class names, or you use GetParameter(...) instead of GetParameterAsText(...) which will directly give you the feature class names as a list. Be sure to verify the results!

Part 2 Deliverable

Submit a single .zip file to the corresponding drop box on Canvas; the zip file should contain:

  • Your modified code files and ArcGIS Pro toolbox files (up to three different versions as described above). Please organize the files cleanly, e.g., using a separate subfolder for each version.
  • A 400-word write-up of what you have learned during this exercise. This write-up should also include:
    • Think back to the beginning of Section 1.6.6 and include a brief discussion of any changes to the processing workflow and/or the code that might be necessary if we wanted to write our output data to geodatabases and briefly comment on possible issues (using pseudocode or a simple flowchart if you wish).
    • A description of what you did for "over and above" points (if anything).