NGA Advanced Python Programming for GIS, GLGI 3001-1

Generators and Yield

PrintPrint

Python Generators retrieve data from a source when needed instead of retrieving all data at once.  Generators are useful when you need to manage the amount of memory you are using and behave like an iterator. Further definitions and explaination can be found in the Python documentation. Let's look at an example:

def first_n(n):
	'''Build and return a list'''
	num, nums = 0, []
	while num < n:
	    nums.append(num)
	    num += 1
	return nums

sum_of_first_n = sum(first_n(1000000))

The function returns the whole list and holds it in memory. This can take valuable resources away from other geoprocessing tasks, such as dissolve. The benefit of Generators is that it will only return the item called before retrieving the next value. Creating an iteration as a Generator is simple as adding the keyword yield instead of return.

# a generator that yields items instead of returning a list
def firstn(n):
	num = 0
	while num < n:
	    yield num
	num += 1

sum_of_first_n = sum(firstn(1000000))

This may be useful when delaing with a lot of data where a process is performed on a result before doing the same to the next item. It is very beneficial if the pc you are using contains a small amount of RAM or there are multiple people competeing for resources in a networked environment.