GEOG 489
Advanced Python Programming for GIS

(link is external) (link is external)

2.9.2 Lesson 2 Practice Exercise 2 Solution

PrintPrint

2.9.2 Lesson 2 Practice Exercise 2 Solution

1
2
3
4
5
6
7
8
9
10
11
12
import requests
from bs4 import BeautifulSoup
 
 
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
 
divElement = soup.select('article#node-book-2269 > div > div')[0]
 
wordLengths = [ len(word) for word in divElement.text.split() ]
print(wordLengths)
 

After loading the html page and creating the BeautifulSoup structure for it as in the examples you already saw in this lesson, the select(…) method is used in line 9 to get the <div> elements within the <div> element within the <article> element with the special id we are looking for. Since we know there will only be one such element, we can use the index [0] to get that element from the list and store it in variable divElement.

With divElement.text.split() we create a list of all the words in the text and then use this inside the list comprehension in line 11 where we convert the word list into a list of word lengths by applying the len(…) function to each word.