Reading Web Data From Python :Using Python to Access Web Data (Python for Everybody Specialization) Answers 2025
Question 1
Which of the following Python data structures is most similar to the value returned in this line of Python:x = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')
✅ file handle
❌ regular expression
❌ dictionary
❌ list
❌ socket
Explanation:urlopen() returns a file-like object — you can use .read() or iterate over it like a file handle.
Question 2
In this Python code, which line actually reads the data?
✅ mysock.recv()
❌ socket.socket()
❌ mysock.close()
❌ mysock.connect()
❌ mysock.send()
Explanation:recv(512) reads 512 bytes of data from the socket — that’s the actual “reading” operation.
Question 3
Which of the following regular expressions would extract the URL from this line of HTML?<p>Please click <a href="http://www.dr-chuck.com">here</a></p>
✅ href="(.+)"
❌ href=".+"
❌ http://.*
❌ <.*>
Explanation:href="(.+)" captures what’s inside the quotation marks — i.e., the URL.
Question 4
In this Python code, which line is most like the open() call to read a file?
✅ socket.socket()
❌ mysock.connect()
❌ import socket
❌ mysock.recv()
❌ mysock.send()
Explanation:socket.socket() creates a socket object (just like open() creates a file handle).
Question 5
Which HTTP header tells the browser the kind of document that is being returned?
✅ Content-Type:
❌ ETag:
❌ Document-Type:
❌ HTML-Document:
❌ Metadata:
Explanation:Content-Type specifies the MIME type of the document, e.g., text/html, application/json.
Question 6
What should you check before scraping a website?
✅ That the website allows scraping
❌ That it supports GET
❌ That it only has internal links
❌ That it returns HTML
Explanation:
Always check the site’s robots.txt or terms of service to confirm that scraping is permitted.
Question 7
What is the purpose of the BeautifulSoup Python library?
✅ It repairs and parses HTML to make it easier for a program to understand
❌ It optimizes file retrieval
❌ It animates web operations
❌ It builds word clouds
❌ It chooses attractive skins
Explanation:
BeautifulSoup parses broken or messy HTML into a structured form that Python can navigate easily.
Question 8
What ends up in the variable x?
✅ A list of all the anchor tags (<a ...>) in the HTML
❌ True if any anchor tags exist
❌ All CSS files
❌ All paragraphs
Explanation:soup('a') finds all <a> tags and returns them as a list of BeautifulSoup tag objects.
Question 9
What is the most common Unicode encoding when moving data between systems?
✅ UTF-8
❌ UTF-32
❌ UTF-128
❌ UTF-16
❌ UTF-64
Explanation:
UTF-8 is the standard for web and network data transmission — compact and backward compatible with ASCII.
Question 10
What is the ASCII character with decimal value 42?
✅ *
❌ +
❌ !
❌ /
❌ ^
Explanation:
In ASCII, decimal 42 corresponds to the asterisk (*).
Question 11
What word does this sequence represent in ASCII?
108, 105, 110, 101
✅ line
❌ tree
❌ func
❌ ping
❌ lost
Explanation:
108→l, 105→i, 110→n, 101→e ⇒ line.
Question 12
How are strings stored internally in Python 3?
✅ Unicode
❌ ASCII
❌ EBCDIC
❌ UTF-8
❌ Byte Code
Explanation:
In Python 3, all strings are Unicode objects — the actual in-memory representation is abstracted.
Question 13
When reading data across the network in Python 3, what must be used to convert it to the internal string format?
✅ decode()
❌ find()
❌ upper()
❌ trim()
❌ encode()
Explanation:
Data read from a network is bytes — .decode() converts it into a string (Unicode).
🧾 Summary Table
| # | ✅ Correct Answer | Key Concept |
|---|---|---|
| 1 | file handle | urlopen() returns a file-like object |
| 2 | mysock.recv() |
Reads bytes from socket |
| 3 | href="(.+)" |
Regex capture group for URL |
| 4 | socket.socket() |
Equivalent to open() |
| 5 | Content-Type | MIME type header |
| 6 | Check scraping allowed | Respect robots.txt |
| 7 | Parses & cleans HTML | Purpose of BeautifulSoup |
| 8 | List of <a> tags |
soup('a') returns anchor tags |
| 9 | UTF-8 | Universal web encoding |
| 10 | * |
ASCII 42 |
| 11 | line | ASCII translation |
| 12 | Unicode | Python 3 strings |
| 13 | decode() | Convert bytes → string |