Skip to content

Regular Expressions :Using Python to Access Web Data (Python for Everybody Specialization) Answers 2025


Q1

Which regex would extract uct.ac.za from the string using re.findall?

  • F.+:

  • @\S+

  • @(\S+)

  • ..@\S+..

Explanation: @\S+ would match @uct.ac.za (including the @). Using a capturing group @(\S+) returns only the captured part — uct.ac.za — which is what the question asks for.


Q2

Which is the way to match the start of a line in a regex?

  • ^

  • str.startswith()

  • \linestart

  • String.startsWith()

  • variable[0:1]

Explanation: The caret ^ anchors the pattern to the start of the line.


Q3

What does [a-z0-9] mean in a regex?

  • ❌ Match anything but a lowercase letter or digit

  • ❌ Match any text that is surrounded by square braces

  • ❌ Match an entire line as long as it is lowercase letters or digits

  • ✅ Match a lowercase letter or a digit

  • ❌ Match any number of lowercase letters followed by any number of digits

Explanation: Bracket expressions list possible single characters to match. [a-z0-9] matches one character that is either a lowercase letter a–z or a digit 0–9.


Q4

What type does re.findall() return?

  • ❌ A boolean

  • ✅ A list of strings

  • ❌ A single character

  • ❌ An integer

  • ❌ A string

Explanation: re.findall() returns a list of all non-overlapping matches (strings). If the pattern contains capturing groups, it returns the group contents.


Q5

What is the regex “wild card” (matches any character)?

  • +

  • ^

  • $

  • *

  • ?

  • .

Explanation: The dot . matches any single character except a newline (unless flags change that).


Q6

Difference between + and * in regex:

  • + matches at least one and * matches zero or more

  • + matches upper case etc.

  • ❌ other incorrect options

Explanation: a+ requires one or more a; a* allows zero or more a.


Q7

What does [0-9]+ match?

  • ❌ Several digits followed by a plus sign

  • ❌ Any mathematical expression

  • ✅ One or more digits

  • ❌ Zero or more digits

  • ❌ Any number of digits at the beginning of a line

Explanation: + means one or more, so [0-9]+ matches a sequence of one or more digits.


Q8

What does this print?

x = 'From: Using the : character'
y = re.findall('^F.+:', x)
print(y)
  • ['From: Using the :']

  • ^F.+:

  • From:

  • ['From:']

  • :

Explanation: ^F.+: starts at the F, then .+ is greedy and stretches to the last : in the string, so the match is 'From: Using the :'. findall returns a list with that string.


Q9

What do you add to + or * to make the match non-greedy?

  • **

  • \g

  • ^

  • $

  • ?

  • ++

Explanation: +? or *? makes the quantifier non-greedy (match as little as possible).


Q10

Given the line
From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008
what would '\S+?@\S+' match?

  • \@\

  • stephen.marquard@uct.ac.za

  • d@uct.ac.za

  • From

  • marquard@uct

Explanation: \S+? (non-greedy) matches minimal non-whitespace up to @, then @\S+ continues to match the rest of the non-whitespace domain — together they capture the full email stephen.marquard@uct.ac.za.


🧾 Summary Table

Q Correct answer Key concept
1 @(\S+) Use capturing group to extract substring without @
2 ^ Start-of-line anchor
3 [a-z0-9] → lowercase letter or digit Character class matches one char
4 list of strings re.findall() returns a list
5 . Wildcard (any character)
6 + = 1+, * = 0+ Quantifier difference
7 [0-9]+ = one or more digits Digit quantifier
8 ['From: Using the :'] Greedy .+ matches to last :
9 ? Makes quantifiers non-greedy (+?, *?)
10 stephen.marquard@uct.ac.za \S+?@\S+ matches email-like non-whitespace