Character repetition operators in regular expressions
There are situations when you need to indicate that a symbol is repeated a specified number of times. If the exact number of repetitions is unknown, you can simply write it several times - 'aaaa'. But what if you need to say something like this: repeat one or more times?
For this purpose, there are repetition operators (quantifiers): plus + (one or more times), asterisk * (zero or more times), and question ? (zero or one time). These operators act on the symbol that comes before them.
Let's look at how these operators work using examples.
Example
Let's find all substrings by pattern letter 'x', letter 'a' one or more times, letter 'x':
txt = 'xx xax xaax xaaax xbx'
res = re.sub('xa+x', '!', txt)
print(res)
Result of code execution:
'xx ! ! ! xbx'
Example
Let's find all substrings by pattern letter 'x', letter 'a' zero or more times, letter 'x':
txt = 'xx xax xaax xaaax xbx'
res = re.sub('xa*x', '!', txt)
print(res)
Result of code execution:
'! ! ! ! xbx'
Example
Let's find all substrings by pattern letter 'x', letter 'a' zero or one time, letter 'x':
txt = 'xx xax xaax xbx'
res = re.sub('xa?x', '!', txt)
print(res)
Result of code execution:
'! ! xaax xbx'
Practical tasks
Given a string:
txt = 'aa aba abba abbba abca abea'
Write a regular expression that will find the strings 'aba', 'abba', 'abbba' according to the pattern: letter 'a', letter 'b' any number of times, letter 'a'.
Given a string:
txt = 'aa aba abba abbba abca abea'
Write a regular expression that will find the strings 'aa', 'aba', 'abba', 'abbba' by the pattern: letter 'a', letter 'b' any number of times (including not a single time), letter 'a'.
Given a string:
txt = 'aa aba abba abbba abca abea'
Write a regular expression that will find the strings 'aa', 'aba' by the pattern: letter 'a', letter 'b' once or not at all, letter 'a'.
Given a string:
txt = 'aa aba abba abbba abca abea'
Write a regular expression that will find the strings 'aa', 'aba', 'abba', 'abbba', without capturing 'abca' and 'abea'.