Team 'or either' in Python regular expressions
The '|' command, which is a more powerful variant of 'or' compared to the [ ]. This command allows you to split the regular expression into several parts. In this case, the searched for can fall under either one part of the regular expression or another. Let's look at some examples.
Example
In this example, the search pattern is: three letters 'a' or three letters 'b':
txt = 'aaa bbb abb'
res = re.sub('a{3}|b{3}', '!', txt)
print(res)
Result of code execution:
'! ! abb'
Example
In this example, the search pattern is as follows: three letters 'a' or from 1 and more letters 'b':
txt = 'aaa bbb bbbb bbbbb axx'
res = re.sub('a{3}|b+', '!', txt)
print(res)
Result of code execution:
'! ! ! ! axx'
Example
In this example, the search pattern is: one or more letters or three numbers:
txt = 'a ab abc 1 12 123'
res = re.sub('[a-z]+|\d{3}', '!', txt)
print(res)
Result of code execution:
'! ! ! 1 12 !'
Example
The vertical bar can divide a regular expression not into two parts, but into any number of parts:
txt = 'aaa bbb ccc ddd'
res = re.sub('a+|b+|c+', '!', txt)
print(res)
Result of code execution:
'! ! ! ddd'
Example
If the vertical bar is inside parentheses, then ' or '' only works inside those parentheses.
As an example, let's find lines with the following pattern: at the beginning there is either 'a', or 'b' one or more times, and then two letters 'x':
txt = 'axx bxx bbxx exx'
res = re.sub('(a|b+)xx', '!', txt)
print(res)
Result of code execution:
'! ! ! exx'
Practical tasks
Given a string:
txt = 'aeeea aeea aea axa axxa axxxa'
Write a regular expression that will find lines according to the pattern: along the edges there are the letters 'a', and between them there is either the letter 'e' any number of times or the letter 'x' any number of times.
Given a string:
txt = 'aeeea aeea aea axa axxa axxxa'
Write a regular expression that will find lines according to the pattern: on the edges there are the letters 'a', and between them there is either the letter 'e' twice or the letter 'x' any number of times.