⊗pyPmREInr 33 of 128 menu

Introduction to Regular Expressions in Python

Regular expressions are commands for complex search and replace (or just search). They allow you to do some very interesting things, but unfortunately, they are quite difficult to master.

To get started with regular expressions in Python, we need to import a special module called re:

import re

The re module includes methods designed to work with regular expressions. A regular expression is a set of commands and consists of two types of characters: those that represent themselves and command characters, which are called special characters.

It is better to start getting acquainted with regular expressions using the sub method, which is used for substitutions in a string. The first parameter of the method is what to change, the second is what to change to. In the third parameter, we specify the string in which the substitution should be made. In the fourth optional parameter, we set the number of substitutions. The simplest substitution using this method will look like this:

res = re.sub('a', '!', 'bab') print(res) # 'b!b'

As you may have noticed, in a regular expression, letters represent themselves. The same applies to numbers. Let's replace the number 2 with !:

res = re.sub('2', '!', '12abc3') print(res) # '1!abc3'

But the dot is a special character and represents any character. In the following example, let's find a substring with the following pattern: letter 'x', then any character, then letter again 'x':

res = re.sub('x.x', '!', 'xax eee') print(res) # '! eee'

Given a string:

txt = 'ahb acb aeb aeeb adcb axeb'

Write a regular expression that will find the strings 'ahb', 'acb', 'aeb' by the pattern: letter 'a', any symbol, letter 'b'.

Given a string:

txt = 'aba aca aea abba adca abea'

Write a regular expression that will find the strings 'abba', 'adca', 'abea' by the pattern: letter 'a', 2 any characters, letter 'a'.

Given a string:

txt = 'aba aca aea abba adca abea'

Write a regular expression that will find the strings 'abba' and 'abea', without capturing 'adca'.

byenru