⊗pyPmREHHy 46 of 128 menu

Hyphen inside sets in Python regular expressions

The hyphen is also a special character inside [ ] (but not outside). If you need the hyphen itself as a symbol, then put it where it is will not be taken as a group separator.

Why this matters: You can make a group of characters without even realizing it. For example, like this - '[:-@]' - you think you're selecting a colon, a hyphen, and a at, but you're actually selecting a group of characters between : and @. The characters in this group are: ? < = > :

Where did they come from? From the ASCII table - the colon has a number lower than the dog - and you get a group. That is, all groups are obtained from the ASCII table (you can use this if you want).

How to deal with this: Place the hyphen where it will definitely not be perceived as a group character, such as at the beginning or end (i.e. after [ or before ]).

You can also escape the hyphen - then it will denote itself regardless of the position. For example, instead of [:-@] write [:\-@] - and there will no longer be a group, but three symbols - a colon, a hyphen and an at @.

Example

In the following example, the search pattern is: digit 1, then a letter from 'a' to 'z', then a digit 2:

txt = '1a2 1-2 1c2 1z2' res = re.sub('1[a-z]2', '!', txt) print(res)

Result of code execution:

'! 1-2 ! !'

Example

Let's now escape the hyphen. The resulting search pattern is: the number 1, then the letter 'a', or a hyphen, or the letter 'z', then the number 2:

txt = '1a2 1-2 1c2 1z2' res = re.sub('1[a\-z]2', '!', txt) print(res)

Result of code execution:

'! ! 1c2 !'

Example

You can simply move the hyphen without escaping it:

txt = '1a2 1-2 1c2 1z2' res = re.sub('1[az-]2', '!', txt) print(res)

Result of code execution:

'! ! 1c2 !'

Example

In the following example, the search pattern is: the first character is small letters or a hyphen '-', then two letters 'x':

txt = 'axx Axx -xx @xx' res = re.sub('[a-z-]xx', '!', txt) print(res)

Result of code execution:

'! Axx ! @xx'

Example

In the following example, the search pattern is: the first character is small, capital letters or a hyphen '-', then two letters 'x':

txt = 'axx Axx -xx @xx' res = re.sub('[a-zA-Z-]xx', '!', txt) print(res)

Result of code execution:

'! ! ! @xx'

Example

You can place a hyphen between two groups - it will definitely not create another group there:

txt = 'axx 9xx -xx @xx' res = re.sub('[a-z-0-9]xx', '!', txt) print(res)

Result of code execution:

'! ! ! @xx'

Practical tasks

Given a string:

txt = 'xaz xBz xcz x-z x@z'

Find all lines with the following pattern: letter 'x', uppercase or lowercase letter or hyphen, letter 'z'.

Given a string:

txt = 'xaz x$z x-z xcz x+z x%z x*z'

Find all lines with the following pattern: letter 'x', then either a dollar, or a hyphen, or a plus, then letter 'z'.

English
AfrikaansAzərbaycanБългарскиবাংলাБеларускаяČeštinaDanskDeutschΕλληνικάEspañolEestiSuomiFrançaisहिन्दीMagyarՀայերենIndonesiaItaliano日本語ქართულიҚазақ한국어КыргызчаLietuviųLatviešuМакедонскиMelayuမြန်မာNederlandsNorskPolskiPortuguêsRomânăРусскийසිංහලSlovenčinaSlovenščinaShqipСрпскиSrpskiSvenskaKiswahiliТоҷикӣไทยTürkmenTürkçeЎзбекOʻzbekTiếng Việt
We use cookies for website operation, analytics, and personalization. Data processing is carried out in accordance with the Privacy Policy.
accept all customize decline