• Uncategorized
  • 0

[Day 2 of python’s re module learning] Application of regular expressions greedy mode and lazy mode

Hits: 0

Table of contents

1 Mode overview

  1. Greedy mode: By default, matching repeated metacharacters always matches as much backwards as possible.
  2. Lazy mode: Also known as non-greedy mode, it matches content backwards as little as possible with repeated metacharacters.
  3. What they have in common: All are built on the basis of successful matching.
  4. Related metacharacters: *, +, , {m},{m,n}
  5. Convert from greedy mode to lazy mode: add after the relevant metacharacter ?, indicating that the lazy mode is used for matching.

2 Examples

2.1 An example of greedy mode

import re
re.findall(r'ab*',"abbbbbbbbbcd")           #['abbbbbbbbb']
re.findall(r'ab+',"abbbbbbbbbcd")           #['abbbbbbbbb']
re.findall(r'ab?',"abbbbbbbbbcd")           #['ab']
re.findall(r'ab{3}',"abbbbbbbbbcd")         #['abbb']
re.findall(r'ab{3,5}',"abbbbbbbbbcd")       #['abbbbb']

2.2 Lazy mode example

import re
re.findall(r'ab*?',"abbbbbbbbbcd")          #['a']
re.findall(r'ab+?',"abbbbbbbbbcd")          #['ab']
re.findall(r'ab??',"abbbbbbbbbcd")          #['a']
re.findall(r'ab{3}?',"abbbbbbbbbcd")        #['abbb']
re.findall(r'ab{3,5}?',"abbbbbbbbbcd")      #['abbb']

2.3 Comparison

  1. The above example can better and simpler to understand the functions of greedy and lazy mode. For some more complex strings, it is a bit difficult to clearly judge the output results of greedy mode and lazy mode.
  2. For example, in the following example, in the greedy mode, only two square brackets at the beginning and the end are judged because the content is matched as much as possible; in the lazy mode, each pair of square brackets is recognized.

import re
s = '[Hua Qiangu], [Legend of Lu Zhen], [New Fair Princess], [Chu Qiao Biography]' 
re.findall( r'\[.+\]' ,s)            #greedy mode output['[ Hua Qiangu],[The Legend of Lu Zhen],[New Fair Princess],[Chu Qiao Biography]'] 
re.findall( r'\[.+?\]' ,s)       #Lazy mode output['[Flower Qiangu ]', '[The Legend of Lu Zhen]', '[New Fair Princess]', '[Chu Qiao Biography]']

3 Summary

  1. Understand the difference between greedy mode and lazy mode;
  2. Master the usage of both modes;
  3. It is easy to understand the result of the above statement, but it takes practice to be able to express it concisely with regular expressions when similar requirements arise.

references

  1. RE Regular Expression Module (Python Video Tutorial)

You may also like...

Leave a Reply

Your email address will not be published.