CS 1110 Lecture 1: November 17 – Re Building

Premium

November 17 Re Building
Sunday, November 5, 2017
11:19 AM
Guidelines
1. Create good and bad example stirngs first; mark them off manually
2. Use r'' strings not '' strings
3. Go left to right
4. Pick the right kind of "or": [ab] for single character, (ab|cd) for longer
5. Optional: ()?, And several: ()*
[^oe] selects complement of oe, and selects everything but oe
Three rules of Regex:
1. Left to right: you want to build leftmost to rightmost
o Write down some examples and cross them off as you go
2. Kinds of "one of": doesn't matter which ones come up
3. Maybe, many, one
o Question mark signifies maybe x after this >> ( )?
Question mark is only good for single character
o May be one of them: ?
o Might be 0 or more: *
o Might be 1 or more: +
You should pretty much always use an 'r' string inside a regular expression so backlashes work in a
certain way
How to handle back\slashes: 16:30
Write single copy of thing to be repeated, and put a parenthesis and * after it
" ( [^"\\]*(\\.)? )* "
Is the same as
" ( [^"\\] | \\.)* "
Either non backlash periods or period
Strings = re.compile(r' " ( [^"\\] | \\.)* " ')
For match in strings.finditer(text):
Print(match.group())
Example: one