Writing a Regex Engine in Python

10

Description:

We write a super simple Regex Engine in python, illustrating along the way how and why regular expressions work.
We also mention why the default re is so nice in some ways.

Topics covered:

Finite State Automatons (DFA, NFA)

Internally what is happening when you write regex patterns

Algorithms needed to write a regex engine

How to extend your engine in the future

Did I mention this is in pure Python? Yep, this is pure Python.

Motivation:

What would you get out of this?

Learn how to compose regular expressions.

Get a look behind the scenes of almost any regular expression engine.

Understand which expressions would end up being expensive ones.

This is not a programming tutorial.

To keep things simple we only allow |, *, concatenation
operators since others can be made up using them

We only demonstrate the simplest of regular expressions and show
how other complex ones may be made.

This is not a talk setting up the case for ditching re; instead offering explanations as to how re and other regex engines basically take a string (expression) and turn that into a program which either accepts or rejects candidate strings.

Bonuses:

A mini regex tutorial if time allows.

Prerequisites:

How to write a function in Python

Python data types

Content URLs:

Speaker Info:

I'm a Masters student at IIITM-K currently working on my Masters' Thesis.
I love building things with Python, blogging, writing articles for OSFY and so on.
I've been swimming in Python for about 5 years now, and absolutely love it.