On my travels I’ve come across a few different functions and I’m never sure which is the right one to use so I thought I’d document what I’ve tried for future me.

Check if regex matches

The first regex I wrote was while scrapping the Champions League results from the Rec.Sport.Soccer Statistics Foundation and I wanted to determine which spans contained the match result and which didn’t.

A matching line would look like this:

Real Madrid-Juventus Turijn 2 - 1

And a non matching one like this:

53’Nedved 0-1, 66'Xavi Hernández 1-1, 114’Zalayeta 1-2

I wrote the following regex to detect match results:

[a-zA-Z\s]+-[a-zA-Z\s]+ [0-9][\s]?.[\s]?[0-9]

I then wrote the following function using re-matches which would return true or false depending on the input:

re-seq returns a list which contains consecutive matches of the regex. The list will either contain strings if we don’t specify capture groups or a vector containing the pattern matched and each of the capture groups.

For example if we now match only sequences of A-Z or spaces and remove the rest of the pattern from above we’d get the following results: