Writing regular expressions involves more than learning the mechanics.
You not only have
to learn how to describe patterns, you also have to recognize
the context in which they appear.
You have to be able to think through the level of
detail that is necessary in a regular expression, based
on the context in which the pattern will be applied.

The same thing that makes writing regular expressions difficult is
what makes writing them interesting:
the variety of occurrences or contexts in which a pattern appears.
This complexity is inherent in language itself,
just as you can't always understand an
expression (26.1)
by looking up each word in the dictionary.

The process of writing a regular expression involves
three steps:

Knowing what it is you want to match and how it might appear in the text.

Writing a pattern to describe what you want to match.

Testing the pattern to see what it matches.

This process is virtually the same kind of process that a programmer
follows to develop a program.
Step 1 might be considered the specification,
which should reflect an understanding of the problem to be solved as well
as how to solve it.
Step 2 is analogous to the actual coding of the program,
and step 3 involves running the program and testing it against the
specification.
Steps 2 and 3 form a loop that is repeated until the program
works satisfactorily.

Testing your description of what you want to match
ensures that the description
works as expected.
It usually uncovers a few surprises.
Carefully examining the results of a test, comparing the output
against the input, will greatly improve your
understanding of regular expressions.
You might consider evaluating the
results of a pattern-matching operation as follows:

Hits

The lines that I wanted to match.

Misses

The lines that I didn't want to match.

Misses that should be hits

The lines that I didn't match but wanted to match.

Hits that should be misses

The lines that I matched but didn't want to match.

Trying to perfect your description of a pattern
is something that you work at from opposite ends: you try to
eliminate the "hits that should be misses"
by limiting the possible matches and you try to
capture the "misses that should be hits" by expanding
the possible matches.

The difficulty is especially apparent when you must
describe patterns using fixed strings.
Each character you
remove from the fixed-string pattern increases the number of possible matches.
For instance, while searching for the string what,
you determine that you'd like to match What as well.
The only fixed-string pattern that will
match What and what is hat,
the longest string common to both.
It is obvious, though, that searching for hat will
produce unwanted matches.
Each character you add to a fixed-string pattern decreases
the number of possible matches.
The string them is going to produce fewer matches than the string the.

Using metacharacters in patterns provides
greater flexibility in extending or narrowing the range of matches.
Metacharacters, used in combination with literals
or other metacharacters,
can be used to expand the range of matches
while still eliminating the matches that you do not want.