Login

String Manipulation

Programmers need to know how to manipulate strings for a variety of purposes, regardless of the programming language they are working in. This article will explain the various methods used to manipulate strings in Python.Introduction

This article will take a look at the various methods of manipulating strings,
covering things from basic methods to regular expressions in Python. String
manipulation is a skill that every Python programmer should be familiar
with.

String Methods

The most basic way to manipulate strings is through the methods that are
build into them. We can perform a limited number of tasks to strings through
these methods. Open up the Python interactive interpreter. Let’s create a string
and play around with it a bit.

>>> test = ‘This is just a
simple string.’

Let’s take a fast detour and use the len function. It can be used
to find the length of a string. I’m not sure why it’s a function rather than a
method, but that’s a whole nother issue:

>>> len ( test
)29

All right, now let’s get back to those methods I was talking about. Let’s
take our string and replace a word using the replace method:

Regular expressions are a very powerful tool in any language. They allow
patterns to be matched against strings. Actions such as replacement can be
performed on the string if the regular expression pattern matches. Python’s
module for regular expressions is the re module. Open the Python
interactive interpreter, and let’s take a closer look at regular expressions and
the re module:

>>> import
re

Let’s create a simple string we can use to play around with:

>>> test = ‘This is for
testing regular expressions in Python.’

I spoke of matching special patterns with regular expressions, but let’s
start with matching a simple string just to get used to regular expressions.
There are two methods for matching patterns in strings in the re module: search and match. Let’s take a look at search first. It
works like this:

>>> result = re.search (
‘This’, test )

We can extract the results using the group method:

>>> result.group ( 0
)‘This’

You’re probably wondering about the group method right now and why we
pass zero to it. It’s simple, and I’ll explain. You see, patterns can be
organized into groups, like this:

>>> result = re.search (
‘(Th)(is)’, test )

There are two groups surrounded by parenthesis. We can extract them using the group method:

>>> result.group ( 1
)‘Th’>>> result.group ( 2
)‘is’

Passing zero to the method returns both of the groups:

>>> result.group ( 0
)‘This’

The benefit of groups will become more clear once we work our way into actual
patterns. First, though, let’s take a look at the match function. It works
similarly, but there is a crucial difference:

Notice that None was returned, even though “regular” is in the string.
If you haven’t figured it out, the match method matches patterns at the
beginning of the string, and the search function examines the whole
string. You might be wondering if it’s possible, then, to make the match method match “regular,” since it’s not at the beginning of the string. The
answer is yes. It’s possible to match it, and that brings us into patterns.

The character “.” will match any character. We can get the match method to match “regular” by putting a period for every letter before it. Let’s
split this up into two groups as well. One will contain the periods, and one
will contain “regular”:

Aha! We matched it! However, it’s ridiculous to have to type in all those
periods. The good news is that we don’t have to do that. Take a look at this and
remember that there are twenty characters before “regular”:

By entering two arguments, so to speak, you can match any number of
characters in a range. In this case, that range is 10-20. Sometimes, however,
this can cause undesired behavior. Take a look at this string:

Finally, there are a number of special sequences. “A” matches at the start
of a string. “Z” matches at the end of a string. “d” matches a digit. “D”
matches anything but a digit. “s” matches whitespace. “S” matches anything but
whitespace.

On a final note, you should not use regular expressions to match or replace
simple strings.

Conclusion

Now you have a basic knowledge of string manipulation in Python behind you.
As I explained at the very beginning of the article, string manipulation is
necessary to many applications, both large and small. It is used frequently, and
a basic knowledge of it is critical.