File Locking in Perl

What happens when children lunge for the same action figure? Similar bad experiences can occur when programs attempt to use the same resources. Learn how to ensure data integrity through locking files.

From the author of

From the author of

Accessing resources such as files and programs is one of the most common
tasks when writing software. Often, multiple programs or multiple instances of
one program want to access the same resource. Locking files is a good way to
ensure that only one resource is used at a time. This article explains file
locking, common mistakes, and an idiomatic way to lock files. Before using the
techniques described here, make sure that you fully understand the Perl
flock() command, or have an understanding of the UNIX flock(2)
command from the appropriate documentation.

Reasons for Locking Files

When many processes want to access the same resource, at some point multiple
programs will try to alter a resource at the same time. When this happens, bad
things can occur. Data integrity is important. No one wants a file containing
incorrect data, or worse, a file that has become corrupted. This is why we want
to lock filesto ensure that only a single process at a time can access a
given resource.

Let's look at a very common application that uses text files as a data
store: access counters for web sites. Here's a common way that people
access counters with Perl:

Basic as this script may be, it has a few problems. To explain what is wrong, let's begin with the flow this program follows during execution.

Open file for reading

Read data from file

Close file

Open file for update

Write updated data to file

Close file

Do you see what's wrong? If not, try thinking of multiple instances of this program running all at the same time. The problem here is that each instance opens, reads and writes to the same file whenever it feels like it. This means that multiple instances can read the same count at the same time, and update the file concurrently. We are now vulnerable to the count not being accurate, and possibly being reset to 1. This problem is commonly called a race condition, because a condition exists where multiple instances are racing to use a resource. As with any race, there will be only one winner, and that winner may not have the desired information.