The Coda Distributed File System

Carnegie Mellon University has developed an exciting file system. Mr. Braam, one of the developers, tells us all about it.

The Coda distributed file system is a
state-of-the-art experimental file system developed in the group of
M. Satyanarayanan at Carnegie Mellon University (CMU). Numerous
people contributed to Coda, which now incorporates many features
not found in other systems:

Mobile Computing:

disconnected operation for mobile clients

reintegration of data from disconnected
clients

bandwidth adaptation

Failure
Resilience:

read/write replication servers

resolution of server/server conflicts

handles network failures which partition the
servers

handles disconnection of client's client

Performance and
scalability:

client-side persistent caching of files,
directories and attributes for high performance

A distributed file system stores files on one or more
computers called servers and makes them accessible to other
computers called clients, where they appear as normal files. There
are several advantages to using file servers: the files are more
widely available since many computers can access the servers, and
sharing the files from a single location is easier than
distributing copies of files to individual clients. Backups and
safety of the information are easier to arrange since only the
servers need to be backed up. The servers can provide large storage
space, which might be costly or impractical to supply to every
client. The usefulness of a distributed file system becomes clear
when considering a group of employees sharing documents; however,
more is possible. For example, sharing application software is an
equally good candidate. In both cases system administration becomes
easier.

There are many problems facing the design of a good
distributed file system. Transporting many files over the Net can
easily create sluggish performance and latency; network bottlenecks
and server overload can result. The security of data is another
important issue: how can we be sure that a client is really
authorized to have access to information and how can we prevent
data being sniffed off the network? Two further problems facing the
design are related to failures. Often, client computers are more
reliable than the network connecting them, and network failures can
render a client useless. Similarly, a server failure can be very
unpleasant, since it can disable all clients from accessing crucial
information. The Coda project has paid attention to many of these
issues and implemented them as a research prototype.

Coda was originally implemented on Mach 2.6 and has recently
been ported to Linux, NetBSD and FreeBSD. Michael Callahan ported a
large portion of Coda to Windows 95, and we are studying Windows NT
to understand the feasibility of porting Coda to NT. Currently, our
efforts are on ports and on making the system more robust. A few
new features are being implemented (write-back caching and cells
for example), and in several areas, components of Coda are being
reorganized. We have already received very generous help from users
on the Net, and we hope that this will continue. Perhaps Coda can
become a popular, widely used and freely available distributed file
system.

Coda on a Client

If Coda is running on a client, which we shall take to be a
Linux workstation, typing mount will show a file
system—of type “Coda”--mounted under /coda. All the files, which
any of the servers may provide to the client, are available under
this directory, and all clients see the same name space. A client
connects to “Coda” and not to individual servers, which come into
play invisibly. This is quite different from mounting NFS file
systems which is done on a per server, per export basis. In the
most common Windows systems (Novell and Microsoft's CIFS) as well
as with Appleshare on the Macintosh, files are also mounted per
volume. Yet the global name space is not new. The Andrew file
system, Coda's predecessor, pioneered the idea and stored all files
under /afs. Similarly, the distributed file system DFS/DCE from OSF
mounts its files under one directory. Microsoft's new distributed
file system (dfs) provides glue to put all server shares in a
single file tree, similar to the glue provided by auto-mount
daemons and yellow pages on UNIX. Why is a single mount point
advantageous? It means that all clients can be configured
identically, and users will always see the same file tree. For
large installations this is essential. With NFS, the client needs
an up-to-date list of servers and their exported directories in
/etc/fstab, while in Coda a client merely needs to know where to
find the Coda root directory /coda. When new servers or shares are
added, the client will discover these automatically in the /coda
tree.

To understand how Coda can operate when the network
connections to the server have been severed, let's analyze a simple
file system operation. Suppose we type:

cat /coda/tmp/foo

to display the contents of a Coda file. What actually
happens? The cat program will make
a few system calls in relation to the file. A system call is an
operation through which a program asks the kernel for service. For
example, when opening the file the kernel will want to do a lookup
operation to find the inode of the file and return a file handle
associated with the file to the program. The inode contains the
information to access the data in the file and is used by the
kernel; the file handle is for the opening program. The open call
enters the virtual file system (VFS) in the kernel, and when it is
realized that the request is for a file in the /coda file system,
it is handed to the Coda file system module in the kernel. Coda is
a fairly minimalistic file-system module: it keeps a cache of
recently answered requests from the VFS, but otherwise passes the
request on to the Coda cache manager, called Venus. Venus will
check the client disk cache for tmp/foo, and in case of a cache
miss, it contacts the servers to ask for tmp/foo. When the file has
been located, Venus responds to the kernel, which in turn returns
the calling program from the system call. Schematically we have the
image shown in Figure 3.

The figure shows how a user program asks for service from the
kernel through a system call. The kernel passes it up to Venus, by
allowing Venus to read the request from the character device
/dev/cfs0. Venus tries to answer the request, by looking in its
cache, asking servers or possibly by declaring disconnection and
servicing it in disconnected mode. Disconnected mode kicks in when
there is no network connection to any server which has the files.
Typically this happens for laptops when taken off the network or
during network failures. If servers fail, disconnected operation
can also come into action.

When the kernel passes the open request to Venus for the
first time, Venus fetches the entire file from the servers, using
remote procedure calls to reach the servers. It then stores the
file as a container file in the cache area (currently
/usr/coda/venus.cache/). The file is now an ordinary file on the
local disk, and read/write operations to the file do not reach
Venus but are (almost) entirely handled by the local file system
(EXT2 for Linux). Coda read/write operations take place at the same
speed as those to local files. If the file is opened a second time,
it will not be fetched from the servers again, but the local copy
will be available for use immediately. Directory files (remember, a
directory is just a file) as well as all the attributes (ownership,
permissions and size) are all cached by Venus, and Venus allows
operations to proceed without contacting the server if the files
are present in the cache. If the file has been modified and it is
closed, Venus updates the servers by sending the new file. Other
operations which modify the file system, such as making
directories, removing files or directories and creating or removing
(symbolic) links are propagated to the servers also.

So we see that Coda caches all the information it needs on
the client, and only informs the server of updates made to the file
system. Studies have confirmed that modifications are quite rare
compared to “read only” access to files, hence we have gone a
long way towards eliminating client-server communication. These
mechanisms to aggressively cache data were implemented in AFS and
DFS, but most other systems have more rudimentary caching. We will
see later how Coda keeps files consistent, but first pursue what
else one needs to support disconnected operation.

Trending Topics

Webinar: 8 Signs You’re Beyond Cron

Scheduling Crontabs With an Enterprise Scheduler
11am CDT, April 29th

Join Linux Journal and Pat Cameron, Director of Automation Technology at HelpSystems, as they discuss the eight primary advantages of moving beyond cron job scheduling. In this webinar, you’ll learn about integrating cron with an enterprise scheduler.