Linux-HA is the oldest, best-known and best-tested and most widely
written about open source high-availability suite for Linux. For many
years, it was limited to two nodes, and didn't monitor services for
correct operation.

This talk will give an overview of release 2, explaining these new
features and how to configure them, and provide an overview of
features coming in the near future.

About the author:

Alan Robertson founded the High-Availability for Linux project where
he has been an active developer, architect and project leader since
about 1997. He maintains the Linux-HA project web site at
http://linux-ha.org/, and has been a
key developer for the open source heartbeat program. In the open
source world, he worked for SuSE for a year, then joined IBM's Linux
Technology Center in March 2001. Alan also jointly leads the Open
Cluster Framework effort to define standard APIs for clustering, and
provide an open source reference implementation of these APIs.

Before joining SuSE, he was a Distinguished Member of Technical Staff
for Bell Labs. He worked for Bell Labs 21 years in a variety of
roles. These included developing telecommunication products, designing
communication controllers and providing leading-edge computing
support.

He obtained an MS in Computer Science from Oklahoma State University
in 1978 and a BS in Electrical Engineering from OSU in 1976.

The netpoll API provides a framework for implementing kernel UDP
clients and servers that operate outside of the Linux kernel's network
stack. Because it does not use the network stack, netpoll is able to
send and receive packets in situations where normal packet delivery
would not be possible. An example of this is when the system is
quiesced for debugging or when taking a crash dump.

Netpoll requires each underlying device driver to implement a
poll_controller hook. The contents of this hook are essentially the
same across all drivers, but will vary slightly depending on whether a
particular driver is written to use the New API (NAPI). The one
exception to this rule is the bonding driver. Because the bonding
driver is a virtual device driver, ushering traffic to real devices
based on policy, it requires further hooks into the netpoll code to
send and receive packets.

This paper explores the design and implementation of the netpoll API.
A necessary primer on relevant portions of the network driver
interface is presented. A comparison is made between netpoll and the
low level networking code that it emulates. The changes made to the
core network stack to accommodate netpoll are also explained. Using
the information presented, the netconsole implementation is extended
to support reading input from a remote server.

About the author:

Jeff Moyer is a senior software engineer at
Red Hat, Inc., who has been
using Linux since 1995. In his formative years, he worked on high
performance cluster computing infrastructure at Worcester Polytechnic
Institute. He then went on to implement high availability cluster
software such as Kimberlite, Convolo Dataguard, Convolo Netguard, and
other solutions in the embedded device space. Jeff has since moved on to a
mixed bag of hacking, including the Linux automounter, the netpoll API, Red
Hat's netdump utility, and the AIO subsystem.

DRBD is a well established Linux software component to build HA (high
availability) clusters out of common off-the-shelf hardware. DRBD's
key point is to replace a shared storage system by online
mirroring. In the presentation and the paper we will describe DRBD 8's
new features, which are a major step forward for shared nothing HA
systems.

The most outstanding new feature is certainly our new "shared disk"
mode, i.e. support for shared-disk file systems and other shared
storage-aware applications.

Provided that you use one of OpenGFS, GFS, OCFS2, this means that
applications have read and write access to the mirrored dataset on
both nodes at the same time, since you can mount the storage on both
peers.

Although DRBD has been widely used and under active development for 5
years now, it turned out that the method to determine the node with
the up-to-date data was suboptimal. A new scheme based on UUIDs will
be presented. This new scheme also works correctly in multi-node
setups.

About the author:

Philipp graduated from the Vienna University of Technology in computer
science in 2000. Since November 2001 he has been MD at
LINBIT, a provider of professional
Linux services with a focus on high availability clustering.

Lars-Gunnar Ellenberg joined DRBD development in 2002. Since then he
has become a co-author, and is now employed at LINBIT.

This speech will cover a view about the status of implementation of IPv6 in
Linux kernel, C-libraries and client-/server networking programs.

The Status of Linux will be shown for kernel with information about the
history and ongoing work. In addition the status of nowadays important
firewalling would be given.

The status of C-libraries is releated to DNS resolver and RPC support.

The status of client-/server networking programs show examples which
applications are already supporting IPv6 native or by additional patches.

Next, some examples will be given how to enable IPv6 in Linux and get IPv6
connectivity and how to setup permanent configuration. Afterwards, some
examples are shown how easy it is to enable IPv6 in applications (client
and servers).

Last but not least summary & outlook show additonal information what to
expect in the future and further information about additional informational
resources.

Csync2 is a cluster synchronization tool. It can be used to keep files
on multiple hosts in a cluster in sync. Csync2 can handle complex
setups with much more than just 2 hosts, handle file deletions and can
detect conflicts. It is expedient for HA-clusters, HPC-clusters, COWs
and server farms.

Usually, the job csync2 does is seen as very trivial task and so it is
most often solved using little shell scripts and tools such as rsync
or scp. But those solutions do not address the three most difficult
issues in synchronizing files in a cluster:

1. Conflict detection

The trivial rsync-based method for syncing the files does not detect a
conflict if a file has been modified on both hosts. Usually there also
is no simple way of replicating file removals.

2. Complex setups

The two-node scenario is a very simple but not a very realistic one.
Usually there are various intersecting groups of servers, some files
being replicated between all hosts and other files only between a
smaller set of servers.

3. Reacting to updates

In many cases it is not sufficient to simply replicate files. Instead,
it may be necessary to execute arbitrary commands in reaction to files
matching a pattern being updated. E.g. in a web-server cluster the
apache configuration files should be synchronized between the servers
and an 'apachectl graceful' should be executed whenever the apache
configuration has been changed.

Csync2 addresses these (and many other) issues and so provides an important
foundation for professional Linux clustering.

I have developed csync2 as complement to DRBD (which also has been
developed at LINBIT Information Technologies). While DRBD does
synchronous replication of block devices between two nodes for
fail-over clusters and is usually used for e.g. databases, csync2 does
asynchronous replication of files between many nodes for all kinds of
clusters and is usually used for configuration files and application
images.

Every new computer generation increases the amount of hardware that can
be connected and disconnectd at any time on a running system.
This requires a migration from static system setups to an event driven
model, to dynamically adapt to the changing system environment.

The "Hotplug" Subsystem

The traditional Linux hotplug subsystem consists of a set of shell-scripts,
called agents. If the kernel detects a new device or if a device is removed,
the kernel forks an event handling program which dispatches the event to
one of the agents. The agent will trigger the load of the appropriate
kernel-module and possibly configure the device.

Restrictions

However, with full hotplug event support for all registered devices
this mechanism has come to age and has some drawbacks. With the ever
growing number of dynamic devices, the execution of the hotplug
dispatcher can lead to serios problems. The events may arrive
out-of-order and the system may be left in the state the last event
has signified instead of representing the actual state of the kernel.
Modern power management and supend/resume support interacts very badly
with kernel-forked helper processes while it tries to change the
systems power state. A very large amount of events-processes can lead
to memory shortage or even a machine freeze. The script-based system
is known to be time and resource consuming, resulting in long
processing time for a single event.

Integrated Device Management System

This situation can be vastly improved by integrating the hotplug event
processing into the udev event handling.

The primary goal of udev is to keep the device nodes in /dev in sync
with the currently known kernel-devices. To create a device node udev
receives an uevent from the kernel and matches the event properties
with a set of rules to decide if and how a resulting device node is
created and named. As udev already has the capabilities to execute
arbitrary programs for any kernel event, it is possible to design a
hotplug subsystem integrated into udev. Such a system has a far
finer-grained control over the actions which have to be taken for a
specific device as the traditional linux hotplug subsystem ever could.

This paper presents a fully integrated udev-based device management
system; a performance comparison against the traditional hotplug subsystem
is given.

About the author:

Studied Physics with main focus image processing in Heidelberg from
1990 until 1997, followed by a PhD in Edinburgh's Heriot-Watt
University. Worked as sysadmin during the studies, mainly in the
Mathematical Institute in Heidelberg.

Linux addict since the earliest days (0.95); various patches to get
Linux up and running. Now working for SUSE
Linux Products GmbH to support IBM's S/390 architecture on Linux. Main
points of interest are SCSI, multipathing, udev and device configuration.
And S/390, naturally. Plus the odd obscure hardware like DEC Alpha, for an ever
decreasing number of machines.

High Availibilty and Load Balancing Cluster for Linux Terminal Services by Wolfgang Büch

Three years ago the Regional Computer Center (RRZ) of the University
of Hamburg started a project to migrate windows based computers to
diskless linux clients. Today over 250 diskless linux clients, some
of them several kilometers away, are managed in a WAN from a central
point of administration in the RRZ. The deployment of linux thin
clients saved over 70% percent of primary investment compared to
traditional technical solutions.

The concept and all modifications developed at the RRZ are mainly
derived from three linux projects:

The implementation of the "Linux Terminal Server Project".

Linux Cluster nodes based on the "Linux Virtual Server (LVS)"
Project act as loadbalancer and HA solution for all the services
needed by thin clients.

The openldap Project builds a centralized and redundant replicated
database for all network information used by the terminal server
and/or thin clients.

In this paper we will describe the modification of programs like LTSP
and the general cluster configuration.

Clustering of Terminalservices can be achieved by eliminating all
dependencies among the cluster nodes. Therefore all services like
dhcp,dns and ltsp were modified to share a unique and replicated ldap
database, which provides informations like ip-address and hostname
needed by a service e.g dhcp.

The load balancing is implemented according to LVS. One redundant node
acts as an LVS director which distributes the requests for services to
the available cluster nodes acting as terminal servers. As all
relevant network information is stored in a replicated ldap database
on each cluster node, these terminal servers are able to reply to
every service request from any thin client. This means that the
terminal system as a whole acts as a load balancing and high
availability linux cluster for terminal services..

Device-mapper, the new Linux 2.6 kernel generic device-mapping
facility, is capable of mapping block devices in various ways (e.g.
linear, striped, mirrored). The mappings are implemented in runtime
loadable plugins called mapping targets.

These mappings can be used to support arbitrary software RAID
solutions on Linux 2.6, such as ATARAID, without the need to have a
special low-level driver as it used to be with Linux 2.4. This avoids
code-redundancy and reduces error rates.

The dmraid application is capable of creating these for a variety of
ATARAID solutions (eg. Highpoint, NVidia, Promise, VIA). It uses an
abstracted representation of RAID devices and RAID sets internally to
keep properties such as paths, sizes, offsets into devices and layout
types (e.g. RAID0). RAID sets can be of arbitrary hierarchical depth
in order to reflect more complex RAID configurations such as RAID10.

Because the various vendor specific metadata formats stored onto ATA
devices by the ATARAID BIOS are all different, metadata format
handlers are used to translate between the ondisk representation and
the internal abstracted format.

The mapping tables which need to be loaded into device-mapper managed
devices are derived from the internal abstracted format.

My talk will give a device-mapper architecture/feature overview and
elaborate on the dmraid architecture and how it uses the device-mapper
features to enable access to ATARAID devices.

About the author:

Heinz Mauelshagen is the original author of the Linux Logical Volume Manager.
He develops and consults in various areas of storage management and is now
employed by Red Hat GmbH since the
aquisition of Sistina Inc. in early 2004.

Xen is a open source virtualization project, maybe the most important
one these days. Initially it was a project created by the Computer
Laboratory of the University of Cambridge, but now development is done
by a much larger community.

This talk will cover the technical aspects of xen. After giving a
short overview about the history of the project and xen's features
I'll introduce the concept of the paravirtualization used by xen, will
talk about memory management and explain how hardware device access
and virtual devices are handled in xen.

About the author:

Gerd Knorr has worked in various areas of the linux kernel. That
includes but isn't limited to the maintainance of the video4linux
subsystem and some v4l drivers during the last years. Current main
focus is virtualization, mostly Xen but also uml. He is is member of
the SUSE Labs.

Numerous countries around the globe are in the process of introducing
passports with biometric information stored on RFID chips, so-called
ICAO MRTD's (Machine Readable Travel Documents). The German
authorities coincidentially plan to issue the first such passports at
the time of LK2005 in October 2005.

As part of the CCC (Chaos Computer Club) working group on biometric
passprots, the author of this paper has followed the technical
development and standardization process very closely. In order to
gather first-hand experience with this new technology, he has
implemented a GPL-licensed, Linux-based RFID stack.

The stack includes a device driver for the common Philips CL RC632
reader chipset, an implementation of the ISO 14443-1, 2, 3 and 4
protocols, as well as an example "border control application" that is
able to read and verify information stored on an ICAO MRTD compliant
passport.

The paper covers some high-level introduction into the technical
standards, as well as a description of the "libmrtd" and "librfid"
projects and a live demonstration with some passport samples.

About the author:

Harald Welte is the chairman of the netfilter/iptables core team.

His main interest in computing has always been networking. In the few
time left besides netfilter/iptables related work, he's writing
obscure documents like the "UUCP over SSL HOWTO" or "A packet's
journey through the Linux network stack". Other kernel-related
projects he has been contributing to are random netowrking hacks, some
device driver work and the neighbour cache.

He has been working as an independent IT Consultant working on
projects for various companies ranging from banks to manufacturers of
networking gear. During the year 2001 he was living in Curitiba
(Brazil), where he got sponsored for his Linux related work by
Conectiva Inc.

Starting with February 2002, Harald has been contracted part-time by
Astaro AG, who are sponsoring him for his current netfilter/iptables
work. Aside from the Astaro sponsoring, he continues to work as a
freelancing kernel developer and network security consultant.

He licenses his software under the terms of the GNU GPL. Sometimes
users of his software are not compliant with the license, so he
started enforcing the GPL with his
gpl-violations.org project.

During the last year, Harald has started development of a free,
GPL-licensed Linux RFID and electronic passport software suite.

While consolidating physical to virtual machines using Xen,we wanted
to be able to deploy and manage a virtual machines in the same way we
manage and deploy physical machines. For operators and support people
there should be no difference between virtual and physical
installations.

Integrating Virtual Machines with the rest of the infrastructure,
should have a low impact on the existing infrastructure. Typically
Virtual machine vendors have their own tools to deploy and manage
virtual machine, Apart from the vendor lock-in to that specific
virtual machine platform , it requires the administrators to learn yet
another platform that they need to understand and manage, something we
wanted to prevent.

This paper discusses how we integrated SystemImager with Xen, hence
creating a totally open source deployment framework for the popular
opensource Virtual Machine monitor. We'll document both development of
our tools and go more in depth on other infrastructure related issues
when using Xen.

System Imaging environments in combination with Virtual machines can
also be used to ensure safe production deployments. By saving your
current production image before updating to your new production image,
you have a highly reliable contingency mechanism. If the new
production environment is found to be flawed, simply roll-back to the
last production image on the virtual machines with a simple update
command!

Xen has become one of the most popular virtualisation platforms during
the last 9 months, although not such a young project, it is now
gaining acceptance in the corporate world as a valuable alternative to
VMWare.

About the author:

Kris Buytaert is a Linux and Open Source Consultant operating in the
Benelux. He has consulting and development experience with multiple
enterprise level clients and government agencies. He is a contributor
to the Linux Documentation Project and author of different technical
publications. Kris is maintainer of the openMosix HOWTO.

Getting printing properly working is a rather complicated task in the
administration of a GNU/Linux or Unix system, especially when one
wants to make use of all the capabilities of a modern printer. One
needs a printer spooler which collects the print jobs from
applications and network clients, filters to transfer non-PostScript
jobs to PostScript and a printer driver which transfers PostScript to
the printer's native language.

All this is not trivial: First, one needs a printer for which a driver
or enough knowledge about its language exists, and then one has to
make the printing system call the correct filters with their long,
cryptic, and often not well-documented command lines and to give the
user the possibility to control the capabilities of the printer.

To improve this situation, Grant Taylor, the author of the former
Printing-HOWTO has set up a database for information about free
software printer drivers as well as for printers and how they are
supported with free software. This database, called Foomatic is
located on http://www.linuxprinting.org/
and Till Kamppeter is currently maintaining it. Now the database
lists near 250 free software printer drivers and more than 1600 printers.

The database is implemented in XML and is accompanied by a universal,
PPD-(Postscript Printer Description)-based print filter
("foomatic-rip") and Perl scripts which automatically create
Adobe-compliant PPD files and even complete print queues for all known
free spoolers: CUPS, LPRng, LPD, GNUlpr, PPR, CPS, PDQ, and
spooler-less printing. With these queues the user will have access to
the full functionality of the printer driver in use and thanks to the
PPD files he can even use all the printer options out of applications
(as OpenOffice) or from Windows/Mac clients.

The system is used by the printer setup tools of most GNU/Linux
distributions, as Mandriva, Red Hat/Fedora, SuSE, ... and several
printer manufacturers are contributing to the database. For PostScript
printers some manufacturers (HP, Ricoh, Epson, Kyocera, ...) even
release the official PPD files which are part of their Windows/Mac
software as free software and post them on linuxprinting.org. So it
turned to be an unofficial standard and around 10000 people visit the
linuxprinting.org web site every day.

With its database and its static pages linuxprinting.org is the
biggest knowledge base about printing with free software. To make it
easier for users and printer manufacturers to add even more knowledge
it is planned to manage the site content with MediaWiki, the Wiki
system successfully used by WikiPedia. Then static pages and
discussion forums will be replaced by the Wiki system, and the Wiki
will also serve as input frontend for new printers.

The talk will cover

how linuxprinting.org evolved

what linuxprinting.org provides

how the database is structured

how PPD files and printer queues are generated with it

how the devlopment of linuxprinting.org will go on

perhaps first linuxprinting.org-Wiki experience

This talk is aimed to system administrators and technically interested
users who want to know what happens "behind the scenes".

About the author:

Till Kamppeter holds a PhD in Theoretical Physics. While he did his
PhD he was system administrator for Unix and GNU/Linux in the physics
department. As system administrator he got to the free software with
contributions to X-CD-Roast and later XPP as his first own
project. XPP lead him to Mandriva in Paris in August 2000, where he is
responsible for the printing and digital imaging in Mandriva Linux.

His main project now is maintaining the linuxprinting.org web site
with its printer database and the Foomatic software. He improved this
system substatially, and currently it is the standard for printer
driver integration in most major GNU/Linux distribution. He is also in
the Open Printing Group of FreeStandards.org.

He has given many talks and tutorials and organized booths on
free-software-related events. In addition he organized Printing
Summits on the Libre Software Meeting 2004 and 2005 in France.

He also wrote several articles in german and brazilian magazines about
free software.

Security is the fastest growing segment of the IT industry. On the one
hand, companies spend a lot of money on firewalls, virus scanners and
spam filters. On the other hand, confidential information in emails is
still sent in clear text ? and therefore unprotected ? 99% of the time.

Although it is obvious to everyone today that emails should be
encrypted and signed in order to safeguard privacy and confidential
information, client-side encryption and signature have still not caught
on.

This is certainly due in part to the subject matter (cryptography), but
also to usability aspects.

That was the motivation for freenigma,
a joint research project of g10 Code and
freiheit.com technologies, to begin
developing a central proxy server for encrypting and signing emails
based on GnuPG/GPGME and the OpenPGP and S/MIME standards. The goal
was to create a server system that was under free licence (GPL) and
can be used in both private and company settings.

The presentation will provide an insight into the architecture and
functionality of freenigma and, using practical examples, will
demonstrate setup, configuration and use.

About the authors:

Stefan Richter, founder and managing partner of freiheit.com
technologies was born in 1966. He holds degrees in Computer Science
(Dipl.-Inf.) and Engineering (Dipl.-Ing.) and has been programming
computers for more than 22 years. After different positions, for
example in the development of scientific software in the fields of
meteorology and oceanography at the Alfred Wegener Institute for Polar
and Marine Research, and in applied research in the aviation and
aerospace industries and also the military at the Institute for Applied
Systems Technology Bremen GmbH (ATB), has now been working in
commercial software development for 15 years. In his free time, he is a
volunteer for the Free Software Foundation Europe where he is involved
with Free Software and digital civil rights.

Werner Koch, born 1961, is radio amateur since the late seventies and
became interested in software development at about the same time. He
worked on systems ranging from CP/M systems to mainframes, languages
from assembler to Smalltalk and applications from drivers to financial
analysis systems. He is a long time GNU/Linux user, principal author
of the GNU Privacy Guard and founding member of the FSF-Europe. In
2001 he founded g10 Code, a company
specialized in development of Free Software based security applications.

Tick-less Idle CPUs for Virtualization and Power Management by Srivatsa Vaddagiri

Traditionally, operating systems have used a periodic timer as a heart
beat to keep track of time, which is needed for activities like
scheduling and accounting, as well as for implementing timers required
by user applications and the OS itself. The Linux kernel uses a
periodic timer on every CPU as this heart beat. The frequency of this
timer tick varies from implementation to implementation. While the
Linux kernel earlier used a frequency of 100 ticks/second, more recent
distributions of the kernel use a frequency of 1000 ticks/second.

The overhead of such a housekeeping timer however becomes prominent
when a CPU is idle and has no immediate housekeeping needs. In
virtualized environment, such a timer could reduce the amount of
physical CPU time available to a busy partition by consuming physical
CPU cycles in an idle partition. Such a timer also prevents idle CPUs
from going into low-power states for long periods. It has, therefore,
become necessary to find a way to avoid the periodic ticks under some
circumstances.

Avoiding the ticks like this however poses a number of
challenges. Since many kernel subsystems (like RCU, scheduler, timer,
accounting) rely on these ticks, those subsystems need to be modified
to deal with the lack of a periodic timer tick. The wall-time can stop
getting updated when all CPUs become idle and consequently it has to
be recovered upon the resumption of any CPU. Also, there are various
short-timers (like slab reap timer) in use by kernel, which can
restrict how long idle CPUs can switch off timer ticks. This paper
will look at the implications of tick-less idle CPU in various kernel
subsystems and how these subsystems need to be modified to deal with
it. The paper also presents the results of a number of tests on
virtualized platforms with tick-less idle CPUs.

About the author:

Srivatsa Vaddagiri is a Linux kernel hacker working at IBM's Linux
Technology Center. He has been with IBM since 9 years now, focusing
mainly on Unix related technologies. Some of his most important
contributions are AIX on IA64, Linux on a handheld, CPU Hotplug in
Linux and lock-free socket hash table lookup. Currently he is looking
at making Linux kernel go tickless on idle CPUs.

Standardizing the Penguin: a Progress Report to the Community by Mats Wichmann

The Linux Standard Base (LSB) project has been evolving for a number
of years as an open-source effort to standardise the core
functionality of GNU/Linux systems. The concept is to remove
incompatibility from parts of the system where there's no real
value-add by being different, leaving the rest of the space for
innovation; and to make life easier for developers by providing a
dependable base they can code to. Community-driven consensus standards
are slow to evolve, but the LSB core is now very stable (and is in
fact a pending ISO standard) and usable. The next major release will
pull in lots of new capabilities including desktop, some works towards
manageability and security interfaces, better developer tools, a new
edition of the LSB Book, and more. This paper and talk serves as a
report to the community on their standardization project, by looking
at the road that lies ahead; some of the tools and tests that support
the standard, and the road forward; providing a forum for input into
future directions that would help make LSB an even more useful
standard for developers; as well as reviewing how the community can
contribute.

About the author:

Mats Wichmann has been kicking around first UNIX, then the Linux /
open source world for rather a long time until finding a home with the
LSB project at Intel. At Intel Corporation he's the Linux Standards
Architect with the Opensource Technology Center. Mats has been a
developer with the LSB project since 2001, and was elected LSB
Chairman in January 2004, which role he still holds. He has also
worked as a consultant, trainer, and courseware developer. He has past
standards/ABI experience with the MIPS ABI Group where he worked as
technical director and is an Austin Group and an IEEE Standards
Association member. Mats is co-author of the book Building
Applications with the Linux Standard Base.

Compared to the "consoles" found on traditional Unix workstations and
mini-computers, the Linux boot process is feature-poor, and the
addition of new functionality to boot loaders often results in massive
code duplication. With the availability of kexec, this situation can
be improved.

kboot is a proof-of-concept implementation of a Linux boot loader
based on kexec. kboot uses a boot loader like LILO or GRUB to load a
regular Linux kernel as its first stage. Then, the full capabilities
of the kernel can be used to locate and to access the kernel to be
booted, perform limited diagnostics and repair, etc.

kboot integrates the various components needed for a fully featured
boot loader, and demonstrates their use. While the main focus is on
core technical functionality, kboot can serve as a starting point for
customized boot environments offering additional features. kboot can
also be used as a platform for exploring architectural enhancements,
such as pre-loading of device scan results to accelerate the boot
process.

About the author:

Werner Almesberger got hooked on Linux in the days of the 0.12 kernel,
when studying computer science at ETH Zurich, and has been hacking the
kernel and related infrastructure components ever since, both as a
recreational activity, and as part of his work, first during his PhD
in communications at EPF Lausanne, and later also in industry. Being a
true Linux devout, he moved closer to the home of the penguins in
2002, and now lives in Buenos Aires, Argentina.

Contributions to Linux include the LILO boot loader, the initial RAM
disk (initrd), the MS-DOS file system, much of the ATM code, the tcng
traffic control configurator, and the UML-based simulator umlsim.

Hacking the Linux Automounter: Current Limitations and Future Directions by Jeffrey Moyer

The IT industry is experiencing a great move from proprietary
operating systems to Linux. As a result, the features and
functionality that customers have come to expect of these systems now
must be provided for on Linux.

Many large scale enterprise deployments include an automounter
implementation. The automounter provides a mechanism for
automatically mounting file systems upon access, and unmounting them
when they are no longer referenced. It turns out that the Linux
automounter is not feature-complete. And, there are cases where Linux
is just plain incompatible with the implementations of other,
proprietary vendors.

This paper takes a look at the common problems in large scale Linux
autofs deployments, and offers solutions to these problems.

In order to solve the current automounter limitations, we must start
with an understanding of how things work today. To this end, we will
explain some basic information about the automounter, such as how to
configure an autofs client machine. We will walk through the code for
basic operations such as the mounting or lookup of a directory and the
unmounting, or expiry of a directory. Through this, we will see where
autofs fits into the VFS layer.

With a picture of the landscape in place, we take a look at major
issues facing customer deployments. Currently, there are two main
pain points. The first is that the Linux automounter implements
direct maps in a way which is incompatible with that of every other
implementation. We will discuss the desired behavior and compare it
with that of the Linux automounter. We will then look at ways to
overcome this incompatibility by extending the autofs kernel
interface.

The second major pain point surrounds the use of multi-mount entries
for the /net, or -hosts maps. Because of the nature of multi-mount
maps, the Linux implementation mounts and unmounts these directory
hierarchies as a single unit. As such, clients mounting several big
filers can experience resource starvation, causing failed mounts. We
will look at this problem from several different levels. We start at
the root of the problem, and show how the kernel, glibc, and automount
can be modified to address the issue.

We conclude with future directions for the automounter.

About the authors:

Jeff Moyer is a senior software engineer at Red Hat, Inc., who has
been using Linux since 1995. In his formative years, he worked on
high performance cluster computing infrastructure at Worcester
Polytechnic Institute. He then went on to implement high availability
cluster software such as Kimberlite, Convolo Dataguard, Convolo
Netguard, and other solutions in the embedded device space. Jeff has
since moved on to a mixed bag of hacking, including the Linux
automounter, the netpoll API, Red Hat's netdump utility, and the AIO
subsystem.

The paper is co-authored by Ian Kent, who obtained a degree majoring
in Mathematics and computer science in 1983. He has worked in the
computing industry since then. The first 5 years was spent doing
software development and infrastructure work. Following this he has
worked mostly in infrastructure although he has always had software
development pojects of some sort as a sideline.

He has been using Linux, in one way or another, since 1994. Having the need
to use an automounter in most of the environments he work lead him to work on
autofs in Linux. After customising it for one site he took on maintaining
the Version 4 code base. He has been maintaing this for about three years.

Extending Kprobes to Support User-Space Application Instrumentation by Prasanna Panchamukhi

Extensive usage of Linux in the enterprise world urges a need for a
tool to analyse the production system non-disruptively. Kprobes
provides such a simple and lightweight interface to analyse the Linux
kernel with minimal disruption, providing 24x7 availability. One can
write a loadable kernel module using kprobes facilities to trace the
kernel. There are user-space tools being developed which underneath
uses the kprobes feature to instrument the Linux kernel
non-disruptively. SystemTap is a one such user-space utility built on
top of kprobes interface to write simple scripts, insert probes into
any kernel routines and get the formatted trace data. Trace data can
be function arguments, function return values, stack traces, global
variables etc. Now with Kprobes feature in the mainline kernel and
with the development of tools like SystemTap, next step would be to
provide user-space probe mechanism which can readily be used to
instrument user-space applications non-disruptively.

This paper does an extensive comparison of existing user-space application
instrumentation, designs new interfaces to meet to the requirements
and will demonstrate the usage of userspace probe mechanism in the real-world.

This paper talks about the user-space instrumentation mechanism that
can be used to insert probes in the user-space applications and
collect the tracing data non-disruptively. This papers discusses the
following topics in detail.

Comparison of existing user-space instrumentation mechanisms such as Dprobes,
Tools Using Dyninst etc.

Current user-space probe mechanism, may miss probes in the symmetric
multiprocessor environment. Modifications are required to make it efficient.

This paper also discusses improved user-space probe mechanism to avoid
probes miss in symmetric multiprocessor environment. Also discusses
various features that user-space probe provides such as tracing the function
entry and exit points, multiple handlers at the same address etc.

Provides real-life examples to show the usage of user-space instrumentation.

Prasanna is currently working with IBM's Linux Technology center
Reliability Availability and Serviceability tools group. He is one of
the developers for Kprobes in Linux and SystemTap tool for
Linux. Prasanna is also involved in improving various probe and trace
tools for Linux. You can reach Prasanna at prasanna@in.ibm.com.

Unlike a traditional mount, that hides the contents of the mount
point, a union mount presents a single view as if the file systems are
merged together. Although only the top layer file system of the union
stack can be altered, it appears as if it is possible to delete or
modify anything. Files in the lower layers may be deleted with
whiteouts in the topmost layer. Modified files are automatically
copied into the topmost layer first. For a virtual file system (VFS)
based implementation, heavy changes to the VFS and some of the
low-level file systems are necessary. This includes modification of
lookup and directory reading operations as well as the introduction of
a persistent whiteout file type.

Union mounts make the implementation of some applications easier,
e.g. live-cds or sourcetree-management. In combination with an
execute-in-place file system, union mounts are used for efficient
software management on read-only file systems shared between Linux
z/VM guests.

About the author:

Jan Blunck studied electrical engineering at the Technische
Universität Hamburg-Harburg, specializing in computer engineering. In
2003, he has been studying abroad at the Nanyang Technological
University, Singapore. He has written his diploma thesis about a VFS
based implementation of transparent file system mounts, also referred
to as union mounts, at the IBM Lab in Böblingen, Germany. His
contributions in Linux development reach from a USENET newsreader,
over device drivers to Linux VFS development.

This paper explores the topic of page migration within the Linux
kernel. Page migration is the act of moving data from one physical
page to another. This action should be transparent to users of the
data. The paper will explore an implementation of page migration. As
one would expect, this will touch on modifications and enhancements to
various pieces of the virtual memory subsystem.

After explaining the implementation of page migration, two known uses
of page migration are discussed. These are memory hotplug and process
migration on NUMA architectures. The paper will show how page
migration is used in each of these projects.

About the author:

Mike is a member of IBMs Linux Technology Center. He has been working
in OS design and development for the past 20+ years. Mike has made
numerous contributions to UNIX OSs in the areas of process management,
memory management, NUMA enablement, loadable kernel modules, shared
library support and multi-system shared device accessibility. He
started hacking on Linux in 2000.

Manufacturers like Samsung are positioning flash media as replacements
for hard disks as mass storage media and gaining strength in the
marketplace, esp. in the embedded area. While bringing advantages in
price, power consumption and reliability, flash technology is
distinctly different to hard disks and requires special support.

Three differences are relevant to the design of filesystems. Flash
requires updates to occur out of place, while hard disks work well
with in-place updates. Lifetime of flash blocks is limited by the
number of write accesses to them. And flash blocks are substantially
larger than hard disk sectors, requiring blocks to be shared by
several filesystem blocks and to be garbage collected under space
pressure.

Two approaches exist to deal with these differences. One is to add an
abstraction layer that emulates hard disk behaviour, commonly known as
Flash Translation Layer (FTL, NFTL, INFTL, etc.). Any existing file
system can be used on top of this abstraction, with FAT being the most
common choice. The second approach is to create specialized file
systems for flashes, like JFFS2 or YAFFS.

While Flash Translation Layers facilitate the integration of flash
devices into existing systems, they also come with disadvantages.
File systems working above the translation layer are usually build for
hard disk drives. Empty space is never explicitly cleared, as that
would waste IO cycles and complicate undelete operations, that most
filesystems support. In combination with garbage collection (GC) in
the translation layer, this causes empty space to be recogniced as
valid data by the translation layer. During GC, this "data" gets
written into empty blocks, reducing performance and medium lifetime.

Current flash filesystems, combining filesystem and translation
layers, have efficient garbage collection, as they know the state of
their content. But they are based on a log structured design, which
does not easily map to the filesystem tree presented to users. JFFS2
and YAFFS both have to scan the medium during mount time and build a
partial filesystem tree in memory. Their drawback, hence, is
increased memory usage and long mount times. One of the authors
already experienced 15min to mount an empty JFFS2 filesystem.

To free ourselves from this uncompfortable situation between a rock
and a hard place, a new filesystem design is presented. Requirements
for the new design are out of place updates of data combined with a
tree structure on the medium. Design has started over Eastern 2005
and convinced most MTD developers about its necessity since.

About the author:

Jörn Engel has been working on embedded systems - most of them running
Linux - since 2001. Since, he has written several MTD drivers, added
support for new hardware to JFFS2 and become an MTD fellow.

He is currently working for IBM in the development lab in Böblingen,
Germany, where he already completed his diploma thesis on Linux kernel
code quality. The "make checkstack" build target has emerged from
this thesis and become a standard tool since.

Robert Mertens is currently working on his PhD in computer science at
the university of Osnabrück, Germany. His primary interests are
eLearning and lecture recording systems. These days, he is primarily
occupied with his new-born daughter, Alexandra Maria.

First steps towards the next generation netfilter subsystem by Harald Welte

Until 2.6, every new kernel version came with its own incarnation of a
packet filter: ipfw, ipfwadm, ipchains, iptables. 2.6.x still had
iptables. What was wrong? Or was iptables good enough to last even
two generations?

In reality the netfilter project is working on gradually transforming
the existing framework into something new. Some of those changes are
transparent to the user, so they slip into a kernel release almost
unnoticed. However, for expert users and developers those changes are
noteworthy anyway.

Some other changes just extend the existing framework, so most users
again won't even notice them - they just don't take advantage of those
new features.

The 2.6.14 kernel release will mark a milestone, since it is scheduled
to contain nfnetlink, ctnetlink, nfnetlink_queue and nfnetlink_log -
basically a totally new netlink-based kernel/userspace interface for
most parts of the netfilter subsystem.

nf_conntrack, a generic layer-3 independent connection tracking
subsystem, initially supporting IPv4 and IPv6, is also in the queue of
pending patches. Chances are high that it will be included in the
mainline kernel at the time this paper is presented at Linux Kongress.

Another new subsystem within the framework is the "ipset" filter,
basically an alternative to using iptables in certain areas.

The presentation will cover a timeline of recent advances in the
netfilter world, and describe each of the new features in detail. It
will also summarize the results of the annual netfilter development
workshop, which is scheduled just the week before Linux Kongress.

About the author:

Harald Welte is the chairman of the netfilter/iptables core team.

His main interest in computing has always been networking. In the few
time left besides netfilter/iptables related work, he's writing
obscure documents like the "UUCP over SSL HOWTO" or "A packet's
journey through the Linux network stack". Other kernel-related
projects he has been contributing to are random netowrking hacks, some
device driver work and the neighbour cache.

He has been working as an independent IT Consultant working on
projects for various companies ranging from banks to manufacturers of
networking gear. During the year 2001 he was living in Curitiba
(Brazil), where he got sponsored for his Linux related work by
Conectiva Inc.

Starting with February 2002, Harald has been contracted part-time by
Astaro AG, who are sponsoring him for his current netfilter/iptables
work. Aside from the Astaro sponsoring, he continues to work as a
freelancing kernel developer and network security consultant.

He licenses his software under the terms of the GNU GPL. Sometimes
users of his software are not compliant with the license, so he
started enforcing the GPL with his
gpl-violations.org project.

During the last year, Harald has started development of a free,
GPL-licensed Linux RFID and electronic passport software suite.

The libferris Virtual filesystem has always sought to push the
boundaries of what a filesystem should do in terms of what can be
mounted and what metadata can be shown for files. Over the past 5
years it has extended to from mounting more traditional things such as
tar.gz, ssh, digital cameras, IPC primitives to be able to mount
various ISAM files including: db4, tdb, edb, eet, gdbm, various
relational databases including: odbc, mysql, postgresql, various
servers such as: Http, Ftp, LDAP, Evolution, RDF graphs aswell as XML
files and Sleepycat's dbXML.

Recently support for indexing filesystem data using any combination of
Lucene, ODBC, TSearch2, xapian, LDAP and PostgreSQL has been added
with the ability to query these backends for matching files. Matches
are naturally presented as a virtual filesystem.

To enable legacy clients to take advantage of libferris, a Samba VFS
module has been created allowing parts of a libferris system to be
exported as samba shares.

The talk will be about the things libferris can mount, the metadata it
offers, searching with libferris and finally how to export things as
Samba shares.

About the author:

I've been working on filesystem related code for the past 10+ years,
libferris for the last 5. I have collected various degrees including a
Bachelors and Masters in InfoTech.

I am currently undertaking a PhD on the application of Formal Concept
Analysis to Semantic File Systems to give a superior search and
interaction interface to one's filesystem.

The communities developing in the areas of virtualization and
simulation are in a state of flux. In the case of virtualization even
a renaissance is proclaimed. Yet there are only few approaches
combining simulation at network layer and operating system
virtualization, at least in the field of open source.

We developed a system called Network Simluation Environment (NoSE) to
simulate arbitrary network environments on a single Linux
machine. NoSE represents a high-interaction honeypot and has special
support for honeynet applications and forensics.

The system uses different Open Source emulators, like User-Mode-Linux,
Qemu, and Xen, to provide support for a broad range of guest operating
systems. We have tested Linux, BSD, and Windows. Other emulators can
be added easily.

The Linux kernel's bridging facilities are used to build a virtual
ethernet to connect the virtual machines. NoSE provides a GUI and a
management daemon that is capable of generating the whole network
infrastructure with just a few clicks. Different virtual machines and
network configurations can be archived in a library for later reuse.
Starting and stopping of whole networks thus becomes a simple task.

As the emulators run full-fledged operating systems, there are almost
no restrictions for applications and services that can be installed
within the simulated network. Possible applications for our system
include network simulation, testing, training, distributed application
development, and analysis of security issues. Security tools for
monitoring, intrusion detection, and sniffing are already
integrated. Data capture (i.e. logging) and data control (preventing
attackers to harm other machines) takes place outside the honeynet.

About the author:

Andreas Görlach is PhD student at the IT Transfer Office (ITO), a
third-party funded research unit of the department of computer science
at the TU Darmstadt.

His work focuses on network security. With his colleagues he regularly
teaches the aspects of IT network security to computer professionals
in a course called "Hacker Contest". Within that course a virtual lab
built by means of NoSE forms the basis for the security training.

In practicals for students of computer science Andreas supervises the
analysis of current technologies such as WLAN or VoIP.

Other field of his work include privacy and security in ubiquitous
computing.

Samba is working fine as a file, print and authentication server on a
single host. With the rise of distributed file systems like GFS, OCFS,
Lustre and others the with might come up to share the same file space
via different Samba nodes.

Right now this almost inevitably leads to data corruption, as Samba
needs to present very special locking semantics to the Windows
client. These locking semantics have absolutely nothing to do with
anything posix can deliver, and thus GFS and the others can not
coordinate Samba access across nodes.

In response to a particular customer request I'm in the process of
fixing that. As the underlying file system can not deliver the locking
semantics Samba needs, on a single host coordination is done via
shared databases. This does not work at all or would be *very*
inefficient if it was ported 1:1 to a distributed environment.

This talk will present the locking semantics Samba needs:

Oplocks are a way to reliably allow a client to cache files

Share modes are complete-file locks with very peculiar semantics,
in particular locking violations may not be answered immediately.

Byte range locks also have to be taken care of by Samba, as Posix has
very weird semantics here as well.

The second part of the talk will be a description of the current state
of development. If I happen to have something to show at the time of
the talk, a live demonstration is inevitable.

About the author:

Volker Lendecke is a long-time member of the core Samba Team. Volker
is co-founder of the Göttingen, Germany based
SerNet Service Network GmbH and
does consulting, training and development for Samba and other
Open Source products.