Rusty Russell announced that he had rewritten his in-kernel module loader
code to be much less invasive than earlier versions. He said it still needed
work, but was basically functional, and would not interfere with preemption
or CPU hotplugging. However, Roman Zippel felt that Rusty's solution added a
lot of complexity in order to solve a relatively simple problem. He suggested
moving a large part of the code into user-space, and added:

I can only refer to my own patch again, which has most of the basic things
needed to sanely get out of this mess:

Allow module exit to fail. This gives modules far more control over
module management and the generic module code can be simplified.

The new module layout simplifies module loading, much more than
relocating isn't necessary, but keeps backward compability as long as
necessary. This means new modules can be loaded with old modutils and
modules using the old interface can be kept working for a while.

Rusty argued that his own implementation was simple and beautiful, and
make the kernel smaller than before. He said a user-space solution would be
much more complex. Roman replied,
"Compared to the
complexity of the current insmod I can I agree. On the other hand with my
module layout, I could load a module with ld and a few lines of shell script
(only the system calls are a bit tricky)."
Rusty said,
"Do that on sparc64, x86_64 or ppc64 and I'll be really impressed.
And of course, good luck fitting libbfd into busybox!"

Part of Rusty's solution was to simplify things by making modules impossible
to unload, while Roman felt that all modules should be unloadable, and
should include API to report on whether they were free to be unloaded at a
given time. But Greg KH replied,
"And with a LSM module,
how can it answer that? There's no way, unless we count every time someone
calls into our module. And if you do that, no one will even want to use
your module, given the number of hooks, and the paths those hooks are on
(the speed hit would be horrible.) I'm with Rusty, just don't let people
unload modules, unless you are running a development kernel, and "obviously"
know what you are doing."
But Alan Cox replied,
"So
the LSM module always says no. Don't make other modules suffer."

Roman and Rusty went back and forth on the whole issue for awhile, but Rusty
had to go get married, and that was that.

I'd like to announce publication of Linux Kernel State Tracer (LKST) 1.3,
which is a tracer for Linux kernel.

LKST's main purpose is debugging, fault analysis and performance analysis
of enterprise systems.
For the purpose, LKST has these features,

It is possible to change dynamically which events are recorded.

Users can obtain information about the events which they concern
only interesting events.

And it reduces the overhead of components which is not related
with a fault.

It is possible to change each function invoked by each events. A default
function invoked by events is just recording occuring of the events.

But, if it is necessary, this function can be changed to another
function.

And LKST supports installing the function by using a kernel module LKST also
supports a maskset, which controls what kind of events should be recorded,
can be changed dynamically. For example, LKST usually traces a few events
for good performance, and when the kernel be in a particular status, LKST
can change a maskset to get more detail information.

It is possible to create new buffers and change to one of them.
By changing to other buffer, Users can leave the information which they
want.

LKST binaries, source code and documents are available in the following
site,

And if you have any comments, please send to the above list, or to another
mailing list written below.

lkst-develop@lists.sourceforge.net
lkst-develop@lists.sourceforge.jp

Robert Schwebel asked what the difference was between this project and the Linux Trace Toolkit. Yumiko replied:

Let me first explain the background of our development work.

We began development of the Linux Kernel State Tracer (LKST)
in response to a domestic need to improve Reliability, Availability,
and Serviceability (RAS) with respect to enterprise systems.
The following requirements were applied to LKST:

Capable of handling a variety of information about system errors.

Little trace overhead (or To control trace overhead)

Short development time

As we had to achieve a short development time, we elected to
develop LKST using our own methodology (based on know-how of
tracer development that we carried out for other OS's) different
from other known tools such as LTT.
# This is not to say that we developed all functions on our own.
#LKST at present connects with Kernel Hooks (GKHI) and LKCD.

Consequently, LKST, which is oriented to enterprise systems,
has the following features different from those of LTT.
# These LKST features are also being enhanced at this time.

Little overhead and good scalability when tracing on a large-scale
SMP system

To make lock mechanism overhead as little as possible, we
designed that the buffers are not shared among CPUs.

Easy to extend/expand the function (User-based extendibility)

Without recompiling kernel, user can change/add/modify the kind
of events and information to be recorded at anytime.
For example, LKST usually traces very few events for the purpose
of good performance. Once the kernel get into the particular status
that user specified, LKST will trace and record more detail information.

Preservation of trace information

Recovery of trace information collected at the time of a system crash
in connection with LKCD.

Saving of specific event information during tracing.
For example, switching to another buffer after the occurrence of
a specific event enables the information on that event to be left
in the previous buffer.

Collection of even more kernel event information

Information on more than 50 kernel events can be collected for
kernel debugging.

The demand for RAS functions in Linux should grow in the years to come.
It is our hope that LKST becomes one means of implementing such functions.

Karim Yaghmour had some comments. To the locking issues of item 1, he
said,
"Clearly this is not a problem for LTT
since we don't use any form of locking whatsoever anymore. IBM's work on
the lockless scheme has solved this problem and their current work on the
per-CPU buffering solves the rest of the issue."
To the usability issues
of item 2, he said that the same features were available with LTT. To item 3, he
said:

Connection with LKCD is really not a problem, but this points to the main
purpose of the tool, which in the case of LKCD is kernel debugging. LTT isn't
aimed as a kernel debugger, so although LKCD is on our to-do list, it's
certainly not our priority.

As for handling multiple output streams (which LKCD can be one of them), we
already have very detailed plans on how LTT is going to integrate this (as I've
mentioned a number of times before on this list). However, before we go down
this road we need to make sure that the core tracing functionality is
lightweight and fits the general requirements set for kernel code. Once this
core lighweight functionality is there, we can build a rich and solid feature
set around it.

To item 4, Karim agreed that this did differentiate the two project. He said:

this is where LTT and LKST cannot be compared. If LKST is
a kernel debugging tool, as it has always been advertised, then any comparison
of LKST should be made with the other tracing tools which are used for
kernel debugging, such as the ones mentioned by Ingo and Andi earlier on
this list.

LTT was built from the ground up to help users understand the dynamic
behavior of the system. As such, it cannot be compared to any kernel
debugging tool since it isn't one.

Finally, Karim remarked,
"There
was a RAS BoF at the OLS this year where tracing was intensively
discussed. All the attendees agreed to unify their efforts around
LTT. At this meeting, Richard Moore of IBM presented a tracing to-do list (http://opersys.com/LTT/ltt-to-do-list.txt)
which we are using a basic check list for our ongoing work. Instead of
implementing yet another tracing system, I think that the LKST team would
benefit much from contributing to LTT, which has already a proven track record
and has been adopted by the community as much as the industry."

Several days later, Yumiko replied,
"After
future, we'll join community actively. We'll use LTT and want to concern LTT,
so we'll join the discussion of you and other LTT developers about Linux RAS.
We hope to co-operate you and other developers about Linux RAS."

We are pleased to announce the first publically available source
release of a new POSIX thread library for Linux. As part of the
continuous effort to improve Linux's capabilities as a client, server,
and computing platform Red Hat sponsored the development of this
completely new implementation of a POSIX thread library, called Native
POSIX Thread Library, NPTL.

Unless major flaws in the design are found this code is intended to
become the standard POSIX thread library on Linux system and it will
be included in the GNU C library distribution.

The work visible here is the result of close collaboration of kernel
and runtime developers. The collaboration proceeded by developing the
kernel changes while writing the appropriate parts of the thread
library. Whenever something couldn't be implemented optimally some
interface was changed to eliminate the issue. The result is this
thread library which is, unlike previous attempts, a very thin layer
on top of the kernel. This helps to achieve a maximum of performance
for a minimal price.

A white paper (still in its draft stage, though) describing the design
is available at

It provides a larger number of details on the design and insight into
the design process. At this point we want to repeat only a few
important points:

the new library is based on an 1-on-1 model. Earlier design
documents stated that an M-on-N implementation was necessary to
support a scalable thread library. This was especially true for
the IA-32 and x86-64 platforms since the ABI with respect to threads
forces the use of segment registers and the only way to use those
registers was with the Local Descriptor Table (LDT) data structure
of the processor.

The kernel limitations the earlier designs were based on have been
eliminated as part of this project, opening the road to a 1-on-1
implementation which has many advantages such as

less complex implementation;

avoidance of two-level scheduling, enabling the kernel to make all
scheduling decisions;

It is not generally accepted that a 1-on-1 model is superior but our
tests showed the viability of this approach and by comparing it with
the overhead added by existing M-on-N implementations we became
convinced that 1-on-1 is the right approach.

Initial confirmations were test runs with huge numbers of threads.
Even on IA-32 with its limited address space and memory handling
running 100,000 concurrent threads was no problem at all, creating
and destroying the threads did not take more than two seconds. This
all was made possible by the kernel work performed as part of this
project.

The only limiting factors on the number of threads today are
resource availability (RAM and processor resources) and architecture
limitations. Since every thread needs at least a stack and data
structures describing the thread the number is capped. On 64-bit
machines the architecture does not add any limitations anymore (at
least for the moment) and with enough resources the number of
threads can be grown arbitrarily.

This does not mean that using hundreds of thousands of threads is a
desirable design for the majority of applications. At least not
unless the number of processors matches the number of threads. But
it is important to note that the design on the library does not have
a fixed limit.

The kernel work to optimize for a high thread count is still
ongoing. Some places in which the kernel iterates over process and
threads remain and other places need to be cleaned up. But it has
already been shown that given sufficient resources and a reasonable
architecture an order of magnitude more threads can be created than
in our tests on IA-32.

The futex system call is used extensively in all synchronization
primitives and other places which need some kind of
synchronization. The futex mechanism is generic enough to support
the standard POSIX synchronization mechanisms with very little
effort.

The fact that this is possible is also essential for the selection
of the 1-on-1 model since only with the kernel seeing all the
waiters and knowing that they are blocked for synchronization
purposes will allow the scheduler to make decisions as good as a
thread library would be able to in an M-on-N model implementation.

Futexes also allow the implementation of inter-process
synchronization primitives, a sorely missed feature in the old
LinuxThreads implementation (Hi jbj!).

Substantial effort went into making the thread creation and
destruction as fast as possible. Extensions to the clone(2) system
call were introduced to eliminate the need for a helper thread in
either creation or destruction. The exit process in the kernel was
optimized (previously not a high priority). The library itself
optimizes the memory allocation so that in many cases the creation
of a new thread can be achieved with one single system call.

On an old IA-32 dual 450MHz PII Xeon system 100,000 threads can be
created and destroyed in 2.3 secs (with up to 50 threads running at
any one time).

Programs indirectly linked against the thread library had problems
with the old implementation because of the way symbols are looked
up. This should not be a problem anymore.

The thread library is designed to be binary compatible with the old
LinuxThreads implementation. This compatibility obviously has some
limitations. In places where the LinuxThreads implementation diverged
from the POSIX standard incompatibilities exist. Users of the old
library have been warned from day one that this day will come and code
which added work-arounds for the POSIX non-compliance better be
prepared to remove that code. The visible changes of the library
include:

The signal handling changes from per-thread signal handling to the
POSIX process signal handling. This change will require changes in
programs which exploit the non-conformance of the old implementation.

One consequence of this is that SIGSTOP works on the process. Job
control
in the shell and stopping the whole process in a debugger work now.

getpid() now returns the same value in all threads

the exec functions are implemented correctly: the exec'ed process gets
the PID of the process. The parent of the multi-threaded application
is only notified when the exec'ed process terminates.

thread handlers registered with pthread_atfork are not anymore run
if vfork is used. This isn't required by the standard (which does
not define vfork) and all which is allowed in the child is calling
exit() or an exec function. A user of vfork better knows what s/he
does.

libpthread should now be much more resistant to linking problems: even
if the application doesn't list libpthread as a direct dependency
functions which are extended by libpthread should work correctly.

the pthread_kill_other_threads_np function is not available. It was
needed to work around the broken signal handling. If somebody shows
some existing code which makes legitimate use of this function we
might add it back.

The current sources contain support only for IA-32 but this will
change very quickly. The thread library is built as part of glibc so
the complete set of glibc sources is available as well. The current
snapshot for glibc 2.3 (or glibc 2.3 when released) is necessary. You
can find it at

Building glibc with the new thread library is demanding on the
compilation environment.

The 2.5.36 kernel or above must be installed and used. To compile
glibc it is necessary to create the symbolic link

/lib/modules/$(uname -r)/build

to point to the build directory.

The general compiler requirement for glibc is at least gcc 3.2. For
the new thread code it is even necessary to have working support for
the __thread keyword.

Similarly, binutils with functioning TLS support are needed.

The (Null) beta release of the upcoming Red Hat Linux product is
known to have the necessary tools available after updating from the
latest binaries on the FTP site. This is no ploy to force everybody
to use Red Hat Linux, it's just the only environment known to date
which works. If alternatives are known they can be announced on the
mailing list.

To configure glibc it is necessary to run in the build directory
(which always should be separate from the source directory):

The --enable-kernel parameter requires that the 2.5.36+ kernel is
running. It is not strictly necessary but helps to avoid mistakes.
It might also be a good idea to add --disable-profile, just to speed
up the compilation.

When configured as above the library must not be installed since it
would overwrite the system's library. If you want to install the
resulting library choose a different --prefix parameter value.
Otherwise the new code can be used without installation. Running
existing binaries is possible with

Alternatively the binary could be build to find the dynamic linker
and DSO by itself. This is a much easier way to debug the code
since gdb can start the binary. Compiling is a bit more complicated
in this case:

This command assumes that it is run in the build directory. Correct
the paths if necessary. The compilation will use the system's
headers which is a good test but might lead to strange effects if
there are compatibility bugs left.

Once all these prerequisites are met compiling glibc should be easy.
But there are some tests which will flunk. For good reasons we aren't
officially releasing the code yet. The bugs are either in the TLS
code which is not enabled in the standard glibc build, or obviously in
the thread library itself. To run the tests for the thread library
run

make subdirs=linuxthreads2 check

One word on the name 'linuxthreads2' of the directory. This is only a
convenience thing so that the glibc configure scripts don't complain
about missing thread support. It will we changed to reflect the real
name of the library ASAP.

What can you expect?

This is a very early version of the code so the obvious answer is:
some problems. The test suite for the new thread code should pass but
beside that and some performance measurement tool we haven't run much
code. Ideally we would get people to write many more of these small
test programs which are included in the sources. Compiling big
programs would mean not being able to locate problems easy. But I
certainly won't object to people running and debugging bigger
applications. Please report successes and failures to the mailing
list.

People who are interested in contributing must be aware that for any
non-trivial change we need an assignment of the code to the FSF. The
process is unfortunately necessary in today's world.

People who are contaminated by having worked on proprietary thread
library implementation should not participate in discussions on the
mailing list unless they willfully disclose the information. Every
bit of information is publically available from the mailing list
archive.

Which brings us to the final point: the mailing list for *all*
discussions related to this thread library implementation is

There was some confusion over whether the new code really did manage to
achieve 100,000 concurrent threads. People couldn't believe their eyes. Even
Linus Torvalds said at one point:

You didn't read the post carefully.

They started and waited for 100,000 threads.

They did not have them all running at the same time. I think the
original post said something like "up to 50 at a time".

Basically, the benchmark was how _fast_ thread creation is, not now many
you can run at the same time. 100k threads at once is crazy, but you can
do it now on 64-bit architectures if you really want to.

But no, as Ingo Molnar corrected, the code really did manage to get
100,000 threads running all at once. He explained:

on the dual-P4 testbox i have started and stopped 100,000
*parallel* threads in less than 2 seconds. Ie. starting up 100,000 threads
without any throttling, waiting for all of them to start up, then killing
them all. It needs roughly 1 GB of RAM to do this test on the default x86
kernel, it need roughly 500 MB of RAM to do this test with the IRQ-stacks
patch applied.

with 2.5.31 this test would have taken roughly 15 minutes, on the same
box, provided the NMI watchdog is turned off.

with 100,000 threads started up and idling silently the system is
completely usable - all the critical for_each_task loops have been fixed.

Trying to be a bit more timely about releases, especially since some
people couldn't use 2.5.37 due to the X lockup that should hopefully be fixed
(no idea _why_ that old bug only started to matter recently, the bug itself
was several months old).

Karim Yaghmour posted a patch to add the Adeos
nanokernel to the Linux source tree. He referred to an earlier
post for explanation. Pavel Machek asked,
"Maybe
adding Docs/adeos.txt is good idea... (sorry can't access web right now) --
so this is aimed at being free rtlinux replacement?"
Karim promised
to get Docs/adeos.txt going, and added:

I'm not sure "replacement" is the appropriate description for this.
The scheme used by rtlinux and rtai is a master-slave scheme where Linux is a
slave to the rt executive. Adeos makes the entire scheme obsolete by making
all the OSes running on the same hardware clients of the same nanokernel,
regardless of whether the client OSes provide hard RT or not. None of these
OSes need to have a "other OS" task, as rtlinux and rtai clearly do. Rather,
when an OS is done using the machine, it tells Adeos that it's done and Adeos
returns control to whichever other OS is next in the interrupt pipeline.

To be honest, nothing in Adeos is "new". Adeos is implemented on classic
early '90s nanokernel research. I've listed a number of nanokernel papers
in the paper I wrote on Adeos. A complete list of nanokernel papers would
probably have hundreds of entries. Some of these nanokernels even had OS
schedulers (exokernel for instance). All Adeos implements is a scheme for
sharing the interrupts among the various OSes using an interrupt pipeline.

Jacob Gorm Hansen asked,
"are you planning
to add spaces & portals, like in Space or Pebble?"
And Karim
explained:

I'm not sure whether what we plan to offer actually fits Space's definition
of spaces, but domains already exist and portals should be trivial to
implement over what we already have. For details on what plan to offer in
terms of spaces, take a look at the paper I wrote describing how to implement
Linux SMP clusters:

Basically, Adeos would hand over RAM regions according to each OS
instance's requests. In such a case, each kernel would have its own virtual
memory and communication would be possible using "bridges", shared physical
RAM regions. Many OSes can coexist in the same virtual address space, but
the mechanisms for managing the virtual address space are not up to Adeos.

Close by, Pavel continued his inquiry, asking,
"Are you actually able to use Adeos for something reasonable? You
can't run two copies of linux, because they would fight over memory; right?
Do you have something that can run alongside linux?"
As far as
running two concurrent invocations of Linux, Karim agreed that currently
this was impossible. But he added,
"I've already
detailed how to do this in a paper I wrote last july on how to obtain Linux
SMP clusters with as few modifications to the kernel as possible."
He
gave the
same link as above. To the question of whether there was anything that
could run alongside Linux, he went on:

Certainly. According to some reports it's already used in some commercial
systems and, as today's RTAI announcement reads, it will be the basis for
the next release of RTAI.

What we need now is ports to other architectures than the i386. This
should be fairly simple for anyone familiar enough with the Linux interrupt
layer for any other arch.

Bob Tracy reported some breakage with IDE when compiled as a module,
and in the course of discussion Alan Cox replied,
"Let
me give a simple clear explanation here. I don't give a flying ***k about
modular IDE until the IDE works. Cleaning up the modular IDE after it all
works is relatively easy and gets easier the more IDE is cleaned up. Until
then its not even on the radar unless someone else wants to do all the
work for 2.4/2.5 and verify/test them."
Bob said,
"Understood. My position is simply that I noted something broken,
and I reported it during the development cycle. Would you prefer that I had
waited until after 2.5.X became 2.6?"
Alan agreed that it was better
to report such things than not, but added,
"its just
I've had lots of equally helpful reports."

Greg KH was very happy with this, and suggested providing a
patch against the current Linux Security Module code snapshot at lsm.immunix.org, in which case Greg said
he'd be happy to accept the patch into LSM. Olaf was happy to oblige, and
soon posted an updated patch; and some folks discussed the implementation.

And new this time around I have broken this up into a number
of smaller self-contained patches. Each is a nice logical unit
(like a driver, or framebuffer, etc). This should greatly
simplify any merging into the mainline code :-)

My name is YOSHIFUJI Hideaki. I'm from USAGI Project.
Our project is trying to improve IPv6 implementation in Linux, and
we'd like to continue contributing our efforts. Please see <http://www.linux-ipv6.org> for
further information.

Well...

Linux happened to process invalid ND messages with invalid options
such as

length of ND options is 0

length of ND options is not enough

Specification says that such messages must be silently discarded.
This patch parses/checks ND options before it changes state of neighbour /
address etc. and ignores such messages.

Following patch is against linux-2.4.19.

David S. Miller applied the patch, thanked YOSHIFUJI, and remarked,
"Let us hope more patches like this one are
coming :-)"

module symbols are correctly decoded as well. Ie. all the userspace
oops decoding mismatches are solved, which can arise if a kernel
crashes and another kernel (with different module symbols) is booted.
How do you find out the symbols that a particular crashed kernel had?

list of modules are printed upon oopsing - this clearly puts every
crash into perspective - exactly which modules were loaded ...

He went on:

i believe it's all for the better, much of the above featureset is also
based on distributors' daily experience of how users report crashes and
how it can be made sense of post-mortem. Tester feedback is often a scarce
resource for distributors, so improving the quality of individual reports
is of high importance. Even here on lkml the quality of oops reporting is
often surprisingly low, especially taking the many years of education into
account.

the cost of the feature is an in-kernel copy of the symbol table - most
testers will not care, and it's default-disabled in the .config. This patch
has proven to be very useful in my daily kernel development activities,
hopefully others will find this just as useful.

I've tested the patch on x86, building and oopsing works both with
kksymoops enabled and disabled.

The line of credit for kksymoops goes like this: Arjan took Keith's original
kallsyms work and extended it to the area of kernel oopsing and stack trace
printing - this was the 2.4 kksymoops patch. Which i ported to 2.5 and added
some minor fixes, which Kai improved significantly - essencially Kai rewrote
much of the original patch - it's now a nice patch that fits into the 2.5
build system properly.

Linus Torvalds had some cosmetic objections to do with output formatting, and
folks went back and forth on that for awhile; and J.A. Magallon suggested the
patch would be great to see in 2.4 as well.

Marcelo Tosatti announced 2.4.20-pre8 but gave only the ChangeLog, no
summary. Axel Siebenwirth asked if Marcelo could give a brief summary of
changes as Linus Torvalds did in his releases; and Christer Nilsson agreed.
Marcelo thanked them for the feedback and replied,
"Indeed I should stop being lazy on that. Will remind myself next
time I release a kernel."

YOSHIFUJI Hideaki posted a patch against 2.4.19 and explained,
"Current IPv6 address validation timer is rough and
timing of address validation is not precise. This patch refines timing of
address validation timer."
David S. Miller and Alexey Kuznetsov had
some slight problems with the implementation, and YOSHIFUJI posted corrections.
Eventually David said,
"I've applied the patch with
the time_after() debugging check removed to both 2.4.x and 2.5.x"

Procps is the package containing various system monitoring tools, like
ps, top, vmstat, free, kill, sysctl, uptime and more. After a long
period of inactivity procps maintenance is active again and suggestions,
bugreports and patches are always welcome on the procps list.

The plan is to release a procps 2.1.0 around the time the 2.6.0 kernel
comes out, with maybe one extra intermediary release between now and
then. Various features and code cleanups are planned, the /proc changes
in 2.5 are also sure to keep the procps maintainers busy...

If you have feedback (or patches) for the procps team, feel free to
mail us at:

procps-list@redhat.com

NEWS for version 2.0.8 of procps

Integrate bugfixes and enhancements from all the vendor RPMs (Rik van
Riel)

Support new /proc layout, up to 2.5.39 or so. (Andrew Morton)

Scheduler policy display in top and ps (Robert Love)

Lots of compile cleanups and warning fixes (Robert Love)

Support for understanding threads (Robert Love)

Realtime priority and scheduling policy display for ps (Robert Love)

Change meminfo() from an array into an actual struct, remove 60 lines
of no longer needed code from free.c (Rik van Riel)

Display active and inactive memory statistics from 2.4 and 2.5 kernels,
in vmstat and top (Rik van Riel)

A bug introduced by locale support was fixed; locales with , as the
decimal point will work again.

Libproc supports new process-migrating beowulf systems.

He replied to himself a few hours later with a warning about a trivial
bug that had crept into the code. The VERSION string had been omitted,
so anyone using the -V option would not see the current version number. He
said he'd release 2.0.9 shortly.

Drop 61 on September 27, 2002 (jfs-2.4-1.0.23.tar.gz
and jfsutils-1.0.23.tar.gz) includes fixes to the file
system and utilities.

Utilities changes

print fsck.jfs start timestamp correctly in fsck.jfs log

allow xchklog to run on a JFS file system with an external journal

initialize program name in logdump properly

code cleanup

File System changes

Detect and fix invalid directory index values
The directory index values are the unique cookies used to resume
a readdir at the proper place. These are stored with each entry
in a directory. fsck.jfs does not currently validate these entries,
nor even create them when populating the lost+found directory.
This fix causes readdir to detect the invalid cookies, and generate
new ones, if possible.

Fix problems with NFS
Don't complain when read_inode is called with a deleted inode. This
is normally done by revalidate.
readdir: Don't hold metadata page while calling filldir(). NFS's
filldir may call lookup() which could result in a hang.

Fix off-by-one error in dbNextAG
In certain situations, dbNextAG set db_agpref to db_numag, is
one higher than the last valid value. This will eventually result
in a trap.

Avoid parallel allocations within the same allocation group
When large files are writing in parallel, allocating the space for
these files within the same allocation group can cause severe
fragmentation of the files. By keeping track of open, growing files
within an allocation group, we can force other new allocations into
a different allocation group to avoid causing fragmentation.

Fix test in lmLogFileSystem

Remove assert(i < MAX_ACTIVE) when the external log can't be found.

Remove excessive typedefs

For more details about JFS, please see the patch instructions or
changelog.jfs files.

The following patches against 2.5.39 clean up the RNG support
substantially. Please pay special attention to the first patch, which fixes
two major bugs in the reseeding logic. They can be easily demonstrated by
running 'cat /dev/random | hexdump' on a quiescent system. When it blocks,
lightly tapping the mouse generates a large stream of additional output,
despite very little entropy being added.

The second and third patches introduce my fixes for the more theoretical
issues and should address all the issues that have been raised.

The fourth and fifth make the pool and reseeding logic much more clear
and create a new pool for /dev/urandom that avoids starving /dev/random
readers.

Six and seven propagate the new API to the rest of the kernel and remove
dead code.

In subsequent posts containing the individual patches, he explained each more
thoroughly. For patch 1:

This fixes a bug where entropy transfer takes more from the primary pool than
is there and credits the secondary with 1000 extra bits.

This also makes this code properly handle catastrophic reseeding by
raising the wakeup threshold from 8 to 64.

You can test for both of these bugs by doing 'cat /dev/random |
hexdump' and observing that the slightest tap of the mouse generates a
large stream of output.

Consider the situation where the state of both pools is compromised
and is known at time T1. If 8 bits of entropy appear in the primary
pool, unblocking random_read, this function would transfer most of the
primary pool to the secondary, then give a byte of data to the user at
time T2. Given that byte and the known state at T1, the user can test
the possible 256 input bits to the primary pool, generate the 256
possible outputs from the secondary, and reduce the possible known
states at time T2 to a handful. This is dependent solely on the wakeup
threshold and not on the transfer size. Raising the wakeup threshold
to 64 means calculating 2^64 possible pool states, making state
extension unreasonably hard.

The second clause of the xfer function was intended to handle this
catastrophic reseeding, but given the weakness in the first clause, it
added nothing.

For patch 2:

This makes irq and blkdev interrupts untrusted and allows adding a bit
of entropy for a configurable percentage of untrusted samples,
controlled by a new sysctl. This defaults to 0 for safety, but can be
used on headless machines without a hardware RNG to continue to use
/dev/random with some confidence.

This also smartens up and simplifies the batch entropy pool to allow
unlimited amounts of untrusted mixing without blocking out trusted
samples.

For patch 3:

This adds improved entropy estimation based on source timing
granularity and a new API for registering entropy sources.

This also detects potential polling or back-to-back interrupt attacks
that could be used to observe or force event timing. If a context
switch doesn't occur between events, one of these two attacks might be
occurring. We can rule out a polling attack by checking if the CPU is
sleeping and we can rule out an interrupt flood if jiffies has
changed since the last event.

This removes the improperly named "ln" function and replaces it with a
call to the potentially arch-optimized fls. This also adjusts the
entropy count appropriately taking into consideration the expected
entropy in sections of a scale-invariant distribution (see "Benford's
Law"). Thanks to Arend Bayer for additional help with this analysis.

For patch 4:

more meaningful names for pools (predicting next patch)

cleanup of pool structure

store name

point to poolinfo table rather than copying entry

alloc pool together with structure

refactor pool creation and initialization

kill pointless (double!) pool zeroing

For patch 5:

Stop /dev/urandom readers from starving /dev/random for entropy by
creating a separate pool and not reseeding if doing so would prevent
/dev/random from reseeding.

This factors pool reseeding out of normal entropy transfer. This allows
different pools to have different policy on how to reseed.

This patch also makes random_read actually use the entropy count in
the secondary pool rather than tracking off the primary.

For patch 6:
"This removes the old API and
updates users to the new one. This also allows different input devices of
the same class (eg mice) to have their entropy state tracked independently
and removes hardwired source classes from the core."

At http://www.xs4all.nl/~zippel/lc/
you can find the latest version of the new config system. Besides the usual
archive there is also now a patch against a 2.5.39 kernel and finally some
documentation. This patch I also consider as my first release canditate, so
please test this one carefully, this release contains pretty much everything
I want from the first release to be integrated into the kernel.

Other changes:

update to 2.5.39

seperate kernel Makefile (by Sam Ravnborg/Kai Germaschewski)

lots of qconf fine tuning

the generated config files are now named Build.conf and use tabs
instead of two spaces, which makes it easier to read.

An issue (which was also mentioned by Jeff Garzik) is the help text
format. Jeff likes to have an endhelp, where I think it's redundant. The parser
currently checks the amount of indendation to find the end of the help text,
this makes the help text quite easy to read and parse. If someone prefers an
endhelp (or has an even better idea), please speak up now, if enough people
complain, I have no problem changing it.

After a report of a trivial bug, Roman put out 0.7.1, and after some testing
Sam Ravnborg said it looked pretty good; and offered some implementation
suggestions and feature requests.

The RivaTV project is trying to produce Linux drivers for graphics boards
with nVidia chips that have a video-in feature.

Changes in this release:

This release includes support for GeForce 4, a lot of new cards and several
bugfixes. Also, tuner support has been improved and RivaTV now comes with the
relevant BTTV modules to get your tuner going - simply and easily. Finally,
the installation process tries to detect pitfalls preventing the use of
RivaTV on your machine.

The EVMS team is announcing the next stable release of the Enterprise Volume
Management System, which will eventually become EVMS 2.0. Package 1.2.0 is
now available for download at the project web site:http://www.sf.net/projects/evms

EVMS 1.2.0 has full support for the 2.4 kernel, and includes patches for most
kernels up to 2.4.19. It also has nearly full support for the 2.5 kernel
(only the OS/2 and S/390 plugins have not been ported yet), and includes a
patch for kernels 2.5.38 and 2.5.39.