Description of problem:
/var/log/lastlog is too large
See bug 156809
Version-Release number of selected component (if applicable):
FC4
How reproducible:
See bug 156809
Additional info:
Bug 156809 reports a problem with "rsync" in that it is unable to copy
"/var/log/lastlog". The problem is in the database for "lastlog". Having a
1.2TB file (sparse or not) is a problem for most copy utilities. Reading the
documentation, even long distances between UIDs cause delays in the "lastlog"
program (clearly, it must be incrementing for each UID). The program and
database should be redesigned so that (1) the files aren't gratuitously large,
and (2) the program ("lastloh") isn't so poorly designed that it appears to
"hang" (see "lastlog" documentation).
The only workaround seems to be to remove the "lastlogging", which is a terrible
workaround.

Well, lastlog uses struct lastlog. In any case, the database design is poor,
old BSDish. Unfortunately, many apps access this file directly, not through
some library's accessor functions, so the internal /var/log/lastlog layout
is sadly a part of the interface.
Similar situation is with /var/run/utmp, /var/log/wtmp, though for those
there are at least accessor functions defined in <utmp.h> and <utmpx.h>,
some of them standardized by POSIX.
To fix this, we'd need to change all the apps that touch these 3 files to
never touch the files directly and use accessor functions instead (and in case
of lastlog where even no accessor functions exist write them and decide in which
library they should be put (whether glibc or -llastlog or something else)).
Only when this step is done we can work on moving the content of these files
into different paths and changing their internal format.

The file is large in that: every tool that copies it needs to have an
understanding of copying sparse files.
For me, the main operational issue is: I have these directories accessible
through a variety of mechanisms, e.g., NFS mount, SMB mount, RSYNC source, and
the files are accessed via a variety of tools, some of which aren't UNIX tools.
I have to back up, copy, synchronize, restore, etc. these files via scripts,
webmin, cpio, rsync, scp, and so on -- and there are some Windows-based tools
that I need to access the files via SMB share or CD-ROM. Why? Because I
support two separate facilities, each as a backup for the other. Why Windows
tools? For disaster recovery using readily available media, tools, and hardware.
Now which makes more sense? Fixing every tool from now until kingdom come to
specially handle these everyday-UNIX files? Or, is it better to fix the 3
existing programs to use a common API (i.e., an implementation-hiding technique
that has two decades of good software engineering experience) and localize the
knowledge of the UTMP file interface to just its APIs?
Finally, what requires the lastlog to have such a large size? It seems that
there is no requirement that UTMP entries be placed 1.2TB into the file, right?
Why can't they just be appended to the end of the file (with a atomic
write-append)?
Presumably, appending to the end of the file would work just as well, right?
Regardless, the main point is the trade off for most scripts/apps improperly
handling everyday-UNIX operation/admin files and their all requiring
changes/hacks to get them to backup up /var/log properly (note: an Oracle
database is not an everyday UNIX admin file) vs. fixing a couple programs that
were poorly designed already because they made their implementation visible
(said differently: they didn't hide their implementation by abstracting the
service interface). Not exactly true: the API is there, they just don't use it.
And finally, why do x86_64 systems (and their administrators) have to worry
about this problem, but x86 systems don't. Clearly, this becomes and odd
portability problem when the x86_64 systems require different backup scripts
than the x86 systems (again, I reiterate: for *everday* UNIX operation/admin files).

(In reply to comment #4)
> Now which makes more sense? Fixing every tool from now until kingdom come to
> specially handle these everyday-UNIX files? Or, is it better to fix the 3
> existing programs to use a common API (i.e., an implementation-hiding technique
> that has two decades of good software engineering experience) and localize the
> knowledge of the UTMP file interface to just its APIs?
As stated above, the tools that access lastlog *don't* use an API; they
access the file directly.
> Finally, what requires the lastlog to have such a large size? It seems that
> there is no requirement that UTMP entries be placed 1.2TB into the file, right?
> Why can't they just be appended to the end of the file (with a atomic
> write-append)?
It's a sparse file indexed by login id. The nfsnobody user has a userid of (-2).
When a x86-64 uses 32-bit UIDs, that (-2) is a very large number.
It's done as a sparse file so that any user of the file who wants to look at the
lastlog record for a particular uid can just seek to that userid's record, as
opposed to parsing the whole file.

> It's a sparse file indexed by login id. The nfsnobody user has a userid of (-2).
> When a x86-64 uses 32-bit UIDs, that (-2) is a very large number.
...which is the real problem, not lastlog's structure, which would probably
be very difficult to change without breaking stuff.
Frank said:
> The file is large in that: every tool that copies it needs to have an
> understanding of copying sparse files.
Understanding or not, processing a 1.2TB file still takes forever. See tar
--sparse : works OK (ie produces a small tar file) but takes about an hour on my
64-bit system.

Based on the date this bug was created, it appears to have been reported
against rawhide during the development of a Fedora release that is no
longer maintained. In order to refocus our efforts as a project we are
flagging all of the open bugs for releases which are no longer
maintained. If this bug remains in NEEDINFO thirty (30) days from now,
we will automatically close it.
If you can reproduce this bug in a maintained Fedora version (7, 8, or
rawhide), please change this bug to the respective version and change
the status to ASSIGNED. (If you're unable to change the bug's version
or status, add a comment to the bug and someone will change it for you.)
Thanks for your help, and we apologize again that we haven't handled
these issues to this point.
The process we're following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp
We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

This bug has been in NEEDINFO for more than 30 days since feedback was
first requested. As a result we are closing it.
If you can reproduce this bug in the future against a maintained Fedora
version please feel free to reopen it against that version.
The process we're following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

Note

You need to
log in
before you can comment on or make changes to this bug.