lmb@... said:
> has anyone already created an updated diff for the latest and greatest
> ac patch?
UML is in -ac, so it shouldn't need a patch. Or does the ac22 UML not build?
If that's the case, and the problem is that linux_boots_ok is still there,
then just get rid if it. I sent Alan a patch that implements it and he
rejected it because linux_boots_ok is temporary.
I've checked it out on ac17, and it's fine.
Jeff

Jeff Dike <jdike@...> writes:
> listreader@... said:
> > If it wants a tty, it should be looking at numeric names in /dev/pts
>
> OK, that's nice.
>
> Could you tell us exactly what you're talking about?
When I issued the ifconfig commands, I got the "invalid device" reply.
I saw an equal number of messages on the UML console that it was trying to use a device in /dev/ptys.
No devices with names like the one being used exist, so I figure you're trying to get a psudeotty and failing.
Linux 2.4 kernels with MY configuration puts pseudottys in /dev/ptys with numeric names. The standard RHL kernel did something different, though I don't recall precisely what.
Some product, which I think is xemacs, has a problem with the RH way of doing it; I surmise that in some cicumstances UML has a problem with the way I do it.
--
Cheers
JS

Henrik Nordstrom wrote:
> b: It needs to be figured out why the normal userspace flow controls does
> not work. It should work out of the box, but apparently does not (or else
> output from userspace applications would not ever be lost).
Got it. The fault is in write_room. The generic code assumes that write_room
properly reports the amount of data the driver is ready to accept on small
writes, while the UML stdio_console write_room function always returns 1024.
How to fix:
Do what the other tty drivers does: Use the tty write queue, and a write
interrupt handler to push the data out. Make write_room return the amount of
space available in the write queue.
Things to fix first:
The UML interrupt layer needs to support both read and write interrupts. Exacly
how to signal this down to the interrupt handler needs to be decided upon, but
I guess we could use two fdset_t fields.. (one for read, one for write)
emulating the hardware status bits. Or the interrupt handlers could simply not
care and try to perform both operations to figure out what the interrupt source
was.
There also needs an API whereby the drivers can tell the virtual IRQ source
that they are waiting for a write interrupt on the filedescriptor.
Question: How does the UML IRQ source operate in SMP mode? Is it still only
one, or may there be more than one thread acting as IRQ source?
--
Henrik Nordstrom
MARA Systems

Jeff Dike wrote:
> What I meant was that sigio_handler only selects for read. And the irq
> mechanism doesn't seem to have a way to say that something is writable
> (hardware interrupts are always reads). So, the driver can test for both
> reading and writing when it gets called, or we can have separate read and
> write irqs for drivers that want them.
Serial hardware does interrupts when the write queue has drained below the
low water mark.
Hardware interrupts are just interrupts. What the interrupt signals is up to
the receiving driver by reading it's hardware registers. The shared parts of
the kernel does not care what the interrupt represents.
As UML does not have hardware registers, but uses host resources the
situations is a bit different. Have not studied the UML interrupt
architecture yet, but from my understanding you are using signal driven I/O
combined with select().
The signal indicates some I/O can be performed on the file descriptor.
select() indicates what kind of I/O.
As select() is done in the generic interrupt code and not within each
driver, what you lack is a layer where the drivers can read why the
interrupt has triggered, similar to interrupt status hardware registers of
more traditional hardware interrupt sources.
Notes based on the serial driver:
* terminal and console use is almost completely separated from each other,
mainly to not disturb normal serial operations by console output.
* console writes uses a busy poll method. The kernel may not be allowed to
reschedule within console outputs or wait for interrupts..
* terminal writes uses tty->stopped, tty->hw_stopped, a write queue and
interrupts. Actual writes seem to be done from interrupts only.
* hw_stopped is controlled by the "hardware" driver, set to 1 when waiting
on the hardware
* stopped is controlled by the shared tty_io.c file which implements the
flow control signalling to userspace.
Flow of the serial tty ->write function (rs_write):
basic checks
copy data to driver transmit buffer
signal the hardware to generate interrupts if not stopped
return amount of data copied to the transmit buffer
Flow of serial interrupt handler: (rs_interrupt)
foreach port
if receive pending
receive_chars()
receive into internal buffer
schedule waiting task
if transmit possible
transmit_chars()
send from internal queue
if queue is sufficiently drained, schedule waiting task
Flow of serial console ->write function (serial_console_write)
1. Save hardware state
2. Busy-wait for the serial hardware to become available (max 1000000 I/O
polls per character)
3. Output character one at a time
4. loop to 2 if more to send
5. Restore hardware state
The actual userspace flow control seems to be managed by n_tty.c:write_chan,
shared by all tty implementations. Beats me why it is not working. Perhaps
mostly a matter of restrucutring code.
So how should it be done:
a: The hack I made with looping, rescheduling (on the host) and friends
should be done in console_write(), nowhere else.
b: It needs to be figured out why the normal userspace flow controls does
not work. It should work out of the box, but apparently does not (or else
output from userspace applications would not ever be lost).
--
Henrik Nordstrom
MARA Systems

Livio Baldini Soares wrote:
> [Henrik, I'm not sure about this, but if you `sched_yield()' on a
> threaded process, won't _all_ threads yield? So this is a mechanism to
> keep the host kernel "healthier", rather than UML kernel, right?]
The current process only, not other threads or clones. (remember, the Linux
kernel does not have threads, only processes sharing different amount of data)
Intentional.
> I've made a patch, which is similar to Henrik's patch, but it's much
> simpler (but less robust, I guess) and works correctly (Henrik's patch
> still seems to lose a few chars, but certainly much less than vanilla
> UML). I've tried to lose chars with my patch, but was unable ;-(
Mine looses chars because of the reasons I indicated. It puts a limit on how
long it will block UML on the write (100 times write(), sched_yield).
We do not want to wait indefinitely here. Flow control is not about waiting in
the write function, it is about suspending the current task until writes can be
done. This "blocking" mode is only intended for kernel output.
A sched_yield helps giving the host CPU time to process the output. Required if
you are running in an xterm, network shell (telnet/ssh) or any other
communication where processes on the host is involved. If you do busy waits then
UML will hog down the host,
Why the patch is incomplete is because of only addresses the printk situation,
not data from applications. (well data from applications are helped also, but it
is not intentional.. the code path needs to be split on console and application
writes).
--
Henrik

On Sun, Jul 01, 2001 at 01:38:19 +0200, Henrik Nordstrom wrote:
>> Not that four layers of nested hostfs will ever be usual, but it
>> still looks better even in the first layer.
>=20
> Almost. You also have to account for meta files in each layer of
> each file, including meta...
Yes. We use your compressed suggestion.
> Regarding your concern about symlinks in the discussion of
> pipes.. using symlinks as storage model is only a optimization for
> EXT2 host filesystems, not a requirement. The translation layer
> could just as well use files.
Yes, I wasn't thinking about the meta files here. We need to store
normal symlinks too, symlinks that looks like symlinks inside UML.
We definitely want them to be symlinks on the host too.
If I understand you correctly, you wanted to pass the umask as a mount
time option, let hostfs store on the host what is allowed within that
umask, and use meta files with the mode if we can't store the whole
mode on the host? We could do the same with FIFOs and symlinks. If you
need to backup to a filesystem that cannot handle FIFOs or symlinks,
simply disallow them at mount time, and they will both be stored as
normal files+meta files. Since symlinks are not allowed, the meta
files will be normal files too. This way, hostfs could be used on
e.g. ext2 with good performance and verbosity, and it would still run
on top of e.g. FAT32. The drawback, of course, is complexity.
If we decide to do this, we need to extend the meta file format:
m st_mode in octal, excluding char/block/pipe/socket/directory mode
bits
u st_uid in decimal
g st_gid in decimal
c st_rdev character device major,minor decimal
b st_rdev block device major,minor decimal
s socket
p fifo
l symbolic link (new)
lrwxrwxrwx 1 root root 3 Jul 2 12:19 foo -> bar
host permissions: 644
host content: bar
encoded data: l
Or maybe:
lrwxrwxrwx 1 root root 3 Jul 2 12:19 foo -> bar
host permissions: 644
encoded data: l:bar
The second method should be faster.
> If the host filesystem is for example reiserfs, then using files is
> almost as efficient as using symlinks (a little more CPU overhead
> due to having to perform open/close to access the data instead of
> only a lstat which does all in one but...)
(readlink, not lstat)
Maybe, but if we use symlinks we get all information we need by typing
ls -al on the host.
Also, it might be easier to write symlinks atomically.
But on filesystems like FAT32, we have no choice. Or are there any
other problems with FAT32 we need to think about? In that case, maybe
we should use UMSDOS for FAT and let hostfs be a pure and simple
Unix-on-top-of-Unix filesystem?
J=F6rgen

livio@... said:
> [BTW: what did you mean when you said that there doesn't exist a
> notification when a fd becomes writeable? Isn't that why there are
> syscalls like poll() and select()? Or have I misunderstood something?]
What I meant was that sigio_handler only selects for read. And the irq
mechanism doesn't seem to have a way to say that something is writable
(hardware interrupts are always reads). So, the driver can test for both
reading and writing when it gets called, or we can have separate read and
write irqs for drivers that want them.
Neither sounds very appealing to me offhand.
Jeff

Howdy!
Jeff Dike writes:
> hno@... said:
(...)
> > b) Data sent by user space applications SHOULD be flow-controlled by
> > the normal device flow controls, but currently the UML "console"
> > device driver does not signal flow control status.
>
> Livio sent me a ^S/^Q patch which I integrated a while ago and sent to Alan
> yesterday. Maybe that same mechanism can be used when a write fails. We'd
> need a notification when the fd becomes writable, which doesn't currently
> exist.
I'm not sure about this... It seems that in this case, each
tty_driver does it's own flow control, so it seems to be ok to make
the proposed control in chan_kern.c, instead of stdio_console.c, for
instance. The major advantage of putting it in chan_kern.c, is that
all the callers of write_chan(), won't have to worry about errors on
write. Since we have at least 2 calls in UML for write_chan()
(con_write() and console_write() in stdio_console.c), we would have to
duplicate to write-fail mechanism.
On the other hand, it is the only way (I think) that UML can
properly implement a schedule() to wait for write() to go
non-blocking.
[Henrik, I'm not sure about this, but if you `sched_yield()' on a
threaded process, won't _all_ threads yield? So this is a mechanism to
keep the host kernel "healthier", rather than UML kernel, right?]
I've made a patch, which is similar to Henrik's patch, but it's much
simpler (but less robust, I guess) and works correctly (Henrik's patch
still seems to lose a few chars, but certainly much less than vanilla
UML). I've tried to lose chars with my patch, but was unable ;-(
About my patch not scheduling, I really don't think that's a
problem. Here all we need is to give the host kernel some time to
clear up space in the fd, and usually that's pretty darn fast :-) So,
in the practical world, the while in write_chan won't loop more than
once.
Jeff, if you still feel that it would be better to implement a
write-fail mechanism, please say so, and when I get some more spare
time I can try to think up a way to do it. [BTW: what did you mean
when you said that there doesn't exist a notification when a fd
becomes writeable? Isn't that why there are syscalls like poll() and
select()? Or have I misunderstood something?]
The patch follows.
Best regards to all,
--
Livio <livio@...>