ken@... (Ken Yap) writes:
> >> Hmm, this points to the ELF header from the first block of the boot file
> >> which is loaded at where mkelf-linux instructs Etherboot to, usually low
> >> memory. But this theory is easy to test. Could someone change first32.c
> >> to make a copy of seg[S_SETUP]->p_paddr before the ramdisk move and jump
> >> to that? If so I'll make the patch.
> >
> >I don't happen to have a machine handy I can run this on but the code
> >builds ok.
> >
> >It turns out we already had a copy of seg[S_SETUP]->p_paddr
>
> I suspect you are right about this. I was thinking of tagged images,
> where the header also specifies where the header block is to go in
> memory. mknbi specifies that it goes into low memory. In ELF images,
> there is no such constraint so the header block is located in
> Etherboot's memory.
You can easily deliberately load the header of an ELF image
and not need to depend on an Etherboot extension to get it.
> So it looks like I'll have to put out 1.4.3 soon. Of
> course the real mystery is why it didn't break sooner.
Yes....
Anyway I have been up mucking with things much too long and it is off
to bed with me.
Eric

>> Hmm, this points to the ELF header from the first block of the boot file
>> which is loaded at where mkelf-linux instructs Etherboot to, usually low
>> memory. But this theory is easy to test. Could someone change first32.c
>> to make a copy of seg[S_SETUP]->p_paddr before the ramdisk move and jump
>> to that? If so I'll make the patch.
>
>I don't happen to have a machine handy I can run this on but the code
>builds ok.
>
>It turns out we already had a copy of seg[S_SETUP]->p_paddr
I suspect you are right about this. I was thinking of tagged images,
where the header also specifies where the header block is to go in
memory. mknbi specifies that it goes into low memory. In ELF images,
there is no such constraint so the header block is located in
Etherboot's memory. So it looks like I'll have to put out 1.4.3 soon. Of
course the real mystery is why it didn't break sooner.

ken@... (Ken Yap) writes:
> >It looks to me like: seg[S_SETUP]->p_paddr is not safe. This is not
> >quite as bad as I thought. There is only the one reference is enough
> >to stop us.
>
> Hmm, this points to the ELF header from the first block of the boot file
> which is loaded at where mkelf-linux instructs Etherboot to, usually low
> memory. But this theory is easy to test. Could someone change first32.c
> to make a copy of seg[S_SETUP]->p_paddr before the ramdisk move and jump
> to that? If so I'll make the patch.
>
>
I don't happen to have a machine handy I can run this on but the code
builds ok.
It turns out we already had a copy of seg[S_SETUP]->p_paddr
Eric
ebiederm@...'s password:
Index: first32.c
===================================================================
RCS file: /cvsroot/etherboot/mknbi/mknbi-1.4/first32.c,v
retrieving revision 1.2
diff -u -r1.2 first32.c
--- first32.c 13 Jan 2003 14:47:21 -0000 1.2
+++ first32.c 2 Dec 2003 09:19:23 -0000
@@ -631,6 +631,6 @@
for (i = 0; i < 0x7ffffff; i++)
;
#endif
- xstart(seg[S_SETUP]->p_paddr);
+ xstart((unsigned long)setup);
return (0);
}

>It looks to me like: seg[S_SETUP]->p_paddr is not safe. This is not
>quite as bad as I thought. There is only the one reference is enough
>to stop us.
Hmm, this points to the ELF header from the first block of the boot file
which is loaded at where mkelf-linux instructs Etherboot to, usually low
memory. But this theory is easy to test. Could someone change first32.c
to make a copy of seg[S_SETUP]->p_paddr before the ramdisk move and jump
to that? If so I'll make the patch.

ken@... (Ken Yap) writes:
> >Ok this looks pretty conclusive. I have looked at first32.c from mkelf-linux
> >/mknbi-linux
> >and it is referencing data from etherboot after it smashes it with the initrd
> >.
>
> Could you point out the line please? The move of the ramdisk is done
> last thing before jumping to the image. I must be missing something.
first32.c goes roughly like:
> ......
>
> header = < someplace in etherboots bss>
>
> .....
>
> static inline void locate_segs(union infoblock *header)
> {
> int i;
> Elf32_Phdr *s;
>
> s = (Elf32_Phdr *)((char *)header + header->ehdr.e_phoff);
> for (i = 0; i < S_END && i < header->ehdr.e_phnum; i++, s++) {
> seg[i] = s;
> }
> }
>
> ......
> /* Copy to destination by longwords, tail first */
> while (sp > (long *)p)
> *--dp = *--sp;
> ......
>
> xstart(seg[S_SETUP]->p_paddr);
> return (0);
It looks to me like: seg[S_SETUP]->p_paddr is not safe. This is not
quite as bad as I thought. There is only the one reference is enough
to stop us.
> >Ken you want to have a look at it. I suspect we just want to reserve enough
> >bss space and copy the parameters we care about into the bss.
>
> But the parameters are in low memory already.
Right the kernel parameters should not be an issue.
> What I think is more likely is this unfinished business:
>
> /* look for highest E820_RAM that is under ramdisk_max
> strictly speaking we should also check that
> we have room for the ramdisk in the memory segment */
>
> What I think is happening is that first32.c has latched upon a high
> segment that is not quite large enough for the ramdisk. I made the
> unwarranted assumption there that the top segment would be the largest.
Hmm. Possibly. I think seg[S_SETUP]->p_paddr needs to reside in a local
variable before we can start pointing fingers in that direction.
Eric

>Ok this looks pretty conclusive. I have looked at first32.c from mkelf-linux
>/mknbi-linux
>and it is referencing data from etherboot after it smashes it with the initrd
>.
Could you point out the line please? The move of the ramdisk is done
last thing before jumping to the image. I must be missing something.
>Ken you want to have a look at it. I suspect we just want to reserve enough
>bss space and copy the parameters we care about into the bss.
But the parameters are in low memory already.
What I think is more likely is this unfinished business:
/* look for highest E820_RAM that is under ramdisk_max
strictly speaking we should also check that
we have room for the ramdisk in the memory segment */
What I think is happening is that first32.c has latched upon a high
segment that is not quite large enough for the ramdisk. I made the
unwarranted assumption there that the top segment would be the largest.

"Robert G. Jakabosky" <bobby@...> writes:
> Etherboot drivers:
> lancepci [0x1022, 0x2000]
> eepro100 [0x8086, 0x1229]
> centaur-p [0x1317, 0x0985]
> rtl8139 [0x10ec, 0x8139]
>
> Relocate is disabled for all four drivers. I have booted 5 SSI nodes with
> those drivers.
Ok this looks pretty conclusive. I have looked at first32.c from mkelf-linux/mknbi-linux
and it is referencing data from etherboot after it smashes it with the initrd.
I wonder why no one noticed this before?
Now someone just needs to get up the energy to patch first32.c
Ken you want to have a look at it. I suspect we just want to reserve enough
bss space and copy the parameters we care about into the bss.
Eric

Etherboot drivers:
lancepci [0x1022, 0x2000]
eepro100 [0x8086, 0x1229]
centaur-p [0x1317, 0x0985]
rtl8139 [0x10ec, 0x8139]
Relocate is disabled for all four drivers. I have booted 5 SSI nodes with
those drivers.
On Monday 01 December 2003 10:49 pm, you wrote:
> "Robert G. Jakabosky" <bobby@...> writes:
> > I had the same problem booting nodes with etherboot on a floppy disk.
> > The ramdisk was downloaded then it would just stop. I got it working by
> > disabling "Relocate" when creating the etherboot image. I used
> > ROM-o-matic to create an etherboot version 5.2.2 floppy image. I hope
> > that helps.
>
> Hmm. How very odd.
>
> Which etherboot driver were you using.
>
> Although I can just about imagine this is mkelf-linux refered back to
> the dhcp options after smacking etherboot with the initrd.
>
> Eric
--
Robert G. Jakabosky
AlphaTrade.com
Phone: (604) 681-7503
Fax: (604) 681-7710
Cell: (604) 721-5283