Building from the AROS and AROS-contrib tree for i386 takes a long time and often requires fairly recent versions of the software, and trying to do it under a slow emulation with obsoleted versions of build tools or on real m68k hardware isn't going to make the job any less painful.

It makes more sense to create a gnu m68k-amiga cross compiler environment under GNU-Linux or FreeBSD or windows and try building AROS for m68k-amiga target. Transfer files over and test out compatibility with UAE or amiga-forever. It is recommended to create a 68k-Linux-hosted AROS and testing it within Aranym. That way you don't need to worry about the custom chips in the beginning. You don't need to write any new drivers; the existing ones for other Linux-hosted AROS flavours should work.

From here and if you remove this complete and the g++ files i post link are in, then G++ should build with the c++ lib.

m68k-amiga-aros-ar is a link pointing to /home/weissms/media/data/aros/build/crosstools/m68k-aros-ar

After about 4 hours of building under my VirtualBox Linux machine it finally aborted with this: Make sure you have up to date tree. That file was broken few days ago. (r46006) btw, there is no need to compile everything, bootable setup only needs workbench and kernel-link-amiga-m68k targets. AFAIK non-m68k drivers are not disabled because no one has bothered and at least in theory most of them can work via Amiga Zorro PCI bridge. Intel GMA of course can't work :)

Anyone tried to make a g++ cross-compiler for 68k ? Or there are simpler ways, maybe, to compile owb for aros 68k ? This is Fab's Odyssey. The build is not coupled with AROS build system. It does require cmake though and some of the libs that are not built in contrib by default. (and it takes 1 hour to build on 3Ghz machine with -j 3)

I made a 680x0 cross-compiler for the MegaDrive (it also includes an SH2 cross-compiler, but that's easy enough to cut). It's a generic build of 4.5.2, however, it may not have the flags wanted. Here's the version info

Note things like the thread model, disabled thread local storage, etc. Those would need to be changed in the makefile that builds the toolchain. The process and files I use to build the toolchain can be found at GenDev

It can be altered fairly easily to accommodate whatever flags AROS needs as well as a different default target CPU (current set to 68000).

cd ~
mkdir build
cd build
(It is very important to have separate build tree, do not compile inside source tree)
../AROS/configure --target=amiga-m68k --with-optimization="-Os" --with-serial-debug --disable-mmu
(do not change optimization setting, it is not going to work)
make
(wait until you get some error, it is some package that is not yet m68k compatible, not important)
make kernel-link-amiga-m68k
rom images are now in distfiles
make workbench to create full wb setup

Hmm. Looks like .text alignment is not allowed on your cross compiler. What version of m68k gcc are you using? If you used the arch/m68k-amiga/doc/build-toolchain.sh script, this is the version you should have:

The i386 and x86_64 build use a "fake crosscompiler" (host's gcc + configuration). I think you should look at how mingw32 port is build - it uses real, prebuild crosscompiler for the complete chain AFAIK (ask Pavel Fedin for more info). If you would follow this approach, we could even setup nightly builds on VPS for your port - something I think would be very important in order not to introduce regressions int m68k by people working on other ports.

If a lot of stuff gets added to "generic" targets (e.g. linklibs), and then used via that same target, which is causing a lot of unrelated stuff to get built when you try and use a specific target. It would be nice (IMHO) if these targets where only used to group the main system objects and not just everything, and have the mmakefiles correctly use the dependency targets for the modules they actually use.

The target should be to get rid of all includes and linklibs as metadependencies not extend them.

Core-linklibs is the metatarget for the linklibs added by default by gcc. Programs using some linklibs need to explicitly provide the proper dependency in their mmakefile.

It's my binutils' 'strip' that is eliminating the .rela.text and .rela.rodata (zero sized). From what I know we should not use strip as a separate command on AROS binaries. We can however add -s to command line of linker. Not to use strip explicitly as it removes some symbols which are needed for AROS programs to work.

--strip-unneeded --remove-section .comment

is OK but pure "strip" creates corrupt binaries.

I had to change the Graphics/*LayerRom family to use A4 instead of A5, since GCC seems to want to believe that the A5 frame pointer is immutable. how about "-fomit-frame-pointer -fcall-saved-reg-a5" options? Could you add ugly wrapper in layer jump table that switches registers? (similar to Disable() and friends) Did this also cause the unexplained boot menu crash?

but:

GCC treats -fomit-frame-pointer as a hint, not a requirement. The reload1.c algorithm in GCC can force the use of a frame pointer, regardless of the 'omit-frame-pointer' setting.

GCC when the frame pointer (%a5) is in use, due to the above explanation, I get the following error:

cc1: error: can't use 'a5' as a call-saved register

The biggest problem is not how pervasive this issue is, but how *rare*. If it was a *huge* problem, then I could just suck it up and deal with the changes that would cause huge bloat.

As it is, the following are the all calls in AROS (that are defined in *.conf files) that could have issues, which means I can't justify huge bloat:

Make a stack-parm wrapper for all library functions, that then calls the real regcall functions via jsr NN(%a6).

+ In this model, the AROS_LH* macros generate _Exec_Foo (stackcall) wrappers for Exec_Foo (regcall). AROS_LC* macros would call the _Exec_Foo stackcall wrappers
+ AmigaOS 3.x apps would work unchanged against the ROM
- Unneccessary bloat for the ROM. This bloats all calls in the ROM to fix only 10 calls.
- Only works for the ROM. Things get out of hand and messy for building the AROS user-space libraries and apps.

Make an additional library call for each of the affected libcalls that use A4 instead of A5 (i.e. Exec/Supervisor_A4), and fix up the callers to have a '#define Supervisor Supervisor_A4' somewhere appropriate in their headers. Then define an asm stub for each function that saves A5 and moves A4 into A5, calls the real routine,

then restores A5.

+ Relatively low impact, only affects the problem APIs
+ Will work with existing AmigaOS 3.x apps
+ No problems with compiling AROS apps
- May have to get genmodule in the act to get this to work cleanly.
- Will *permanently* affect the AROS ABI for m68k, and consume library call slots for *ALL OTHER ARCHITECTURES*.

Fix gcc to give diagnostics when it feels 'forced' to use a frame pointer, sufficient to allow a programmer to make the correct changes so that gcc would not make a frame pointer in that routine.

- This code is very convoluted and ugly. I tried this once, and ran away screaming.

Fix gcc to never need a frame pointer on m68k

- This may be impossible. reload1.c is an impenetrable morass of evil.

Fix gcc to recognize when the FP is going to be clobbered, and move the FP to an alternate register.

+ Would be a fantastic fix, and should be portable to other archs. - Looks like this may be a huge task, or at least one outside my skill set.

Make a M68K target backend for LLVM

- Would distract from the AROS KS Phase I task for at least a month.
- May run into intractable issues (fixed ABI definition per arch?, impossible to debug LLVM code generation issues?, etc)
+ HUGE long term benefits if we get it to work!

gcc 2.95 is incredibly obsolete - AROS uses a number of GCC 3.x or higher features both at compile and link time.

gcc 2.95 code generation is horribly bloated on m68k - far worse than gcc 4.5.1. For example, the 400K ROM text space on 4.5.1 with my regcall (broken) inline hacks bloated up to 1.2M on gcc 2.95. Unacceptable! - One of the worst cases was Exec/Forbid - it bloated from 2 instructions to 35!

The code generation bug I was hoping to avoid - it's evidently ancient, as 2.95.3 still generates the same bug in the same place. (it uses a lea %sp(#n),%a3 instead of move.l %sp(#n),%a3, due to an optimization issue in some corner cases). So, what I'm going to do now is focus on the 2.95.3 code optimization bug (which, sadly, shows up in PrepareExecBase), fix it, then port that patch up to gcc 4.5.1. It's -fpic optimization, that gets corrupted by the register usage. -fno-pic squashes that bug for now, but I'll need to re-investigate this when I need to make 'real' PIC libraries.

GCC 2.95 (maybe not all versions of it, but IIRC 2.95.3 did) can produce wrong code for vararg macros like this:

Both on x86 and 68k. IIRC that was one of the reason for the switch to gcc 3.

that's from binutils assembler.to find the error you can use a switch, that the temp file is not delete.

to avoid the startup code the linker Option

nostarfiles
noixemul

GCC 3.4 is too broken to compile ffmpeg with optimizer on.ffmpeg is build with GCC 4.5.0 (not the AROS Version).but compile to well with GCC 4.3.2 (was used before GCC 4.5 was out)

only cygwin hosted GCC 4.3.2 and GCC 4.50 are here. with those compilers are many amiga Ports compile.

on these compilers is also a text file how to build GCC.the 68k GCC is official support in GCC.so its easy to compile a GCC for your linux you like.

I also recommend to use from the includes or ixemul the new math-68881.h.this is special modify to work with newer compilers. problem on GCC is on newer compiler after a asm line a \n\t need add.but newer asm code have this.

i think optimizers are switch off in that AROS build.also default GCC create only 68000 code.

I can't see how they will fit all the things from just a 3.1 level kickstart without using tricks that the 4000T used, like moving Workbench.library to disk. That compiled C code will be bigger in almost every case. Most of AmigaOS 3.1 is written in C. dos.library was written in BCPL up to 1.3 & converted to C for 2.0 (or even 1.4 beta?) IIRC they used several compilers, one being lattice. Which turned into SAS/C. Back in the gcc 2.95 days I did some benchmarks and the software I tried was quicker with gcc than SAS/C. There are many factors that will influence how large/fast AROS68K runs. It's far too early to even contemplate what it will be like. However CD32/A1200's can have 1mb kickstart, A500/A2000's needs wires added. No idea about A3000.

It appears that we have locked ourselves into the GCC family of compilers, due to the use of the idioms in compiler/include/asm.c

These idioms are used to get the target's GCC C compiler to determine the offset of structure elements, and emit them as inline assembly. The inline assembly is then used to generate the assembly headers files.

The reason it's done this way is that the host compiler can't properly compute the offsets - say a Linux host (little endian, long aligned) compiling for m68k (little endian, short aligned). Was there any reason, in the history of AROS, for not using the #pragma pack(n) preprocessor directives? I think for the reason that pragma pack() gets you an approximation of how the cross-compiler would generate the structure offsets, but the asm.c method gets the offsets *precisely*. For most cases, pragma pack() will probably get you what you want (especially for PPC vs m68k, which have similar packing rules and the same endianness), but may not always get the right answers for, say, cross-compiling for m68k on x86. That should guarantee the same structure alignments and offsets, independant from the host-architecture and compiler being used. OS4 and MorphOS uses it and vbcc supports it too. I think OS4 only uses it for structures that where already defined in OS3.x. New structures don't use that packing. On 64 bit this would not be the case as pointers (and IPTRs who replace ULONGs) will take double space. PPC and m68k are similar in their packing rules, x86 can be very weird. We decided to use the native packing of the cpu for maximum speed reasons.

Unfortunately, VBCC's inline assembly method doesn't allow for replacement operations in the assembly (it's stuffed as-is into the output file), so the asm.c method can't be used. Such a feature is not easily added. And even if it would exist, it doesn't help you with vbcc/vasm. I had a glance over asm.c, and it looks like it will insert #define directives into the assembler source. That cannot work with vasm, which doesn't run a C preprocessor over its input source (finally it is an assembler and knows nothing about C). For AROS/VBCC toolchain, I'm using only vbcc. cpp, as, ld, etc come from GCC/binutils. I didn't want to rewrite all my assembler, and I know the intricacies of the GNU toolchain pretty well. But you are aware that vasm does all the peephole-optimization work for the compiler? Without an optimizing assembler the code will be noticeable slower and larger. Easy. I use the '-gas' option to vbccm68k, and it emits AT&T syntax.

For VBCC (or SASC, or any other 'Amiga style' compiler), the asm.c technique is not usable. There have been discussions about that before. Some people proposed to try to extract with some trickery the needed information out of object .o files. Another possibility is just hard code an cpu.i file for each cpu and don't let it generate by code. If I was to ask for *one* vbcc change, it would be supporting this idiom:

Basically, be able to treat the inline assembly like a sprintf() string. If we had that, I could get asm.c to work under vbcc. Either you have an argument to the function, then you would write:

int func(__reg("a0") char *array) = "...";

Or you will use a global variable:

int func(void) = "lea _array,a0 ...";

But you probably need something like:

int func(void) = "movl #%d,d0\nrts",offsetof(struct Task, tc_State);

It looks like we're stuck with GCC (and possibly LLVM) for all architectures. Yes, unfortunately not only for compiling the AROS system, but also for compiling AROS programs in general. I had already given up years ago trying to support AROS with vbcc, because the SDK seems to be designed for gcc only, and I don't want to patch it with every new release. :(

arch/m68k-all: Flesh out the m68k common kernel routines. Common routines that would be needed by all ports. Of special interest is the arch/m68k-all/include/gencall.c helper, which generates the GCC glue macros that (hopefully) will get us a working native library API without too many GCC patches.

arch/m68k-amiga: Native Amiga support. Provides support for the stack-based and bincompat amiga-m68k builds. The register based amiga-m68k did have compiler issues for some functions (like Exec/Forbid())

Use generic exec/preparecontext.c? because a lot of time was taken to make it usable on all systems... sigcore.h use must be avoided but use arch/m68k-amiga/kernel_cpu.h instead and define needed macros there.

There are plans to unify CPU context structures per CPU family and open them to the public. Or invent some API to manipulate contexts. So private context is a temporary thing.

It is not known if we can safely use Disable()/Enable() in gallium. Better implement atomics for m68k. Generic atomics can be considered temporary hacks. Unfortunately, m68k CAS and TAS instructions are illegal on ChipRAM (see the Amiga HRM). We have to use Enable()/Disable() on m68k-amiga for read/modify/write.

Where is m68k AROS_LIBFUNC_INIT defined? in compiler/arossupport/include/libcall.h

Make a ROM that loads at 0xF00000 instead of 0xF80000 ('cartridge' space)

The Amiga Kickstart at 0xF80000 will see the AROS ROM, and jump to it instead of its own Exec (happens automagically with KS 1.3 or later)

The AROS Exec will scan *both* its ROM space *and* the KickStart ROM space, and use the KS ROM libraries to 'fill in the gaps' for hardware drivers.

This should let us get a 'Frankenrom' that will let more developers get a booting AROS that can load AROS m68k ELF programs and work on validation and KS library replacements. I will be FrankenROMing it with Amiga Kickstart 3.1's graphics, dos, mouse, keyboard, and trackdisk drivers to make sure I can get up to an AROS desktop. After that (which will validate exec.library and kernel.resource), I will continue to add AROS libraries (and remove KS 3.1 libs) until it's all AROS.

Romsplit extracts KS rom modules and re-creates relocation data. Remus can be used to build custom rom, relocated to any address you want.

This is fully UAE compatible (no need for 0xf00000 hacks/UAE boot rom modifications), just configure extended rom image, also it is more real Amiga compatible too, 1M EPROM in place of original ROM maps 0xE0 and 0xF8 areas automatically. (except on Fat Gary based Amigas that lost 1M ROM capability for some reason, it works fine with A500/A600 and A1200). Real CD32 also does it this way. CDTV is the only "weird" Amiga that uses 0xF0 space. (0xF0 is also used by Blizzard 060 card). Note that 0xE0 normally mirrors 0xF8 (if normal 512K rom used), do not accidentally load modules twice :), also make sure 0xE0 ROM header is installed because PC at reset is loaded from 0xE0 ROM if installed.

Where to scan for UAE ROM extensions (UAE HDF, UAE SCSI, etc.)?

That ROM is normally located at 0xf00000 but it also includes fake autoconfig board, it mounts automagically. (at least in WinUAE it is PC-relative and can be moved to other locations, not sure about other UAE versions)

"Filesystem Autoconfig Area" in memory map dump

"UAE Boot ROM" contains the code.

(again, not exactly sure if this is reported in other UAE versions)

Can you add a TCP port based serial emulation to WinUAE? That will make debugging much easier on your end. I have already pacthed my e-uae to do that. Or, if you are really into pain, you >could implement a GDB stub TCP port on WinUAE.... I am always using separate winuae debug log window when debugging, serial to write_log() conversion is usually more than enough.

this work in that way in winuae, that your amiga program need call a function.because uae.resource can be on different addresses, on newer winuae is some additional code need to read the vector to switch it on.

the vector 20 do the enforcer and serial to winuae console output enable.when d1 & 1 !=0 enforcer is enable. when d1 & 2 !=0 then the serial at 9600 baud is send to winuae log file or console.but in some cases maybe wish faster serial output, maybe toni can change the code so a baud rate of lets say 600000 do too write to console and no switch is need.

d1 & 4 != 0 is the illegal address stop program mode.that's similar to grim reaper but on winuae you can jump on illegal address access to all graphical debuggers AOS have.

all modes can switch during runtime.the serial mode is sticky, so when set once, after a reboot serial is write to winuae log console or windows too.

Drop NATIVE at all. This flag enabled some Amiga-specific quirks in the code (grep for it, you'll see). This can be handled by arch-specific code in kernel.resource and CPU-specific code somewhere. In some rare cases where #ifdef is really needed (like returning something in a3 register) you can better depend on BINCOMPAT, it's IMHO fits in better here.

One of the goals for the BINCOMPAT flavour is to eliminate the need for a BSS section for the ROM. This patch removes the BSS data for rom/kernel for the BINCOMPAT flavour, and puts KernelBase at M68K Vector 0 (right below SysBase). There were more BINCOMPAT changes recently. Note that there are other ports that are BINCOMPAT at the moment, though, they are not really because they lack compatible stucture packing. But they can become really binary compatible in the future, every port running on a big endian cpu can. What about your, port does it use some pragma pack(2) at some place? I guess we should add this for the BINCOMPAT ports now.

But even then, changes similar to the above will brake those other ports because they might need those globals. Here we need another solution. I came across all this because sam port wasn't booting anymore. For the moment I've switched it from ppcnative to standalone and it works again.

All structs for gcc-m68k are implicitly pack(2).

I would not introduce to manu AROS_FLAVOUR defines as the target is to get rid of them after ABI V1 and for SDK V1. I would like to make modules with globals rommable. I would do this by linking these globals to an absolute address that is know to be in RAM. The first ROM initialization code would then reserve the right amount at the place where the globals are linked and initialize them.

AROS_FLAVOUR_NATIVE

Native OS's syscalls are used instead of Kernel

AROS_FLAVOUR_STANDALONE

Kernel is used to directly control scheduling

AROS_FLAVOUR_EMULATION

???

AROS_FLAVOUR_LINKLIB

???

AROS_FLAVOUR_BINCOMPAT

Binary compatible with AmigaOS 3.x APIs (both PPC and M68K)

I would like to add something to the effect of:

AROS_FLAVOUR_ROM

No .bss nor .data segment allowed in rom/* libraries

If we keep the current flags until the ABI V1 switch nothing should get broken. For sam and efika port and lines like

if (AROS_FLAVOUR & AROS_FLAVOUR_BINCOMPAT)

I need additional information if this is for amiga hardware or just binary compatible. At the moment 16 byte stack alignment is not working or there is some direct hardware banging. This could be rewritten to use some more preprocessor symbols (which ones by the way?) separated into arch dependent macros or inlines or overridden with build_arch_specific.

I don't know how the flavour stuff is going to be changed for ABI V1 but I assume that informations about AROS_TARGET_ARCH and AROS_TARGET_CPU are needed, so what about placing those into our generated config.h already now. Then these could help in solving the above problem.

OK, what I want to have is config.h to contain something like:

define AROS_TARGET_ARCH_AMIGA

define AROS_TARGET_CPU_M68K

or

define AROS_TARGET_ARCH_SAM440

define AROS_TARGET_CPU_PPC

or

<something else>

depending on how configure was called.

Preprocessor symbols like HOST_OS_linux, HOST_OS_darwin or AROS_ARCHITECTURE that are used in some hosted modules can be generated and placed into config.h the same way. Could you make the configure.in changes, then I'll 'catch up' and fix the code to use those instead of BINCOMPAT where needed.

Or should it be a separate 'CONFIG_ROMMABLE' or something?

One of the fundamental beauties of the AmigaOS Exec is that (with a few added hardware mutexes) SMP support with a MMU is pretty trivial. Each CPU would have its own memory space, with its own SysBase, and you can create a special inter-cpu message pipe to send message to other CPUs.

One way to do it would be to have (for example) CPU0 control all hardware, and all other CPUs would have proxy libraries that use GetMsg() to 'upcall' to CPU0 for hardware access.

Of course, you could also divvy up hardware access among the CPUs, or any number of other possibilities. All CPUs would use the same read-only memory for holding the Exec/Intuitiuon/etc libraries, but each libraries' LibBase would be private per CPU.

I don't see how these two things conflict. It may be true that SMP aware drivers may not use global variables for efficiency but that does not mean that it should be forbidden for all drivers. I would like to be able to put any driver or library into ROM. For example for mobile devices with several MB of non-volatile memory. Giving each CPU/core its own memory space will give problems when you want to move processes between the CPUs/cores. I would have a copy of SysBase for each CPU but I would use the MMU to locate it at the same virtual address space. This would allow to move processes in between CPUs/cores.

The old-style Amiga interrupt handlers had their condition code flag Z in %sr to determine whether the handler handled the IRQ. Unfortunately, there's really no clean way to support both old-style and new-style interrupt handlers, so for BINCOMPAT, we're just going to ignore the return code, and process all the handlers.

Why not have TARGET_CC & TARGET_CFLAGS and HOST_CC & HOST_CFLAGS as optional vars when configuring AROS ? And just use well known default values for the usual case. This will be more flexible if we like to use some other compiler.

The AROS_LC (Library Call) macros should use the AROS_LD (Library Declaration) family for casting - the AROS_LP (Library Prototype) is for the C stubs and prototypes, not for the Library ABI.

In the __AROS_LP_BASE(basetype,basename) macro, it is unsafe to ever use or return the 'basetype' element, since it is frequently corrupted by #defines, for example:
struct foo_data {
struct GfxBase *GfxBase;
...;
}
#define GfxBase data->GfxBase
...
int blah_blah(struct foo_data *data, ...)
{
AddDisplayDriverA(gfxhidd, &tags);
...
}
In the above example, AddDisplayDriverA is defined as a call to __AddDisplayDriverA_WB(GfxBase, (arg1), (arg2)), which uses the AROS_LC2() macro, which uses the __AROS_LP_BASE() macro to cast the type. If __AROS_LP_BASE() returns the basetype, the macro expands to the following for the cast of the function pointer:
(LONG (*)(APTR, struct TagItem *,struct GfxBase *))
Which is all fine, until GfxBase gets expanded:
(LONG (*)(APTR, struct TagItem *,struct data->GfxBase *))
So, if '#defineing' lib bases is an allowed idiom, then
__AROS_LP_BASE() must never return basetype, correct?

You do realize that that made all your C stubs regcall instead of stack call, right? I don't know if these C stubs are even used at all. They did not exist at all years ago. I don't think the AROS_LP macros are/were intended to be used for the prototypes of C stubs. They are meant to be used for the prototypes of the real libcalls (even if mostly unused because of the way the libcalls are done). It's also illogical that the prototypes of the C stub would use the AROS_LP macros, while in the actual implementation (generated file #?_stubs.c) no macros at all are used. Because that means AROS_LP macros would always need to "expand" to normal C calling convention -> why have AROS_LP macro in the first place?

The correct behaviour is for them to be used in the prototypes of real library calls and not be used for the stubs. There may be cases of odd compilers from the past where you needed to have separate prototypes and definitions, or it was just extra compatibility. They've been around probably since rev 1 or something. So my preference would certainly be to keep them, and fix the stubs.

define are considered a hack and they should be avoided. When they are used people should take care they don't cause problems. It's not up to the AROS_LP macros to fix it. In the ABI V1 branch there is a mechanism so you can get rid of all these #define statements.

AROS_UFCx() should *only* be used for regcall routines. If VNewRawDoFmt() wants a stackcall routine, do NOT use AROS_UFCx(). Just declare a normal C routine. Yes, this is how it is done in MorphOS. I'm sorry for the breaking. I implemented VNewRawDoFmt() before the complete docs were out. I was pretty sure nobody has used it with custom callback routine. In fact magic constants declared in exec/rawfmt.h cover 99% of function's usage.

I notice that AROS program load is lots slower as on AOS. i look on AROS source, it look that file action to load_block go over several corners, and it look slow when use fgetc. Why the load_block func can not just do a AOS read ? is the elf file endian different to 68k platform endian ? Is a block not larger as 4 kb, and if so, its always lots faster to use Read direct and not use fgetc that's call from the elf loader

Optimisation is a lot like bug-fixing, in that, though you would like to create the perfect product, reality is that you have a limited lifespan and you take care of the biggest problems first. Small bugs in non-essential parts of the program may remain unfixed if they don't make the product unusable. Likewise, slow routines that are used seldom may never get replaced, as the overall gain in speed may be negligible.

The difference between the two is that a bug is absolute: Something either does what it is supposed and allowed to do under all circumstances, or it does not. If it does not, then the cause of that is called a bug. Optimisation on the other hand is relative. For optimisation for speed there's a trade-off. You can always get even better speed than your current solution, but at the cost of one or more other factors, like memory usage, disk space or maintainability.

In development, especially that maintainability cost is a problem: If you're still developing, but you've made your code unreadable for the sake of speed, then it'll be impossible to find the remaining bugs. Generally, this is expressed as "It's easier to optimise a working program than it is to make an optimised program work." Thus, in development, for the same piece of code, bug-fixing is given precedence over speed-optimisation.

This would be a very strange phenomenon. Speed and clarity are usually trade-offs. What other factor would have been optimised that these both had suffered from?

No, a good compiler can create fairly good code for most forms of abstract typing, object oriented programming included. There may indeed be some loss of speed, as that's the trade-of for having clearer source code. That clearer source code makes the debugging easier, of course, not harder.

What Knud writes is that some programmers do try to write very speed-optimised code, and that that causes huge problems for debugging and maintenance. He says that for 97% of the time, it's a waste to do optimisation. Then he says that optimising too early is bad, but that we should optimise that 3%. (The implication is that you shouldn't optimise before you know you're in that 3% of the code.) On AROS there are only some few performance bottlenecks, load of exe, icon lib.and when fix that and do not optimize other AROS get a lot faster response to the User. This is indeed in line with what Knud says: If you know that the load of exe, icon lib or OO are in that three percent, whatever they may be, then the should be optimised for speed.

Discovered that the attribute list that's passed to DoSuperNew() is invalid by the time it gets to the superclass's OM_NEW method (only the first ti_Tag value is above 0x80000000 for example). This can be worked around by using -O0 for NList_mcc.c and NListview.c.

I suspect GCC decides to inline the DoSuperNew() method in those files when optimisation is enabled, and hence doesn't bother putting the varargs on the stack, with the result that the &tag1 (AROS_SLOWSTACKTAGS_ARG(tag1)) that's passed to DoSuperMethod() as the attribute list isn't necessarily followed by the remaining tags. This probably means that we can no longer get away without using AROS_SLOWSTACKTAGS on i386, since we shouldn't assume varargs are on the stack (but use va_list etc.). Would introducing SLOWSTACKTAGS for ABIv0 break binary compatibility?

My GCC 4.5.1 patches are not at all 'ready' for general consumption yet. I still have list corruption on the task ready and wait lists, and I am beginning to fear that there's a GCC bug hiding somewhere.

The function addresses are put in the LVO table of the library by listing them in the .conf file of the library. In any point in the library no matter how deep in the call chain you can access the library base with the AROS_GET_LIBBASE macro (which actually just retrieves the value at the address pointed to by %ebx) (x86 code). So what advantage is there to this in existing code (which already passes the library base to subfunctions that need it) aside from burning the %bx register?

In that case there is no real benefit because that code already works around shortcomings of the current amiga shared library implementation. For new code you can save the stack space used by all the libbases that are pushed on the stack for each function call.

Personally I don't see a way on i386 to implement it without reserving %ebx for this. I choose %ebx as that is documented in the SYSV as global offset table base register for position independent code; I found it somewhat related.

If you plan to use the %bx register, maybe it would be a better plan to use -fPIC compiled code, and use %bx to point to the ELF GOT table for that OpenLibrary() instance. This would make it easier for all ports, since -fPIC is well defined for ARM, x86, x86_64, m68k, and PPC.

I did have a quick look at this but it seems that on i386 the code generated __GLOBAL_OFFSET_TABLE__ as an address, so the accesses are done through addresses and not through a register. This gives problems on AROS where there is only one single address space and shared libraries are only loaded into memory once.

Second problem I had with -fPIC is that it is an all or nothing approach: when you specify it all access to global variables are done through __GLOBAL_OFFSET_TABLE__. I did want to keep on supporting the current case where only the fields explicitly put in the library base are per opener (per task) and the rest of the global symbols.

In general I found -fPIC too much oriented to UNIX systems with virtual memory and separated adress spaces per process. I am no expert in this so I am open to any false conclusions I have made.

But it is true I need something alike -fPIC but adapted to AROS architecture: - On all cpu done through register call - Possibility to set the register in stub code - Possibility to remember the previous value of the register and restoreit after the function call. I don't think this is solved yet if using -fPIC as I assume it is solved during linking when a .so is met.

I certainly don't want to copy the .so shared object concept to AROS (as AOS4 has done). - Possibility to only reserve the register for libbase and by default don't do all addressing of the global variables through GOT.

For m68k, I believe the compile options we'll want for -fPIC are:

-fPIC -msep-data

And what is the effect on m68k if this option is specified ? How would I pass libbase through this mechanism ?

And I forgot to mention that I do plan to use libbase in %ebx to make it possible to have modules without .bss or .data section without the need for these ugly #define xxxBase hacks.

At the end of the day, I think we're going to have to stick our hands deep into the slimy guts of some compiler. Be it LLVM or GCC, I think the time is coming to pick an AROS compiler, and start whacking on it to support our needs.

I would go even step further - would proper using of variadic lists break that much? AROS seems to be a research OS and the incorrect use of variadics is one of the places which could be improved quite a lot... Varadic lists are especially problematic for x86_64, since sizeof(void*) != sizeof(int).

As much as I would like to use them correctly, implicit vardics are heavily used (and abused) in almost all Amiga source code that uses BOOPSI.

Breaking this assumption would break code like this (which is STUPIDLY common in my experience with porting to x86_64):

Remember to undo rom.ld change, we are not responsible if you forget the modification and report future so called crashing "bug" due to illegal 0xdeadxxxx addresses In rom :)

Oh, I plan to make the m68k exceptions *identical* to AmigaOS on the stack when I call a processes' tc_TrapCode, but I need to know what the AROS Kernel sonic's been working on expects.

By using the register declaration for %a6, gcc assumed all uses of %a6 were references to the AROS_GET_FP global variable.

To get the frame pointer, you can't rely on %fp (or %bp, or anything like that), as GCC can clobber that if -fomit-frame-pointer is used.

__builtin_frame_address(1) should be what is wanted in this context anyway (frame pointer of the calling routine)

In any case, %a5 is the frame pointer under AROS/m68k anyway, not %a6.

I'm compiling and testing this right now, but this is the smoking gun according to 'git bisect'.

Fix the build. exec/types.h doesn't define size_t, sys/types.h does

stddef.h is even better as it is ANSI-C. sys/types.h is POSIX.

I search AROS code for text bestcmodeid, but i don't see that wanderer use that.So i think my modeid does not match the install. use bestcmodeidtags is best, because the P96 mode numbers are different on all winuae systems. maybe Wanderer can change ?

here is code from SDL. If a screen resolution is not here, this command give automatic the next possible modeid.

maybe its some of my winuae setting.is there a winuae config file that is known to work ?

Does anybody know a ABI reference for m68k ? Interested in the argument passing for a function call. All C language arguments are passed via the stack (for non-LVO calls). The return value from a C function is in D0. Registers D0/D1 and A0/A1 may be used as scratch by the called function, and the caller must expect them to have been trashed. Registers D2-D7 and A2-A7 must be preserved by the called function. (Some compilers allowed per-argument overrides, but they explicitly map a register to an argument)

More concrete: who owns A6 during a function call ? The caller. Is it preserved by a function call ? Yes, the callee must preserve it *if* it modifies it.

register A6 is LibBase, A5 is Frame Pointer (optional), and A4 is the pointer to the .data/.bss area (for some compilers)

A6 is that Library's libbase, and most of the intra-library calls in graphics.library are LVO (register) calls, so this is not surprising.

GfxBase is an LVO call.

There are three or four main calling conventions in AmigaOS depending on who you ask. I'm writing from memory so don't quote me exactly on this.

C calling convention: Slow as frozen molases and passes everything on the stack. Supports variadic argumentss but other than that is practically worthless and usually optimized to another calling convention. Returns result in D0. Should not be used with shared libraries except for variadic arguments stub calls.

LVO calling convention: LibCalls always allow D0/D1/A0/A1 to be used as scratch and are not preserved by the subroutine. The remaining registers must be preserved by the subroutine. The A6 register holds the library base always during this calling convention. May preserve globals and locals heap in A4 unless large data model is used. May have optional frame pointer in A5 used by debugger.

Standard calling convention: Just like LVO Calling convention except no library base is required. Requires a stub function written in C calling convention to implement variadic arguments as they are not supported from here.

Any benchmarks confirming this slowness ? If it is really that slow how come that C calling function is used for non library calls ? Of course the effect will be most noticeable on a CPU without data cache and only chip mem. Should we make the m68k AROS compiler use register argument passing for all function calls ?

That said, stack cache is only present on an '060 so register passing is definitely preferred. The reason the C calling convention is used is only for variadic arguments. There is no other reason to use the C calling convention. The standard calling convention that uses register passing is MUCH more common.

Also, regarding LLVM's calling conventions, there is another one supported by that compiler called the "cold" calling convention for subroutines that are not called often and when they are called, they need to leave as few registers touched as possible. On a 68k that would mean using only scratch registers and stack without overwriting any of the other registers. It is only needed for optimization and for compatibility with LLVM though.

Although, a MMU is not required for Aros-m68k. The 68030 and 68040 use different supervisor instructions to access registers, (some) different supervisor registers and the register bit mappings are different, as well as page descriptors. 68030 MMU indeed supports more pagesizes, and in general is more flexible. The MMU design was simplified for 68040 to gain speed and reduce complexity. 68040 and 68060 MMUs are almost identical, 68060 just adding some more restrictions on caching of mmu tables and renaming few cachemodes. MMU exception processing is different on 68060, too.

For the record, I wrote the m68k part of the kernel elf loader in Haiku: (though not much tested yet, kernel doesn't build anymore atm)

Btw, I found an old binutils patch to add HUNK support to BFD in the archives of my Ti92 times (1997), surely this can be ported to newer binutils. It's odd none kept this around. I also recall writing some code to read hunks to convert to the ti format. (the first Ti 68k developments were done using a68k, and some people used amiga cross compilers before the times of tigcc.)