Because in my opinion the amazing truth is: It is in fact UNNECESSARY to recompile any package, including the new GCC, more than exactly once when rebuilding your entire system! (For those who cannot believe this, see my argumentation in the related article https://forums.gentoo.org/viewtopic-p-3548628.html#3548628.)

Well if that were true it would be a great time saver. So, having read your very clear and well argued explaination I thought I would give it a try.

Since you admitted somewhere that there were things like binutils that you were wrong about I took a couple of extra steps before taking your "only once" approach.

To be precise:

I already have gcc-4.1.1 which had been compiled twice before I swapped. It seems this is now taken care of in the ebuild anyway.

I then cloned the entire system to another partition , rebooted to the new system and started the gcc-4 rebuild .

First I did

Code:

emerge binutils glibc

, then a full

Code:

emerge -e world

.

Since I wanted to follow exactly what was happening this was all done by hand not using your script. I had to deal with a few packages that failed but after a couple of updates and a bit of rebuilding I got the end of the world rebuild.

Now when I try to run kdevelop I get this:
bash-3.1#kdevelop

Code:

kdevelop: /usr/lib/gcc/i686-pc-linux-gnu/3.4.6/libstdc++.so.6: version `CXXABI_1.3.1' not found (required by /usr/kde/3.5/lib/libkhtml.so.4)

I have previously used the tiresome multi-emerge approach and not had any of this sort of issue.

What do you think is the cause of this breakage?
_________________Linux, because I'd rather own a free OS than steal one that's not worth paying for.
Gentoo because I'm a masochist
AthlonXP-M on A7N8X. Portage ~x86

I had the same problem with other parts of kde. For me it was caused by gcc-config-1.3.14 not properly setting the default compiler fully to gcc-4.1.1. To fix this I upgraded gcc-config and eselect-compiler-2.0.0_rc2-r1 and used eselect compiler to set the compiler to gcc-4.1 ( when I ran this it reported the gcc--3.4.6 was default even though I had set 4.1 to default a few days back with gcc-config-1.3.14)

After that step the broken kde packages were fixed with no emerging necessary._________________John

Dont bother trying, if that could be done Gentoo would have been bought up by a large mining conglomorate and we'd have to pay to use it. _________________Linux, because I'd rather own a free OS than steal one that's not worth paying for.
Gentoo because I'm a masochist
AthlonXP-M on A7N8X. Portage ~x86

Incredibly helpful and well written. I dup'd my machine to another one I have handy and ran your guide on one and the stardard rebuild, rebuild, rebuild stuff on the other. Yours just finished and so far I see no problems at all other than libidn which forced me to learn that Java now has generations. The other one is still going and three times now I've had to manually intervene to fix a problem.

I think your guide should go up for posterity._________________Linux: More fun than porn.

Posted: Tue Nov 21, 2006 8:09 am Post subject: What to do if emerge -auDN world or revdep-rebuild fails

When following my guide, sometimes the problem arises that "emerge --update --newuse --deep world" or "revdep-rebuild" fails, and the question arises whether my guide can still be followed in those cases.

Well, it depends on the packages which are failing.

The main reason why I want the "--update" is that I want stable, up-to-date build tools at the time my script will be started.

The reason for the "--deep" is that I also want those build tools to be based on up-to-date libraries, i. e. glibc and other important system libraries.

"--newuse" is there for the case that the new profile which the user might have selected enables a different set of build options for the compiler or libraries.

So, "--update --deep --newuse" together will take care of all that and make sure we start our rebuild in a sane environment and with an up-to-date and working set of tools.

The "revdep-rebuild" is there in order to fix things the "emerge --update --newuse --deep world" might have "forgotten".

So, each package which fails in either "emerge --update --newuse --deep world" or "revdep-rebuild" is a package which cannot be trusted to operate correctly during the initial phases of the system rebuild.

Whether it is wise to continue using my script in such cases depends on the relative importance of the failing packages: Will it be required by the build system in order to rebuild the basic build tools themselves?

And if it does, this will save the day: As my script rebuilds everything (except for the C compiler which already has be up-to-date when the script starts) from scratch, it does not matter whether any of the remaining packages are also up-to-date when the script starts. They will be rebuilt by my script anyway. Only the build tools themselves and everything they depend on for successful operation *must* be up-to-date.

And if that minimal approach does not work that way: Don't blame me!

I have never tested rebuilding the system without a successful "emerge --update --newuse --deep world" and "revdep-rebuild", so it may work or it may not.

You have been warned! I'll take no responsibility for what might happen.

So make backups. Hope for the best. Pray! Or trust on your Live-CD if all your prayers won't help...

If someone is willing to risk trying out the minimal approach suggested above instead, please post here whether it worked for others to share your experiences.

Interesting. But I'd recommend install ccache (make sure it's enabled in FEATURES in make.conf) and do it the usual way

Without question, ccache is a great tool. I'm using it all the time.

And it works perfectly with my script. (At least I did not encounter any problems. However, I manually cleared the cache of ccache after using gcc-config to set the new compiler. Just to be sure there are no cached object files left originating from the old compiler.)

Maybe you should add this step to your how-to. I was several hours into the rebuild when I read this. Now I'm restarting... _________________I could be arguing on my own time.

But it is not as if I was certain that ccache will actually cause troubles.

I just don't know how it works internally, and that worries me, because I cannot rule out the possibility that ccache might return compiled files generated by an outdated GCC.

The question is: How is it implemented?

I only can say: If I was about to implement ccache's functionality, then it would work as follows:

I would run the preprocessor unaltered, passing through any command line options which influence the preprocessor. This will be done for every source file specified on the command line.

The preprocessor will return me a list of C/C++ source files with all preprocessor macros replaced by their contents. Conditional compilation (#ifdef etc) or #include files will also have been taken care of already.

Now a loop follows over each already-preprocessed source file:

The complete contents of the commandline will be concatenated to the string representation of the exact version number of the compiler, linker and assember, and finally to the contents of the already-preprocessed current source file.

An MD5-hash (or some other hash) is calculated over the contents of that concatenated text stream.

Then the resulting hash value has to be looked up in the cache database: Is it already there?

If yes, then return the already-compiled object file represented by the cache entry, bypassing an actual invocation of the compiler. Also touch the cache entry to make it the newest one.

If no, then run the compiler normally first. But prior to returning the compiled object file, make a copy from it and store it in the cache database with the previously calculated hash value as the access key.

If the maximum cache size has been exceeded by the new entry, delete the oldest cache entry file in order to keep the cache size within its defined limits.

This continues until all source files have been processed.

If the real ccache actually works like that, there is indeed no need to worry: The changed version number of the compiler will result into a different hash value for each source file, and so object files compiled with the old compiler will never be returned.

But it ccache doesn't include the version numbers of the build tools into the hash value it calculates, then ccache will not detect the fact that an object file has been compiled by a different GCC than the current one, and an outdated object file might be returned.

So, it depends.

If you clear the cache, you are on the safe side.

drescherjm wrote:

I rebuilt my system without clearing and did not have a problem.

But have you also tested every single executable since then?

The problem is, ccache maintains a cache. Which means, it will not store every object file ever compiled there, but just the latest 2 gigs or so of them. Older ones will be purged, to make room for newer ones.

Which essentially means: Whether or not an outdated object file has a chance to "survive" a full system rebuild depends on the build order.

If enough source files will be compiled first which only have cache misses, the resulting object files will replace all the existing (outdated) object files as the new onles are saved into the cache.

But if a package is recompiled which produces cache hits for outdated object files in the cache, those outdated object files will be returned.

So it might be very well that everything went OK for you - but you might just have been lucky...

If anyone is reading this who knows how ccache acually works in this regard - please share your knowledge with us!

If the real ccache actually works like that, there is indeed no need to worry: The changed version number of the compiler will result into a different hash value for each source file, and so object files compiled with the old compiler will never be returned.

I remember reading the docs for ccache it does mention that it works like this. I mean the new gcc version will generate a new hash.

I remember reading the docs for ccache it does mention that it works like this.

Thanks for that hint! I RTFM'd and finally found:

Quote:

HOW IT WORKS

The basic idea is to detect when you are compiling exactly the same code a 2nd time and use
the previously compiled output. You detect that it is the same code by forming a hash of:

o the pre-processor output from running the compiler with -E

o the command line options

o the real compilers size and modification time

o any stderr output generated by the compiler

These are hashed using md4 (a strong hash) and a cache file is formed based on that hash
result.

This looks very good indeed, because it includes a hash of the whole compiler, and not just the version number of it.

However, the question remains what the author of the man page had in mind when he was referring to "the compiler": Only the driver program? Or all the many executables the GNU compiler collection consists of as well?

And what about changes to the assembler?

For instance, assemblers typically can perform "jump optimizations" of various types which cannot reliably be detected at the assembler source level (where the GCC code generator backend operates).

Some CPU architectures feature "long" and "short" jump instructions. Depending on the jump distance from a given assembler source line to some other instruction specified by a symbolic label, the assembler can use either the longer or the shorter version of the jump instruction. And the jump distance depends on the actual encoding size of the machine code statements emitted by the assembler which the compiler cannot know, but only the assembler.

On the other hand, "shorter" means not always "better", because changes to the code alignment in memory can also trigger changes to the CPU cache usage.

Also, the superscalar architecture of today's CPUs with its pipelines and jump prediction heuristics can make an optimal decision "what is the best way to encode, potentially reorder and align an assember statement" a really complicated problem to be solved.

Of course, the compiler will take care of most of those optimization matters.

But the GCC backend (as well as the backends of most traditional UNIX C compilers) is restricted by the fact that it generates assembler source code, not binary machine code.

That means, although GCC might output the optimal assember statements, it does not know how the assembler will transform those assembler statements into actual machine instructions.

This also means GCC cannot really optimize for CPU cache usage, because it cannot predict the alignment.

If any such optimization is performed at all, it can only be done by the assembler itself (unless GCC will be completely rewritten to directly emit machine code).

And if a new version of the assembler changes the way how it optimizes those things - a different object file will be the result; even though GCC might emit exactly the same assembler statements.

If ccache only checksums the compiler executables, and not the assembler, it cannot detect such changes.

On the other hand: Who cares...

But even if we ignore those (possible) differences: What about environment variables?

There are a couple of environment variables which can change the way the compiler works.

Just think about CFLAGS!

And I cannot remember having read that ccache also checksums the contents of such variables. Which could lead to totally different object files for different compiler runs, even though all the command line options as well as all the source files are identical.

Only the fact that most packages use automake protect from this, because automake-generated makefiles will set up their own CFLAGS.

But... what will happen for packages that do not use automake? There are some of those in the portage tree.

What if the CFLAGS in /etc/make.conf have been changed manually by the user? Will ccache detect that?

When I was half way through the recompile, I noticed that modules would no longer work. It took me a while to figure out that the modules had been recompiled (nvidia-drivers) with the new gcc, while the kernel was still compiled with the old one.

My last note: For some reasons I could not discern, after the script was almost finished, GRUB would no longer work. I had to boot from CD and reinstall grub on the mbr. After that everything seems to be fine. (I can't compile app-misc/beagle anymore - that stopped your script from finishing).

Thanks for your great script and guide. My system is now up to date and compiled with the gcc 4.1.1

Seems Guenther's post has been updated since that 20060926 date, so I went ahead and used that version. It's been running for awhile now, and all seems well -- but v.e.r.y s.l.o.w. These emerge commands are taking several times longer than they would with a normal emerge command run on this system. Is this typical? Since this is such a small system resource-wise, I'm wondering if the script is attempting too much parallelism for the architecture?

Anyway, thanks again for automating and simplifying a complex process.

I was running this as a background task, and then disowned it & logged out. Can't see how, but could that be the issue? Otherwise, it seems there might be something in the script that is breaking these builds right before they complete. This is a small, slow system.

OK, so you are suggesting to do the gcc-config before the emerge --update then.

It should certainly not a be problem changing this in my guide, but first I have to understand what exactly the problem was.

From what I've seen in the link you provided, they were actually getting a "gcc too old"-sort of error message.

But in our case, GCC will be up to date when the glibc will finally be compiled by my script, so I am uncertain on what step of my guide the problem actually occurred in your case!

So far, my the logic of my guide is:

The initial emerge --update brings all packages up-to-date, including glibc and the old GCC as well as all the tools required for rebuilding the system.

Then the new GCC is emerged. Now both GCCs are installed, and still all packages are up-to-date.

Now the used runs gcc-config and sets the new GCC as the system default compiler. But the user does not use it to compile anything with it yet.

Instead the user starts my script, which will generate a package list of all packages in the system, and also determine the correct order in which they should be recompiled. This includes glibc.

The user starts the generated script. glibc should be emerged as the second package of all, and the first one to be actually compiled. This will already be done with the new GCC, which is the system C compiler now.

If there is an error in this flow of logic, please be more specific and tell me at what point of the above list the problem occurs.

I may then be able to realize the origin of the problem. (Let's hope!)

I tried to do emerge --ask --update --deep --newuse world and when it got to compiling glibc 2.4, i got the error These critical programs are missing or too old: gcc and it got kicked out of the emerge.

emerge --ask --update --deep --newuse portage linux-headers glibc gcc binutils results in the same problem. not sure how to work around this.

I tried to do emerge --ask --update --deep --newuse world and when it got to compiling glibc 2.4, i got the error These critical programs are missing or too old: gcc and it got kicked out of the emerge.

emerge --ask --update --deep --newuse portage linux-headers glibc gcc binutils results in the same problem. not sure how to work around this.

Although you can ask here I think you will be better served if you ask that as a new thread and post a few bits of info