Posted: Sun Mar 20, 2011 10:19 am Post subject: How to recover a lost world file

Hi everybody,
over the years, the Gentoo forum helped me out quite a few time and therefore I think it's time to give something back. So here's my first post - hopefully it proves to be useful for others. If I chose the wrong category for this post, please move it to where it belongs.

How your world file may get lost
I guess there are countless ways of how to loose the portage world file. In my case, two simple but serious mistakes lead to an empty world file. As many others, I'm using rpm in Gentoo for the sole purpose of installing drivers for printers and scanners made by Brother (e.g. the MFC series). Yesterday, I wanted to update those drivers. Therefore, I issued rpm -e brscan[...] in order to remove the old drivers prior to installing the new ones.

This was the first serious mistake, since you must NEVER ever use rpm without the --nodeps option if rpm is not the main package manager on your system. Anyway, issuing said command left me with a completely messed up system: several important files like 'libstdc++.so' and 'stdio.h' were missing, gcc was unable to create executables and so on. I wasn't able to emerge anything, basic programs like eix and locate stopped working - long story short, the system was almost unusable. However, I was able to use links and wget to download a current stage3 tarball which I planned to use to get the basic system functions back online.

I knew that extracting the tarball would probably overwrite a lot of important files, so I created a backup of /etc. That was the second serious mistake I made - before extracting a stage3 tarball in order to recover a broken system, one should backup EVERYTHING, not only /etc. So, after extracting the tarball most parts of the system like portage, gcc and actually most other applications as well were back online, but the portage world file located at '/var/lib/portage/world' had been overwritten and was now empty.

All of this took place after 2am. As it is mentioned in How I met your mother, "No good things happen after 2am. After 2am, just go to bed". They were totally right... Gentooish high five...

Recovering a lost world file
There is that well-known command called regenworld, which attempts to recover lost world files. In my case, that did not help at all, since regenworld tries to find out about package installations that took place in the past by parsing portage's log files. Those were gone too by extracting the tarball, therefore regenworld did not report a single package on my system.

After trying several administrative commands, I noticed there are at least two ways of getting a list of currently installed packages: equery and eix. Both of them use their own databases and seem to be unaffected by a missing world file. Sadly, the package lists provided by equery and eix contain ALL packages currently installed on the system, i.e. programs, libraries, fonts, codecs etc. The world file, however, should only contain packages that were installed on demand.

Code:

# equery list -i
# eix-installed -a

But then I came up with an idea: packages that are installed on demand most likely have no dependencies to other packages - if they had, they would have already been pulled in by other packages during the emerge process. I verified that thought with a few equery depends <package> on packages like vlc, amarok, kdebase-meta etc. which I knew had been installed on demand. The results were very promising.

Using a script to automate the recovery process
Based on the idea described above I created a small but effective recovery script which collects a list of all currently installed packages and then automatically checks the dependencies for each package. The script creates a text file containing all packages without dependencies, which may directly be used as a replacement for the lost world file. On my system, the script recovered over 90% of my world file (80 entries out of 85 total). The recovery process takes some time (about 30 min. in my case) since equery depends queries are very slow - if you have an idea on how to speed things up, feel free to comment accordingly.

Some packages won't be reported by the script because of their special dependency structure. One example is 'sun-jre-bin', which depends on 'virtual/jre' and is therefore omitted by the script. But those few missing packages may easily be spotted by checking the result of emerge -pv --depclean after copying the recovered world file to '/var/lib/portage/world'. In my case, I only had to add five packages reported by depclean in order to have my world file recovered to 100%.

All in all, my system was up and running again after about three hours (including the time it took to write the script) which is well below the amount of time it would have taken to install Gentoo and all my applications from scratch. By using the script, other users with a similar problem might be able to recover their system in less than an hour, I guess.

Find the recovery script I created attached below. Feel free to comment if you have any suggestions on how to improve it.

Best,
fonic

Last edited by fonic on Sun Mar 20, 2011 11:43 am; edited 4 times in total

# Step 3:
# Process package list and determine the number of dependencies for each
# package. Packages WITHOUT dependencies are most likely packages that were
# installed on demand (and therefore need to be put in the world file).
while read package; do

# If there are no dependencies, a world entry is found
if [ $numdeps -eq 0 ]; then
echo "-> Packages was added to the list of recovered world entries."
echo $package >> $fstep3
fi

done < $fstep2

# Done, show results and depclean notice
numpackages=$(cat $fstep3 | wc -l)
echo -e "\nThe recovery process is completed. $numpackages world entries were identified and stored in '$fstep3'.\n"
echo -e "Make sure to check 'emerge -pv --depclean' for packages that were not found by this script.\n"

For those that didn't actually bother to read the entire original post:

Quote:

Recovering a lost world file
There is that well-known command called regenworld, which attempts to recover lost world files. In my case, that did not help at all, since regenworld tries to find out about package installations that took place in the past by parsing portage's log files. Those were gone too by extracting the tarball, therefore regenworld did not report a single package on my system.

Now, I do wonder that extracting the tarball over a live system would have overwritten log files though... is that the default portage /var/log/build.log rather than the individual logs written to /var/log/portage/<package-group>:<package-name>-<packageversion><datetime>.log?

For those that didn't actually bother to read the entire original post:

Thank you! I can't stand people commenting on something they actually haven't read...

Quote:

Now, I do wonder that extracting the tarball over a live system would have overwritten log files though... is that the default portage /var/log/build.log rather than the individual logs written to /var/log/portage/<package-group>:<package-name>-<packageversion><datetime>.log?

Well, I can't tell exactly what happened or how it happened. My focus that night was to get the system running again, therefore I didn't spend much time on finding out what lead the data loss. It might have also been rpm that caused the missing portage log files since it destroyed a lot on the system. Fact is, regenworld could not find anything in that situation while my way (or: my script) did. And even if regenworld helps in 90% of those cases, I think it never hurts to have a backup solution.

I trimmed your script, and just used the last stage to clean up my world file. I saw I could delete 25 of the 99 entries from my world file.
Then I uncovered a problem with it. If a package has a conditional dependency, it will flag it as a dependency. For instance I have gimp installed from world, and gtkam installed. gtkam has the following line in its RDEPEND "gimp? ( >=media-gfx/gimp-2 )" Since I don't have the gimp use flag set, it doesn't acually depend on gimp, but equery sees that it does. So your script flags gimp as non-world-worthy.

After doing more work, I'd be very cautious about using this. I had 99 packages in my world file, it only said I needed 74. Upon further inspection, I actually need 92. So it missed almost 25% of the packages that should be in my world.

Having said that, it's still better than nothing, if you have no choice.

Thank you for this post and script! Did the same thing (unpacked stage3 tarball) one evening when everything was going wrong. Should have gone to bed instead. I figured a solution like this should work and thought of writing a script, but then realised it must already exist somewhere

!!! Default action for this module has changed in Gentoolkit 0.3.
!!! Use globbing to simulate the old behavior (see man equery).
!!! Use '*' to check all installed packages.
!!! Use 'foo-bar/*' to filter by category.