I have left gentoo and have not updated this script for a while, though it probably still works, mv has taken over and proposed new features and improvements, aufs is one good thing among others. And his script can be used on any directory (I would recommend /usr/src/linux and the docs) You should probably have a look at the thread and check mv's version, now. Last version is available here --- I was sure I had left a notice like this one here but only realize now I must have forgotten...

Abstract The portage tree takes typically 600MB on /usr/portage, some people have chosen to have it on a separate partition with a small block size to

decrease its global size
decrease fragmentation
increase the speed during searches and update

However that also means that the free space on that partition is lost. Another, better, solution is to use a stack file containing the filesystem with small block size to avoid the shortcomings of the separate partition. Following the discussion in this post, the best solution found is to compress the portage tree using the compressed, read-only squashfs filesystem and mount a read-write partition on top of it using unionfs. Second, for convenience, the portage tree should be mountable on boot-up and updated on shutdown, hence I have written an initscript to automate this process. I will publish and update this program here.

Assumptions I will assume that you have

read the original idea as it details what you need to get started and that you are familiar with the bootup/shutdown sequence of your computer,
read the Gentoo doc on initscripting.
DISTDIR should be out of the tree, etc.

Please just read the original post as everything is well documented there and come back once you have done that, unless you do understand what you are doing!

Introduction In brief, you will need to load the relevant kernel modules and emerge the userland utilities

Code:

emerge -avt sys-fs/unionfs sys-fs/squashfs-tools

, unmask if needed (I am now successfully using sys-fs/unionfs-1.2 on sys-kernel/suspend2-sources-2.6.16-r8 and sys-fs/squashfs-tools-3.0)

Code:

modprobe loop squashfs unionfs

add the modules to /etc/modules.autoload.d/kernel-2.6.
the script

/etc/conf.d/squash_portage:

# /etc/conf.d/squash_portage

# SQFS_DIRNAME points to the directory that will contain the sqfs
# images, recommended value is /var/tmp
SQFS_DIRNAME="/var/tmp"

# Leave PORTAGE_RW empty for use with tmpfs, a ram-based filesystem,
# This is recommended unless you are short of RAM
PORTAGE_RW=""

The script assumes that you already have an sqfs image on your disk ready to be mounted and called $SQFS_DIRNAME/portage.sqfs it will save the new image if you sync'ed the tree the rest is pretty obvious and the rw partition is mounted on /dev/shm by default for speed and simplicity. But beware that it means you will lose your update in case of power failure or if you are not caring. I would recommend to restart the script upon hibernation if you use it and do not always reboot into your ram image._________________Compress portage treeElog viewerAutodetect swap

Last edited by synss on Fri Sep 07, 2007 6:33 pm; edited 19 times in total

th...this is great! i would never think of implementing it via initscript

nice job.

although there is one drawback with this overlay - when you upgrade your kernel you might get stuck with no portage tree at all, since not every kernel might have squashfs and unionfs modules integrated. there could be a solution in form of an mini-overlay that would contain only ebuilds for squashfs and unionfs (and required eclasses) for forgetful folks. (would it be difficult to make such a backup overlay from an existing portage tree via an bash/sh script?)

1. squashfs is a standard kernel module, which does not require the userland utilities, if you have a recent kernel, you just enable squashfs and loop and that is enough for mounting the squashed image. Even "forgetful folks" can compile the necessary modules and modprobe them (no need to reboot into the new kernel).
2. unionfs and sys-fs/squashfs-tools are only needed for the updates because the squashed image is read only. Hence, it is not required at all as long as you do not try to sync your tree.

I have had a problem with union that was acting weird during unmount, but that was a mistake from my side (a non-existing locale when compiling glibc, I think) So you might be unable to sync, but you can then compile and load the modules and go on. In any case, it is always possible to download an image of the tree from a nearby ftp server yes, that is exactly for avoiding that that I implemented the incremental timestamped backups while debugging. (Then I leave it cause it is a good idea and you can easily clean the backups via cron if you want to.) So the only risk you are taking, AFAIK, is that you might have to mount the squashed image by hand after recompiling the modules.

with squashfs you don't need the free space to unpack it, and these do not work as one compressed tar file they work as many small compressed files (or have been working in this way some two months ago), which IMHO makes the whole idea to compress the tree futile._________________"I knew when an angel whispered into my ear,
You gotta get him away, yeah
Hey little bitch!
Be glad you finally walked away or you may have not lived another day."
Godsmack

these do not work as one compressed tar file they work as many small compressed files (or have been working in this way some two months ago), which IMHO makes the whole idea to compress the tree futile.

hmm i must have misread something then. i assumed they would create a single compressed file from a dir tree.

Well I though the same in the beginning, even with such implementation the things would have been just ok if there was some sort of file-container ala tar around the files, of course the compression ratio would not have been very good but there would have been virtually not that much fragmentation and loss of free space _________________"I knew when an angel whispered into my ear,
You gotta get him away, yeah
Hey little bitch!
Be glad you finally walked away or you may have not lived another day."
Godsmack

What about moving the squashed tree to, ie /usr/portage_tree.sqfs and creating a symlink to it, ln -s /usr/portage_tree.sqfs /usr/portage.sqfs the second one should only be a symlink as it is overwritten on updates to point to the actual tree.

I do not know whether that helps... Could you make sure it is the first mount that is broken? and check that something is mounted by echoing the variables for example. I am sorry for the inconvenience, it works at home...

I am in a cybercafe on XP with a French kbd so debugging is not easy... But I do not see what else but the first mount can be a problem.

What about moving the squashed tree to, ie /usr/portage_tree.sqfs and creating a symlink to it, ln -s /usr/portage_tree.sqfs /usr/portage.sqfs the second one should only be a symlink as it is overwritten on updates to point to the actual tree.

I do not know whether that helps... Could you make sure it is the first mount that is broken? and check that something is mounted by echoing the variables for example. I am sorry for the inconvenience, it works at home...

Well, i don't know if it wasn't my computer problem. Had few day of "fun" with kernel panic and stuff...
But maybe it would be better to move previous portage to something like "portage-old.sqfs" and the new one would be "portage-current.sqfs". It would help for the "forgetful folks" to autodelete and keep only one portage (that you use) and one (as a backup). Otherwise few days and you have 10 squashed files... or i'm wrong? But the scprit deletes only the PORTAGE_SQFS and it's a symlink.

Oh, and I had a problem when didn't sync (IIRC ;-), the script during a reboot created "portage" (4096 bytes) and a symlink to it...

synss wrote:

I do not think there should be any problem:

1. squashfs is a standard kernel module, which does not require the userland utilities, if you have a recent kernel, you just enable squashfs and loop and that is enough for mounting the squashed image. Even "forgetful folks" :) can compile the necessary modules and modprobe them (no need to reboot into the new kernel).
2. unionfs and sys-fs/squashfs-tools are only needed for the updates because the squashed image is read only. Hence, it is not required at all as long as you do not try to sync your tree.

What about kernel change? You have to rebuild the unionfs module, but how to do it without a portage tree? When you boot the new kernel, you don't have a unionfs module. Or i'm missing something in here?...
Again IIRC when i was having kernel panic, kernel change and then "brand new kernel" the unionfs module at boot was

Code:

Loading module unionfs [!!]
No module called `unionfs` [!!]

Then the startup script printed "no module called unionfs" and i had no portage.

PS. Sorry for all those "IIRC" but this was a long weekend with "fight with hardware". /-:_________________roslin uberlay | grubelek

I'm glad to see you've found a way to make unionfs work with it reliably. I'll give it a try shortly.
(My current solution is still to have my main server make the squahsed image for me on my weekly emerge --sync)

By the way, it is good to combine the new metadata system with squahsfs, so you don't just duplicate the data yet again.

To do this, add "-metadata-transfer" to FEATURES and add the line "portdbapi.auxdbmodule = cache.metadata_overlay.database" to /etc/portage/modules.

This avoids the /var/cache/edb metadata cache from being created, as that data already lives in /usr/portage/metadata in the squashed image.

I am considering this at the moment, but even creating the initial portage.sqfs images takes more than 10 minutes. Am I right to think that every time I shutdown my system and the initscript resyncs the squashfs, it will take just as long?

Just wanted to say, that this works for me very good. Now I'm wondering why Gentoo does not provide portage also as squashfs-files. Downloading 40MB to 50 MB and mounting it is much faster than a standard sync!

gerardo wrote:

I've created a BASH script to cleanup the portage squash files older than one month:

I am considering this at the moment, but even creating the initial portage.sqfs images takes more than 10 minutes. Am I right to think that every time I shutdown my system and the initscript resyncs the squashfs, it will take just as long?

Only the first time is time consuming, thereafter, everything is cached and faster, that is another advantage of having the tree compressed, the tree is in RAM. See the other thread for discussion._________________Compress portage treeElog viewerAutodetect swap

I like the idea to realize the task as an init-script very much. However, the current script has an security issue, and several enhancements are possible:

synss wrote:

Code:

PORTAGE_RW="/dev/shm/portage"

Using a fixed name in a world-writable directory is a serious security issue. In this case, it might be an acceptable risk, because the initscript is hopefully started before any local attack might have a chance to get started, but even then: Don't better even think about calling the script later on with a restart option.

There are two other disadvantages of keeping PORTAGE_RW in a RAM disk: First, your previous changes to the portage tree are lost when your system crashes and the script cannot shutdown properly. The second disadvantage is that - surprise - it costs valuable RAM; the point is that you get practically no speed improvement from this ramdisk here (at least in normal cases), because the newly read data from a portage tree will remain in your disk ram anyway. So, it seems better not to use a ramdisk at all for portage_rw. In contrast, during shutdown, if you do not want to keep a backup of your .sqfs file or are short of disk space, it might be a good idea to build the new .sqfs file in a ramdisk.

It was already mentioned in this thread that keeping backups of older .sqfs files without limiting their number automatically is not necessarily the best idea. Morever, it seems better to use .tar.bz2 as a backup anyway for two reasons: First, these files need less space than .sqfs and, more importantly, if you are forced to boot from a kernel without squashfs support (e.g. from some rescue disk), you can still unpack the .tar.bz2 file if necessary. On the other hand, the disadvantage of using .tar.bz2 files as backup is of course the double packing time.

If you build a new kernel, you can simply compile in squashfs and loop support (or build the modules). However, to build in the unionfs support, you need access to the portage tree (unless you made a copy of the ebuild in advance). It was already mentioned that in this case you still can mount the portage tree at least readonly. However, the current script does not mount the tree in this case to the "usual" place readonly.
In the initscript of http://www.mathematik.uni-wuerzburg.de/~vaeth/gentoo/index.html there is a different attempt of a corresponding squashfs_portage with the above fixes/enhancements. Concerning the backups, you may choose in the configuration between keeping .sqfs or .tar.bz2, and you may pass options to mv to define the method and number of backups.

Edit: Fix some typos.

Last edited by mv on Fri Oct 20, 2006 10:07 am; edited 1 time in total

I like the idea to realize the task as an init-script very much. However, the current script has an security issue, and several enhancements are possible:
etc.

Interesting comments. I will see what you have done. I also have a small bash script, which takes care of making the first image, I wanted to incorporate it in my initscript. Having only one backup, following what has been posted here by sbdy else, is probably a good idea. I also realized recently that my overlay directory is getting bigger. It does not make much sense to have a compressed portage and a large overlay, that is another issue, I (or you) may take into account._________________Compress portage treeElog viewerAutodetect swap

in your functions, adapted slightly to fit in. I have used that when I did not have a direct access to the Internet, had to downlad the tree, then unpack/squash the tree. It is very possible that the output of the

Code:

tar -x

shall be piped into

Code:

mksquashfs

to save some time, but I wrote that one in 2 min and let it-left it. I may update my script when I have time. Take it if you want.

You are right, of course. I had done so in the beginning, but at some point, I decided to do a rewrite from scratch and copied the header just from somewhere else. This bug is repaired now.

Quote:

if you want to make the script more complex, you definitely should have something like [...]

I don't get the point of this function: As I guess from the last command, you mean that you already have mounted some portage-tree, but it is just not up-to-date, and you want to update it from a .tar.bz2 file? Then why don't you just execute

They are difficult to read, but this is the way required by shell-programming- there is not much one can do to make it better readable: The quotation signs are all necessary to make sure that also file names with spaces work correctly. The "&&" syntax can hardly be rewritten with "if" if the exitcode should be correct. OK, one might split the last line into two and/or change its indentation. Not sure whether this is better readable...

Since you mentioned overlays in an earlier posting: If you keep your overlays in /usr/portage/local (which is e.g. the default of layman), then you need no special treatment for them... [only the rm-command in my above suggestion should then be modified, of course, although it is not really a mess if you executed it by accident, as long as you didn't call "/etc/init.d/squash_portage restart"].