I am a teacher at a school for vocational education and besides that responsible for the maintenance of about 140 workstations. To make this possible without running crazy, between 2005 and 2010 a former student of me, Manuel Mommertz, and I developed a framework to mass install and maintain the whole installation (well, in fact, my part had been mainly requestig for features and testing). Since 2010 I am mostly on my own now, maintainig the whole system and slowly increasing my insight in how it works.
Unfortunately the headmaster now decided to migrate the entire school to Microsoft-Software whithin the next two years. So Manuel and I decided to hand the system over to the interested public, as long as I have a running system at hand to help getting it started elsewhere.

How it works

Maintaining 140 workstations as source code installed boxes is not possible. We run a chroot environment on one box which creates binary packages, which then get installed on the workstation. We in fact create a binary distribution, sort of, tailored to our specific needs. For the base installation of the boxes we boot the bare new box over the network and run a script which is automatically doing all things you would do by hand, installing gentoo. As there are partitioning the harddrive, create swap and filesystem, install the stage3, get a homebrew package with the kernel and install grub. Finally a package with the update script gets installed which on a running sytem later would do the usual regular upates. This script then is run to install our system in its final state. The whole thing can be configured such that a new box gets installed by just choosing pxe-boot as a boot device on startup.

The whole emerge world thing and providing the binary packages in in an ftp-archive is done automatically. There is need for manual intervention though, when upstream is doing some fundamental work on portage, packages need new USE-flags or things like that.
The creation of our "distribution" is done by a local overlay whose ebuilds pull in the upstream ebuilds as dependencies. All local customization is done by local ebuilds or by modify-scripts for base ebuilds.
I am able to install different kinds of boxes: ordinary workstations for students, workstations for teachers, terminalservers, notebooks and so on. The decision what gets installed is based on the DNS-name of the box in question.

I hope you got the picture so far. The whole thing had not been created with publication in mind. So it will not be an easy task to hand it over to someone else. First of all, I will have to remove some things specific to our installation, password-hashes for instance. Then there are a lot of things which have to be cleaned up which are leftovers from some experiments in the past and so forth.
Therefor it will not make sense to put it all in a tarball on an ftp-server and forget about it. I am looking for knowledgeable developers which are willing to replicate our environment, with my help clean things up and eventually put the whole thing into one or two ebuilds. There will have quite amount of documentation to be done too.
First thing to do will be to discuss how work can be organized. I have some mailing list in mind and a place where to put the software. Later on may be some cvs repository or the like will be needed. I don't know, I am not used to collaborative developement.
Despite that I have the whole system running for some years now, there is quite a lot of work to be done. But I think it would be worth it. As, to my understanding, some kind of such system is the only chance to use gentoo in larger installations efficiently.

As you may have noticed already, english is not my native language. So bear with me as I may ocassionally not be very responsive. I will need some spare time for proper answers when things get demanding.

I'm sorry to hear they decided to step over to Windows. Seems like the framework you guys build received a lot of love over the years.
I certainly would not mind deployment software tailored for Gentoo. There are lot of different tools out there (like chef) but they are not quite Gentoo specific. Now there are already feature requests popping to mind, but that is for later concern.
Be sure to edit out all hardcoded password and hostnames and all that.
I guess the first thing is to be able to correctly replicate your setup. (Get all "hacks" back to proper ebuilds and patches and conventions etc.)

I'm sorry to hear they decided to step over to Windows. Seems like the framework you guys build received a lot of love over the years.
I certainly would not mind deployment software tailored for Gentoo. There are lot of different tools out there (like chef) but they are not quite Gentoo specific. Now there are already feature requests popping to mind, but that is for later concern.
Be sure to edit out all hardcoded password and hostnames and all that.
I guess the first thing is to be able to correctly replicate your setup. (Get all "hacks" back to proper ebuilds and patches and conventions etc.)

At first, I just planned to put all things into a tarball, hand it over and help others getting it running. Eventually I realise that this would raise quite some security concerns nowadays. Besides that there is at least the remote boot system which would be better created from scratch anyway, with me just providing the essential shell scripts needed for installing new boxes. Unfortuantely the remote boot environment is the part of which I have the least knowledge.
The script, which is installing new boxes is an init-script. I have a rewrite of this script which will run from any live system by hand. But you definitely will get the most out of the build-system with the remote boot system at hand. One of our usual IT-classrooms has 25 client box. Installing them new by hand is not fun. With the remote boot system, after registering them in DNS and DHCP it is a matter of ten minutes work (though the installation itself will run for about an hour or so, but there is nothing interesting to be seen).

The response to my post hasn't been too enthusiastic so far. May be I should tell something about the prerequisites to replicate the build-system. To do that you will need:
- DHCP server: All boxes get identified by their MAC address
- DNS server: All boxes have individual names, following some naming convention, which serve to decide what should be installed and how they should get configured
- NFS server: Provides the root filesystem of the remote boot system and the place, where the stage3 archive can be found.
- TFTP server: Provides the remote boot kernel
- a box with some space to hold the build-system in an chroot environment. Inside this environment will run a FTP server too, which provides the binary packages to the client boxes. I just have an old server with Pentium4 processor. The rebuild of the whole system with about 900 packages needs about three or four days to finish. I had to do this only once fortunately.
Of course, you will need at least one spare client box for testing.

Last edited by AgBr on Thu Dec 12, 2013 3:56 pm; edited 1 time in total

Considering the lack of interest in this topic it came to my mind, that obviously maintainig large installations with Linux-desktops may be usaually better be done with some binary distribution, providing the necessary maintanance tools, which do not exist for gentoo. So, regarding gentoo in large installations, I may have some kind of a hen-egg problem here for which I am about to provide the egg. Therefore disregard of the meager interst in my topic, with your courtesy and patience I will procede in describing what we have done, just in case someone, some day is looking for such a solution.

I will start with describing the setup of our remote boot environment. This may have been done somewhere else already but we did not find anything which fully suites our needs so we brew our own. I can keep thing short here as most of it is quite straight foreward and follows a standard gentoo installation.
Instead of installing gentoo into a fresh partition you unpack a stage3 into a directory which will later get exported via NFS. Chroot into this directory as described in the handbook and follow the istructions of the handbook. To get a bootable system with network running. There are just a few minor differences:

The kernel must get built with support for root residing on NFS: "CONFIG_ROOT_NFS=y"
Options for tmpfs and devfs are needed to be set too:
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_TMPFS=y
All support for your network hardware must of course be hard coded into the kernel as the kernel needs to know about network configuration before it gets access to the filesystem.
Build the kernel and modules as usual and install the kernel modules. The kernel gets copied to /tftpboot on your TFTP-server.

You don't need grub obviously but you need a pxe bootloader. The one I use is pxelinux.0. It is part of syslinux as provided by kernel.org:
https://www.kernel.org/pub/linux/utils/boot/syslinux/
Download the tarball extract it somewhere and call make. You can do a lot of fancy things with syslinux but we just need bios/core/pxelinux.0 (note, the last character is a zero) Copy the file to /tftpboot too.
Further in /tftpboot you need a directory(!) pxelinux.cfg and within that directory a file named 'default' with a contents like this:

The root-filesystem gets mounted ro. To make sure the system can write to certain parts of the filesystem, we create /tmp in memory and bind all parts of the root-filesystem which need to be writable to directories therein. Then we link the desired runlevel-target to default, based on the DNS-name of the starting machine and finally start init with the initial kernel commandline as parameter.
In order to get this working you will have to set 'wipe_tmp="NO"' within /etc/conf.d/bootmisc. Otherwise /etc/init.d/bootmisc would delete you your sophistically created virtual system within /tmp.
The 'rescue' target is the usual 'default' runlevel which just starts a machine without graphical login. The 'terminal' target will start an X-display manager. The setup target starts an init-script which installs a new machine onto the harddisk. You easily can define other targets to your liking. Just chroot into your netsystem on the NFS-Server and install the needed software.

You will have to make sure, the client machines get their hostname via dhcp and the machines will keep their addresses.
In /etc/conf.d/net you should set: 'dhcpcd_eth0=" -n -p "'

In runlevel 'setup' the init-script 'install' gets started, which in its first part is the doing all steps you would do manually installing a fresh machine running from a live system.

Code:

# /etc/init.d/install
#!/sbin/runscript

DRIVE=""
# leave STAGE empty to use the newest available
STAGE="stage3-i686-20130827.tar.bz2"
#STARGE=""
DISTDIR="/home/.newinstall" #nfs-directory exported rw, mounted at /home
WINDOWS_IMAGE="$DISTDIR/windows.img"
# to create and install windows images you will need ntfsclone on netsys

ROOTPW='<put the password hash for the root password of the new machine here>'
FTPHOST="ftp://packages/stable/"
WINDOWS_INSTALL=false
LINUX_INSTALL=true
MAX_SIZE=76 # largest disk size for system partitions in Gigabyte
MAX_SIZE=$((MAX_SIZE*1024*1024*2)) # sam as above in blocks of 512 Byte
FREE_PART="" # just declaration for util partition

depend() {
need net netmount
}

getDrive() {

DRIVES_SCSI=$(ls /sys/bus/scsi/drivers/sd/*/block) # new kernels
DRIVE=""
SIZE=0
for cur in $DRIVES_SCSI; do
cursize=$(cat /sys/block/$cur/size)
if [ "$cursize" -gt "$SIZE" ]; then
DRIVE=/dev/$cur
SIZE=$cursize
fi
done
# if disk is larger than MAX_SIZE only use MAX_SIZE

In the second part of the install three special packages get installed:

portage.tar.gz is an archive of our own /usr/portage which contains info to the profile 'binclient' and local meta ebuilds, which later on pull in all other packages to be installed in this particular machine, based on its DNS name.

gentoo sources is sys-kernel/gentoo-sources enhanced with the binary kernel (a modify script is doing this on the build-server)

update-client is a package which installs shell scripts to do the regular updates on the machine automatically. As our machine to be installed is at this point a fully capable linux envirnonment, which by profile is configured as a 'binclient', running the script update-install in the chroot-environment will pull in all software needed for this particular machine.

You may notice that grub is still used here. I have tried to convert the script to grub2 but grub2 and I don't get friends easy. I gave up by now.
We stay with python 2.7 as updating to 3 on the build-server is a hazzle and python 2.7 works.

If no one stops me, I will procede to present the installation environment. I will show you the update-install script as soon as I'll find some spare time. But eventually I will have to offer a tarball with the essential parts of the build server as the environment is too complex to present it in full length here. Please pm me and give me an eMail address if you are interested.

Last edited by AgBr on Tue Dec 10, 2013 11:47 pm; edited 1 time in total

Just a note on installing Windows in a second partition. Maintaining windows boxes in large numbers, without proper tools is a PITA. These tools are expensive and at least german schools mostly aren't lying on a bed of roses regarding their financial situation.
If there were a need to (we, until now, do not use Windows) I could maintain Windows boxes by image based updates on the cheap, just by creating an ebuild which uses ntfsclone to copy a fresh image onto its partition. If you know what you are doing, you even may be able to install all the necessary registry hacks to individualize these clones afterwards. You can copy *reg files into the partition and have the files installed on bootup by some autoexec script which looks for *reg files and have regedit suck them in.
If you have the financial ressources, you may be better off with some commercial msi-package based tool though.

Last edited by AgBr on Wed Dec 11, 2013 12:14 pm; edited 1 time in total

bbs1-meta/base is what makes the update-client pull in all necessary packages. On the build-server there are some other ebuilds like 'notebook' which are not needed on the clients. (Why 'notebook' made its way on ordinary clients which are not notebooks, is one of the miracles which I am working on actually)

This will be the last post relating the update client. In executing the updates there are two scripts involved. The first, update-check, checks whether there are updated packeges on the server. The second pulls and installes them. Update-check is called by update-install

Code:

rh12-04 ~ # cat /usr/local/bin/update-check
#!/bin/bash

source /etc/portage/make.conf

if [[ ! "$PORTAGE_BINHOST" ]]
then
echo "You have to set PORTAGE_BINHOST in /etc/portage/make.conf"
exit 255
fi

Update-install is run by cron every 10 minutes. If we have to alter the configuration of our clients, I make the changes on the build-server, run packages-update there and within 10 minutes after packages-update has finished successfully, the updates get installed on the clients. The variable PORTAGE_BINHOST usually is 'ftp://packages/stable'. All new updates go into unstable. Stable is just a link to unstable. I have one client configured to unstable to test the update. Larger updates, for instance world updates, go then to the link testing with some more clients configures to this target. These will run, if needed, for some days or weeks in every day use. If all goes well I create the link to stable, to update all other machines. Before I make world updates, I copy the old successful unstable to stable.backup, to make sure that I have a working copy, just in case my world update fails. In this case I can at least reinstall broken clients. Unfortuantely during such times I am not able to make smaller changes just in configuration, as unstable only gets updated if the packages-update run finished successfully. Up till now there is no way to roll back to an earlier state after a world update. To find a solution for this would be a large improvement of the system as especially changes in the way portage works occasionally breaks the system and I need some time to figure out, what went wrong.

Code:

rh12-04 ~ # cat /usr/local/sbin/update-install
#!/bin/bash

# The developers at one point removed PORTDIR from make.globals. Further down an instruction relies on the existence
# in an 'rm -rf' - instruction. If PORTDIR is not set the instruction reduces to 'rm -rf /'. /home is nfs-mounted. Guess how I found out.
: ${PORTDIR:=/usr/portage}

The following post will be dedicated to the build-server. Other than the netsys environment I described further up, I did never set it up myself and I do not have the time to make a test setup now. But I think I am able to describe it sufficiently exact to reproduce it. My environment carries a heritage of configuration ebuilds, package.use files and modify-scripts I would have to sort out first. So it will be better to present the essentials here and build up the system step by step nearly from scratch. I will copy and paste the basically needed scripts here and outline how it works though.

First you need the build-server. It is a chroot environment set up just like the netsys environment described further up. Just mkdir build-server somewhere where you have some disk space put in there the stage3, bind the necessary filesystems to the appropriate directories and step in.

To my understanding, with the stage3 you have all necessary developement tools you need. app-portage/eix would be handy. There may be some other things but you will not install them directly. And you do not want anything in your world file. My world file only contains things needed for the developement environment or the build-server itself and I am not sure if even these are really needed there. My world file contains:

sys-apps/less may not be needed here as it will get installed as part of our bbs1-meta/base.ebuild anyway later.
vsftp is needed as the ftp-server providing the packages for the client is running inside the chroot-envirnment. Its home is /home/ftp. It gets started outside from the envirnment by a startup-script:

Code:

faramir ~ # cat /etc/conf.d/local.start
# /etc/conf.d/local.start

# This is a good place to load any misc programs
# on startup (use &>/dev/null to hide output)

The build envirnoment consists of three parts which will be explained bit by bit:
- make.conf
- /etc/portage, especially the scripts in portage.modify
- /usr/local/portage with our local ebuilds

The idea is to never touch the client-machines manually. All configuration has to be done on the build-server. On the client-machines all that is done by emerge is unpacking the tar files as long as these would not try to overwrite files belonging to other packages. So configuration files belonging to regular ebuilds would have to be modified before they get packed. All files belonging to openrc emediately come to mind, as these determain most of the behavior of the client.
Other configuration files do not belong to any regular ebuild. /etc/fstab for example is created manually during installation. These configuration files will have to be created or modified by local ebuilds which lie in /usr/local/portage.
My make.conf looks like this:

As far as I can tell there isn't anything fancy in there. Layman is only needed for the overlay nx, as we installed an nx-server on one of our terminal-servers. The overlay is set up in the usual way, so I don't have anything to report here.
My next post will be dedicated to /etc/portage but I need to clear some details for myself first before I can present it to you.

A page at wiki.gentoo.org will probably get more interest than the forums.
Its also much easier to collaborate there.

An email to pr AT gentoo.org would be good too. That may lead to something in the Gentoo Monthly Newsletter or even an interview/podcast.
I'm not a member of pr.

The forums are read by a small subset of developers, so more publicity will be better.

Thank you for the hint. It was my impression that the wiki is a place for documentation of a more definite state and I never thought of the wiki to get more traffic than the forums. My initial idea was to find one/some people interested in reproducing the installation to share the files with. I will think about transferring my little 'essay'. But may be I should finish it here first, just to make it complete. Who is pr by the way?

The wiki is open to all, provided they register. There are many skilled Gentoo people who are not Gentoo developers.
That does not matter. For your protect to live, you only need people with an interest and the skills to contribute._________________Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.

Just in case anyone is wondering, I am still on it. Unfortunately right now my time is occupied by some other things. If meanwhile anyone should be interested in looking into our system on his own, just pm me. I have a tarball with all the essential parts of the system, which just can be unpacked into the chroot environment of the build-server outlined above.