Provide deltaisos for development releases

Description

Background:

Currently Fedora 13 is in a development, that means Test Composes, Release Candidates and Final versions of Alpha/Beta/Final? milestones are regularly released. During this period there may be even several releases in one week. Each release consists of 1x 3.5GB DVD, 6x 700MB CD, 1x 200MB netinst image and 1x 1GB LiveCD. This is created for i386 and x86_64 architectures.

This is quite large amount of data. Every person interested in occasional or regular testing must download substantial part of this release. That makes certain demands on user's internet connection and also RelEng infrastructure (internet bandwidth, I/O).

Proposal:

Release Engineering team will create a deltaiso file for every release of current Fedora development series. This deltaiso file will contain differences between that particular release and a previous release. That means deltaisos will be created in this fashion:

Fedora 13 Alpha TC1 -> Fedora 13 Alpha TC2

Fedora 13 Alpha TC2 -> Fedora 13 Alpha RC1

Fedora 13 Alpha RC1 -> Fedora 13 Alpha

Fedora 13 Alpha -> Fedora 13 Beta TC1

Fedora 13 Beta TC1 -> Fedora 13 Beta RC1

Fedora 13 Beta RC1 -> Fedora 13 Beta RC2

Fedora 13 Beta RC2 -> Fedora 13 Beta

Fedora 13 Beta -> Fedora 13 (Final) TC1

Fedora 13 (Final) TC1 -> Fedora 13 (Final) RC1

Fedora 13 (Final) RC1 -> Fedora 13 (Final) RC2

Fedora 13 (Final) RC2 -> Fedora 13 (Final)

Because deltaisos are useful typically for media that consists ​mainly of RPM files this process would involve mainly DVD image and would not involve LiveCD. For other media (CD, netinst) the decision is still to be made. Deltaisos for DVDs are ​typically around 100MB.

Deltaisos will be stored in a single directory (e.g. deltaisos/ in ​/pub/alt/stage), so it is easy to convert any older ISO the user currently has into a new one, even several releases forward. These deltaisos will be stored for the period of the development release - beginning with Fxx Alpha TC and ending with Fxx Final (with a small extra time after the final release to allow people to upgrade their ISOs to the final release).

The deltaiso creation process may be easily automated and ​is described on our wiki. No additional human interaction should be needed.

Rationale:

The main reason for this proposal is to enable people with slower internet connection to participate in installation testing. This often involves not only developing countries, but also highly-developed countries ​like the USA. I believe that although some more people would be interested in doing installation testing, it is currently too expensive for them from download time and bandwidth perspective.

Quite interestingly this also includes me as a Red Hat employee. While it is my job to perform installation testing regularly, our office does not have internet line thick enough such that I could afford downloading DVD-sized images several times a week. Performing regular nightly mirroring is not an option when new releases may be pushed even daily.

Up till now Andre Robatino (CCed) has been performing the repetitive task of downloading every development release for every architecture, creating deltaisos and publishing them ​as torrents (because he doesn't have any publicly available storage). Our evidence shows that these torrents are used, which means they are useful for people (and his work is invaluable for me personally). QA has started to exploit his work officially ​on every installation testing page.

The problem is that Andre is usually the only seeder, so the download speed is really slow and there are connection problems sometimes, so people often ​complain about it. Second problem is that those deltaisos are published with a noticeable delay (Andre has to detect new release, download, build, upload and announce). Overall this a huge waste of Andre's time and energy. A defined and automated process from RelEng side would simplify all of this and improve the experience for our users a lot.

Another benefit would be lowered IO/bandwidth load on Fedora infrastructure.

I've been only creating deltaisos for the DVD images, since the current software for creating and using them is only capable of updating a single ISO at a time (so there's no way of feeding in an entire CD set as input). They could be generated for individual CDs (for example, updating CD1 only) but there might be a slight loss of efficiency when RPMs are located on different CDs in the old and new sets. I was considering generating them for CDs anyway, but then found that rsync/zsync can efficiently convert in both directions between different disc sets for a given version (for example between a DVD and the corresponding CD set). This would both eliminate the need to post deltaisos for CDs, and neatly sidestep the deltaiso software's trouble with multiple-disc sets.

Unfortunately, rsync is a resource hog on the server, so it's not currently used, and zsync is not available in Fedora yet - see

However, I think for the long term it would be better to hold out for zsync's eventual availability, and not start training people to use deltaisos for CD sets when a combination of single-disc deltaisos and zsync would be better in the long run.

Also, to the above list of deltaisos, I would add

Fedora 12 -> Fedora 13 Alpha TC1

The first jump is the biggest (this deltaiso would typically be over a gigabyte) but is still much smaller than the full TC1 ISO.

I'd much rather we invest time in zsync, as opposed to adding yet another task which can go wrong to the compose process, which is already riddled with too many by hand processes.

If the deltaisos are limited to DVDs only, it would mean creating just two new disos each release. And if something did go wrong with their creation, people could just download the full ISO as before - the disos are small enough that people wouldn't have wasted a significant amount of time downloading them.

It's not even part of the compose, so it doesn't increase the chance of failure. It's done afterwards, to make the already created ISOs easier to obtain. It should take only a minute or two to enter this command (using autocomplete for the oldiso and newiso names, and inventing deltaiso names) and hit enter. It will probably max out the CPU for 15 minutes or so but it's not necessary to stick around for completion - as I said above, people can just download the full ISO if something goes wrong.

It requires both iso sets are on accessible at the compose time, or on a compose host, and that isn't always the case, particularly without putting a lot of I/O stress on our production mirrors. It also requires shuffling those delta isos around, checksumming them and signing that checksum, so on, so forth. It is not a simple task like you make it out to be.

As long as the old ISO set isn't deleted immediately when the new set is created, both will be available. And there's no need to checksum the deltaisos, since only the newiso needs to be verified, and the checksum file already exists for that (and even that's not usually signed). The checksums I've been providing are only due to the fact that I can't be sure how reliable the file hosting services I've been using are.

We don't have the appropriate tools on serverbeach1, nor even shell access there. The mirror for that, alt.fedoraproject.org also doesn't have the tools, nor will it. It is not a shell server to run arbitrary commands on. It would have to be done on the compose host and synced to alt., then mirrored to serverbeach1. And in fact, I had to purge RC3 from alt. in order to make room for RC4.

I understand there are some difficulties, Jesse. There are always some. But by putting a little effort in our RelEng team we save much more effort for dozens and hundreds of our users. Once this task is automated, it shouldn't even make RelEng life any harder, but still bring benefit to the users. And that's what we aim for with Fedora project, right?

I would be very glad if someone from RelEng could create some testing deltaisos for future composes, just for testing purposes. Maybe we find some more obstacles, maybe everything will work just perfect.

Just a side note: From my experience deltaisos are ​much more efficient for Fedora images than zsync, they would save much more I/O and bandwidth. That doesn't mean we should damn zsync, but it means we should really try to take advantage of deltaiso.

It doesn't pull in any other DAG dependencies, so you can just import DAG's key, download the one zsync RPM, and do a yum localinstall. For people who have a fast enough connection that they disable yum-presto now (the ones who can download full RPMs faster than downloading drpms and rebuilding them), it would be faster than disos. (I'm not one of those people, but it's still better than nothing, and I could continue to provide disos with a greatly reduced delay, except for Alpha TC1, which will still be almost a full download since almost all packages will have changed.) It looks like it could be years before zsync gets into Fedora, and I don't see any reason why Fedora testers shouldn't use non-Fedora tools if they're useful.

It doesn't pull in any other DAG dependencies, so you can just import DAG's key, download the one zsync RPM, and do a yum localinstall. For people who have a fast enough connection that they disable yum-presto now (the ones who can download full RPMs faster than downloading drpms and rebuilding them), it would be faster than disos. (I'm not one of those people, but it's still better than nothing, and I could continue to provide disos with a greatly reduced delay, except for Alpha TC1, which will still be almost a full download since almost all packages will have changed.) It looks like it could be years before zsync gets into Fedora, and I don't see any reason why Fedora testers shouldn't use non-Fedora tools if they're useful.

I'm opposed to using third party repositories for pieces of Fedora infrastructure.

I'm opposed to using third party repositories for pieces of Fedora infrastructure.

How about having the .zsync files posted somewhere else, then? (The files don't have to be on the same server, though it would be nicer.) If necessary, I'll do it myself, although there would be a delay of about 12 hours (the usual 6 hours for the DVD downloads, another 6 for the CDs) and it just seems silly. There's a catch-22 here in that it's unlikely zsync will ever get into Fedora unless there's more interest, and there won't be more interest until people actually get a chance to use it.

I don't think that using a tool (on server and/or by users) that's not officially in Fedora is kosher either. Deltaisos built by rel-eng team seems to me like the only option for now. That's the topic of this ticket, let's keep it like that (we can open other ticket for zsync discussion).

Jesse, could you comment on current obstacles blocking rel-eng from providing deltaisos? Is there some time plan? Can we help somehow?

I'm willing to continue doing this, but it would be much better, both for me and for users, if there was a "real" direct download server I could upload the files to, that allows pausing and resuming both uploads (for me) and downloads (for users). Having rsync available would be great as well, in order to be able to fix broken uploads/downloads (important for low bandwidth users for whom it's hard enough to download the first time).

If there was a few dozen GiB of space available, I could kill two birds with one stone by providing an archive of all the TCs/RCs (see ​https://fedorahosted.org/fedora-infrastructure/ticket/2241 ). The space required for each version of Fedora should be less than 10 GiB. I could also post deltas between Alphas, Betas, and Finals for people only interested in those (which really should be on the mirrors, but this would be a start).

It appears that at present, F15/Rawhide consists of a mix of RPMs built using the old and new xz compression. (For example, F14's BackupPC-3.1.0-16.fc14.x86_64.rpm and Rawhide's BackupPC-3.1.0-16.fc15.x86_64.rpm were both built on August 2 using the old xz compression.) Unless this changes, it means that it will not be possible to produce a functional diso from F14 to F15 Alpha TC1 (since rebuilding will fail using either old or new xz). The deltas from F15 Alpha TC1 forward should still work, using the new xz. (Someone using F14 or below would have to temporarily update their xz-\* packages to the F15/Rawhide version in order to use these, which is easy enough - the transaction doesn't pull in any other packages.)

For development, this is not too big a deal, since not that many people seem interested in Alpha TC1 anyway. But it also means that deltas from F14 to F15 Final, for example, would be nonfunctional. This is unfortunate since disos from (N-1) Final to N Final can save about half the download (F12->F13 was about 43% of full size and F13->F14 was about 51%) and the push to shave a few percent more off the size of full ISOs by tweaking the compression (which I support, BTW) is causing the loss of an opportunity to save a much larger amount of bandwidth by using delta compression. Since RPM signs the compressed data, this means that the only way to keep disos working when the compression is changed would be to rebuild ALL changed packages using the new compression.

I was wondering if it would be feasible to provide enough disk space in my ​http://robatino.fedorapeople.org account for this. The standard quota is 150 MiB which is much too small. Currently I have a free ADrive.com account which provides 50 GB, and everything now available at ​https://fedoraproject.org/wiki/User:Robatino/Downloads is using about 1/3 of that. Using the fedorapeople account would allow resuming/repairing both uploads and downloads, so I wouldn't have to split the large disos into small chunks for safety, making it much more user-friendly. And having rsync means not having to baby-sit each upload and then download again to verify, which would save me a lot of time.