October 29, 2004

I’ve been thinking a good bit about what needs to happen in fedora for FC4. Not just in a technology sense but also in a community/organizational sense as well as infrastructure needed for the future.

A few of those ideas:

Communication between the users/developers and the steering committee/leadership needs to exist more and be more consistent. Right now communication seems to be fairly sparse and when it does happen fairly one-way. The remark most often made is that it is the same communication style that existed pre-fedora in the rhl timeframe. To be fair it’s improved in the past few months. Gafton is talking more and that’s great. But the rest of the steering and technical committees need to talk and suggest and post and recommend publicly. These people are on the steering committee and technical committees because they are experts in their fields with years of experience. However, they’re also extremely busy. Red Hat needs to recognize this conflict and either appoint different people to the steering and technical committees or free up more time for the people on the committees now.

Leadership: there needs to be more recognition of the people at red hat and outside of red hat who are leading the way on certain projects. Not only because it is heartening to people volunteering but also because it decreases confusion on the part of the new user or new developer who wants to know who to talk to about working on a particular project.

Update posting and notification infrastructure needs a complete overhaul. We need to make it impossible to post an update without an email going out explaining why the update occurred. We’ve had way too many packages get dropped in updates and updates-testing with no explanation for their presence at all. It needs to stop.(Shameless plug: in FC3 and beyond you can run ‘yum list recent’ and get a list of packages added to any repository in the last week)

Announcements and notifications about events or things going on in the fedora world need to exist in places other than just the mailing lists. The mailing lists are great but occasionally things get lost in the sheer noise and volume. We need more of a content mgmt framework available to manage the information that should get greater prominence because of its importance and/or significance.

Fedora Extras needs to actually happen. This is obvious but it can’t be overstated. This has to happen and the only way it will happen is if people outside of redhat.com decide to work on packages. That’s all there is to it. Extras cannot be run solely from redhat.com. There is just no way to do it. Extras needs to get going for multiple reasons. First, there is no good way of packages ever leaving Core. This means a package that is slowly dying and decaying over a period of months as it either grows crufty from lack of development or becomes less and less used for various external reasons will always raise a lot of ire when/if it is removed. Second, we need a way of vetting packages and migrating important/useful/active ones into Core. Third, we need a good place to mentor and train new packagers and developers. Without Extras there is no place for these people to go to learn how to best package for compatibility with Fedora Core.

Mirroring. The mirroring infrastructure needs a lot of love. Right now consistency of mirrors is a dicey proposition. Especially in a rapidly-changing environment like rawhide. We need a few different items to deal with this:

Some way of synchronizing the mirrors of rawhide more efficient than rsync. Rsync is great but for data that is changing over ~5GB a day it just won’t work for a hundred + sites to sync to a single master. We need tiering of the mirrors to be configured and rigidly enforced and we need to consider working on new protocols to synchronize mirrors from each other rather than all from a central server or a central set of tiered servers.

We need a good way of verifying mirror content for listing the mirrors as synchronized and for handing out accurate mirrorlists for yum and up2date. This should include both pull mechanisms that crawl the mirrors and verify that the data is accurate and consistent and push mechanism that let mirror admins submit information about the consistency of their mirror. This is going to take a lot of cooperation from the mirror admins and some coding to make this infrastructure live.

Package Metadata: Right now fedora core 3 ships with 5 different encapsulations of the rpm package metadata:

comps.rpm

hdlist[2]

yum .hdr files

rpmdb-fedora

repodata xml-metadata

I think a lot of concerted work needs to go into reducing this to as few as possible. The yum .hdr files should be able to go away after FC3. So we’re down to four.

Comps.rpm is only used for system-config-packages, therefore in firstboot. We can do better than this. We need to work on merging yum and system-config-packages so the graphical interface of system-config-packages uses the yum modules for finding and resolving dependencies on packages. That could get rid of comps.rpm.

The hdlists are used only by anaconda. If we can get anaconda to be migrated to using the repodata files for installs then the hdlists can go away too. That will be tricky and requires a fair bit of work early on to get anaconda functioning adequately. After all, if the installer doesn’t work, then there’s no point in going on to other things. 🙂

Finally, rpmdb-fedora. This is used by ‘rpm –redhatprovides’, ‘rpm –aid’ and ‘rpm –redhatrequires’ for querying what was in the original installation tree. It’s useful for query information from the base, stock distro, but with a distribution as rapidly changing as fedora core it might not be very valuable. It might be more useful to write the code to make it possible for rpm (or some command line tool) to do arbitrary queries from xml-metadata (repodata) repositories out on the network or even ones on the local disk easily.

If we can get there then we’d be at a place with a single metadata type in the installation media and on our way to converging a number of tools. That might save a considerable amount of disk space on the CD images and it will definitely remove some redundant code.

Technical Priority establishment: Right now technical objectives and deliverables are established in what appears to be fairly arbitrary ways. The discussions are closed off from anyone. I’m not recommending that any and every random user’s opinion be taken into account, but I think the discussion of objectives by the technical steering committee should be made available in a read-only medium so external developers and packagers can get a feel for where fedora _should_ be going so they can adjust their packages and development with this in mind. For example: If I knew that by FC 5 the drive is to have an all-singing, all-dancing Desktop install equivalent of OsX but w/o all the bugs then I might keep that in mind as I developed new features for yum.

Specific Features/Items I’d like to see in the distro or at least under development:

An end-user graphical backup tool. Possibly something using rdiff-backup or similar that would let a user backup their data to another directory or another system w/o having them have to learn either a difficult command line option or an arcane and arbitrary config file format.

More and more items using dbus for notification of events. This might include tying in logging tools to dbus for security alerts and events.

A concerted effort to make the acpi sleep and suspend functions work out-of-the-box on more laptops and desktops.

More hardware compatibility reporting. Reports of failed and successful configurations of systems for Fedora Core installs and use.

So, What am I going to do about any of these? The projects related to yum and the package metadata I will be spending much time and focus on in hopes of achieving some of them by FC4. Time permitting I’d like to look into some of the mirroring issues and work on code to resolve and manage some of them. Over the FC3 test cycle I’ve been helping out with mirrorlist maintenance and I can personally say that maintaning those lists by hand sucks. 🙂

For the other ideas the only thing I can do right now is to push folks inside red hat to move forward during the post-FC3-final, pre-FC4-test1 period. I’m also going to encourage anyone who thinks these items would be cool to work on them and post your announcements and progress on the fedora mailing lists. In addition, if anyone is working on any of the projects for fedora, or working on new packages for fedora extras and you have a blog that you would like to see added to the Fedora People feed please let Me or Colin Charles know about it.

It’s great that so many people at red hat work on fedora. However, that doesn’t make a strong community, that makes a strong company. Fedora need more external developers and I’ll do whatever I can to help and encourage more external folks to become involved. If anyone has some suggestions, let me know about them.

October 23, 2004

So the day starts off slow but good. I had a nice night’s sleep and woke up well. Then off to look at cars with the girl. Found some things. Then to chapel hill for lunch and some wandering around.

On the way home things were fine. Go to pull in my drive way and I didn’t see the car coming. We collide. Luckily neither of us was going any appreciable speed. The front quarterpanels on the driver side of each car were torn up. But no damage done to people inside and I doubt any serious structural damage to the cars.

Still, probably my fault, I feel like an idiot and my car is all screwed up.

It’ll be fixed before too long. I just need to shake the “I’m a dumbass” feeling.

October 22, 2004

todo items and zany ideas that I’d love to see people play with in yum

2.1.x. I’ll say that again – I don’t want to see any of this for yum

2.0.X, only for 2.1.X 🙂

These are in no particular order.

– yum deplist packagename:

looks in the metadata for the packagename(s) and returns a list of its requirements. It should list the specific dependency then gives a short list of packages that provide that requirement from the metadata

– yum list obsoletes:

– list obsoletes of packages in the format – it needs to fit on one line (80chars) per obsolete.

package-ver-rel from repo obsoletes otherpackage-ver-rel

– leaf-node list/removal – I’d like an option to list/remove leafnodes. These are packages (especially libs) upon which nothing depends. There is some code in rpmUtils/transaction.py to return lists of them – the bit that needs to be done is figuring out what is a lib and what is an end-application.

– new logging output and possible history saving. Logs are fine, they let you list some changes, but I think it might be useful to save a much more detailed log of what occurred, including error/output messages from each package in the transaction. This could be cool to implement a ‘yum history’ command – that printed out what happened, etc in a transaction. So you can look back at what you’ve done.

– the above requires that we get some code in place to dup the output file descriptors from the rpm callback and capture their output. This _should_ be doable.

– importkey – I’d like yum to be able to import keys, if anyone wants to go the extra mile and write up code for stripping off extra signatures of the gpg keys so it won’t mess up the rpmdb on import, I’d be your best friend 🙂

– transaction chunking. As Daniel described elsewhere some way of reliably breaking up a large transaction into discrete and self-complete transactions. This would make play back and error recovery easier.

– make the diskspacecheck and other rpm transaction problem filters easier to specify

– xml-rpc interaction for sending yum commands to other systems – I think this would be cool and potentially useful – but I do also see the dangers therein – if someone wanted to work on an xml-rpc server implementation that exposed the major yum commands as methods I think that’d be a cool project and a good place to work on code outside of yum. At the very least – putting together some query/read-only code for collecting data from yum-running machines would be terribly interesting (to me)

– especially for those people managing a bunch of yum-enabled machines but with limited central-server resources.

– similar to the bottom of above – if someone wanted to write an rssparser for yum generate-rss updates that consolidated the lists and made a nicer display of a set of machines and what updates they needed, I think that person would make a lot of friends. 🙂

– download package callback – right now yum downloads packages but it doesn’t give you an idea of how many it has left to download – this should be added – this is about a 10 minute job 🙂

– some thought needs to be put into yum downgrade – the depresolution is all based around the idea of updating out of dependency problems. Some of that will need to be rethought in order to implement downgrade. I’ll be honest this one is not a high priority for me, however I would like to be able to implement two things:

I’m really not interested in a global downgrade path – there lies an enormous amount of pain, I think.

– includeonlypkg – this won’t be hard to implement either – stub code is in place in yum/__init__.py – just needs to delete packages from the packagesack based on what it finds in the config.

– groupinstall should be able to take options about what types of packages to install – optional, default, mandatory – right now it just does default and mandatory but I can see someone wanting to do more/otherwise.

– logging to syslog – need a syslog object and to point filelog there – done

– repackage all erasures should work – I don’t think this one is hard either – it just needs some attention for a few minutes. 🙂

– all the execptions I’m using need error numbers to make it easier for a gui to use and for other reasons – especially translations.

– outputs strings in cli.py, callback.py and output.py at level 2 or below need to have the _() function from i18n wrapped around them for translation

– yum queue! – I want this one to be implemented badly and when I’m through a few more things I’m sure I’ll look at it – if anyone gets to it first, go for it – most of it should be simple, parse xml file, look at metadata, run the update/install/erase functions as necessary.

– integrate the yum downloader patches.

– work on a yum-source program for dealing with/installing/etc src-rpms including their BuildRequirements.

– the yum webpage and documentation on the webpage needs an overhaul. I like the look of the page – snaplook is great, but I think the content needs some love.

Whew. I think that’s all I have for right now. A lot of these are programs outside of yum, but they’ll be using the yum modules and/or would be very exciting to see somewhere on the yum webpages.

Now the fun part, If there is anything above that you’ve worked on that you want to patch INTO yum it MUST:

work in python 2.2 – I’m using this stuff on python 2.2 systems

work with rpm 4.2.X

not pull too many crazy requirements.

If you see something you like start working on it, If you’d like to talk about the idea some more to flesh it out – then post it to the yum-devel mailing list or bounce me a message.

October 21, 2004

I decided to write something some folks had asked me for quite a bit. It’s called repomanage.py – you pass it a dir and an option (–old or –new) and it will output either the old packages in that dir, or the newest packages in that dir (respectively)

it’s useful if you want to do things like:

repomanage.py --old /some/place | xargs rm -f

to get rid of old packages in a repository.

or

for file in `repomanage.py –new /some/place`

do

cp -a $file /my/new/distro/dir

done

to create a dir full of only the latest rpms.

It keys on name+arch so you’ll get the latest rpms of every arch and it’s dirt simple and maybe it will be useful to someone.

I also added a couple of two line fixes to yum this evening. Made it so mirrorlists can handle # comments and blank lines more gracefully. It didn’t have a problem before, it just didn’t ignore them nicely, like it should have.

Maybe it’s true, taking a nap when I get home does make me more productive. Or, maybe, it’s b/c it is almost 3am and I’m kinda awake. 🙂

October 21, 2004

For those of you who run repositories of rpms and are supporting the repomd metadata format – please download and install createrepo 0.4.1 on your systems. There was a small bug that could make it appear that a ghosted file entry was not a resolveable requirment, when it really was.

ghosted file entries matching the regex for primary entries were not being included in the primary.xml file.