Hello,
before Christmas QA started discussing changes in the release blocker process for bugs
that are not media-related (i.e. don't need to be fixed in the generated .iso/.img
files). The conversation is available here:
https://lists.fedoraproject.org/archives/list/test%40lists.fedoraproject....
I took interest in one specific type of such bugs, which are release blockers in one of
the existing stable releases (i.e. Branched-1 or Branched-2). These are usually related to
the upgrade process. We later decided to mark them with AcceptedPreviousRelease flag in
Bugzilla, so I'll call them like this. This email is concerned just with these bugs,
i.e. just one type of the non-media-related blockers. (I want to separate the overall
topic into several separate discussions, so that it's easier to talk about it and we
don't muddy the discussion with too many things at once).
For these AcceptedPreviousRelease blockers, we want to make sure that all of this happens
before we announce Branched release:
a) their updates are pushed to Branched-1 or Branched-2 updates repo
b) the contents of those repos are available to all our users, also considering all caches
and refresh intervals
Ensuring a) is easy and will be tracked by QA and Bodhi as usual, but for b) we will need
help from releng/infra. Since many users upgrade immediately on release day, we really
need to make sure the updates are available for everybody by that time, otherwise a large
portion of our user base will get affected by those blocker bugs. I have tried to
investigate how best to ensure the latest repo contents are available to everyone, and I
described it here:
https://lists.fedoraproject.org/archives/list/test%40lists.fedoraproject....
As the easiest way to achieve this, I suggested we create a new MirrorManager-related tool
which will strip all old metadata from the repo metalink. Dennis from RelEng agreed he
could run such a tool in these circumstances to make sure only latest metadata are
distributed. I suppose the tool could work something like this:
1. A blocker update was pushed to f23-updates on 2015-01-07.
2. Releng will run the tool like this:
$ ./mm-strip-old-metadata --release 23 --repo updates --date 2015-01-07
(This is assuming there is only one push per day, if not, maybe we can have --timestamp
instead of --date).
3. The tool will go through all metalinks for f23-updates (all primary architectures), and
drop each <mm0:alternate> section which has <mm0:timestamp> older than the
provided date (let's say midnight UTC).
4. As a result, end-user machines will only connect to repositories which already serve
the blocker update (have that or newer repo tree).
And now finally the important question - does infrastructure team thinks this is easily
doable, or is there something not accounted for? Is here someone willing to create such a
tool? In December, I talked to Adrian Reber and I got the impression he's a
MirrorManager developer, so that's the only name I know of, but I don't want to
bother anyone directly. Are there more of MM developers? Is anyone willing to help out
with this?
Thanks a lot,
Kamil

3. The tool will go through all metalinks for f23-updates (all
primary
architectures), and drop each <mm0:alternate> section which has
<mm0:timestamp> older than the provided date (let's say midnight UTC).

Please note that there are no metalink files: these files are generated on the fly from a
cache
by the mirrorlist servers.
I have a patch for this that I'll submit upstream to add the feature itself, and will
discuss
with releng how they want to fire this off.

4. As a result, end-user machines will only connect to repositories
which
already serve the blocker update (have that or newer repo tree).
And now finally the important question - does infrastructure team thinks this
is easily doable, or is there something not accounted for? Is here someone
willing to create such a tool? In December, I talked to Adrian Reber and I
got the impression he's a MirrorManager developer, so that's the only name I
know of, but I don't want to bother anyone directly. Are there more of MM
developers? Is anyone willing to help out with this?
Thanks a lot,
Kamil

> 3. The tool will go through all metalinks for f23-updates (all
primary
> architectures), and drop each <mm0:alternate> section which has
> <mm0:timestamp> older than the provided date (let's say midnight UTC).
Please note that there are no metalink files: these files are generated on
the fly from a cache
by the mirrorlist servers.
I have a patch for this that I'll submit upstream to add the feature itself,
and will discuss
with releng how they want to fire this off.

> 3. The tool will go through all metalinks for f23-updates (all
primary
> architectures), and drop each <mm0:alternate> section which has
> <mm0:timestamp> older than the provided date (let's say midnight UTC).
Please note that there are no metalink files: these files are generated on the fly from a
cache
by the mirrorlist servers.
I have a patch for this that I'll submit upstream to add the feature itself, and will
discuss
with releng how they want to fire this off.

The new script is now available on mm-backend01: mm2_emergency-expire-repo
I think I have seen that Patrick also created a playbook to let the
script run in the correct environment and configuration.
We have successfully tested the script in the staging environment and
after it runs only the newest repomd.xml file is listed in the metalink.
As far as I understand MirrorManager this script only changes the number
of alternate repomd.xml files in the metalink. The number of mirrors
returned does not change. Depending on the last run of the master mirror
crawler (umdl), the state of the crawler checking the mirrors and the
mirrors which are running report_mirror this may lead to situation where
mirrors are offered to clients which might not yet have the newest
files.
Adrian

On Fri, Jan 08, 2016 at 02:40:52PM -0500, Patrick Uiterwijk wrote:
> > 3. The tool will go through all metalinks for f23-updates (all primary
> > architectures), and drop each <mm0:alternate> section which has
> > <mm0:timestamp> older than the provided date (let's say midnight
UTC).
>
> Please note that there are no metalink files: these files are generated on
> the fly from a cache
> by the mirrorlist servers.
> I have a patch for this that I'll submit upstream to add the feature
> itself, and will discuss
> with releng how they want to fire this off.
The new script is now available on mm-backend01: mm2_emergency-expire-repo
I think I have seen that Patrick also created a playbook to let the
script run in the correct environment and configuration.
We have successfully tested the script in the staging environment and
after it runs only the newest repomd.xml file is listed in the metalink.
As far as I understand MirrorManager this script only changes the number
of alternate repomd.xml files in the metalink. The number of mirrors
returned does not change. Depending on the last run of the master mirror
crawler (umdl), the state of the crawler checking the mirrors and the
mirrors which are running report_mirror this may lead to situation where
mirrors are offered to clients which might not yet have the newest
files.

In other words, some of those repos included in the metalink might not correspond to the
included repomd.xml hash, right?
Is that a problem? I believe DNF should just skip those repos and find the first one which
matches the provided metadata hash.

> On Fri, Jan 08, 2016 at 02:40:52PM -0500, Patrick Uiterwijk
wrote:
> > > 3. The tool will go through all metalinks for f23-updates (all primary
> > > architectures), and drop each <mm0:alternate> section which has
> > > <mm0:timestamp> older than the provided date (let's say midnight
UTC).
> >
> > Please note that there are no metalink files: these files are generated on
> > the fly from a cache
> > by the mirrorlist servers.
> > I have a patch for this that I'll submit upstream to add the feature
> > itself, and will discuss
> > with releng how they want to fire this off.
>
> The new script is now available on mm-backend01: mm2_emergency-expire-repo
>
> I think I have seen that Patrick also created a playbook to let the
> script run in the correct environment and configuration.
>
> We have successfully tested the script in the staging environment and
> after it runs only the newest repomd.xml file is listed in the metalink.
>
> As far as I understand MirrorManager this script only changes the number
> of alternate repomd.xml files in the metalink. The number of mirrors
> returned does not change. Depending on the last run of the master mirror
> crawler (umdl), the state of the crawler checking the mirrors and the
> mirrors which are running report_mirror this may lead to situation where
> mirrors are offered to clients which might not yet have the newest
> files.
In other words, some of those repos included in the metalink might not correspond to the
included repomd.xml hash, right?