RHN Satellite has an action queue, this action queue has also a
scheduler to manage tasks which are often packages we want to
administrate (Install (deploy), remove, etc.).
What happened here is that the client, when checking this queue with
the tool rhn_check, it failed silently, reporting a successful
installation to the RHN Satellite server's task scheduler, making this
one believe everything went fine, no failures whatsoever.
Now, rhn_check is a python script that is very similar to up2date, the
difference is that he checks the scheduler instead of fetching packages,
the common points is that they share the same python foundations and
therefore the same rpm code.
rhn_check will check first with the scheduler to see if we have
pending tasks, since the admin scheduled an install of apache, it will
then download the package's header (containing this broken %pre script)
and evaluate it, if the syntax is correct, it will then perform an rpm
transaction against it and execute each bit that it has been asked on
that header. If everything went fine, it will then install the package
and report to the Satellite a successful result, unless a failure was
encountered, in which case we will send an error code and message.
The problem came from the transaction, in fact, rpm contains a section
called PSM, which is the one that will execute each one of the
scriptlets and parts of the header. When trying to install a package,
python (rhn_check) will code it's C wrapper and try to run a batch of
transactions with :
ts.run() which is in the wrapper rpmts_Run()
This function returns a list.
What happened in our case ?
When running this scriptlet batch, we did :
if (!rc) rpmpsmStage(transaction we have, PSM_PRE);
if (!rc) ....etc.
rpmpsmStage(transaction, PSM_FINI); /* we clear and end */
So you can see that if one of these fail, we just clear/clean/return.
Within this Stage function we fork a child and execute a script shell,
if we failed or if someone (BAD PRACTICE as we say we failed) exits with
an error code and NOT ZERO, we clean/end fail.
RPM in command line will fail with main returning -1.
However, the wrapper will do something different! It will just want to
return a list, so how Paul decided to do this ? Simple :
rc = rpmts_Run()
ps = rpmts_Problems()
...
if (rc < 0) { /* Our case... */
create_a_list
return this_empty_list
}
And then in Python :
rc = ts.Run()
if (rc)
thins happen...
And here comes our problem! We never checked whether this list was
empty or not during the batch transaction! therefore I added :
+ elif type(rc) == type([]) and not len(rc):
+ raise up2dateErrors.RpmError(_("Failed running rpm transaction"))
Which means :
Well, we have a return yes, also if we have this return *and* it is a
list *and* it's empty (our case)
Then raise a simple generic exception as we cannot fetch (unless I write more
lines)
the name of the package and anyway we did exit (1).
Then I added this exception to the code so it can return :
+ except up2dateErrors.RpmError, e:
+ return (18, "Failed: packages requested raised "\
+ "dependency or transaction problems", {})
Making satellite aware of the error and sending the host to the right queue
within the scheduler (failed hosts).
Let me know if you need more details,
Jose

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
release.

yes, thats what i would expect it to raise. We are raising the exception in the
up2date actions code and not in rpm itself and the exception we return is:
+ except up2dateErrors.RpmError, e:
+ return (18, "Failed: packages requested raised "\
+ "dependency or transaction problems", {})
which is an error code 18.

most part of the patch proposed was already existing code that was commented out
for some unkonwn reason, we uncommented that so we actually catch the exception
rather than ignore it and end up as a success as it was doing previously. There
is nothing much to add to the patch except catch the exception. Thats my take on it.
The only decision we were trying to make initially was whether to throw the
exception at the rpm level itself or wait untill it enters the actions code and
as per Paul this cannot happen untill the next major release if we want to do it
in rpm, so we decided to do it at the actions level. As far as the print
statement I feel "packages requested raised dependency or transaction problems",
is appropriate(in this case a transaction problem) as this path is also follwed
while doing the dep solving and the error code needs to be as stated.
I think i dont understand the concern here. isnt the patch working for you?

yep I see what you mean, I did consider that, as i explained earlier, we need to
throw same exception in multiple scenerios, so I felt what we had was
appripriate as it covered any dependency case and any transaction scenerio(which
is fits our case or any other transaction issue).

(In reply to comment #33)
> An interesting finding on this issue. up2date-4.4.5-1.i386.rpm does not
> care about obsoletes list as mentioned earlier in this ticket. So , if I
> use above version of up2date and issue the command up2date apache , it just
> goes ahead and installs the package. However , up2date package that is
> attached to this ticket fails with error as apache package get's added to
> obsoletes list due to httpd being in the base channel. So it looks like
> up2date got this additonal capability somewhere between version 4.4.5-1 and
> the current patched version attached to this ticket.
>
> Regards ,
> Shashin.
>
> Internal Status set to 'Waiting on SEG'
> Status set to: Waiting on Tech
>
> This event sent from IssueTracker by jplans
> issue 107834
>
- Could you paste the exact error message its failing with?
- Also did you try commenting out the patch in this bug and try the same ?

(In reply to comment #42)
> Has this bug been fixed in RHEL 3? Is so then which release of up2date contains
> the fix?
the fix for rhel-3 will be released as a part of rhel-3.9. This is currently
"on_qa".

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.
http://rhn.redhat.com/errata/RHBA-2007-0250.html

Note

You need to
log in
before you can comment on or make changes to this bug.