Description of problem:
Running "dnf upgrade" on a fresh FC 24 install results in upgrading also systemd-udev (updated version .x86_64, 229-13.fc24). Upgrading systemd-udev involves restarting the running X session (the user has to log in again) and breaking the dnf transaction: the state of the install is invalid, as many packages are marked as installed in both the old and new versions (as found by running "dnf repoquery --duplicated" , but in fact the new version is not installed (e.g. kernel).
Steps to Reproduce:
1. install FC 24 (incl. setting up wifi networking)
2. "dnf upgrade"
3. packages not updated, but marked as duplicated
Actual results:
Install broken, packages marked as installed (and duplicated), but in fact got not updated.
Expected results:
"dnf update" updates packages (ossibly without restarting the X session, or with restart, but with all packages updated)
Additional info:

Sorry, the logs are gone - I reinstalled the system.
However, after reinstall, I updated systemd-udev alone (before updating all the rest), and issues similar to those described in #1367766 are now present in the newly installed system. So if some log related to this scenario would be helpful, let me know which one to attach.

There's some more info in 1341327 , which we're leaving open to be for the X end of this problem.
Basically, restarting systemd-udev-trigger.service causes (we think) systemd-logind to pull the graphics adapter out from under X and immediately give it back again. This service is (currently) restarted in %postun of systemd-udev . From reports we've received so far, it seems that on systems with hybrid graphics, this causes X to crash. On systems with dedicated graphics, it doesn't.
This bug is for the systemd end of the problem: the spurious graphics adapter 'replug' probably just shouldn't happen at all. The other bug is for making X not crash if it *does* happen, if X folks want to do that.
I'm proposing this bug as a Beta freeze exception. Since the restart is in %postun , if we ship F25 Beta with the current systemd package, then the first update to systemd-udev will trigger this bug - even if it's updating to a systemd-udev which takes the restart out of %postun. To ensure F25 Beta users don't encounter this bug, we have to include a systemd-udev build with the systemd-udev-trigger restart taken out of %postun in the frozen images.
http://koji.fedoraproject.org/koji/buildinfo?buildID=807101 is the build that should fix this, I'll submit an update once it's complete.

Nah, "udevadm trigger" is an operation that should always be safe. Software (be it apps or drivers) that cannot deal with such a replug is broken, and needs to be fixed.
We have been retriggering udevadm either fully or only specific subsystems since about always. If X11 is broken now with that it needs to be fixed really.
I don't see anything to change in systemd here. Sorry.

Lennart: the thing zbyszek thinks may be wrong is not the udevadm trigger operation itself, but the fact that it results in this 'hardware replug' happening. He says he's not sure that's actually intended or wanted.

Created attachment 1207377[details]
journalctl -f logs from udevadm trigger --type=devices --action=add
The issue is not caused by udevadm trigger --type=devices --action=add directly, but through systemd-logind. If systemd-logind is SIGSTOPed, nothing happens. If systemd-logind is running normally there is a bunch of remove/add events logged by Xorg (see) attachment. systemd-logind doesn't log anything, even at debug level unfortunately. I think it should at least log when it adds/removes devices.
I also don't think it should remove the devices from clients, even temporarily. I would expect this to cause glitches at least.

For the record, I checked release day F23 and F24 lives in a VM, and restarting systemd-udev-trigger.service appears to trigger the hardware 'replug' in both; I see:
Oct 04 19:13:40 localhost /usr/libexec/gdm-x-session[1542]: (II) config/udev: removing GPU device /sys/devices/pci0000:00/0000:00:02.0/drm/card0 /dev/dri/card0
Oct 04 19:13:40 localhost /usr/libexec/gdm-x-session[1542]: xf86: remove device 0 /sys/devices/pci0000:00/0000:00:02.0/drm/card0
Oct 04 19:13:40 localhost /usr/libexec/gdm-x-session[1542]: failed to find screen to remove
even on F23. However, it seems like there was a change between F23 and F24: the introduction of the systemd-udev subpackage, which did not exist in F23. This commit created it:
http://pkgs.fedoraproject.org/cgit/rpms/systemd.git/commit/?id=c16b573717a4fc657d8bac8e12f734f574b8ec42
and added the postun scriptlet:
+%postun udev
+%systemd_postun_with_restart systemd-udev-{settle,trigger}.service systemd-udevd-{control,kernel}.socket systemd-udevd.service
at least just looking at that commit diff, this wasn't simply moved from somewhere else - we actually weren't doing that before, though the systemd-udev-trigger service did exist. So I think that's why this showed up in F24.

I did have a thought about how we could further mitigate this to prevent people updating a fresh Fedora 24 install from encountering it.
Basically, have systemd-udev do something like this (psuedocode):
%pre
%if (current systemd package is older than systemd-229-16.fc24)
systemctl mask systemd-udev-trigger.service
%endif
%posttrans
systemctl unmask systemd-udev-trigger.service
the %pre will run before the old systemd-udev package's %postun and effectively negate its restart of the service, I think, then the %posttrans would restore it to normal.
We might need a few more hedges - perhaps only do this on update(?), and definitely check if systemd-udev-trigger.service was *already* masked and don't unmask it in %posttrans in that case (systemctl is-enabled can tell us if it's already masked) - but what do people think of the general idea? Too hacky?

That's way to hacky and error prone. Instead, we could a drop in with '[Unit] RefuseManualStop=true' to the service and do 'systemctl daemon-reload' in %post. This is enough to prevent the subsequent 'systemctl try-restart systemd-udev-trigger.sevice' from doing anything.

No, it's already fixed in F24, just not the extra-fix that prevents it happening on the first update yet, but Zbigniew is still planning to do that, I think. Bugzilla / Bodhi integration has limitations in dealing with bugs that affect multiple releases.

Ray: it happened when you did the update because of the details of how the bug is triggered.
The bug is triggered by a command in system-udev's `%postun` script. When you do a package update from, say, foo-1.0 to foo-2.0, the `%postun` script from foo-1.0 is run as part of that transaction.
So here's how it went down: the *existing* systemd-udev package on your system had a `%postun` script that would trigger the update. Up until systemd-229-16.fc24 , all F24 systemd packages had that script.
We released systemd-229-16.fc24 as an update which *removed* that script. However, because it's the %postun of the *old* package that is run on update - not the %postun from the *new* package - when you install systemd-udev-229-16.fc24 , the bug will happen one last time, because the old systemd-udev package still has the bad %postun.
What the update ensures is that any time you update the package *after* the update to 229-16, you won't hit the bug.
We've since come up with a trick which allows the new package to suppress the old package's %postun , so that the bug will no longer happen when you first update to the 'fixed' package. But that trick hasn't been built for F24 yet, I hope Zbigniew will build it, though. Still, now you've got 229-16 installed, you should be safe from this bug in future in any case.

Just for the record: I had the same problem upgrading from F23->F24 - X was terminated. Now it happened again with upgrade from F24->F25, and the systemd-udev package was updated long ago before upgrade from F24->F25 on Oct 04 2016:
/var/log/dnf.rpm.log:Nov 03 14:26:34 INFO Upgraded: systemd-udev-231-10.fc25.x86_64
/var/log/dnf.rpm.log-20161009:Oct 04 18:20:24 INFO Upgraded: systemd-udev-229-15.fc24.x86_64
/var/log/dnf.rpm.log-20161009:Oct 04 18:20:30 INFO Cleanup: systemd-udev-229-13.fc24.x86_64
/var/log/dnf.rpm.log-20161009:Oct 07 10:51:11 INFO Upgraded: systemd-udev-229-16.fc24.x86_64
/var/log/dnf.rpm.log-20161009:Oct 07 10:51:20 INFO Cleanup: systemd-udev-229-15.fc24.x86_64
It's not a big deal, but with dnf is a bit difficult to clean the mess after the crash. I use this:
dnf remove $(dnf repoquery --duplicated --latest-limit -1 -q)
which complains for removal of systemd* and dnf packages and I need to do that manually with rpm. This time the new problem was the missing /usr/lib/locale/locale-archive, which leads to:
-bash: warning: setlocale: LC_CTYPE: cannot change locale (en_US.UTF-8): No such file or directory
-bash: warning: setlocale: LC_COLLATE: cannot change locale (en_US.UTF-8): No such file or directory
-bash: warning: setlocale: LC_MESSAGES: cannot change locale (en_US.UTF-8): No such file or directory
-bash: warning: setlocale: LC_NUMERIC: cannot change locale (en_US.UTF-8): No such file or directory
-bash: warning: setlocale: LC_TIME: cannot change locale (en_US.UTF-8): No such file or directory
Executing build-locale-archive fixed it.
I have another one system to test the upgrade, if there is a fix for F24's systemd-udev package. I did not have a problem upgrading systemd* in F23 or F24, only when upgrading to the next FNN.

that's not a great defence. the best defences are the ones we documented: use offline updates, update from a VT, or update from a tmux/screen session.
this specific bug should now be basically fixed, yes.

Note

You need to
log in
before you can comment on or make changes to this bug.