Bug Description

Description
===========

At present when swapping encrypted volumes no attempt is made to attach an encryptor to the target volume. This results in the RAW underlying volume being used during the rebase, where decrypted data is copied from the original volume to the target:

Additionally, while unlikely, a malicious user could easily DOS the compute node hosting the instance by writing a corrupt LUKS header to the RAW volume before detaching and reattaching the volume. For example, setting a keyslot iters (used by PBKDF2) to a large value etc (kudos to mdbooth for suggesting this):

This method of DOS'ing the compute host was previously discussed in the context of bug 1724573 but dismissed as access to the underlying volume was dependent on a host reboot, outside of a users control. This bug differs as a user has full control of the above volume-update/swap_volume flow that provides access to the underlying volume.

The encrypted volumes are rebased with their associated encryptors attached, leading to encrypted data being written to the underlying volumes.

Actual result
=============

Decrypted data from the source volume is written to the underlying target volume. This data will be lost with a subsequent detach / attach cycle. Access to the underlying volume could also be used by a malicious user to DOS the local compute host.

CVE References

I think the priority here is that the user loses their data. The DoS potential is real, but relatively inefficient. I think we could use a CVE to communicate the problem to users, but I don't think this particular issue is important enough to jump through secrecy hoops while we work on the data-loss bug.

I would personally be in favour of early disclosure, handling this openly, and getting it done quickly. We could possibly leave disclosure til early January, though, as we're unlikely to even start work on the fix until then. I'll obviously defer to the VMT and the reporter, just my 2c.

Agreed, the data loss bug is of much higher importance than the potential DoS, just wanted to ensure VMT members were at least aware of the possibility for this to be used maliciously. If there are no further concerns or objections then I'm fine with early disclosure so we can fix this as a priority, in the open.

Since this report concerns a possible security risk, an incomplete security advisory task has been added while the core security reviewers for the affected project or projects confirm the bug and discuss the scope of any vulnerability along with potential solutions.

Given Matthew and Lee are both suggesting we switch to our public workflow in hope of getting the data loss problem solved in a more timely manner (at the expense of disclosing a somewhat inefficient DoS risk), I've gone ahead and subscribed ossg-coresec reviewers to get their input first.

Have these problems have existed since the introduction of LUKS support? Or did the ability to perform a volume-update/swap_volume come later? Are all currently supported stable branches (let's say back to stable/ocata anyway since stable/newton is likely to finally get EOL'd by the time this is fixed) at risk from this?

I know it's been holiday time for a lot of people, but given it's been two weeks at this point and we've got some consensus from the reporter and another core reviewer as well as some of the VMT, I'm going ahead and switching this one to our public workflow ending the embargo.

This change introduces new utility methods for attaching and detaching
frontend volume encryptors. These methods centralise the optional
fetching of encryption metadata associated with a volume, fetching of the
required encryptor and calls to detach or attach the encryptor.

These new utility methods are called either after initially connecting
to or before disconnecting from a volume. This ensures encryptors are
correctly connected when swapping volumes for example, where previously
no attempt was made to attach an encryptor to the target volume.

The request context is provided to swap_volume and various other config
generation related methods to allow for the lookup of the relevant
encryption metadata if it is not provided.

I wanted to give the master change sometime to bed in and while there are conflicts there don't appear to be any show stoppers AFAICT. I'll try to post something for review in the coming week so other stable cores can review.

Rather than backport these libvirt driver changes to stable branches, which are non-trivial refactors of the code, why can't we just put something in the stable branches that explicitly fails a swap volume of encrypted volumes? If swapping encrypted volumes doesn't work until queens, and only then if you're not using native luks encryption, it seems OK to just deny the operation outright in stable branches as a resolution to this bug for pike and ocata.

So at this point I think we're just waiting for review of the stable/pike fix and then a similar fix backported to stable/ocata? Looks like we can probably get moving on CVE assignment from the VMT end in that case. Thanks!

I'm +1 now on the stable/pike backport https://review.openstack.org/#/c/543569/ - my +2 is pending discussion about a 'security' release note in the patch that mentions the CVE / OSSA which isn't yet published, and sounds like we might have a chicken-and-egg with that until the stable/ocata backport is proposed?

CVE-2017-18191 is already reserved for this and can, I suppose, be included in that release note already. You're right that OSSA numbers have historically been assigned at the time of publication so I would consider either amending the release note later and not blocking the patch on it, or we can attempt to work out some sort of coordination mechanism for earlier OSSA assignment.

Prior to Queens any attempt to swap between encrypted volumes would
result in unencrypted data being written to the new volume. This
unencrypted data would then be overwritten the next time the volume was
attached to an instance as Nova no longer identified the volume as
encrypted, resulting in the volume being reformatted.

This stable only change uses limited parts of the following changes to
block all swap_volume attempts with encrypted volumes prior to Queens
where this was resolved by Ica323b87fa85a454fca9d46ada3677f18 and also
blocked when using QEMU to decrypt LUKS volumes by
Ibfa64f18bbd2fb70db7791330ed1a64fe61c1.

Ica323b87fa85a454fca9d46ada3677f18fe50022

The request context is provided to swap_volume in order to look up the
encryption metadata of a volume.

Ibfa64f18bbd2fb70db7791330ed1a64fe61c1355

Attempts to swap from an encrypted volume are blocked with a
NotImplementedError exception raised.

I258127fdcd011ccec721d5ff62eb7f128f130336

Attempts to swap from an unencrypted volume to an encrypted volume are
also blocked with a NotImplementedError exception raised.

Ie02d298cd92d5b5ebcbbcd2b0e8be01f197bfafb

The serial of a volume is used as the id if connection_info for the
volume doesn't contain the volume_id key. Required to avoid bug #1746609.

Prior to Queens any attempt to swap between encrypted volumes would
result in unencrypted data being written to the new volume. This
unencrypted data would then be overwritten the next time the volume was
attached to an instance as Nova no longer identified the volume as
encrypted, resulting in the volume being reformatted.

This stable only change uses limited parts of the following changes to
block all swap_volume attempts with encrypted volumes prior to Queens
where this was resolved by Ica323b87fa85a454fca9d46ada3677f18 and also
blocked when using QEMU to decrypt LUKS volumes by
Ibfa64f18bbd2fb70db7791330ed1a64fe61c1.

Ica323b87fa85a454fca9d46ada3677f18fe50022

The request context is provided to swap_volume in order to look up the
encryption metadata of a volume.

Ibfa64f18bbd2fb70db7791330ed1a64fe61c1355

Attempts to swap from an encrypted volume are blocked with a
NotImplementedError exception raised.

I258127fdcd011ccec721d5ff62eb7f128f130336

Attempts to swap from an unencrypted volume to an encrypted volume are
also blocked with a NotImplementedError exception raised.

Ie02d298cd92d5b5ebcbbcd2b0e8be01f197bfafb

The serial of a volume is used as the id if connection_info for the
volume doesn't contain the volume_id key. Required to avoid bug #1746609.