Comments

This patch adds support for storing the TPM's persistent state into Qemu
block storage, i.e., QCoW2.
The TPM creates state of varying size, depending for example on how many
keys are loaded into it at a certain time. The worst-case sizes of
the different blobs the TPM can write have been pre-calculated and this
value is used to determine the minimum size of the Qcow2 image. It needs to
be 83kb (libtpm rev. 7). 'qemu-... -tpm ?' shows this number when this
backend driver is available.
The layout of the TPM's persistent data in the block storage is as follows:
The first sector (512 bytes) holds a primitive directory for the different
types of blobs that the TPM can write. This directory holds a revision
number, a checksum over its content, the number of entries, and the entries
themselves.
typedef struct BSDir {
uint16_t rev;
uint32_t checksum;
uint32_t num_entries;
uint32_t reserved[10];
BSEntry entries[BS_DIR_MAX_NUM_ENTRIES];
} __attribute__((packed)) BSDir;
The entries are described through their absolute offsets, their maximum
sizes, the number of currently valid bytes (the blobs inflate and deflate)
and what type of blob it is (see below for the types). A CRC32 over the blob
is also included.
typedef struct BSEntry {
enum BSEntryType type;
uint64_t offset;
uint32_t space;
uint32_t blobsize;
uint32_t blobcrc32;
uint32_t reserved[9];
} __attribute__((packed)) BSEntry;
The worst case sizes of the blobs have been calculated and according to the
sizes the blobs are written at certain offsets into the blockstorage. Their
offsets are all aligned to sectors (512 byte boundaries).
The TPM provides three different blobs that are written into the storage:
- volatile state
- permanent state
- save state
The 'save state' is written when the VM suspends (ACPI S3) and read when it
resumes. This is done in concert with the BIOS where the BIOS needs to send
a command to the TPM upon resume (TPM_Startup(ST_STATE)), while the OS
issues the command TPM_SaveState() before entering ACPI S3.
The 'permanent state' is written when the TPM receives a command that alters
its permenent state, i.e., when a key is loaded into the TPM that is expected
to be there upon reboot of the machine / VM.
Volatile state is written when the frontend triggers it to do so, i.e.,
when the VM's state is written out during taking of a snapshot, migration
or suspension to disk (as in 'virsh save'). This state serves to resume
at the point where the TPM previously stopped but there is no need for it
after a machine reboot for example.
Tricky parts here are related to encrypted QCoW2 storage where certain
operations need to be deferred since the key for the storage only becomes
available much later via the monitor than the time that the backend is
instantiated.
The backend also tries to check for the validity of the block storage for
example. If the Qcow2 is not encrypted and the checksum is found to be
bad, the block storage directory will be initialized.
In case the Qcow2 is encrypted, initialization will only be done if
the directory is found to be all 0s. In case the directory cannot be
checksummed correctly, but is not all 0s, it is assumed that the user
provided a wrong key. In this case Qemu does not exit, but the TPM is put
into failure mode.
v6:
- reworked parts of the error path handling where the TPM is
now used to process commands under error conditions and the callbacks
make the TPM aware of the error conditions. Only as the last resort
fault messages are sent by the backend driver circumventing the TPM.
- removed data layout function
- only initializing storage directory if it is found to be empty; report
error if found corrupted
- removed some assert()s
v5:
- name of drive is 'drive-vtpm0-nvram'; was 'vtpm-nvram'
v4:
- functions prefixed with tpm_builtin
- added 10 uint32_t to BSDir as being reserved for future use
- never move data in the block storage while migration is going on
- use brdv_lock/bdrv_unlock to serialize access to the TPM's state
file which is primarily necessary during migration and the startup
of qemu on the target host where the content of the drive is being
read and validated
v3:
- added reserved int's for future extensions to the entries in the
directory structure
- added crc32 to every entry in the directory structure and calculating
it when writing and checking it when reading
- fixed an endianess issue related to crc calculation
- surrounding debugging output function in adjust_data_layout
with #if defined DEBUG_TPM
- probing for installed libtpms development package by test-compiling
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
---
configure | 25 +
hw/tpm_builtin.c | 706 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 731 insertions(+)