PERFINFO_CCSWAP_BUFFER

The PERFINFO_CCSWAP_BUFFER is one of many types of
fixed-size header that begin the data for an event as held in the trace buffers
or flushed to an Event Trace Log (ETL) file for an NT Kernel Logger session. The
event is specifically PERFINFO_LOG_TYPE_CONTEXTSWAP_BATCH
(0x0525).

Usage

The PERFINFO_LOG_TYPE_CONTEXTSWAP_BATCH event exists
to trace context swaps, of course, but to do so without the expense of logging an
event on every context swap. Here, context swap means a change of thread: a processor
switches from running an old thread to a new thread. For each processor, the kernel
accumulates data on successive thread switches that occur on that processor and
writes this batch as one event if the latest thread
switch satisfies any of several conditions. Presently, these are

data has accumulated for too many threads;

too much time (more than 500 timer ticks) has passed since the batch started;

the batch is already too large to add data for this latest thread switch;

too much time has passed since the last thread switch on the same processor;

this is to be the last thread switch to trace, since tracing of this event
has stopped.

Note, however, that the implementation is a little more complicated since the
kernel not only tracks each processor separately but also may record each context
switch in multiple batches to account for different clock types that are in use
by trace sessions that are enabled for the event.

For any particular NT Kernel Logger session to be sent this event, the
group masksPERF_CONTEXT_SWITCH (0x20000004) and
PERF_COMPACT_CSWITCH (0x20000100) must both be enabled.

Documentation Status

The PERFINFO_CCSWAP_BUFFER is not documented. A C-language
definition is published in the NTWMI.H header from some editions of the Windows
Driver Kit (WDK) for Windows 10.

Layout

a sequence of PERFINFO_CCSWAP structures (in full
or compressed form), one for each thread switch.

Trace Header

In the PERFINFO_TRACE_HEADER, the
Size is the total in bytes of the trace header and all
the event data. The HookId is
PERFINFO_LOG_TYPE_CONTEXTSWAP_BATCH , which identifies the event.

The Marker is, at its most basic, 0xC0100002 (32-bit
or 0xC0110002 (64-bit). Additional flags may be set to indicate that extended data
items are inserted between the trace header and the event data. Ordinarily, however,
the event data follows as the trace header’s Data array.

Event Data

The event data itself begins with a fixed-size header. This
PERFINFO_CCSWAP_BUFFER is 0x58 bytes in both 32-bit
and 64-bit Windows:

Offset

Definition

0x00

LONGLONG FirstTimeStamp;

0x08

ULONG TidTable [0x10];

0x48

SCHAR ThreadBasePriority [0x10];

The FirstTimeStamp tells when this batch started.
The unit of measurement depends on the trace session’s clock type. Data for each
thread switch records only the difference in time from the preceding thread switch.

The TidTable lists the thread ID for every threads
that has been seen as the old thread in any thread switches since this batch started.
Data for each thread switch identifies the old thread by indexing into this list.
When a thread switch occurs and the old thread is not the idle thread, it is added
to the list. If the list is full, the existing batch becomes an event and a new
batch is started.

The ThreadBasePriority array gives the base priority
of each thread at the time it was first switched away from. Data for each thread
switch may indicate the old thread’s priority as an increment from this base priority.

Thread Switch

The fixed-size header is followed by however much data has accumulated about
thread switches since the last batch was logged as an event. The total size allowed
for a batch is 0x0400 bytes. When a thread switch occurs and there is not at least
eight bytes remaining, the existing batch becomes an event and a new batch is started.

The full form for the data that describes each thread switch is the 8-byte
PERFINFO_CCSWAP:

To save space, however, the data can be present in any of three reduced forms
(see below), distinguished by the DataType:

PerfCSwapIdleShort (0) for a
PERFINFO_CCSWAP_IDLE_SHORT;

PerfCSwapIdle (1) for a
PERFINFO_CCSWAP_IDLE;

PerfCSwapLite (2) for a
PERFINFO_CCSWAP_LITE;

PerfCSwapFull (3) for a
PERFINFO_CCSWAP.

The TimeDelta tells how much time, in the units of
the trace session’s clock type, has passed since the preceding thread switch. If
too much time passes between thread switches, such that the delta will not fit the
allowed 30 bits, the existing batch becomes an event and a new batch is started.

The OldThreadIdIndex identifies the outgoing thread
indirectly. It is the thread’s 0-based index into the header’s
TidTable. Note that a thread can come and go multiple
times in one batch.

The OldThreadStateWr is a compound of the outgoing
thread’s WaitReason and State,
as read from the KTHREAD. The former tells
why the outgoing thread is to wait. It takes its values from the documented
KWAIT_REASON enumeration, from zero up to but not including
MaximumWaitReason, which is currently 0x27. Values
of OldThreadStateWr that are not below this are instead
a biased State, specifically the
State plus MaximumWaitReason.
The State takes its values from the undocumented
KTHREAD_STATE enumeration (with a current maximum of 9).
Note that for a WaitReason to be shown, the old thread’s
State must be Waiting
(5).

The NewThreadWaitTime tells how long the incoming
thread was waiting, in timer ticks.

Compression

When the new thread has been waiting no more than 1 tick and the
TimeDelta will fit in 17 bits and the old thread (as
will almost always be true) has not increased its priority by more than 7 from the
base priority that is recorded for it in the header’s ThreadBasePriority
array, all that might go into the 8-byte PERFINFO_CCSWAP
can instead fit in the 4-byte PERFINFO_CCSWAP_LITE:

The OldThreadPriInc is the increase in the old thread’s
priority over the recorded base priority.

Idle Thread

A different saving applies when the old thread is the idle thread, i.e., the
one whose Thread ID is zero. This thread does not figure in the header’s
ThreadId array. Data for a thread switch away from the
idle thread is in general a 4-byte PERFINFO_CCSWAP_IDLE:

Offset

Definition

0x00

ULONG DataType : 2;
ULONG TimeDelta : 30;

However, it too can be compressed, to the 2-byte PERFINFO_CCSWAP_IDLE_SHORT,
if only a little time has passed since the last thread switch:

Offset

Definition

0x00

USHORT DataType : 2;
USHORT TimeDelta : 14;

This page was created on 17th
December 2016 and was last
modified on 18th December 2016.