There is an undesirable situation in SMP Linux machines when sending an
IPI via the smp_call_function_single() API:

/*
* smp_call_function_single - Run a function on a specific CPU
* @func: The function to run. This must be fast and non-blocking.
* @info: An arbitrary pointer to pass to the function.
* @wait: If true, wait until function has completed on other CPUs.
*
* Returns 0 on success, else a negative status code.
*/
int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
int wait)

It runs "func" on a specific CPU, and if the "wait" flag is true, it
waits until "func" has completed work before returning to the caller.
This is a useful feature used to control when it is safe to continue and
potentially parse data generated by another CPU.

Unfortunately, Linux can kill the task waiting on its
smp_call_function_single(). The NMI watchdog is detecting a "soft
lockup" on the CPU which calls smp_call_function_single() with the
"wait" flag set. In practice it is not always the desirable behavior.
It is possible that the target CPU of the IPI was non-responsive and
would never (or not in time) pick up the IPI sent to it. From the
perspective of the task sending the IPI, the target CPU never finished
the work, so smp_call_function_single() continues its busy wait loop.
In the meantime, the NMI watchdog can kill the busy-waiting task instead
of somehow "unlocking" the CPU that did not handle the IPI in time.
This is often visible in the logs like e.g.:

In this situation you will see two task dumps in the logs: first for the
smp_call_function_single() busy loop task, and then for the IPI target
task, with different "stuck" time for the CPU/core, e.g.:

In that case, before the watchdog killed the problematic task on CPU1,
it "recovered" on its own and continued normal work. But at the same
time the NMI watchdog killed the task which sent the IPI and was waiting
for it to finish. In that case, in the logs you will only see a dump of
the task that was busy-waiting, and nothing about the original culprit.

Any other APIs wrapping smp_call_function_* are also affected by the
same problem. However, functions like on_each_cpu() have additional
logic, making them less visible in the logs than functions using the
smp_call_function_single() API directly.

Another aspect of the described situation is that for stability of the
system as a whole, it might in fact be safer to kill the busy-waiting
task rather than the more problematic task that fails to handle an IPI.
Unfortunately, the resulting logs are very confusing. They appear to
suggest that the problem resides on the correct execution context which
is killed and dumped, but not on the actually problematic context, which
might not be dumped. This makes it hard to root-cause the problem even
if one is aware of this shortcoming of the killings and the logging.

Maybe better behavior in that case should be having the busy-looping
time out cleanly, with an error return from smp_call_function_single().
At the same time, we may send an NMI to all CPUs for printing a
backtrace, which will greatly help in diagnosing the problem, although
on systems with a lot of logical CPUs this may be impractical (e.g.,
with Knights Landing being up to 288 "CPUs" in one chip, such systems
may already be surprisingly common).