My two observations relate to both code size and runtime performance.
These observations don't affect my situation, so I'm not inclined to
spend a bunch of time on it, but maybe someone else is interested. This
should be especially interesting since these inline functions are used
all over the kernel, so it might actually make a marginally significant
difference.

I suppose there's a reason this code is the way it is. If so, feel free
to ignore me or flame away.

1. If the first part of the if were an ifdef instead it would result in
a code size reduction as well as a runtime performance gain.

2. In atomic.h the "C lang stuff" is wrapped with a spinlock. In the
SMP case the spinlock will result in code that contains ll and sc
instructions, so I infer that there are no SMP system configs that use
CPUs that don't have the ll and sc instructions.