"If you are considering timescales of 10 microseconds and lower, you need to architect for cores to run independently - so you need to understand the data in the shared CPU cache, and use thread affinity to keep specific operations on specific cores"