Microprocessor design is both complex and time-consuming: exploring a huge design space for identifying the optimal design under a number of constraints is infeasible using detailed architectural simulation of entire benchmark executions. Statistical simulation is a recently introduced approach for efficiently culling the microprocessor design space. The basic idea of statistical simulation is to collect a number of important program characteristics and to generate a synthetic trace from it. Simulating this synthetic trace is extremely fast as it contains a million instructions only. This paper improves the statistical simulation methodology by proposing accurate memory data flow models. We propose (i) cache miss correlation, or measuring cache statistics conditionally dependent on the global cache hit/miss history, for modeling cache miss patterns and memory-level parallelism, (ii) cache line reuse distributions for modeling accesses to outstanding cache lines, and (iii) through-memory read-after-write dependency distributions for modeling load forwarding and bypassing. Our experiments using the SPEC CPU2000 benchmarks show substantial improvements compared to current state-of-the-art statistical simulation methods. For example, for our baseline configuration, we reduce the average IPC prediction error from 10.9% to 2.1%; the maximum error observed equals 5.8%.