The performance of fine-grained applications such as Parallel Discrete Event Simulation (PDES) is limited by communication overheads. Multi-core architectures with tightly integrated cores on a single chip can substantially reduce the communication cost. The number of cores available on such machines remains low. To scale the simulation, it is important to be able to effectively use Clusters of Multicores (CMs). However, the high communication overheads between remote machines can significantly limit scalability. It is unclear if, in the presence of relatively slow links, there is a benefit of having the low latency between the cores on the same machine. In this paper, we first extended a multithreaded PDES simulator to support CMs. In addition, we show that remote communication forms a primary challenge to scalability due to both latency and message processing software overheads.