A multithreaded computer maintains multiple program counters and register files to support concurrent or overlapping execution of multiple threads of context, and to provide fast context switching for tolerance of memory latency. In this paper, we apply trace-driven simulation to study the performance impact of a multithreaded architecture on the storage hierarchy. In particular, we examined the effects of different multithread scheduling techniques on cache performance. Using several program traces representing a typical server/workstation workload mix, we find that cache performance can be improved over that of the traditional round-robin scheduling method when the thread with the MRU hit is given a higher priority. With a direct-mapped cache, the absolute hit ratio can be improved by more than 7%. We also study the performance effects of the multithreading degree, i.e., the number of threads coexisting in the processor at the same time, on cache memory. The results show that both cache size and set associativity need to increase according to the multithreading degree in order to maintain comparable cache performance.