A CPU has a cache with block size 64 bytes. The main memory has $k$ banks, each bank being $c$ bytes wide. Consecutive $c$ − byte chunks are mapped on consecutive banks with wrap-around. All the $k$ banks can be accessed in parallel, but two accesses to the same bank must be serialized. A cache block access may involve multiple iterations of parallel bank accesses depending on the amount of data obtained by accessing all the $k$ banks in parallel. Each iteration requires decoding the bank numbers to be accessed in parallel and this takes $\frac{k}{2} ns$.The latency of one bank access is 80 ns. If $c = 2$ and $k = 24$, the latency of retrieving a cache block starting at address zero from main memory is:

The banks can be thought of as the parllel RAM chips that are used in parllel and called collectively as main memory. In this type of system more than one ram chip is used to constitute main memory. The width of each cell = width of main memory / number of banks. like this

It is because of parallel access to the banks.We can access all the banks in a iteration at the same time, hence time needed will be only 80, which is the time needed to access a single bank.Further as we need to access the banks( 0 - 7) twice ,we need serailized access for that hence it is done in the second iteration.

I think so. The given answers either access same data twice, or worse access data that doesn't even exist in the main memory. Think about it, the main memory size is 48 bytes, but a single cache block is 64 bytes. Let's assume we only fill a cache block with 48 bytes

Then we have k/2 = 12ns for selecting 24 banks in parallel. After they have been selected, 80ns time is taken to access the first byte of the 24 banks, then another 80ns to access the second byte of the 24 banks.

Your identity must be verified before you can post a comment. Please wait if already uploaded identity proof or upload your proof here

+26 votes

This question is based on the concept of MEMORY INTERLEAVING... which says that instead of accessing data from memory every time, it is better to divide memory in modules or banks and distribute consecutive data on each module to access the data in parallel..to improve data transfer rate. For this purpose the additional decoder is used to access each module in parallel, so we have to count the latency of decoder also along with each module latency.

now i am going to explain the solution:---->

according to the original question there are k banks and k=24 and each bank has c bytes where c=2 . So total we got 2*24=48 bytes in one iteration.

now we have to calculate one iteration latency:

decoding time for one iteration is k/2 ns: 24/2=12 ns

and each bank latency is 80 ns

normally when decoder latency is given then total iteration time is calculated as; K*(decoder latency) + bank latency

but here we have given the total decoding latency of iteration=12 ns

therefore for one iteration we require : 12+80= 92 ns

Now as we discussed above in one iteration we can get 48 bytes of data but question ask for cache block(64 bytes) transfer therefore we require 2 iterations....... that is 2*92=184 ns

Your identity must be verified before you can post a comment. Please wait if already uploaded identity proof or upload your proof here

+3 votes

Explanation: Size of cache block=64 B No. of main memory banks K=24 Size of each bank C=2 bytes So time taken for parallel access T=decoding time +latency time T=(K/2)+latency =12+80=92 ns But C=2 for accesses =2*92=184ns So (D) is correct option