4 Memory Hierarchy
4.1 Introduction
Target: to present user with as much as memory as is the cheapest technology, while providing access speed offered by the fastest memory.
* Temporal locality (locality in time): If an item is referenced, it will tend to be referenced again soon.
* Spatial locality (locality in space): If an item is referenced, items whose addresses are close by will tend to be referenced soon.
4.2 The Basic of Caches
Read 4-10 to see how to access a cache.
Compute the cache size, assuming there are n blocks and 32-bit per address: 2n * (63 - n).
When cache missing, we must stop to find the data. Write buffer is added to improve efficiency. When write misses, we should rewrite the whole block to maintain the data consistance. (4-18)
Example: cache missing
The following C program is run (with no optimizations) on a machine with a cache that has four-word(16-byte)blocks and holds 256 bytes of data:
int i, j, c, stride, array[256];
…
for (i=0; i<10000; i++)
for (j=0; j<256; j=j+stride)
c=array[j]+5;
if we consider only the cache activity generated by references to the array and we assume that integers are words, what is the expected miss rate when the cache is direct-mapped and stride=132?
How about if stride=131?
Would either of these change if the cache were two-way set associative?
If stride=132 and the cache is direct-mapped
The number of blocks in the cache is 256/16=16
For array[0]:
* The block number = 0/4 =0
* it maps to cache number = 0 mod16 = 0, tag=0/16=0
For array[132]:
* The block number = 132/4 =33
* it maps to cache number : 33 mod 16=1, tag=33/16=2
So, 2 compulsory misses on the first iteration of i, and 9999*2 hits thereafter, and the miss rate =2/(2*10000)=1/10000
If stride=131 and the cache is direct-mapped
Cache block number is always 0. So, 1 compulsory miss and 1 conflict miss on the first iteration, and 2 conflict misses on every following iterations, and the miss rate=1.
If stride=132 and the cache is two-way set associative, the situation will not change. But when stride=131, the miss rate =2/(2*10000)=1/10000
4.3 Measuring and Improving Cache Performance
Some easy calculations: Page 4-27
Reducing Cache Misses by More Flexible Placement of Blocks: Direct mapped, Set associative, Fully associative.
Read 4-32 for the whole process.
How to computing caches performance: Page 4-37(EXAMPLE: Performance of Multilevel Caches)
4.4 Virtual Memory
Use main memory as a cache for the secondary stage.
How to calcluate size of page: Page 4-47.
See the example of TLB on Page 4-53.
two-way set-associative TLB: