4 Memory Hierarchy
4.1 Introduction
Target: to present user with as much as memory as is the cheapest technology, while providing access speed offered by the fastest memory.
* Temporal locality (locality in time): If an item is referenced, it will tend to be referenced again soon.
* Spatial locality (locality in space): If an item is referenced, items whose addresses are close by will tend to be referenced soon.
4.2 The Basic of Caches
Read 4-10 to see how to access a cache.
Compute the cache size, assuming there are n blocks and 32-bit per address: 2n * (63 - n).
When cache missing, we must stop to find the data. Write buffer is added to improve efficiency. When write misses, we should rewrite the whole block to maintain the data consistance. (4-18)
Example: cache missing
The following C program is run (with no optimizations) on a machine with a cache that has four-word(16-byte)blocks and holds 256 bytes of data:
int i, j, c, stride, array[256];
…
for (i=0; i<10000; i++)
for (j=0; j<256; j=j+stride)
c=array[j]+5;
if we consider only the cache activity generated by references to the array and we assume that integers are words, what is the expected miss rate when the cache is direct-mapped and stride=132?
How about if stride=131?
Would either of these change if the cache were two-way set associative?
If stride=132 and the cache is direct-mapped
The number of blocks in the cache is 256/16=16
For array[0]:
* The block number = 0/4 =0
* it maps to cache number = 0 mod16 = 0, tag=0/16=0
For array[132]:
* The block number = 132/4 =33
* it maps to cache number : 33 mod 16=1, tag=33/16=2
So, 2 compulsory misses on the first iteration of i, and 9999*2 hits thereafter, and the miss rate =2/(2*10000)=1/10000
If stride=131 and the cache is direct-mapped
Cache block number is always 0. So, 1 compulsory miss and 1 conflict miss on the first iteration, and 2 conflict misses on every following iterations, and the miss rate=1.
If stride=132 and the cache is two-way set associative, the situation will not change. But when stride=131, the miss rate =2/(2*10000)=1/10000
4.3 Measuring and Improving Cache Performance
Some easy calculations: Page 4-27
Reducing Cache Misses by More Flexible Placement of Blocks: Direct mapped, Set associative, Fully associative.
Read 4-32 for the whole process.
How to computing caches performance: Page 4-37(EXAMPLE: Performance of Multilevel Caches)
4.4 Virtual Memory
Use main memory as a cache for the secondary stage.
How to calcluate size of page: Page 4-47.
See the example of TLB on Page 4-53.
two-way set-associative TLB:
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· Manus的开源复刻OpenManus初探
· AI 智能体引爆开源社区「GitHub 热点速览」
· 三行代码完成国际化适配,妙~啊~
· .NET Core 中如何实现缓存的预热?