工具/插件 -- CACTI:一种Cache/Memory分析工具
工具/插件 -- CACTI:一种Cache/Memory分析工具
1. 简介
CACTI是一种分析工具,它接受一组 Caches/Memory参数作为输入,并计算其访存时间、功耗、周期时间和面积。目前更新到7.0版本,并且支持下面几种Memory的分析:
- direct mapped caches
- set-associative caches
- fully associative caches
- Embedded DRAM memories
- Commodity DRAM memories
支持multi-ported uniform cache access (UCA)和multi-banked, multi-ported non-uniform cache access (NUCA).
Router power model.
Interconnect model with different delay, power, and area properties including low-swing wire model.
An interface to perform trade-off analysis involving power, delay,area, and bandwidth.
All process specific values used by the tool are obtained from ITRS and currently, the tool supports 90nm, 65nm, 45nm, and 32nm technology nodes.
Chip IO model to calculate latency and energy for DDR bus. Users can model different loads (fan-outs) and evaluate the impact on frequency and energy. This model can be used to study LR-DIMMs, R-DIMMs, etc.
2. 使用
技术文档: http://www.hpl.hp.com/techreports/2013/HPL-2013-79.pdf
- 从源码地址下载c++源码,放到centos系统下。
- 进入源码文件夹,直接在命令行里
- 生成名为
./cacti -infile ***.cfg
其中.cfg文件是配置memory属性的文件,需要根据所使用的DRAM属性进行更改,这里我直接拿了他sample里的一个配置文件运行了:./cacti -infile sample_config_files/ddr3_cache.cfg
Cache size : 8388608
Block size : 64
Associativity : 8
Read only ports : 0
Write only ports : 0
Read write ports : 1
Single ended read ports : 0
Cache banks (UCA) : 1
Technology : 0.022
Temperature : 360
Tag size : 42
array type : Cache
Model as memory : 0
Model as 3D memory : 0
Access mode : 0
Data array cell type : 0
Data array peripheral type : 0
Tag array cell type : 0
Tag array peripheral type : 0
Optimization target : 2
Design objective (UCA wt) : 0 0 0 100 0
Design objective (UCA dev) : 20 100000 100000 100000 100000
Cache model : 0
Nuca bank : 0
Wire inside mat : 1
Wire outside mat : 1
Interconnect projection : 1
Wire signaling : 1
Print level : 1
ECC overhead : 1
Page size : 8192
Burst length : 8
Internal prefetch width : 8
Force cache config : 0
Subarray Driver direction : 1
iostate : READ
dram_ecc : NO_ECC
io_type : DDR3
dram_dimm : UDIMM
IO Area (sq.mm) = inf
IO Timing Margin (ps) = 35.8333
IO Votlage Margin (V) = 0.155
IO Dynamic Power (mW) = 1282.42 PHY Power (mW) = 232.752 PHY Wakeup Time (us) = 27.503
IO Termination and Bias Power (mW) = 3136.7
---------- CACTI (version 7.0.3DD Prerelease of Aug, 2012), Uniform Cache Access SRAM Model ----------
Cache Parameters:
Total cache size (bytes): 8388608
Number of banks: 1
Associativity: 8
Block size (bytes): 64
Read/write Ports: 1
Read ports: 0
Write ports: 0
Technology size (nm): 22
Access time (ns): 3.03414
Cycle time (ns): 1.84197
Total dynamic read energy per access (nJ): 0.381869
Total dynamic write energy per access (nJ): 0.446873
Total leakage power of a bank (mW): 2520.29
Total gate leakage power of a bank (mW): 4.71441
Cache height x width (mm): 3.07383 x 2.89775
Best Ndwl : 8
Best Ndbl : 8
Best Nspd : 2
Best Ndcm : 1
Best Ndsam L1 : 8
Best Ndsam L2 : 1
Best Ntwl : 16
Best Ntbl : 8
Best Ntspd : 8
Best Ntcm : 1
Best Ntsam L1 : 8
Best Ntsam L2 : 2
Data array, H-tree wire type: Global wires with 30% delay penalty
Tag array, H-tree wire type: Global wires with 30% delay penalty
Time Components:
Data side (with Output driver) (ns): 3.03414
H-tree input delay (ns): 0.860695
Decoder + wordline delay (ns): 0.607741
Bitline delay (ns): 0.473783
Sense Amplifier delay (ns): 0.00189739
H-tree output delay (ns): 1.09002
Tag side (with Output driver) (ns): 0.866708
H-tree input delay (ns): 0.250295
Decoder + wordline delay (ns): 0.0962495
Bitline delay (ns): 0.078
Sense Amplifier delay (ns): 0.00189739
Comparator delay (ns): 0.0162774
H-tree output delay (ns): 0.440265
Power Components:
Data array: Total dynamic read energy/access (nJ): 0.360657
Total energy in H-tree (that includes both address and data transfer) (nJ): 0.270396
Output Htree inside bank Energy (nJ): 0.263979
Decoder (nJ): 0.000237668
Wordline (nJ): 0.000275334
Bitline mux & associated drivers (nJ): 0
Sense amp mux & associated drivers (nJ): 0
Bitlines precharge and equalization circuit (nJ): 0.00163006
Bitlines (nJ): 0.0612354
Sense amplifier energy (nJ): 0.0018371
Sub-array output driver (nJ): 0.0249178
Total leakage power of a bank (mW): 2357.99
Total leakage power in H-tree (that includes both address and data network) ((mW)): 18.9776
Total leakage power in cells (mW): 0
Total leakage power in row logic(mW): 0
Total leakage power in column logic(mW): 0
Total gate leakage power in H-tree (that includes both address and data network) ((mW)): 0.0916133
Tag array: Total dynamic read energy/access (nJ): 0.0212128
Total leakage read/write power of a bank (mW): 162.298
Total energy in H-tree (that includes both address and data transfer) (nJ): 0.00268136
Output Htree inside a bank Energy (nJ): 0.00104879
Decoder (nJ): 0.000585105
Wordline (nJ): 0.000356972
Bitline mux & associated drivers (nJ): 0
Sense amp mux & associated drivers (nJ): 0.000288214
Bitlines precharge and equalization circuit (nJ): 0.00153419
Bitlines (nJ): 0.0132631
Sense amplifier energy (nJ): 0.00155643
Sub-array output driver (nJ): 8.13397e-05
Total leakage power of a bank (mW): 162.298
Total leakage power in H-tree (that includes both address and data network) ((mW)): 0.23223
Total leakage power in cells (mW): 0
Total leakage power in row logic(mW): 0
Total leakage power in column logic(mW): 0
Total gate leakage power in H-tree (that includes both address and data network) ((mW)): 0.00146699
Area Components:
Data array: Area (mm2): 7.28836
Height (mm): 3.07383
Width (mm): 2.3711
Area efficiency (Memory cell area/Total area) - 73.1983 %
MAT Height (mm): 0.716448
MAT Length (mm): 0.540768
Subarray Height (mm): 0.328909
Subarray Length (mm): 0.26532
Tag array: Area (mm2): 0.377107
Height (mm): 0.716051
Width (mm): 0.526648
Area efficiency (Memory cell area/Total area) - 74.9106 %
MAT Height (mm): 0.173381
MAT Length (mm): 0.063873
Subarray Height (mm): 0.0822272
Subarray Length (mm): 0.027995
Wire Properties:
Delay Optimal
Repeater size - 42.0297
Repeater spacing - 0.0329013 (mm)
Delay - 0.216837 (ns/mm)
PowerD - 0.000279845 (nJ/mm)
PowerL - 0.0215298 (mW/mm)
PowerLgate - 9.15623e-05 (mW/mm)
Wire width - 0.022 microns
Wire spacing - 0.022 microns
5% Overhead
Repeater size - 17.0297
Repeater spacing - 0.0329013 (mm)
Delay - 0.226875 (ns/mm)
PowerD - 0.0001818 (nJ/mm)
PowerL - 0.00872349 (mW/mm)
PowerLgate - 3.70994e-05 (mW/mm)
Wire width - 0.022 microns
Wire spacing - 0.022 microns
10% Overhead
Repeater size - 15.0297
Repeater spacing - 0.0329013 (mm)
Delay - 0.235988 (ns/mm)
PowerD - 0.000174237 (nJ/mm)
PowerL - 0.00769899 (mW/mm)
PowerLgate - 3.27424e-05 (mW/mm)
Wire width - 0.022 microns
Wire spacing - 0.022 microns
20% Overhead
Repeater size - 12.0297
Repeater spacing - 0.0329013 (mm)
Delay - 0.257722 (ns/mm)
PowerD - 0.00016297 (nJ/mm)
PowerL - 0.00616223 (mW/mm)
PowerLgate - 2.62069e-05 (mW/mm)
Wire width - 0.022 microns
Wire spacing - 0.022 microns
30% Overhead
Repeater size - 10.0297
Repeater spacing - 0.0329013 (mm)
Delay - 0.28134 (ns/mm)
PowerD - 0.000155511 (nJ/mm)
PowerL - 0.00513773 (mW/mm)
PowerLgate - 2.18498e-05 (mW/mm)
Wire width - 0.022 microns
Wire spacing - 0.022 microns
Low-swing wire (1 mm) - Note: Unlike repeated wires,
delay and power values of low-swing wires do not
have a linear relationship with length.
delay - 0.0902442 (ns)
powerD - 2.8399e-06 (nJ)
PowerL - 1.71796e-07 (mW)
PowerLgate - 1.29017e-09 (mW)
Wire width - 4.4e-08 microns
Wire spacing - 4.4e-08 microns
Segmentation fault
Cache Parameters:
Total dynamic read energy per access (nJ): 0.381869
Total dynamic write energy per access (nJ): 0.446873
版权声明:本作品采用知识共享署名-非商业性使用-禁止演绎 2.5 中国大陆许可协议进行许可。
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步