一个诡异的想法

多发射与超标量 | R2ISC | 多发射和流水线是啥关系？

The goal of the multiple-issue processors is to allow multiple instructions to issue in a clock cycle. Multiple-issue processors come in three major flavors :

Very Long Instruction Word (VLIW) processors
Statically scheduled superscalar processors. Although statically scheduled superscalars issue a varying rather than a fixed number of instructions per clock, they are actually closer in concept to VLIWs, since both approaches rely on the compiler to schedule code for the processor.
Dynamically scheduled superscalar processors

VLIW人们试过了不行：it was apparent that the IA-64 architecture and the compiler were much more difficult to implement than originally thought

Dynamically scheduled superscalar processors: 1) CPI 0.5止步不前了；2)不知道为CPU增加了多少电路。

也许可以：

顺序多发射，流水线可能stall，即执行结果总是对的，但多发射可能没有效果。
程序员用pragma指定哪些函数satically schedule，编译器用1小时优化100行代码(比如)。开发阶段不打开，最后优化几天。
虚拟机可统计导致流水线stall的寄存器等信息。
从ASM到satically scheduled ASM也是变换，用AI来做(一部分)。编译器生成的汇编代码有套路。
辅助工具，例如高亮显示选中的汇编代码中所有写eax和读eax的，画根线指过去之类。
"oooo chess-engine.o" Out Of Order Optimizer
主板或芯片内集成个x86核，HCA (Hardcore CPU Affinity), 龙芯与兆芯合资兆龙。桌面版1个x86核，32核服务器版则arm, x86, mips, risc-v各8个，CCC (China CPU Concerto)
不要急着刻芯片，先用纯软件的模拟器，跑Fritz Chess，得到比如“这个设计能做到平均每时钟周期3.14条指令”。虽然此阶段物理上可能要跑一天，相当于1微秒一个时钟周期之类。
Apparently我在搞笑。

Multiple Issue Processors I – Computer Architecture (umd.edu)

Consider the simple MIPS integer pipeline that we are familiar with. This gets extended with multiple functional units for the execution stage when we look at different types of fixed and floating point operations. We can also increase the depth of the pipeline, which may be required because of the increase in clock speeds. Now, in multiple issue processors, we increase the width of the pipeline. Several instructions are fetched and decoded in the front-end of the pipeline. Several instructions are issued to the functional units in the back-end. Suppose if m is the maximum number of instructions that can be issued in one cycle, we say that the processor is m-issue wide.

Superscalar processors decide on the fly how many instructions are to be issued. If instructions are issued to the back-end in program order, we have in-order processors. In-order processors are statically scheduled, i.e., the scheduling is done at compile-time. A statically scheduled superscalar must check for any dependences between instructions in the issue packet and any instruction already in the pipeline. They require significant compiler assistance to achieve good performance as the compiler does most of the work of finding and scheduling instructions for parallel execution. In contrast, a dynamically scheduled superscalar requires less compiler assistance, but significant hardware costs. If instructions can be issued to the back-end in any order, we have out-of-order (OOO) processors. OOO processors are dynamically scheduled by the hardware.

道理听起来挺简单(不是量子力学) :-)，可惜真动手做很难啊。

posted @ 2022-03-09 17:41 华容道专家阅读(57) 评论(0) 收藏举报

刷新页面返回顶部

Penilum meum pullo sententia Latin a est 「通过浪费时间获得快乐」

一个诡异的想法