除法到底慢不慢?

一本讲FPGA的书里说:Because of complexity, the division operator cannot be synthesized automatically. We use an FSMD to implement the long-division algorithm in this subsection. FSMD: Finite State Machine with Data Path. FSM可用来检查某个字符串是否和某个正则表达式匹配,那也得输出个bool matched,C编译器只告诉你.c没语法错误但不生成.o/.obj/.s/.asm?我的意思是不必强调with Data Path吧。

Given pipelining, out of order processing, microcode, multi-core processors, etc there's no guarantee that a particular section of assembly code will take exactly x CPU cycles/clock cycle/whatever cycles.

https://gmplib.org/~tege/x86-timing.pdf

LNN means latency for NN-bit operation, 如L64. 站在流水线的终点容易产生10秒造一辆车的错觉,从流水线的起点跟着部件走到终点,说不定一天就过去了。

TNN means throughput for NN-bit operation. The term throughput is used to mean number of instructions per cycle of this type that can be sustained. That implies that more throughput is better, which is consistent with how most people understand the term. Intel use that same term in the exact opposite meaning in their manuals.

以Intel SKL(应为Skylake)为例,没理由div比inc和xor快许多。Pentium 4的div 1612,首先P4确实慢,其次那个2不是指数是尾注。我觉得可以粗略地说,在 launched in August 2015 succeeding the Broadwell microarchitecture的Skylake上,div的速度是inc的1/20.

在整明白Verilog写的div之前可以先整明白python写的。从stackoverflow上一个Java程序改来的:

复制代码
def div(dividend, divisor):
    k = 0
    # divisor移位至符号位为1时divisor < 0
    while divisor <= dividend and divisor > 0: divisor <<= 1; k += 1
    q = 0 # quotient
    while k > 0:
        k -= 1; divisor >>= 1; q <<= 1
        if divisor <= dividend: dividend -= divisor; q += 1
    return q
print(div(5,1), div(5,2), div(5,3), div(5,5), div(4,1), div(4,2), div(3,0), div(0,2), div(1,999))
复制代码

FPGA除法器设计实现 - super_star123 - 博客园

二进制、八进制、十进制都可以看作多项式。如1*102 + 2*101 + 3*100,多项式也可以类似地除。二进制的优点是:如果某位不是1,那它必然是0。进制和乘除法有点鸡蛋关系,不学乘法学不了幂。我不知道自己咋学会12*34的。

忘了是在 《The C Programming Language》作者: Brian W. Kernighan, Dennis M. Ritchie (C语言发明者)还是《The UNIX Programming Environment》里学到的:

while ((c = *s++)) n = n * 10 + c - '0';

循环必然可以写成比较和goto/jmp,那么必然可以变成状态机,反正人人都说图灵机能实现任何算法,一般不提程序要多长。:-)

posted @   Fun_with_Words  阅读(79)  评论(0编辑  收藏  举报
(评论功能已被禁用)
相关博文:
阅读排行:
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· Manus的开源复刻OpenManus初探
· 三行代码完成国际化适配,妙~啊~
· .NET Core 中如何实现缓存的预热?
· 阿里巴巴 QwQ-32B真的超越了 DeepSeek R-1吗?









 和4张牌。

点击右上角即可分享
微信分享提示