Accurate Neural Training with 4-bit Matrix Multiplications at Standard Formats

概
Logarithmic Unbiased Quantization
代码

Chmiel B., Banner R., Hoffer E., Yaacov H. B. and Soundry D. Accurate neural training with 4-bit matrix multiplications at standard formats. ICLR, 2023.

概

本文希望实现 4-bit 的模型训练和推理. 提出了一种 logarithmic unbiased quantization (LUQ).

Logarithmic Unbiased Quantization

作者认为, 无偏量化在反向传播中尤为重要因为这保证在期望上和普通的优化策略的一致性. 又梯度整体呈现对数形状, 如何在这些条件下进行量化催生了本文 LUQ.
Stochastic underflow: 首先, 对梯度进行一个随机'裁剪':

\[T_{\alpha}(x) = \left \{ \begin{array}{ll} x, & \text{if } |x| \ge \alpha, \\ \text{sign}(x) \cdot \alpha & \text{with a probability } \frac{|x|}{\alpha}, \text{if } |x| < \alpha, \\ 0 & \text{with a probability } 1 - \frac{|x|}{\alpha}, \text{if } |x| < \alpha. \end{array} \right . \]
这里取 \(\alpha = \max(|x| / 2^{2^{b-1}})\).
Logarithmic SR: 对数量化是选择 bins:

\[\{\alpha, 2\alpha, \ldots, 2^{2^{b-1}} \alpha \}, \]
然后按照如下的方式进行 stochastic rounding. 对于 \(2^{n-1}\alpha < x < 2^n \alpha\):

\[Q_{\alpha}(x) = \left \{ \begin{array}{ll} 2^{n-1} \alpha & \text{with a probability } \frac{2^n \alpha - x}{2^n \alpha - 2^{n-1} \alpha}, \\ 2^{n} \alpha & \text{with a probability } 1 - \frac{2^n \alpha - x}{2^n \alpha - 2^{n-1} \alpha}. \end{array} \right . \]
作者为了进一步优化这个稍显复杂的 rounding, 提出了 RDNP. 可惜这部分我没咋看懂, \((2^n + 2^{n-1}) / 2 = 3 / 4 \cdot 2^{n-1}\)?.

代码

[代码在 supplementary material 中]

posted @ 2024-12-24 10:55 馒头and花卷阅读(59) 评论(0) 收藏举报

刷新页面返回顶部

馒头and花卷

Accurate Neural Training with 4-bit Matrix Multiplications at Standard Formats

概

Logarithmic Unbiased Quantization

代码

公告