Cortex M0平台32bit乘法输出64bit结果

介绍

M0是一个32bit的处理器，并且只有有限的Thumb指令。关于乘法只有MUL指令，该指令将两个32bit数相乘，丢掉高32bit。本文用汇编实现两个32bit数相乘，结果用两个32bit寄存器表示。

以下思想来自：http://anjoola.com/32bitmul.html

I first tried working this out on paper before getting it to work in Thumb. Say we want to multiply two numbers A and B. Both A and B will each take up one 32-bit register. Since A and B are represented in binary, they can be split into a top half and bottom half, like so.

      A          *         B         =     Result
 ---------------       ---------------       -----------
| A_top | A_bot |  *  | B_top | B_bot |  =  | Top | Bot |
 ---------------       ---------------       -----------

This can then be taken as a product of two sums.

(A_top <<16 + A_bot) * (B_top <<16 + B_bot)
    = (A_top << 16)(B_top << 16) + (A_top << 16)(B_bot) + (B_top << 16)(A_bot) + A_bot*B_bot
    = (A_top*B_top)<< 32 + (A_top*B_bot + B_top*A_bot)<<16 + A_bot*B_bot

Top can be set as (A_top*B_top), since (A_top*B_top) << 32 moves it over by one register. Similarly, Botcan be set as (A_bot*B_bot).

LSRS R7, R6, #16   ; top half of Mid
ADD  R1, R7      ; add to Top

The hard part is the middle 32 bits, (A_top*B_bot + B_top*A_bot) << 16, since half of it falls in Top and half falls in Bot. This can be dealt with by first adding the top half of (A_top*B_bot + B_top*A_bot) << 16(which we will call Mid from now on) to Top. To get this top half, shift Mid right by 16 places.

Next, add the bottom half of Mid to Bot. To get this bottom half, shift Mid left by 16 places. This will result with the bottom half of Mid followed by 16 zeros, which is what we want.

Check Over Flow

The problem with only using 32-bit registers is that it is possible that overflow will occur. The overflow amount needs to be added to the right registers to get the correct result.

LSLS R7, R6, #16   ; bottom half of Mid
ADDS R0, R7        ; add to Bot
ADC  R1, R7        ; add the carry to Top

The first part to check is the result of adding the bottom half of Mid to Bot. Add 1 (the carry) to Top if there is an overflow. If (A_top*B_bot + B_top*A_bot) also causes an overflow, it also has to be added to Top.

ADDS R6, R7, R6    ; A_top*B_bot + B_top*A_bot stored in R6
ADC  R1, R7        ; Overflow added to Top

This code does not deal with negative numbers. Negation of 64-Bit Numbers with 32-Bit Registers has some information on this.

Sample code

; A--r0; B-- r1; Result[H]--R2; Result[L]--R3
; for cortex M0 paltform

Tif_32x32Asm                 FUNCTION
                             push {r4,r5,r6,r7,r14}
                             
                             lsr  r6, r0, #16       ;r6-< [x1]          
                             lsl  r0, r0, #16
                             lsr  r0, r0, #16       ;r0-< [x0]           
                             lsr  r7, r1, #16       ;r7-< [y1]
                             lsl  r1, r1, #16
                             lsr  r1, r1, #16       ;r1-< [y0]
                             
                             mov  r5, r0
                             mul  r5, r1            ;r5-< [x0 * y0]
                             mul  r0, r7            ;r0-< [x0 * y1]
                             mul  r7, r6            ;r7-< [x1 * y1]            
                             mul  r1, r6            ;r1 -<[x1 * y0]
                             
                             lsr  r6, r0, #16
                             add  r7, r7, r6
                             
                             lsr  r6, r1, #16
                             add  r7, r7, r6
                             
                             mov  r4, #0
                             
                             lsl  r6, r0, #16
                             add  r5, r5, r6
                             adc  r7, r4

                             lsl  r6, r1, #16
                             add  r5, r5, r6
                             adc  r7, r4
                             
                             str  r7, [r2]          ;result[H] -> R2
                             str  r5, [r3]          ;result[L] -> R3
                             
                             pop {r4,r5,r6,r7,r15}
                             
                             ENDFUNC

posted on 2015-08-26 11:35 jianqi2010 阅读(2059) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

jianqi2010