Cortex M0平台32bit乘法输出64bit结果

介绍

    M0是一个32bit的处理器,并且只有有限的Thumb指令。关于乘法只有MUL指令,该指令将两个32bit数相乘,丢掉高32bit。本文用汇编实现两个32bit数相乘,结果用两个32bit寄存器表示。

 

以下思想来自:http://anjoola.com/32bitmul.html

I first tried working this out on paper before getting it to work in Thumb. Say we want to multiply two numbers A and B. Both A and B will each take up one 32-bit register. Since A and B are represented in binary, they can be split into a top half and bottom half, like so.

      A          *         B         =     Result
 ---------------       ---------------       -----------
| A_top | A_bot |  *  | B_top | B_bot |  =  | Top | Bot |
 ---------------       ---------------       -----------

This can then be taken as a product of two sums.

(A_top <<16 + A_bot) * (B_top <<16 + B_bot)
    = (A_top << 16)(B_top << 16) + (A_top << 16)(B_bot) + (B_top << 16)(A_bot) + A_bot*B_bot
    = (A_top*B_top)<< 32 + (A_top*B_bot + B_top*A_bot)<<16 + A_bot*B_bot

Top can be set as (A_top*B_top), since (A_top*B_top) << 32 moves it over by one register. Similarly, Botcan be set as (A_bot*B_bot).

LSRS R7, R6, #16   ; top half of Mid
ADD  R1, R7      ; add to Top

The hard part is the middle 32 bits, (A_top*B_bot + B_top*A_bot) << 16, since half of it falls in Top and half falls in Bot. This can be dealt with by first adding the top half of (A_top*B_bot + B_top*A_bot) << 16(which we will call Mid from now on) to Top. To get this top half, shift Mid right by 16 places.

Next, add the bottom half of Mid to Bot. To get this bottom half, shift Mid left by 16 places. This will result with the bottom half of Mid followed by 16 zeros, which is what we want.

Check Over Flow

The problem with only using 32-bit registers is that it is possible that overflow will occur. The overflow amount needs to be added to the right registers to get the correct result.

LSLS R7, R6, #16   ; bottom half of Mid
ADDS R0, R7        ; add to Bot
ADC  R1, R7        ; add the carry to Top

The first part to check is the result of adding the bottom half of Mid to Bot. Add 1 (the carry) to Top if there is an overflow. If (A_top*B_bot + B_top*A_bot) also causes an overflow, it also has to be added to Top.

ADDS R6, R7, R6    ; A_top*B_bot + B_top*A_bot stored in R6
ADC  R1, R7        ; Overflow added to Top

This code does not deal with negative numbers. Negation of 64-Bit Numbers with 32-Bit Registers has some information on this.

Sample code

; A--r0; B-- r1; Result[H]--R2; Result[L]--R3
; for cortex M0 paltform
Tif_32x32Asm FUNCTION push {r4,r5,r6,r7,r14} lsr r6, r0, #16 ;r6-< [x1] lsl r0, r0, #16 lsr r0, r0, #16 ;r0-< [x0] lsr r7, r1, #16 ;r7-< [y1] lsl r1, r1, #16 lsr r1, r1, #16 ;r1-< [y0] mov r5, r0 mul r5, r1 ;r5-< [x0 * y0] mul r0, r7 ;r0-< [x0 * y1] mul r7, r6 ;r7-< [x1 * y1] mul r1, r6 ;r1 -<[x1 * y0] lsr r6, r0, #16 add r7, r7, r6 lsr r6, r1, #16 add r7, r7, r6 mov r4, #0 lsl r6, r0, #16 add r5, r5, r6 adc r7, r4 lsl r6, r1, #16 add r5, r5, r6 adc r7, r4 str r7, [r2] ;result[H] -> R2 str r5, [r3] ;result[L] -> R3 pop {r4,r5,r6,r7,r15} ENDFUNC

 

posted on 2015-08-26 11:35  jianqi2010  阅读(2059)  评论(0编辑  收藏  举报

导航