c - How many 64-bit multiplications are needed to calculate the low 128-bits of a 64-bit by 128-bit product?

Question

Welcome To Ask or Share your Answers For Others

c - How many 64-bit multiplications are needed to calculate the low 128-bits of a 64-bit by 128-bit product?

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

c - How many 64-bit multiplications are needed to calculate the low 128-bits of a 64-bit by 128-bit product?

Consider that you want to calculate the low 128-bits of the result of multiplying a 64-bit and 128-bit unsigned number, and that the largest multiplication you have available is the C-like 64-bit multiplication which takes two 64-bit unsigned inputs and returns the low 64-bits of the result.

How many multiplications are needed?

Certainly you can do it with eight: break all the inputs up into 32-bit chunks and use your 64-bit multiplication to do the 4 * 2 = 8 required full-width 32*32->64 multiplications, but can one do better?

Of course the algorithm should do only a "reasonable" number of additions or other basic arithmetic on top of the multiplications (I'm not interested in solutions that re-invent multiplication as an addition loop and hence claim "zero" multiplications).

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T19:27:13+0000

Four, but it starts to get a little tricky.

Let a and b be the numbers to be multiplied, with a₀ and a₁ being the low and high 32 bits of a, respectively, and b₀, b₁, b₂, b₃ being 32-bit groups of b, from low to high respectively.

The desired result is the remainder of (a₀ + a₁?2³²) ? (b₀ + b₁?2³² + b₂?2⁶⁴ + b₃?2⁹⁶) modulo 2¹²⁸.

We can rewrite that as (a₀ + a₁?2³²) ? (b₀ + b₁?2³²) + (a₀ + a₁?2³²) ? (b₂?2⁶⁴ + b₃?2⁹⁶) modulo 2¹²⁸.

The remainder of the latter term modulo 2¹²⁸ can be computed as a single 64-bit by 64-bit multiplication (whose result is implicitly multiplied by 2⁶⁴).

Then the former term can be computed with three multiplications using a carefully implemented Karatsuba step. The simple version would involve a 33-bit by 33-bit to 66-bit product which is not available, but there is a trickier version that avoids it:

z0 = a0 * b0
z2 = a1 * b1
z1 = abs(a0 - a1) * abs(b0 - b1) * sgn(a0 - a1) * sgn(b1 - b0) + z0 + z2

The last line contains only one multiplication; the other two pseudo-multiplications are just conditional negations. Absolute-difference and conditional-negate are annoying to implement in pure C, but it could be done.

Categories

c - How many 64-bit multiplications are needed to calculate the low 128-bits of a 64-bit by 128-bit product?

c - How many 64-bit multiplications are needed to calculate the low 128-bits of a 64-bit by 128-bit product?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags