x64 Instructions for Four Times Faster Math

Users sometimes assume that 64-bit computers (typically running x64 CPUs) will naturally run faster than 32-bit computers (typically running x86 CPUs). They often guess that they will run twice as fast. After all, 64 is twice as big as 32. But in reality there is usually little difference. Sometimes x64 processes will run a bit faster than x86 processes, due to having twice as many registers (sixteen instead of eight), but more often x64 processes will run slightly slower, due to having larger instructions and larger data structures (because of larger pointers) that lead to increased cache pressure.

But in some cases x64 processes can run dramatically faster than x86 processes. If you need access to more than 4 GB of RAM then x64 processes are the way to go, and if you need to do high-precision math ā€“ math to hundreds of digits of accuracy ā€“ then x64 processes can deliver a four times performance increase. I was discussing this with one of my Fractal eXtreme customers when he recommended that I post some of the discussions. This is part one of a multi-part series on the optimizations that make multi-precision math in Fractal eXtreme (and in some cryptography code Iā€™m sure) as fast as possible.

The first part is already written, as part of the Fractal eXtreme documentation. It explains why 64-bit high-precision math is four times faster when coded for a 64-bit processor and you can read it here.

Parts two and three, when I get to them, will cover the advantages of diagonal math (officially known as Lattice Multiplication) over rectangular math, and stretching the limits of loop unrolling.

About brucedawson

I'm a programmer, working for Google, focusing on optimization and reliability. Nothing's more fun than making code run 10x as fast. Unless it's eliminating large numbers of bugs. I also unicycle. And play (ice) hockey. And sled hockey. And juggle. And worry about whether this blog should have been called randomutf-8. 2010s in review tells more: https://twitter.com/BruceDawson0xB/status/1212101533015298048
This entry was posted in Fractals, Math, Performance, Programming. Bookmark the permalink.

2 Responses to x64 Instructions for Four Times Faster Math

  1. Pingback: Fractal eXtreme, now cheaper | Random ASCII

  2. Pingback: The Surprising Subtleties of Zeroing a Register | Random ASCII

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.