I’ve long felt Japan has been severely overlooked in recent years due to two “lost decades” and China overshadowing it — and supercomputing is no exception.
In 2011, Fujitsu launched the K computer at the Riken Advanced Institute for Computational Science campus in Kobe, Japan. Calling it a computer really is a misnomer, though, as is the case in any supercomputer these days. When I think “computer,” I think of the 3-foot-tall black tower a few feet from me making the room warm. In the case of K, it’s rows and rows of cabinets stuffed with rack-mounted servers in a space the size of a basketball court.
With its distributed memory architecture, K had 88,128 eight-core SPARC64 VIIIfx processors in 864 cabinets. Fujitsu was a licensee of Sun Microsystems’ SPARC processor (later Oracle) and did some impressive work on the processor on its own. When it launched in 2011, the K was ranked the world’s fastest supercomputer on the TOP500 supercomputer list, at a computation speed of over 8 petaflops. It has since been surpassed by supercomputers from the U.S. and China.
Fujitsu sets a new course with an ARMv8-A processor
Supercomputers have a shelf life of three to five years, so it’s time for a replacement. With the demise of the SPARC processor, Fujitsu has decided to chart a new course and last week at the Hot Chips conference announced publication of specifications for the A64FX CPU, an ARMv8-A processor that will be used in the post-K computer that Fujitsu and RIKEN hope will be 100 times faster than K.
A64FX will be the first CPU to adopt the Scalable Vector Extension (SVE), an extension of ARMv8-A instruction set architecture for supercomputers. Fujitsu worked with ARM, which is owned by Softbank, another Japanese firm, to develop the A64FX.
The CPUs will be directly connected by the proprietary Tofu (Torus Fusion) interconnect developed for the K computer. Tofu is designed to improve parallel performance using a six-dimensional mesh/torus topology, providing scalability of over 100,000 nodes, and full-duplex links that have a peak bandwidth of 10 GB/sec.
Each processor can provide a peak double precision (64-bit) floating point operations performance of over 2.7 TFLOPS, or twice that for single precision (32-bit) floating point. Since artificial intelligence (AI) is one of those fields that doesn’t require double-precision, the A64FX will be ideal for supercomputing use — which requires double-precision — as well as AI, which can get by with single precision.
It’s a beast of a chip, too, with 512-bit SIMD x 2 pipes/core, comparable to a Xeon or Epyc processor, support for HBM2 memory and 48 cores per chip, plus four assistant cores, all connected by the Tofu interconnect. Memory bandwidth is expected to top 1TB/sec.
All told, Fujitsu expects the chip to be 2.5 times faster for HPC and AI than the previous generation SPARC chip it made, the SPARC XIfx.
Improved power efficiency
At the same time, Fujitsu is aiming for increased power efficiency through an Energy Monitor and Energy Analyzer to enable chip-level monitoring of performance and adjusting accordingly. The “Power Knob” can change the hardware configuration as needed for power use.
The big question is will other customers be able to buy it? Fujitsu said it would contribute to the ARM ecosystem, but will anyone outside Japan be able to buy a post-K computer for themselves? Japan likes to keep its best toys for itself. We’ll see. The post-K computer is expected to launch around 2021.