Penguin Computing recently installed the world’s first HPC cluster powered by AMD Accelerated Processing Units (APUs) at Sandia National Labs in Albuquerque, New Mexico.
The experimental system – which comprises 104 servers interconnected via a QDR Infiniband fabric – delivers a theoretical peak performance of 59.6 TFLOPs.
According to Penguin CTO Phil Pokorny, the Altus 2A00 was specifically designed with AMD to support Fusion APU architecture, as it combines multi-core x86 processing, memory controllers, a PCI-E interface and massively parallel GPU computing on a single piece of silicon.
“The APU includes 400 parallel processing cores that can be [used] for HPC applications through the OpenCL programming framework. Unlike conventional GPU server architectures, APU parallel multiprocessors share the same physical memory space with CPU cores,” Pokorny explained.
“As a result, the programming model for APUs is simpler, bottlenecks for data movement between GPU and main memory are avoided and data duplication is eliminated. These capabilities offer particularly compelling benefits when deployed in conjunction with low-latency RDMA interconnects such as Infiniband, as they allow for building efficient distributed GPU applications.”
AMD exec Margaret Lewis expressed similar sentiments, telling TG Daily the (mainstream) deployment of APUs in the HPC world was a definite possibility.
“Yes, APUs could be a viable path for HPC, as the accelerated processing units combine very powerful compute capabilities along with an advanced graphics vector engine.
“The HPC sector definitely understand the value of two different compute engines combined on one piece of silicon. Remember, our customers and partners have been experimenting with discrete graphics boards for a while now in the server world, and the APU is obviously generating interest and could see play in the server arena as well.”