Tuesday, July 15, 2008

The building blocks of HPC

Top 500 again. Linpack HPL. Hitting half a teraflop on a single system.

Using RHEL 5.2 on a single IBM Power 575 system, Linux was able to hit half a teraflop with Linpack. These water-cooled systems are pretty nice. Thirty-two POWER6 cores packed into a fairly dense 2U rack form factor. These systems are designed for clusters, so 14 nodes (14 single systems) can be loaded into a single rack. Water piping winds its way into each system and over the cores (we of course had to pop one open to see how things looked and worked). The systems can be loaded with 128GB or 256GB of memory. A colleague provided a nice summary of the Linpack result over on IBM's developerWorks.

For Linux, there are several interesting pieces, especially as we look at Linpack as one of the key workloads that takes advantage of easy HPC building blocks. RHEL 5.2 comes with 64KB pages. The 64KB pages provides easy performance gains out of the box. The commercial compilers and math libraries provide the tailored and easy exploitation of the POWER6 systems. Running Linpack on clusters is the whole basis for the Top 500 workloads.

It's easy to take advantage of the building blocks in RHEL 5.2. OpenMPI in particular, the Infiniband stack, libraries tuned for the POWER hardware are all included. When we fire up a cluster node, we automatically install these components.
  • openmpi including -devel packages
  • openib
  • libehca
  • libibverbs-utils
  • openssl
These building blocks allow us to take the half-a-teraflop single system Linpack result and begin running it "out-of-the-box" on multiple nodes. There are cluster experts around that I'm learning from. Lots of interesting new challenges in the interconnect technologies and configurations. In this realm, I'm learning that one of the technology shifts emerging is the 10GBe (10GB Ethernet) interconnect vs Infiniband. Infiniband has all sorts of learning curves associated with it. Everytime I try to do something with Infiniband, I'm finding another thing to learn. It'll be interesting to see whether the 10GBe technology will be more like simply plugging in an Ethernet cable and off we go. A good summer project...

No comments:

One of many bloggers.

Bill Buros

Bill leads an IBM Linux performance team in Austin Tx (the only place really to live in Texas). The team is focused on IBM's Power offerings (old, new, and future) working with IBM's Linux Technology Center (the LTC). While the focus is primarily on Power systems, the team also analyzes and improves overall Linux performance for IBM's xSeries products (both Intel and AMD) , driving performance improvements which are both common for Linux and occasionally unique to the hardware offerings.

Performance analysis techniques, tools, and approaches are nicely common across Linux. Having worked for years in performance, there are still daily reminders of how much there is to learn in this space, so in many ways this blog is simply another vehicle in the continuing journey to becoming a more experienced "performance professional". One of several journeys in life.

The Usual Notice

The postings on this site are my own and don't necessarily represent IBM's positions, strategies, or opinions, try as I might to influence them.