Monday, November 26, 2007

Green 500 ?!

Interesting. Flippingly remarked in my last post that I wondered what the energy/thermal ratings were for the Top 500 clustered systems, and then this morning stumbled on the Green 500 list at http://green500.org.

My first thought was "how in the world do you measure power consumption across so many clustered systems??" And it appears that they calculate the overall power consumption depending on what can be measured, then the overall metric is calculated. The web site even nicely provides a tutorial paper on how to calculate / determine your power consumption.

http://green500.org/docs/tutorials/tutorial.pdf

More stuff to read and try this week. Time to pull the power meter over to a small Linpack test system running the latest RHEL 5.1 release where we recently pushed a number of publishes out and see what's happening. We regularly play with Linpack on the single SMP servers to make sure they scale fairly linearly (ie: 4-core to 8-core to 16-core) and this may be a good way to apply some power consumption metrics to the performance metrics. Linpack is good because it's fairly steady-state for a relatively long period (depending of course on how big you define the problem size to be solved).

Naturally... what, when, and how to measure watts and thermal and energy consumption for systems under test are going through many debates and discussions these days in the industry these days. A whole new dimension of being able to say "well, it depends" when asked about performance and the trade-offs. If you get a chance, watch what happens with SPEC.org's SPECpower initial benchmark (at http://www.spec.org/specpower/ ). This initial benchmark is focused on CPU centric workloads, but more dimensions are undoubtedly coming.

Sunday, November 25, 2007

The Worlds Fastest Computers

Caught this post from Linux Watch last week as we were getting ready for the Thanksgiving weekend.
The post stemmed from the twice-annual posting of the Top.500 super computers as compiled on the Top500.org web site. This web site keeps track of the rankings of the world's Super Computers based on the Linpack benchmark results. We use Linpack on a day-to-day basis on single systems to keep track of changes being made across a number of pieces of the software stack, but playing on a single system is nothing compared to what these companies and partnerships are doing.

These top computers are amazing, they are a testament to the "scale-out" architectural approach of repetitive computing horsepower. Take a look at the number 1 system in the list. They describe the system on the Lawrence Livermore National Laboratory's web site at https://asc.llnl.gov/computing_resources/bluegenel/. According to the Top 500 web site, this "system" has got 212,992 processors and 73,728 GB memory. (I wonder what the power/watts/thermal measurements are for systems like this - couldn't find any energy star ratings on the BlueGene web site.)
  • Poking around some more, it turns out that the memory is referred to as "tebibytes". I probably should've known this, but the tebi prefix is short for "tera binary byte" and is intended for the 2 to the nth power numbers. So the tebibyte is 1024 to the 4th (or 2 to the 40th), where-as the more familiar terabyte is 1000 to the 4th (or 10 to the 12th). There's a nice table out on Wikipedia which had a good concise way of looking at things. It certainly is more precise - we often have clarification discussions about the difference between 1024 vs 1000 based numbers.
Back to the fastest computers running Linux. The Linux Watch post states that 426 of the top 500 computers rely on Linux. The Linux operating system base is easily able to be optimized for the various software stack pieces. One of the surprises in the statistics list was how pervasive Gigabit Ethernet was for the interconnect. I had heard that Infiniband was the preferred interconnect technology and had assumed this would be the hands-down favorite. Indeed, Gigabit Ethernet was listed at 54% of the Top 500, with Infiniband the 2nd most pervasive at 24%. The remaining systems use a variety of specialized interconnect technologies. It's interesting to see these off-the-shelf technologies being leveraged in the Top 500 Super Computer configurations.

These Top 500 super computer systems are a world in and of themselves. They really represent the top of the stack for HPC workloads, and have fairly unique configuration challenges which in many cases dive into the "research" world and some amazing latency, shared file, memory accessibility, and CPU interconnect technologies. For Linux customers, these technologies are in some cases bleeding edge technologies which over time are product'ized and rolled into the commercially supported Distros, and in other cases are already shipping in today's customer available distros.

A good example of technology which is pervasive and mature for commercial, research, and academic use is the OpenMPI project. Over the last couple of months we've started looking more into OpenMPI and related MPI products and have found that OpenMPI is very competitive. The Open MPI organization (at http://www.open-mpi.org/ ) is very active and keeps the MPI implementation at the leading edge across a number of offerings. The feature set of Open MPI v1.2.4 is impressive and provides good flexibility across networks, interconnects, and system implementations. Very impressive. Our work on small clusters is based on OpenMPI and allows us to focus on other system performance issues and concerns.

It'll be interesting to see what we find over the coming months. It'll be fun to start learning more about these large system scaled-out configurations and see what we can apply to real-life customers today.

Sunday, November 4, 2007

Oprofile - a Visual interface

So in performance, we profile systems a lot. In our performance testing and analysis, we usually rely on normal Linux command-line interfaces and a tailored automated analysis framework for running workloads, micro-benchmarks, and industry standard benchmarks. The test framework is similar to several derivative branches worked on in the Linux community by people like Martin Bligh[1], Andy Whitcroft, and many others. For example, in the community, AutoTest is one of the latest incarnations of automated test environments[2]. Our current autobench framework is driven primarily these days by a member of our Austin analysis team, Karl Rister.

For our performance work, we wrap and re-wrap things with simplifying scripts and strive for parse'able text output. The key focus for our performance work is to be able to crank the same workload across different OS and software levels, and use/compare varying IBM hardware platforms that Linux supports (x86, Power, Blades, Servers, etc). This helps us more quickly understand where Linux needs to be improved, or if there is something to be improved in the various hardware/firmware combinations and configurations that customers would want to use. Or in some cases when working with new workloads, in figuring out what exactly we're running, and working to understand what we're really measuring.

But what about the more elaborate user-graphical interfaces to performance tools? Not so much. The two user interface assists we do deploy are wrapping things with web interfaces (easy to crank out, easy to adjust), and graphing the generated data - more simply put: "a picture is worth a thousand words".

That said, someone recently asked about the latest Visual Performance Analyzer (VPA) tool which can be used with oprofile on Linux. Last time I had played with it was several years ago, so it was time to give it a new shot. And I found it was pretty nice. It won't replace our automation framework, but it was a nice alternative way to get and view profile information, especially from a remote system looking "into" a system under test. And sometime over the coming weeks, I'd like to see if we can tap into our pool of previously run and saved profiled workloads. Not sure if that's possible, but we'll see.

So here's what I found, with some quick context definitions as needed.

Using a profiler[3] is the simple process of measuring a program or system's behavior while things are running. The simple approach is to sample what each CPU is doing on a regular basis - say every 100,000 timer ticks - and keep track of where in the program (or the system) things are spending the most time. Profilers can do this "automatically", and can be tailored to do some pretty clever tricks using different hardware counters.

Application programmers use profiling to figure out how to optimize their programs, usually improving data structures, loops, and program structure. Each programming language is unique, and has unique performance challenges. In our world though, we're focused on a hierarchy of performance challenges. With new hardware or new operating system code, we are generally focused more on how to optimize Linux for the hardware, looking for the typical "performance hazards" which can be improved in software, firmware, or hardware.

For VPA, I downloaded the x86 rpm version of VPA from IBM's alphaWorks[4] and installed that on a recently installed RHEL 5.1 Linux laptop. and also tried the zip'ed version for Windows. While the files were downloading, I did read on the alphaWorks website:
What is Visual Performance Analyzer?

Visual Performance Analyzer (VPA) is an Eclipse-based performance visualization toolkit. It consists of five major components: Profile Analyzer, Code Analyzer, Pipeline Analyzer, Counter Analyzer, Trace Analyzer, and Control Flow Analyzer.

Interesting. Eclipse-based? So do I need to install the full Eclipse environment? Apparently not. Maynard Johnson (our dependable oprofile Answer-Man) told me that Eclipse now supports a "Rich Client Platform" (RCP) which allows GUI applications to build with Eclipse widgets, without a full Eclipse platform. And sure enough, the two files I downloaded from alphaWorks were both called "vpa-rcp-6.0.0*".

VPA comes with several tools included. The one I was interested in was the Profiler Analyzer. I tried examples from both the Linux client and the Windows client.

These screen shot images are from the Windows client using SnagIt [5]. I really wish we had this tool available on Linux... I recommend going to the Techsmith website and request a SnagIt product port to Linux clients, especially since they're already considering one to the Mac (per their website).

Invoking the Profile Analyzer, I simply defined the connection to the server I was testing. Plenty of options. Lots of flexibility. I simply wanted to see if I could connect to the system, and profile a couple of seconds of the system sitting there idling along. By the way, oprofile was previously installed on the system being tested - but that's easy to do as well.


Worked easily. Connected cleanly. Push-buttons to start and stop profiling. Then I could click on the output, and poke around to see where things were running.






(Quick aside: for those with good eyes, the system being tested was named xbox.ltc.austin.ibm.com - it's just a normal Power 5 based system running SLES 10. The analyst who named a set of three related systems has caused no end of raised eyebrows as we report on work we've done on systems named xbox, ps3, and wii.... clever... but mis-leading at times....)

The system under test wasn't doing anything (just idle), but I was impressed with how easy this was. In the coming weeks, we'll play more. But you should consider downloading VPA as something easy to try, play with, and get results.

Bill B

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
References:

[1] Martin Bligh - http://www.linuxsymposium.org/2006/view_bio.php?id=1697
[2] Autotest - http://test.kernel.org/autotest/
[3] Profilers - http://en.wikipedia.org/wiki/Profiler_%28computer_science%29
[4] Visual Performance Analyzer - http://www.alphaworks.ibm.com/tech/vpa
[5] SnagIt - http://www.techsmith.com/screen-capture.asp
One of many bloggers.

Bill Buros

Bill leads an IBM Linux performance team in Austin Tx (the only place really to live in Texas). The team is focused on IBM's Power offerings (old, new, and future) working with IBM's Linux Technology Center (the LTC). While the focus is primarily on Power systems, the team also analyzes and improves overall Linux performance for IBM's xSeries products (both Intel and AMD) , driving performance improvements which are both common for Linux and occasionally unique to the hardware offerings.

Performance analysis techniques, tools, and approaches are nicely common across Linux. Having worked for years in performance, there are still daily reminders of how much there is to learn in this space, so in many ways this blog is simply another vehicle in the continuing journey to becoming a more experienced "performance professional". One of several journeys in life.

The Usual Notice

The postings on this site are my own and don't necessarily represent IBM's positions, strategies, or opinions, try as I might to influence them.