The 3 most important things in supercomputing (2014).

I recently returned from my annual pilgrimage to the Supercomputing conference (SC14 this year, of course). My take away this year was a list of the 3 most important things in supercomputing, or least in exascale supercomputing, which is pretty much the only thing anyone was talking about. The 3 things I seemed to hear the most about were (1) power efficiency, (2) memory, and (3) productivity.

Little surprise that power efficiency is on the list, it’s been there for a while. With “leadership” class supercomputers already requiring tens of megawatts of power at petascale, it simply isn’t possible to scale up by another factor of 1000 by doing more of the same. While there didn’t seem to be any consensus on how to solve the problem, there was some consensus on what the problem was: “locality”. Current multicore architectures require moving the data from where ever it is, through several levels of cache, into the processing unit, then reverse above procedure for the result. Moving data takes power, lots of it. Throw in the growing heterogeneity of hybrid architectures (multicore, GPU, FPGA, etc) and the data path gets longer and more tortuous. So getting the processor and the data in the same locality is a critical architectural issue.

Memory is newer to the list, at least for me. Last year I learned there were multiple new memory technologies coming on-line in the near future. This year there was an excellent panel discussion “Future of Memory Technology for Exascale and Beyond II”, apparently a continuation of a similar panel last year, but I missed that one somehow. The panelists each reviewed the status of various memory technologies and their potential for satisfying the exascale demand. Again, there didn’t seem to be a consensus on the solution, but locality was certainly the topic, since memory and data movement consumes most of the power.

Productivity is the newest on the list. Or perhaps it is new again. In any case, it appeared in several interesting venues. The panel discussion “HPC Productivity or Performance: Choose One” renewed the time-honored debate over the trade-offs between optimizing machine performance versus optimizing human productivity, updated for today’s heterogeneous architectures and tomorrow’s unknown exascale future. Locality is a central issue in productivity, since orchestrating data movement is a key programming task. As architectures get increasingly  more complex, it is getting difficult to even describe where the data is and what path to move it along without having to know way too much about the particular machine your code is running on. This issue was addressed in an interesting Birds of a Feather session “Programming Abstractions for Data Locality”. The session reported and discussed the results of a workshop of the same name held last April in Switzerland (see http://www.padalworkshop.org). Here too there was consensus that locality was the problem, but not much confidence in what the solution might be.

So it seems each of the 3 items in my list reduce to locality. Apparently supercomputing has a lot in common with real estate, the 3 most important things are locality, locality, and locality.

3 Things I learned at SC13

I attended SuperComputing 2013 back in November.  That trip was followed immediately by the Thanksgiving recess, a quick trip to Whistler, British Columbia to make a presentation at the SINBAD Consortium‘s HPC/BigData Forum, then several weeks of long anticipated vacation. When I returned in the New Year there was lots of catching up to do, so this is the first chance I’ve had to comment on what I learned at SC13.

It was a very stimulating conference,  as always, but I think 3 things stood out. The first was unquestionably Micron Technology’s “Automata processor“. Like a graphics adapter, this is a special purpose accelerator for a particular class of problems. The class includes bio-sequencing applications and the performance results they were showing at their “Emerging Technologies” booth were stunning. Micron claims the technology “is not a memory device, but it is memory based” and is a  “two-dimensional fabric comprised of thousands of processing elements each programmed to perform a targeted task or operation”. Micron further claims the technology is scalable  to “hundreds of thousands or even millions of tiny processors”.  And this is just the first generation. One can certainly expect the “tiny” processors to get bigger and more capable with time, suggesting an entirely new architecture for HPC. Unburdened by any understanding of how this actually works or its limitations, I have no trouble imagining a gazillabyte or so of this stuff running a massive in-memory Sheaf database, rapidly answering all sorts of complex, compute-intensive topological/geometric/field queries.

The second was the panel discussion “Big Computing: From the Exa-Scale to the Sensor-Scale”, which presented a vision of a beyond exascale world of ubiquitous sensors and controllers connected by the internet of things. I attended because “sensor swarms” are one of the known practical applications of sheaf theory (see the work of Robert Ghrist, for instance “Sheaves and Sensors“). The panel gave a glimpse of what is coming and it seems like there ought to be some SheafSystem™ applications in there somewhere. Maybe that’s what you use the gazillascale Sheaf on Micron database for!

Finally, the third standout was the invited talk “Integration of Quantum Computing into High Performance Computing” by Colin P. Williams from D-Wave Systems, Inc. Williams’ talk combined a bit of background on quantum computing with a description of the current and coming capabilities of D-Wave’s product line. I was previously under the impression that quantum computing was something to read about in research journals and not likely to be practical for decades. But D-Wave has quantum computers you can buy now, albeit special purpose ones, as Williams described. He briefly touched on why quantum computers are so powerful and how the power apparently originates in the “many-worlds” interpretation of quantum mechanics. I’d first read about that idea years ago when I was in graduate school and was delighted to learn it actually has practical consequences. Maybe it’s time to refresh my understanding of quantum mechanics. (In my copious spare time!)

So now, “suitably pumped up” with new ideas, I can return to all the tasks we’ve planned for 2014 and look forward to SC14.

Off to Supercomputing 2013

Well, it’s that time of year again, so I’m off to the annual Supercomputing conference. Not speaking this year, but LPS will be present at the exhibition, by way of the University of Bergen, Bergen Language Design Laboratory’s exhibit, booth #4501. They will be displaying a work-in-progress poster describing our collaboration with Professor Magne Haveraaen on specification of a query language for the sheaf data model. I’ll be helping man the booth part time (schedule to be determined) so stop by and say hello!