Welcome to part 4 of my series summarizing the exascale software roadmap document. This document is a produced through a series of meetings by scientists and researchers in different areas of HPC envisioning the software stack for million cores machines. That’s a machine that is due soon in this decade with exascale computing power. In last two blog posts I summarized the Systems Software, which is concerned of operating systems, run-time systems, I/O, and systems management. This blog posting and the next one is discussing exascale project vision about the development environment, which includes interesting topics, mainly: programming models, frameworks, compilers, numerical libraries, and debugging tools. I think this section is of great importance to both computer scientists and researchers from other fields of science. It is concerned of the direct tools to build and implement needed applications or algorithms. So let’s get started.

Programming Models

Original contributors of this section are: Barbara Chapman (U. of Houston), Mitsuhisa Sato, (U. of Tsukuba, JP), Taisuke Boku (U. of Tsukuba, JP), Koh Hotta, (Fujitsu), Matthias zueller (TU Dresden, DE), Xuebin Chi (Chinese Academy of Sciences)

Authors believe that 7 technology drivers will affect the programming models significantly in this decade:

  • Increased number of nodes and explosion in the number of cores in nodes which mandates from programming models to work at different granularity levels.
  • Heterogeneity of processors, which makes a basic task of the programming models to abstract such heterogeneity.
  • Increased number of components increases the likelihood of failures to occur. Programming models should be resilient to such failures.
  • Changing nature and trends in I/O usage push programming models to consider more seriously expected I/O complexities.
  • Applications’ complexity will increase dramatically. Programming models should simplify parallel programming and help developers focus on the application or algorithm implementation rather than architectural related concerns.
  • Increased depth of software stack mandates from the programming models to detect and report failures at the proper abstraction level.


Based on these foreseen drivers, the following R&D alternative are available for the community:

  • Hybrid versus Uniform programming model. Hybrid may provide better performance but very difficult to learn and use. Uniform programming models are easier to program with; however, their abstractions may reduce performance.
  • Domain specific versus general programming models. Domain specific may provide better portability and performance compared to the general models in some application areas.
  • Widely embraced standards versus single implementation. The second option is faster to implement but the first strategy would provide more support for the applications developers.

It is very difficult to decide which of these alternatives to choose. However, it is a fact right now that most of the HPC systems will be built out of heterogeneous architectures to accelerate the compute intensive parts within the applications. This will impose the usage of hybrid programming models such as MPI and OpenMP or MPI and CUDA. According to the authors, they key for a successful programming models development is to link existing models for faster and better productivity. Such integration may give corresponding community more ideas about building a new programming model that provides unified programming interface.


Original contributors of this section are: Michael Heroux and Robert Harrison

Frameworks should provide a common collection of interfaces, tools and capabilities that are reusable across a set of related applications. It is always a challenging task for HPC systems due to their inherit complexity. I think there is some redundancy in this section. The main technology drivers I could get from this section are:

  • New applications will be implemented on top of the exascale systems. Current frameworks should be revisited to satisfy the new possible needs.
  • Scalability and extensibility are very important factors that need reconsideration due to the hybrid systems a variability of applications as well.

According to the authors, we have two options in such case:

  • No Framework. In this case a single application can be developed faster. However, a lot of redundancy will exist if ware adopting that option for all applications running on top of the exascale infrastructure.
  • Clean-Slate Framework. It takes time to develop such frameworks. However, it depends on the other components of the exascale software stack. If a revolutionary option chosen in the other components (e.g. new OS, programming model, etc.), which is less likely to occur, a new framework will be required to link all these components together.

The authors are concluding by suggesting two main critical directions for a proper framework tying all the exascale software components together:

  1. Identify and develop cross-cutting algorithm and software technologies, which is relatively easy to do, based on the experiences of the last few years on the multi- and many-core architectures.
  2. Refactoring for manycore, which is doable by understanding the common requirements of manycore programming that will be true regardless of the final choice in programming models, such as load balancing, fault tolerance, etc.


Original contributors of this section are: Barbara Chapman (U. of Houston), Mitsuhisa Sato, (U. of Tsukuba, JP), Taisuke Boku (U.of Tsukuba, JP), Koh Hotta, (Fujitsu), Matthias Mueller (TU Dresden), Xuebin Chi (Chinese Academy of Sciences)

Compilers are a critical component in implementing the foreseen programming models. The following technology trends might be the main drivers for compilers design and development for the exascale software stack:

  • Machines will have hybrid processors. Compilers are expected to generate code and collaborate with run-time libraries working on different types of processors at the same time.
  • Memory hierarchies will be highly complex; memory will be distributed across the nodes of exascale systems and will be NUMA within the individual nodes, with many levels of cache and possibly scratchpad memory. Compilers will be expected to generate code that exhibits high levels of locality in order to minimize the cost of memory accesses.

Authors of this section are using the same R&D alternatives of the programming models for the compilers. Therefore, they are proposing the following research points for compilers (I’m including important ones):

  • Techniques for the translation of new exascale programming models and languages supporting high productivity and performance, support for hybrid programming models and for programming models that span heterogeneous systems.
  • Powerful optimization frameworks; implementing parallel program analyses and new, architecture-aware optimizations, including power, will be key to the efficient translation of exascale programs.
  • Exascale compilers could benefit from recent experiences with just-in-time compilation and perform online feedback-based optimizations, try out different optimizations, generate multiple code versions or perform more aggressive speculative optimizations.
  • Implement efficient techniques for fault tolerance.
  • Compilers should interact with the development tools run-time environment for automatically instrumenting tools.
  • Compilers may be able to benefit from auto-tuning approaches, may incorporate techniques for learning from prior experiences, exploit knowledge on suitable optimization strategies that is gained from the development and execution environments, and apply novel techniques that complement traditional translation strategies.

Next Time

My next blog post will be handling important two subsections: numerical libraries and debugging tools.


This posting is part of a series summarizing the roadmap document of the Exascale Software Project: