Optimizing Compilers for High-Performance Computing
Dr. Louis-Noël Pouchet
Assistant Professor, Colorado State University
Applications running on clusters of shared-memory computers are often implemented using OpenMP+MPI. Productivity can be vastly improved using task-based programming, a paradigm where the user expresses the data and control-flow relations between tasks, offering the runtime maximal freedom to place and schedule tasks. While productivity is increased, high-performance execution remains challenging: the implementation of parallel algorithms typically requires specific task placement and communication strategies to reduce inter-node communications and exploit data locality. Furthermore, pattern-specific and target-specific code optimizations are often needed to achieve high-performance execution of task bodies.
In this talk we present a new macro-dataflow programming environment for distributed-memory clusters, based on the Intel Concurrent Collections (CnC) runtime. Our language extensions let the user define virtual topologies, task mappings, task-centric data placement, task and communication scheduling, etc. We introduce a compiler to automatically generate Intel CnC C++ run-time, with key automatic optimizations including task coarsening and coalescing. We also present several pattern-specific optimization scenarios, including our latest results on optimizing stencil computations on regular grids.
Dr. Louis-Noel Pouchet is an Assistant Professor at Colorado State University. He is working on pattern-specific languages and compilers for scientific computing, and has designed numerous approaches using optimizing compilation to effectively map applications to CPUs, FPGAs and SoCs. His work spans a variety of domains, including compiler optimization, hardware synthesis, machine learning, programming languages, and distributed computing. His research is currently funded by the National Science Foundation, the Department of Energy, and Intel. Previously he has been a Visiting Assistant Professor (2012-2014) at the University of California Los Angeles, where he was a member of the NSF Center for Domain-Specific Computing, working on both software and hardware customization. He is the author of the PolyOpt and PoCC compilers, and of the PolyBench/C benchmarking suite.
Thursday, September 29, 2016
10:00 – 11:00 am
NCAR Mesa Lab, Main Seminar Room