Loading…
XSEDE16 has ended

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Accelerating Discovery [clear filter]
Wednesday, July 20
 

3:30pm EDT

AD: Tools for studying populations and timeseries of neuroanatomy enabled though GPU acceleration in the Computational Anatomy Gateway
The Computational Anatomy Gateway is a software as a service tool for medical imaging researchers to quantify changes in anatomical structures over time, and through the progression of disease. GPU acceleration on the Stampede cluster has enabled the development of new tools, combining advantages of grid based and particle based methods for describing fluid flows, and scaling up analysis from single scans to populations and time series. We describe algorithms for estimating average anatomies, and for quantifying atrophy rate over time. We report code performance on different sized datasets, revealing that the number vertices in a triangulated surface presents a bottleneck to our computation. We show results on an example dataset, quantifying atrophy in the entorhinal cortex, a medial temporal lobe brain region whose structure is sensitive changes in early Alzheimer's disease.


Wednesday July 20, 2016 3:30pm - 4:00pm EDT
Chopin Ballroom

4:00pm EDT

AD: Delayed Update Algorithms for Quantum Monte Carlo Simulation on GPU
QMCPACK is open source scientific software designed to perform Quantum Monte Carlo simulation, a first-principles method for describing many-fermion systems. The evaluation of each Monte Carlo move requires finding the determinant of a dense matrix of wave functions. This calculation forms a key computational kernel in QMCPACK. After each accepted event, the wave function matrix undergoes a rank-one update to represent a single particle move within the system. The Sherman-Morrison formula is used to update the matrix inverse. Occasionally, the explicit inverse must be recomputed to maintain numerical stability. An alternate approach to this kernel utilizes QR factorization to maintain stability without re-factorization.

 

Algorithms based on a novel delayed update scheme are explored in this effort. This strategy involves calculating probabilities for multiple successive Monte Carlo moves and delaying their application to the matrix of wave functions until an event is denied or a predetermined limit of acceptances is reached. Updates grouped in this manner are then applied to the matrix en bloc to achieve enhanced computational intensity.

 

GPU-accelerated delayed update algorithms are tested and profiled for both Sherman-Morrison and QR based probability evaluation kernels. Results are evaluated against existing methods for numerical stability and efficiency; emphasis is placed on large systems, where acceleration is critical.


Wednesday July 20, 2016 4:00pm - 4:30pm EDT
Chopin Ballroom

4:30pm EDT

AD: Efficient Primitives for Standard Tensor Linear Algebra
This paper presents the design and implementation of low-levellibrary to compute general sums and products over multi-dimensional arrays (tensors). Using only 3 low-level functions, the API at once generalizes core BLAS1-3 as well as eliminates the need for most tensor transpositions. Despite their relatively low operation count, we show that these transposition steps can become performance limiting in typical use cases for BLAS on tensors. The execution of the present API achieves peak performance on the same order of magnitude (teraflops) as for vendor-optimized GEMM by utilizing a code generator to output CUDA source code for all computational kernels. The outline for these kernels is a multi-dimensional generalization of the MAGMA BLAS matrix multiplication on GPUs. Separate transpositions steps can be skipped because every kernel allows arbitrary multi-dimensional transpositions of the arguments. The library, including its methodology and programming techniques, are made available in SLACK. Future improvements to the library include a high-level interface to translate directly from a \LaTeX{}-like equation syntax to a data-parallel computation.

Speakers

Wednesday July 20, 2016 4:30pm - 5:00pm EDT
Chopin Ballroom