Efficient and portable sparse solvers for heterogeneous high performance computing systems
Sparse matrix computations arise in the form of the solution of systems of linear equations, matrix factorization, linear least-squares problems, and eigenvalue problems in numerous computational disciplines ranging from quantum many-body problems, computational fluid dynamics, machine learning and graph analytics. The scale of problems in these scientific applications typically necessitates execution on massively parallel architectures. Moreover, due to the irregular data access patterns and low arithmetic intensities of sparse matrix computations, achieving high performance and scalability is very difficult. These challenges are further exacerbated by the increasingly complex deep memory hierarchies of the modern architectures as they typically integrate several layers of memory storage. Data movement is an important bottleneck against efficiency and energy consumption in large-scale sparse matrix computations. Minimizing data movement across layers of the memory and overlapping data movement with computations are keys to achieving high performance in sparse matrix computations. My thesis work contributes towards systematically identifying algorithmic challenges of the sparse solvers and providing optimized and high performing solutions for both shared memory architectures and heterogeneous architectures by minimizing data movements between different memory layers. For this purpose, we first introduce a shared memory task-parallel framework focusing on optimizing the entire solvers rather than a specific kernel. As most of the recent (or upcoming) supercomputers are equipped with Graphics Processing Unit (GPU), we decided to evaluate the efficacy of the directive-based programming models (i.e., OpenMP and OpenACC) in offloading computations on GPU to achieve performance portability. Being inspired by the promising results of this work, we port and optimize our shared memory task-parallel framework on GPU accelerated systems to execute problem sizes that exceed device memory.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
Rabbi, Md Fazlay
- Thesis Advisors
-
Aktulga, Hasan Metin
- Committee Members
-
Kulkarni, Sandeep
O'Shea, Brian
Çatalyürek, Ümit V.
- Date Published
-
2022
- Subjects
-
Computer science
Sparse matrices--Computer programs
Computer architecture
Heterogeneous computing
- Program of Study
-
Computer Science - Doctor of Philosophy
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- xiv, 96 pages
- ISBN
-
9798438748717
- Permalink
- https://doi.org/doi:10.25335/m8sp-v312