Abstract
The CUDA model for graphics processing units (GPUs) presents the programmer with a plethora of different programming options. These includes different memory types, different memory access methods and different data types. Identifying which options to use and when is a non-trivial exercise. This paper explores the effect of these different options on the performance of a routine that evaluates sparse matrix-vector products (SpMV) across three different generations of NVIDIA GPU hardware. A process for analysing performance and selecting the subset of implementations that perform best is proposed. The potential for mapping sparse matrix attributes to optimal CUDA SpMV implementations is discussed.
| Original language | English |
|---|---|
| Pages (from-to) | 3-13 |
| Number of pages | 11 |
| Journal | Concurrency Computation Practice and Experience |
| Volume | 24 |
| Issue number | 1 |
| DOIs | |
| Publication status | Published - Jan 2012 |
| Externally published | Yes |
Keywords
- CUDA
- Fermi
- GPU
- matrix-vector
- NVIDIA
- S2050
- sparse
Fingerprint
Dive into the research topics of 'Generating optimal CUDA sparse matrix-vector product implementations for evolving GPU hardware'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver