Concurrent Number Cruncher: An Efficient Sparse Linear Solver on
the GPU
Authors
Luc Buatois and Guillaume Caumon and Bruno Lévy
Journal/Article/Conference
High Performance Computation Conference (HPCC-07), http://www.tlc2.uh.edu/hpcc07/
Abstract
A wide class of geometry processing and PDE resolution methods
needs to solve a linear system, where the non-zero pattern of the matrix is dictated
by the connectivity matrix of the mesh. The advent of GPUs with their
ever-growing amount of parallel horsepower makes them a tempting resource for
such numerical computations. This can be helped by new APIs (CTM from ATI
and CUDA from NVIDIA) which give a direct access to the multithreaded computational
resources and associated memory bandwidth of GPUs; CUDA even
provides a BLAS implementation but only for dense matrices (CuBLAS).
However, existing GPU linear solvers are restricted to specific types of matrices,
or use non-optimal compressed row storage strategies. By combining recent
GPU programming techniques with supercomputing strategies (namely block
compressed row storage and register blocking), we implement a sparse generalpurpose
linear solver which outperforms leading-edge CPU counterparts (MKL /
ACML).
BibTeX Reference
@INPROCEEDINGS{Buatois07,
| author
|
=
{Luc Buatois and Guillaume Caumon and Bruno Lévy},
|
| year
|
=
{2007},
|
| title
|
=
{Concurrent Number Cruncher: An Efficient Sparse Linear Solver on
|
| editor
|
=
{R. Perrott et al.},
|
| booktitle
|
=
{High Performance Computation Conference (HPCC-07), http://www.tlc2.uh.edu/hpcc07/},
|
| publisher
|
=
{Springer},
|
| series
|
=
{Lecture Notes in Computer Science 4782},
|
| volume
|
=
{4782},
|
| pages
|
=
{358--371},
|
| note
|
=
{Texas instrument Student paper award},
|
| abstract
|
=
{
|
}
