Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equivalent of cuSparse - start with sparse matvec #208

Open
ViralBShah opened this issue Jun 19, 2023 · 7 comments
Open

Equivalent of cuSparse - start with sparse matvec #208

ViralBShah opened this issue Jun 19, 2023 · 7 comments
Labels
enhancement New feature or request libraries Things about libraries and how we use them.

Comments

@ViralBShah
Copy link
Contributor

I believe there is currently no sparse matrix capability in Metal.jl. What is the easiest way to get some basic things working?

Perhaps a bigger question is whether we can have a generic sparse matrix implementation that can work on all our GPU backends.

@maleadt
Copy link
Member

maleadt commented Jun 19, 2023

Metal's performance shaders library does not support sparse arrays. Apple Accelerate does, but that's for the CPU. Maybe that's good enough, though (with the memory being unified anyway)?

A generic implementation would be nice, but I don't have much experience with sparse algorithms. What operations would be important?

@ViralBShah
Copy link
Contributor Author

There is a FixedSparseCSC type that would be a good starting point. A good starting point might be:

  1. A COO matrix format
  2. Ability to convert back and fort from FixedSparseCSC
  3. Matrix-vector and Matrix-multiplication kernels
  4. Broadcast
  5. Reductions and scans

@maleadt
Copy link
Member

maleadt commented Jun 22, 2023

There is a FixedSparseCSC type that would be a good starting point.

What makes FixedSparseCSC (as opposed to the normal CSC array type) better suited for GPU acceleration? Generally the problem is how to parallelize (as there's fewer opportunities, naively only over a single dimension).

@ViralBShah
Copy link
Contributor Author

ViralBShah commented Jun 22, 2023

I guess I am trying to figure out what is the right programming model to keep in mind here would be. Getting a fast sparse matvec (and getting Conjugate Gradient working) followed by a fast matmul would be a good starting point to explore what is possible.

I'll experiment with a few things and see how far I can get.

@maleadt
Copy link
Member

maleadt commented Jun 22, 2023

There's some native kernels I wrote in CUDA.jl, https://github.com/JuliaGPU/CUDA.jl/blob/master/lib/cusparse/broadcast.jl, which use row/column iterators that 'zip' the multiple inputs. Thus, they parallelize across one dimension of the sparse input.

Multiplication is much more difficult though, as there isn't a straightforward dimension to accelerate over. (The crux of the issue is that we cannot have an efficient getindex to use on sparse inputs in a kernel, we need threads to iterate the row/column values).

@maleadt
Copy link
Member

maleadt commented Jun 22, 2023

Also note that those CUDA.jl kernels are ideally suited to be ported to GPUArrays using KA.jl, once we start doing that, as they don't use any advanced CUDA features.

@ViralBShah ViralBShah changed the title Equivalent of cuSparse Equivalent of cuSparse - start with sparse matvec Jul 29, 2023
@ViralBShah
Copy link
Contributor Author

ViralBShah commented Jul 29, 2023

From IterativeSolvers, cg(a,b) works when a is a dense matrix and b is a dense vector. Having a working cg for a sparse matrix would be interesting since it would open up the door to various iterative solvers on GPUs with low effort. In order to do that, the main operation is a sparse matvec.

After that, a sparse matmul is valuable.

Since CUDA uses CSR, perhaps we could just use that for Metal.jl as well.

@maleadt maleadt added enhancement New feature or request libraries Things about libraries and how we use them. labels Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request libraries Things about libraries and how we use them.
Projects
None yet
Development

No branches or pull requests

2 participants