Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return sparse matrix instead of dense matrix from adjacency matrix functions #1081

Open
sambaPython24 opened this issue Feb 7, 2024 · 3 comments
Labels
enhancement New feature or request hard

Comments

@sambaPython24
Copy link

sambaPython24 commented Feb 7, 2024

networkx gives adjacency matrices as sparse matrices and that is necessary, since for larger graphs with >100000 nodes with sparse connections, we can have a sparse e.g. coo matrix but a dense matrix does usually not fit into the memory.

I suggest, you give the option to return either a sparse (coo) matrix or its values and coordinates (as an option).


Comparison networkx

@sambaPython24 sambaPython24 changed the title Return sparse matrix instead of dense matrix from adjoint_matrix functions Return sparse matrix instead of dense matrix from adjacency matrix functions Feb 7, 2024
@IvanIsCoding
Copy link
Collaborator

I think converting the graphs into a sparse matrix would be really helpful. But on the Rust side sparsemat/sprs#289 is going to be a big blocker.

We could also try implementing types that are a read-only CSC/CSR matrix from Rust, but I am not sure how useful that is for users

@IvanIsCoding IvanIsCoding added enhancement New feature or request hard labels Feb 8, 2024
@mtreinish
Copy link
Member

mtreinish commented Feb 8, 2024

At least in this case since we just need to generate a sparse matrix. I think this one isn't actually super hard, albeit not as potentially optimized as possible. We should be able construct 1d arrays for each of the inputs like (row, col, data) or (indptr, indices, data) as readonly numpy arrays directly from rust fairly efficiently by iterating over the graph and then it's just a matter of calling the appropriate scipy python side sparse matrix constructor with those arrays we generate.

The thing I'm mostly hesitant about is adding a dependency on scipy to the package, as it's fairly heavy and we're not actually going to be using it for anything inside rustworkx, just for interop with sparse matrices which is a fairly small part of scipy. I wonder if we had a function like rustworkx.sparse_adjacency_matrix(graph: PyGraph) -> (PyArray, PyArray, PyArray) or something like that if it would be enough. Then as an end user you would call:

scipy.sparse.csc_matrix(rustworkx.sparse_adjacency_matrix(graph))

Although if we did add a scipy dependency (including an optional one) we can probably do this call directly in rust via pyo3 fairly easily.

@sambaPython24
Copy link
Author

sambaPython24 commented Feb 10, 2024

@mtreinish, @IvanIsCoding
Thank you for your fast response! From the user side, it would be certainly enough to just give out (row, col, data) as a dict or a tuple,
since it is very easy to construct sparse matrices or transform there layout in scipy or other, even more optimized python libraries.

Unless you do not want to re-write scipy and its sparse matrix operations in rust as well (which would be amazing but probably a bit of an overkill), adding functionality or its own sparse format does not seem to be necessary.

The implemention of the COO matrix is probably just a loop over the edges and instead of filling the dense matrix, appending it to a list. The reason, I did not do that directly in python is, that in Rust it must be a lot faster which matters for the mentioned, large matrices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request hard
Projects
None yet
Development

No branches or pull requests

3 participants