Add capacity keyword arguments to graph creation #975

jakelishman · 2023-08-29T11:00:34Z

This is just a simple exposure of the allocation control from Rust space to Python space.

I was playing around with this for DAG construction in Qiskit and it (surprisingly to me) didn't have a meaningful effect on performance for what I was doing, but even so, it just feels like something I'd like to be able to expose there as well.

coveralls · 2023-08-29T11:27:46Z

Pull Request Test Coverage Report for Build 9098666721

Details

40 of 40 (100.0%) changed or added relevant lines in 2 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.05%) to 96.496%

Totals
Change from base Build 9009838732:	0.05%
Covered Lines:	16717
Relevant Lines:	17324

💛 - Coveralls

IvanIsCoding · 2023-08-29T22:22:04Z

I’ll let @mtreinish give his opinion as well but I am against exposing this to end-users.

Mostly because Python users should not worry about it. For example, Python lists don’t let you pick a capacity. I think we are leaking a detail from Rust into Python

Also, I think for most of the Qiskit DAG code we spent more time on Rust <-> Python conversion than anything else. For some algorithms it takes more time to convert objects than to actually run the core logic

jakelishman · 2023-08-30T10:12:33Z

That's fair - it's always going to be the case that the other stuff we're doing is slower, it just seemed unnecessary to me to pay the resize cost even as a papercut when I know how many nodes and edges I'm about to generate to put into my graph.

IvanIsCoding · 2023-08-30T12:19:33Z

That's fair - it's always going to be the case that the other stuff we're doing is slower, it just seemed unnecessary to me to pay the resize cost even as a papercut when I know how many nodes and edges I'm about to generate to put into my graph.

I could see a constructor for PyGraph/PyDiGraph which accepts nodes/edges, a little bit similar to what extend_from_edge_list does now

But if we add with_capacity then we need to explain to the end user what is a capacity, why it is different from the number of nodes/edges, etc. Which I would rather not do

jakelishman · 2023-08-30T12:28:28Z

I have a generator expression (or equivalent opaque iterable) in Python-space (technically it's actually generated in Rust space, but we need to go via DAGCircuit, so that's irrelevant here) whose length I know ahead of time, but I don't actually want to generate Python-space allocated storage for the collection ahead of time, especially since that would then have copy semantics on entry to extend_from_edge_list.

Feel free to close the PR if you don't want to deal with explaining capacity to users - I have no proof that it'd have any measurable impact right now anyway.

mtreinish · 2023-08-30T12:59:28Z

For me I think we probably want to expose this functionality in some way for performance. Saving on memory allocations can save a bunch of runtime especially for very large graphs. But, I also share Ivan's concerns around exposing too much internal details to the UI. I'm wondering if there is a compromise here on the interface, like instead of doing with_capacity adding kwargs to the constructor like target_node_count and target_edge_count and document the performance benefit and behavior there

jakelishman · 2023-08-30T13:02:20Z

I'm happy to retarget this PR to whatever interface you guys would prefer, if it's something you'd consider adding. From my side, my ask is only that I can reserve space for nodes and/or edges before I actually provide them - anything that lets me do that, no matter how wordy / hidden, is fine for me, if you guys agree to add it.

(and I can of course live with it on our side if it's not added at all, for the above reasons)

enavarro51 · 2023-11-01T16:16:11Z

What about initial_node_count and initial_edge_count? That indicates to a Python user that it's not a limit, but a starting point. I do think it's probably worthwhile, especially since we now have a lot of hybrid Python/Rust developers on the qiskit side and probably elsewhere as well.

Or could call the method initial_size and just use node_count and edge_count. Or ...

This adds `initial_node_count` and `initial_edge_count` optional keyword arguments to the Python-space constructors of `PyGraph` and `PyDiGraph` to expose the `with_capacity` constructor from Rust space. This also easily allows these to be used during pickling operations.

jakelishman · 2024-05-15T15:14:17Z

Updated to use the new forms suggested above.

alexanderivrii · 2024-05-23T10:20:02Z

src/graph.rs

+                ("initial_node_count", self.graph.node_count()),
+                ("initial_edge_count", self.graph.edge_bound()),


Silly question: why do you use node_count and edge_bound (and not say node_bound and edge_bound, or node_count and edge_count)?

Also, do I understand correctly that the purpose of this function is to speed up unpickling of rustworkx graphs (by passing the initial_node_count and initial_edge_count arguments to __new__)?

alexanderivrii · 2024-05-23T10:22:49Z

FWIW, I also like the ability to specify the number of expected nodes/edges when creating rustworkx graphs, so I am in favor of this PR.

mtreinish added this to the 0.15.0 milestone Feb 22, 2024

jakelishman force-pushed the add-graph-with-capacity branch from eaebe80 to 9b0c146 Compare May 15, 2024 15:13

jakelishman changed the title ~~Add with_capacity constructors in Python space~~ Add capacity keyword arguments to graph creation May 15, 2024

jakelishman added 3 commits May 15, 2024 16:17

Correct release note

cbbecea

Run formatting

55ee436

Add private methods to stubs

8660eea

alexanderivrii reviewed May 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add capacity keyword arguments to graph creation #975

Add capacity keyword arguments to graph creation #975

jakelishman commented Aug 29, 2023

coveralls commented Aug 29, 2023 •

edited

IvanIsCoding commented Aug 29, 2023

jakelishman commented Aug 30, 2023

IvanIsCoding commented Aug 30, 2023

jakelishman commented Aug 30, 2023

mtreinish commented Aug 30, 2023 •

edited

jakelishman commented Aug 30, 2023 •

edited

enavarro51 commented Nov 1, 2023 •

edited

jakelishman commented May 15, 2024

alexanderivrii May 23, 2024

alexanderivrii May 23, 2024

alexanderivrii commented May 23, 2024

		("initial_node_count", self.graph.node_count()),
		("initial_edge_count", self.graph.edge_bound()),

Add capacity keyword arguments to graph creation #975

Are you sure you want to change the base?

Add capacity keyword arguments to graph creation #975

Conversation

jakelishman commented Aug 29, 2023

coveralls commented Aug 29, 2023 • edited

Pull Request Test Coverage Report for Build 9098666721

Details

💛 - Coveralls

IvanIsCoding commented Aug 29, 2023

jakelishman commented Aug 30, 2023

IvanIsCoding commented Aug 30, 2023

jakelishman commented Aug 30, 2023

mtreinish commented Aug 30, 2023 • edited

jakelishman commented Aug 30, 2023 • edited

enavarro51 commented Nov 1, 2023 • edited

jakelishman commented May 15, 2024

alexanderivrii May 23, 2024

Choose a reason for hiding this comment

alexanderivrii May 23, 2024

Choose a reason for hiding this comment

alexanderivrii commented May 23, 2024

coveralls commented Aug 29, 2023 •

edited

mtreinish commented Aug 30, 2023 •

edited

jakelishman commented Aug 30, 2023 •

edited

enavarro51 commented Nov 1, 2023 •

edited