Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rx.digraph_find_cycle() does not find the cycle in a graph #1171

Closed
weiT1993 opened this issue Apr 23, 2024 · 7 comments · Fixed by #1181
Closed

rx.digraph_find_cycle() does not find the cycle in a graph #1171

weiT1993 opened this issue Apr 23, 2024 · 7 comments · Fixed by #1181
Assignees
Labels
bug Something isn't working
Milestone

Comments

@weiT1993
Copy link

Information

  • rustworkx version: 0.14.2
  • Python version: 3.10.14
  • Rust version: rustc 1.66.0-nightly (8b705839c 2022-09-26)
  • Operating system: Linux

What is the current behavior?

rx.is_directed_acyclic_graph(graph) returns False but rx.digraph_find_cycle(graph) fails to find the cycle in the graph.

What is the expected behavior?

When rx.is_directed_acyclic_graph(graph) returns False, rx.digraph_find_cycle(graph) should find a cycle in the graph.

Steps to reproduce the problem

Running the following script several times. It quite often reproduces the bug. The graph is always correctly identified as not a DAG, but quite often rx.digraph_find_cycle(graph) does not find a cycle.

import rustworkx as rx

edges = [
    (6, 7),
    (8, 9),
    (15, 16),
    (0, 6),
    (2, 8),
    (9, 10),
    (7, 10),
    (10, 11),
    (11, 12),
    (10, 12),
    (12, 13),
    (12, 14),
    (4, 15),
    (16, 17),
    (14, 3),
    (13, 17),
    (17, 18),
    (18, 19),
    (17, 19),
    (19, 20),
    (19, 5),
    (20, 1),
    (27, 28),
    (28, 29),
    (34, 35),
    (21, 27),
    (23, 29),
    (29, 30),
    (30, 31),
    (29, 31),
    (31, 32),
    (31, 33),
    (25, 34),
    (32, 24),
    (35, 36),
    (33, 36),
    (36, 37),
    (37, 38),
    (36, 38),
    (38, 39),
    (38, 26),
    (39, 22),
    (42, 45),
    (44, 45),
    (40, 44),
    (45, 46),
    (46, 47),
    (45, 47),
    (47, 48),
    (47, 49),
    (48, 43),
    (49, 41),
    (56, 57),
    (57, 58),
    (61, 63),
    (50, 56),
    (52, 57),
    (58, 59),
    (57, 59),
    (59, 60),
    (54, 62),
    (59, 61),
    (60, 53),
    (62, 63),
    (63, 64),
    (64, 65),
    (63, 65),
    (65, 66),
    (65, 67),
    (66, 51),
    (67, 55),
    (74, 75),
    (75, 76),
    (80, 81),
    (68, 74),
    (70, 76),
    (76, 77),
    (77, 78),
    (76, 78),
    (78, 79),
    (78, 80),
    (72, 81),
    (79, 71),
    (81, 82),
    (82, 83),
    (81, 83),
    (83, 84),
    (83, 69),
    (84, 73),
    (1, 42),
    (3, 23),
    (24, 72),
    (22, 50),
    (5, 54),
    (41, 40),
    (26, 52),
    (69, 70),
]
edges = [x + (None,) for x in edges]
nodes = set()
for edge in edges:
    nodes.add(edge[0])
    nodes.add(edge[1])
nodes = list(nodes)

# Create a directed graph (PyDiGraph)
graph = rx.PyDiGraph()

# Add nodes and edges
graph.add_nodes_from(nodes)
graph.add_edges_from(edges)

# Check if the graph is a DAG
is_dag = rx.is_directed_acyclic_graph(graph)
print("Is the graph a DAG? {}".format(is_dag))

# Attempt to find a cycle
cycle = rx.digraph_find_cycle(graph)
print("Cycle found : {}".format(cycle))

Quite often the script produces:

Is the graph a DAG? False
Cycle found : EdgeList[]
@weiT1993 weiT1993 added the bug Something isn't working label Apr 23, 2024
@IvanIsCoding
Copy link
Collaborator

That sounds pretty bad. I haven’t reproduced the issue yet, but to give you a reasoning:

  • The first method that is correct uses topological sort
  • The second method runs a DFS to find a cycle with a given node. If you don’t specify we pick an arbitrary one, which could explain why you get no cycle at all

Until we fix it in 0.14.3 or 0.15.0 please use https://www.rustworkx.org/apiref/rustworkx.strongly_connected_components.html and pick a component with more than one node. That method is correct and should always give you a cycle. I will probably fix find_cycle using the SCC implementation

@IvanIsCoding
Copy link
Collaborator

IvanIsCoding commented Apr 24, 2024

I verified the problem. It is with the arbitrarily chosen starting node. If you pick this one (I plotted the graph to see the cycle):

cycle = rx.digraph_find_cycle(graph, graph.find_node_by_weight(81))
print(cycle)

You get:

EdgeList[(81, 82), (82, 83), (83, 69), (69, 70), (70, 76), (76, 77), (77, 78), (78, 80), (80, 81)]

I'll try to change the behaviour of rx.is_directed_acyclic_graph(graph). It should always return a cycle if it exists and node is specified. I agree it is a bit misleading. But probably the original intention was to pass a node you already knew was in a cycle as an argument

@IvanIsCoding
Copy link
Collaborator

Also, completely unrelated to the bug but you would probably like add_edges_from_no_data. You don't need to create the new edge list you created, just call graph.add_edges_from_no_data(edges) instead of graph.add_edges(edges) if you don't have weights.

@weiT1993
Copy link
Author

weiT1993 commented Apr 24, 2024

Thanks for looking into the bug. I am confused about your reasoning for the bug. The DFS cycle detection algorithm should find a cycle regardless of which arbitrary node was picked when the given graph is connected and only has one component. In my example, the given graph is a single connected graph.

If you are saying the DFS algorithm attempts to find a cycle that has to contain the random starting node if none was given, that would explain the bug. But it is a misleading design choice. For example, if the input is a huge graph with a tiny cycle, this algorithm would have a very small chance of returning the cycle unless the user already knows where the tiny cycle is (in which case, what's the point of running this function anyway?). Was this intentional?

I think the expected behavior for rx.digraph_find_cycle(graph) should be that it always finds a cycle (if there is one) when no starting node is specified. It should find a cycle containing the starting node when one is given.

And I do not think there is anything wrong with rx.is_directed_acyclic_graph(graph). It seems to correctly decide whether the graph is a DAG.

@IvanIsCoding
Copy link
Collaborator

Thanks for looking into the bug. I am confused about your reasoning for the bug. The DFS cycle detection algorithm should find a cycle regardless of which arbitrary node was picked when the given graph is connected and only has one component. In my example, the given graph is a single connected graph.

If you are saying the DFS algorithm attempts to find a cycle that has to contain the random starting node if none was given, that would explain the bug. But it is a misleading design choice. For example, if the input is a huge graph with a tiny cycle, this algorithm would have a very small chance of returning the cycle unless the user already knows where the tiny cycle is (in which case, what's the point of running this function anyway?). Was this intentional?

I think the expected behavior for rx.digraph_find_cycle(graph) should be that it always finds a cycle (if there is one) when no starting node is specified. It should find a cycle containing the starting node when one is given.

And I do not think there is anything wrong with rx.is_directed_acyclic_graph(graph). It seems to correctly decide whether the graph is a DAG.

In short, yes, it is a misnomer. A better name for the function would be digraph_find_cycle_containing_node.

Also, is-directed_acyclic_graph has an independent implementation that behaves as you expected.

@weiT1993
Copy link
Author

weiT1993 commented Apr 24, 2024

So currently, if I want to find a cycle in the graph without knowing which node is in it, I need to write a wrapper to make sure I visit all the nodes? Do you plan to support it?

@IvanIsCoding
Copy link
Collaborator

So currently, if I want to find a cycle in the graph without knowing which node is in it, I need to write a wrapper to make sure I visit all the nodes? Do you plan to support it?

We will fix it in 0.15 to make it smarter and always return a cycle if it exists.

If you need an urgent workaround I suggest checking the strongly connected components. Any component with more than one element is a cycle

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants