Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apoc.load.jdbcUpdate inside apoc.periodic.iterate leaves idle connections in 5.19.0 #4074

Closed
nielsjansendk opened this issue May 14, 2024 · 2 comments

Comments

@nielsjansendk
Copy link

Expected Behavior (Mandatory)

When running a apoc.load.jdbcUpdate operation as the second statement in an apoc.periodic.iterate, the connections to the database (postgres) should be closed after they have been executed.

Actual Behavior (Mandatory)

It does not close the connections, and it leaves an idle connection for each statement. When the postgres database reaches max connections the statement fails.

How to Reproduce the Problem

Steps (Mandatory)

  1. Create a table in a postgres database:
    CREATE TABLE nodes ( id serial PRIMARY KEY, my_id integer );

  2. In you neo4j database, create some nodes:
    WITH range(0, 100) as list UNWIND list as l CREATE (n:MyNode{id: l})

  3. Load the postgres driver:
    CALL apoc.load.driver("org.postgresql.Driver")

  4. Try to use apoc.period.iterate to insert into the postgres table:

CALL apoc.periodic.iterate("MATCH (n:MyNode) return n", "WITH n, apoc.text.format('insert into nodes (my_id) values (\\\'%d\\\')',[n.id]) AS sql CALL apoc.load.jdbcUpdate('jdbc:postgresql://my_user:my_password@localhost:5432/my_database',sql) YIELD row AS row2 return row2,n", {batchsize: 10,parallel: true}) yield batches,total,timeTaken,committedOperations,failedOperations,failedBatches,retries,errorMessages, operations, failedParams, updateStatistics return batches,total,timeTaken, committedOperations,failedOperations,failedBatches,retries,errorMessages,operations,failedParams, updateStatistics

  1. Check postgres for connections:

select client_addr, state from pg_stat_activity to see the hanging connections from the neo4j server. They can be killed with select pg_terminate_backend(pid) from pg_stat_activity where state = 'idle' but a pg_cancel_backend will not work.

NOTE: running the same experiment on neo4j 5.18 works fine, it 5.19 that introduces this problem.

Specifications (Mandatory)

Currently used versions

Versions

  • OS: Ubuntu 22.04.4
  • Neo4j: 5.19.0
  • Neo4j-Apoc: 5.19.0
  • pg driver: postgresql-42.7.3.jar
  • Postgres version: 16.2
@jexp
Copy link
Member

jexp commented May 16, 2024

Also happens for apoc.load.csv (holds the lock on windows) not sure if this is related.

https://community.neo4j.com/t/database-is-holding-onto-file-after-apoc-load-csv-creating-a-permission-error/67943

@vga91
Copy link
Collaborator

vga91 commented Jun 7, 2024

The problem does not seem to be caused by apoc.periodic.iterate, in fact even doing apoc.load.jdbcUpdate directly many times, idle connections are left.

However it would seem not to be an apoc problem, as PreparedStatement.close() executes correctly without errors, rather it could depend on Postgres itself leaving connections idle for a certain amount of time, as mentioned here and here.

Even with the latest JDBC driver version, the problem still seems to be present.

Please feel free to reopen the issue, if you have more info to share.

The problem isn't related to the apoc.load.csv one.

@vga91 vga91 closed this as completed Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done (to cherry-pick)
Development

No branches or pull requests

4 participants