PostgreSQL - connection failed: FATAL: remaining connection slots are reserved for non-replication superuser connections #5071

agusmakmun · 2024-05-15T03:59:47Z

What happened?

I'm having this postgresql error when doing upgrade the django cookie cutter for my project.

connection failed: FATAL: remaining connection slots are reserved for non-replication superuser connections

I'm using many celery tasks, websocket connections, and deploy using the k8s in my project.

The similar issues are mentioned in Stack Overflow:

One of the thread mentions this package, but not sure if it's working or not:
https://github.com/altairbow/django-db-connection-pool

Details

Python version, run python3 -V: 3.12
Docker version (if using Docker), run docker --version: 3
docker compose version (if using Docker), run docker compose --version: 3
Cookie cutter Django: 2024.04.22

(env-my-labs) ➜  env-my-labs cookiecutter https://github.com/cookiecutter/cookiecutter-django
  [1/27] project_name (My Awesome Project): My Project Labs
  [2/27] project_slug (practical_devsecops_cloud_labs):
  [3/27] description (Behold My Awesome Project!): My Project Labs
  [4/27] author_name (Daniel Roy Greenfeld): My Project Developer
  [5/27] domain_name (example.com): my-domain.com
  [6/27] email (practical-devsecops-developer@my-domain.com): info@my-domain.com
  [7/27] version (0.1.0): 3.7.0
  [8/27] Select open_source_license
    1 - MIT
    2 - BSD
    3 - GPLv3
    4 - Apache Software License 2.0
    5 - Not open source
    Choose from [1/2/3/4/5] (1): 5
  [9/27] Select username_type
    1 - username
    2 - email
    Choose from [1/2] (1): 1
  [10/27] timezone (UTC):
  [11/27] windows (n): n
  [12/27] Select editor
    1 - None
    2 - PyCharm
    3 - VS Code
    Choose from [1/2/3] (1): 3
  [13/27] use_docker (n): y
  [14/27] Select postgresql_version
    1 - 16
    2 - 15
    3 - 14
    4 - 13
    5 - 12
    Choose from [1/2/3/4/5] (1): 5
  [15/27] Select cloud_provider
    1 - AWS
    2 - GCP
    3 - Azure
    4 - None
    Choose from [1/2/3/4] (1): 1
  [16/27] Select mail_service
    1 - Mailgun
    2 - Amazon SES
    3 - Mailjet
    4 - Mandrill
    5 - Postmark
    6 - Sendgrid
    7 - SendinBlue
    8 - SparkPost
    9 - Other SMTP
    Choose from [1/2/3/4/5/6/7/8/9] (1): 5
  [17/27] use_async (n): y
  [18/27] use_drf (n): y
  [19/27] Select frontend_pipeline
    1 - None
    2 - Django Compressor
    3 - Gulp
    4 - Webpack
    Choose from [1/2/3/4] (1): 1
  [20/27] use_celery (n): y
  [21/27] use_mailpit (n): y
  [22/27] use_sentry (n): y
  [23/27] use_whitenoise (n): n
  [24/27] use_heroku (n): n
  [25/27] Select ci_tool
    1 - None
    2 - Travis
    3 - Gitlab
    4 - Github
    5 - Drone
    Choose from [1/2/3/4/5] (1): 3
  [26/27] keep_local_envs_in_vcs (y): y
  [27/27] debug (n): n
 [SUCCESS]: Project initialized, keep up the good work!

The text was updated successfully, but these errors were encountered:

foarsitter · 2024-05-15T06:21:17Z

This is a postgresql configuration issue and not related to this project.

You are running out of connections because they aren't closed after usage
Use https://pgtune.leopard.in.ua/ to tune your config, make sure you have a high value for max_connections
Set https://docs.djangoproject.com/en/5.0/ref/settings/#std-setting-CONN_MAX_AGE to 0 to disable long(er) living connections

agusmakmun · 2024-05-15T07:24:27Z

But the problem is, this case is not happening in Django 3.2, and only appear after migrated to Django 4.
Any different between the versions? I check at the CONN_MAX_AGE using the same 60s value.

foarsitter · 2024-05-15T07:30:45Z

If that is the problem, why don't you report it as such?

What else did you change during the migration?
What lib are you using for websockets, since this template doesn't have them?
what is your max connections? Run the show max_connections ; query to find out.
Did you use async before Django 4?

agusmakmun · 2024-05-15T07:42:51Z

I'm using the django channels for websocket connection, and upgrading to these pacakges.

django==4.2.11  # pyup: < 5.0  # https://www.djangoproject.com/

# Django Async
channels==4.1.0  # https://github.com/django/channels/
channels-redis==4.2.0  # https://github.com/django/channels_redis/

# asgi_redis (async to sync)
asgiref==3.8.1  # https://github.com/django/asgiref/

For max_connections is 100 as default from Postgres 12.

Did you use async before Django 4?

I'm only using async in the websocket things, and celery tasks to cleanup the sandbox machines.

foarsitter · 2024-05-15T07:53:46Z

max_connections isn't related to your project, so do you use your database for more projects?
how many gunicorn workers do you use?
how many celeryworkers workers do you use?
is you use the async/unvicon option you are using the uvicon worker, I'm correct?

Increasing max_connections to 400 would be a simple solution

agusmakmun · 2024-05-15T07:59:53Z

nope, I'm using a single application but setup using k8s.
12 workers as I manually config in compose/production/django/start

exec /usr/local/bin/gunicorn config.asgi --bind 0.0.0.0:5000 --chdir=/app -k uvicorn.workers.UvicornWorker --workers=${UVICORN_TOTAL_OF_WORKERS:=12}

I'm not sure of it, but we're using single pod of celeryworker.

kubectl --kubeconfig=.vscode/development/kubeconfig get pods
NAME                           READY   STATUS    RESTARTS   AGE
celerybeat-7dccfd8965-mbjt9    2/2     Running   0          4h31m
celeryworker-fc7d59c8c-9bgcv   2/2     Running   0          4h31m
django-5d54d9ddbf-dmpk6        2/2     Running   0          4h31m
flower-78c4b8f74f-8h8hx        2/2     Running   0          4h31m
next-849ff6c88b-95tgn          1/1     Running   0          4h31m
websocket-768b6cd964-gj9j4     2/2     Running   0          4h31m

Yup, we're using uvicorn workers as in above.

agusmakmun · 2024-05-15T08:05:14Z

Let say if many users are using the websocket at the same time. is increasing the max_connections to 400 will solve the problem?

foarsitter · 2024-05-15T08:09:43Z

How many workers does celery use? 12?

Do the math: you are using two django pods, resulting in at least 24 connections. I assume the same for celery, makes at least 48 in total. You are running beat twice, that will result in double scheduling. Why are you running flower twice? Do next & websockets read the database?

Howere, my guess would be that the catch is in worker-connections I suppose. Each worker has many http connections and each http connection has his own database connection. So with your 24 workers and the default worker-connections of 1000 you need 24000 database connections at max for django alone. Here is some more info: https://stackoverflow.com/questions/63471960/gunicorn-uvicorn-worker-py-how-to-honor-limit-concurrency-setting

agusmakmun · 2024-05-15T08:24:07Z

How many workers does celery use? 12?

How to check the how many celery workers?

This is what I get so far;

kubectl --kubeconfig=kubeconfig exec --stdin --tty celeryworker-6dc48c545d-p879n -- /bin/bash

Defaulted container "celeryworker" out of: celeryworker, vault-agent, vault-agent-init (init)
django@celeryworker-6dc48c545d-p879n:/app$ source /entrypoint
PostgreSQL is available
django@celeryworker-6dc48c545d-p879n:/app$
django@celeryworker-6dc48c545d-p879n:/app$
django@celeryworker-6dc48c545d-p879n:/app$ celery -A config.celery_app status
->  celery@celeryworker-6dc48c545d-p879n: OK

1 node online.

and for more:

django@celeryworker-6dc48c545d-p879n:/app$ celery -A config.celery_app inspect stats
->  celery@celeryworker-6dc48c545d-p879n: OK
    {
        "broker": {
            "alternates": [],
            "connect_timeout": 4,
            "failover_strategy": "round-robin",
            "heartbeat": 120.0,
            "hostname": "redis-master.redis.svc.cluster.local.",
            "insist": false,
            "login_method": null,
            "port": 6379,
            "ssl": false,
            "transport": "redis",
            "transport_options": {},
            "uri_prefix": null,
            "userid": null,
            "virtual_host": "0"
        },
        "clock": "107892",
        "pid": 8,
        "pool": {
            "implementation": "celery.concurrency.prefork:TaskPool",
            "max-concurrency": 8,
            "max-tasks-per-child": "N/A",
            "processes": [
                16,
                17,
                18,
                19,
                20,
                21,
                22,
                23
            ],
            "put-guarded-by-semaphore": false,
            "timeouts": [
                300,
                600
            ],
            "writes": {
                "all": "0.67, 0.17, 0.17",
                "avg": "0.33",
                "inqueues": {
                    "active": 0,
                    "total": 8
                },
                "raw": "4, 1, 1",
                "strategy": "fair",
                "total": 6
            }
        },
        "prefetch_count": 32,
        "rusage": {
            "idrss": 0,
            "inblock": 40,
            "isrss": 0,
            "ixrss": 0,
            "majflt": 3,
            "maxrss": 166536,
            "minflt": 187490,
            "msgrcv": 0,
            "msgsnd": 0,
            "nivcsw": 833,
            "nsignals": 0,
            "nswap": 0,
            "nvcsw": 1597,
            "oublock": 56,
            "stime": 1.8547509999999998,
            "utime": 10.366434
        },
        "total": {
            "my_project.base.tasks.send_announcement_broadcast": 1,
            "my_project.webssh.tasks.cleanup_expired_durations": 1,
            "my_project.webssh.tasks.cleanup_machines": 3,
            "my_project.webssh.tasks.sync_cloud_drive_account": 1
        },
        "uptime": 533
    }

1 node online.

foarsitter · 2024-05-15T08:27:08Z

"max-concurrency": 8

agusmakmun · 2024-05-15T08:34:34Z

anything that I can do apart from increasing the max_connections?

foarsitter · 2024-05-15T08:37:46Z

You have at least these 4 options as far I can tell:

Limit the amount of concurrent connections for the UvicornWorker like described in https://stackoverflow.com/questions/63471960/gunicorn-uvicorn-worker-py-how-to-honor-limit-concurrency-setting by implementing a CustomUvicornWorker
Set https://docs.djangoproject.com/en/5.0/ref/settings/#std-setting-CONN_MAX_AGE to 0
Increase max_connections
use pgbouncer or pgpool for connection pooling

agusmakmun · 2024-05-15T08:42:11Z

Thank you so much @foarsitter, we will try your suggestions 👍

foarsitter · 2024-05-15T08:44:35Z

If you resolve your issue by implementing a custom UvicornWorker we would like to receive a PR :)

agusmakmun · 2024-05-15T08:46:57Z

Sure, will happy to do that. For quick solution we will try to increase the postgres max_connections to 1000 first.

agusmakmun · 2024-05-16T01:40:37Z

Finally my issue was solved. This happens because of using both shared_task() and @celery_app.task(), and it leads duplicate database connection. So, after rollback all to use @celery_app.task() only, it's resolve the problem.

Note: I changed it to @shared_task() because trying to follow the users/tasks.py for non-executable tasks for cronjob.

cookiecutter-django/{{cookiecutter.project_slug}}/{{cookiecutter.project_slug}}/users/tasks.py

Lines 1 to 9 in beba4c1

    
           from celery import shared_task 
        
           from .models import User 
        
           @shared_task() 
        
           def get_users_count(): 
        
               """A pointless Celery task to demonstrate usage.""" 
        
               return User.objects.count()

Before

myapp/tasks.py

from celery import shared_task
from config import celery_app

@shared_task()
def task_1():
   ...

@celery_app.task()
def task_2_that_called_for_cronjob():
   ...

After

from config import celery_app

@celery_app.task()
def task_1():
   ...

@celery_app.task()
def task_2_that_called_for_cronjob():
   ...

agusmakmun · 2024-05-16T03:54:12Z

The issue still appears again.

foarsitter · 2024-05-16T07:05:24Z

shared_task is just another way of registering tasks for situations when there is no app available yet. So that can't be the solution.

agusmakmun · 2024-05-16T13:00:13Z

I found some related issue are because of impact of upgrade Django 4 + using sync_to_async, as I'm right now using it to handle the async process for the websocket.

This may be due to Django 4.0 having per-request contexts for the thread sensitivity of sync_to_async() — See #32889.
If so, that's kind-of a good thing, in that too many open resources is what you'd expect in async code, and up to now, we've not been hitting that, as we've essentially been running serially. - https://code.djangoproject.com/ticket/33497#comment:4

foarsitter · 2024-05-16T13:06:42Z

So you put CONN_MAX_AGE to 0 and solved the issue?

agusmakmun · 2024-05-16T13:11:12Z

Nope, I had changed it to 0. And the error still occurs.

agusmakmun · 2024-05-17T13:07:44Z

The issue finally resolved after implementing the django-db-connection-pool + set the CONN_MAX_AGE=0

settings.py

DATABASES = {"default": env.db("DATABASE_URL")}
DATABASES["default"]["ATOMIC_REQUESTS"] = True
# https://github.com/altairbow/django-db-connection-pool?tab=readme-ov-file#postgresql
DATABASES["default"]["ENGINE"] = "dj_db_conn_pool.backends.postgresql"
DATABASES["default"]["POOL_OPTIONS"] = {
    "POOL_SIZE": 10,
    "MAX_OVERFLOW": 10,
    "RECYCLE": 24 * 60 * 60,
}

# Set to 0 to disable long(er) living connections
# https://docs.djangoproject.com/en/dev/ref/settings/#std-setting-CONN_MAX_AGE
# https://github.com/cookiecutter/cookiecutter-django/issues/5071
# "connection failed: FATAL: remaining connection slots are reserved for non-replication superuser connections"  # noqa: ERA001
DATABASES["default"]["CONN_MAX_AGE"] = env.int("CONN_MAX_AGE", default=0)

requirements.txt

psycopg[pool]==3.1.18  # https://github.com/psycopg/psycopg
django-db-connection-pool[psycopg3]==1.2.5  # https://github.com/altairbow/django-db-connection-pool

agusmakmun · 2024-05-19T05:03:24Z

The upcoming errors that occurs is ProgrammingError the last operation didn't produce a result.
Even after changing the ORM of .exists() to .count() > 0.

Related issue: psycopg/psycopg#417

foarsitter · 2024-05-21T07:00:48Z

@agusmakmun is CONN_MAX_AGE required when a pool is involved? That shouldn't have to be the case I suppose.

agusmakmun · 2024-05-21T16:50:45Z

For this case, was resolved after downgrade the python fromdocker.io/python:3.12.3-slim-bookworm topython:3.10.10-slim-bullseye.

I also consulate with ChatGPT with this query:

For a database server with 2 cores and 2GB of memory, how many POOL_SIZE, MAX_OVERFLOW and RECYCLE. I need numbers only.

DATABASES["default"]["ENGINE"] = "dj_db_conn_pool.backends.postgresql"
DATABASES["default"]["POOL_OPTIONS"] = {
    "POOL_SIZE": 3,
    "MAX_OVERFLOW": 3,
    "RECYCLE": 1800,  # 30 minutes
}

DATABASES["default"]["CONN_MAX_AGE"] = env.int("CONN_MAX_AGE", default=0)

Until deployed to production, no issues so far. Not sure which one is correct, either decrease the pool_size with propper size, or downgrade the python.

foarsitter · 2024-05-21T17:45:27Z

You can certain remove CONN_MAX_AGE since it defaults to 0.

Saurav-Paul · 2024-05-26T06:03:00Z

The issue finally resolved after implementing the django-db-connection-pool + set the CONN_MAX_AGE=0

settings.py

DATABASES = {"default": env.db("DATABASE_URL")}
DATABASES["default"]["ATOMIC_REQUESTS"] = True
# https://github.com/altairbow/django-db-connection-pool?tab=readme-ov-file#postgresql
DATABASES["default"]["ENGINE"] = "dj_db_conn_pool.backends.postgresql"
DATABASES["default"]["POOL_OPTIONS"] = {
    "POOL_SIZE": 10,
    "MAX_OVERFLOW": 10,
    "RECYCLE": 24 * 60 * 60,
}

# Set to 0 to disable long(er) living connections
# https://docs.djangoproject.com/en/dev/ref/settings/#std-setting-CONN_MAX_AGE
# https://github.com/cookiecutter/cookiecutter-django/issues/5071
# "connection failed: FATAL: remaining connection slots are reserved for non-replication superuser connections"  # noqa: ERA001
DATABASES["default"]["CONN_MAX_AGE"] = env.int("CONN_MAX_AGE", default=0)

requirements.txt

psycopg[pool]==3.1.18  # https://github.com/psycopg/psycopg
django-db-connection-pool[psycopg3]==1.2.5  # https://github.com/altairbow/django-db-connection-pool

Thank you so much.. I was suffering from this issue.

agusmakmun · 2024-05-28T12:05:14Z

Guys, FYI, until now I still facing this issue some how.

foarsitter · 2024-05-28T13:47:31Z

And you did increase max_connections to 1000?

agusmakmun added the bug label May 15, 2024

foarsitter added the wontfix Wait a 10 days days and automatically close label May 15, 2024

github-actions bot removed the wontfix Wait a 10 days days and automatically close label May 15, 2024

browniebroke added the solved Wait a bit and close the issue if no response assuming it was solved. label May 15, 2024

github-actions bot removed the solved Wait a bit and close the issue if no response assuming it was solved. label May 16, 2024

agusmakmun closed this as completed May 16, 2024

agusmakmun reopened this May 16, 2024

browniebroke added the solved Wait a bit and close the issue if no response assuming it was solved. label May 17, 2024

github-actions bot removed the solved Wait a bit and close the issue if no response assuming it was solved. label May 19, 2024

agusmakmun mentioned this issue May 19, 2024

Django 4 - ProgrammingError the last operation didn't produce a result psycopg/psycopg#825

Closed

browniebroke added the solved Wait a bit and close the issue if no response assuming it was solved. label May 19, 2024

github-actions bot removed the solved Wait a bit and close the issue if no response assuming it was solved. label May 21, 2024

agusmakmun closed this as completed May 21, 2024

agusmakmun reopened this May 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PostgreSQL - connection failed: FATAL: remaining connection slots are reserved for non-replication superuser connections #5071

PostgreSQL - connection failed: FATAL: remaining connection slots are reserved for non-replication superuser connections #5071

agusmakmun commented May 15, 2024 •

edited

foarsitter commented May 15, 2024

agusmakmun commented May 15, 2024

foarsitter commented May 15, 2024

agusmakmun commented May 15, 2024 •

edited

foarsitter commented May 15, 2024

agusmakmun commented May 15, 2024

agusmakmun commented May 15, 2024

foarsitter commented May 15, 2024 •

edited

agusmakmun commented May 15, 2024 •

edited

foarsitter commented May 15, 2024

agusmakmun commented May 15, 2024

foarsitter commented May 15, 2024

agusmakmun commented May 15, 2024

foarsitter commented May 15, 2024

agusmakmun commented May 15, 2024

agusmakmun commented May 16, 2024 •

edited

agusmakmun commented May 16, 2024

foarsitter commented May 16, 2024

agusmakmun commented May 16, 2024 •

edited

foarsitter commented May 16, 2024

agusmakmun commented May 16, 2024

agusmakmun commented May 17, 2024 •

edited

agusmakmun commented May 19, 2024

foarsitter commented May 21, 2024

agusmakmun commented May 21, 2024

foarsitter commented May 21, 2024 •

edited

Saurav-Paul commented May 26, 2024

agusmakmun commented May 28, 2024

foarsitter commented May 28, 2024

PostgreSQL - connection failed: FATAL: remaining connection slots are reserved for non-replication superuser connections #5071

PostgreSQL - connection failed: FATAL: remaining connection slots are reserved for non-replication superuser connections #5071

Comments

agusmakmun commented May 15, 2024 • edited

What happened?

Details

foarsitter commented May 15, 2024

agusmakmun commented May 15, 2024

foarsitter commented May 15, 2024

agusmakmun commented May 15, 2024 • edited

foarsitter commented May 15, 2024

agusmakmun commented May 15, 2024

agusmakmun commented May 15, 2024

foarsitter commented May 15, 2024 • edited

agusmakmun commented May 15, 2024 • edited

foarsitter commented May 15, 2024

agusmakmun commented May 15, 2024

foarsitter commented May 15, 2024

agusmakmun commented May 15, 2024

foarsitter commented May 15, 2024

agusmakmun commented May 15, 2024

agusmakmun commented May 16, 2024 • edited

Before

After

agusmakmun commented May 16, 2024

foarsitter commented May 16, 2024

agusmakmun commented May 16, 2024 • edited

foarsitter commented May 16, 2024

agusmakmun commented May 16, 2024

agusmakmun commented May 17, 2024 • edited

agusmakmun commented May 19, 2024

foarsitter commented May 21, 2024

agusmakmun commented May 21, 2024

foarsitter commented May 21, 2024 • edited

Saurav-Paul commented May 26, 2024

agusmakmun commented May 28, 2024

foarsitter commented May 28, 2024

agusmakmun commented May 15, 2024 •

edited

agusmakmun commented May 15, 2024 •

edited

foarsitter commented May 15, 2024 •

edited

agusmakmun commented May 15, 2024 •

edited

agusmakmun commented May 16, 2024 •

edited

agusmakmun commented May 16, 2024 •

edited

agusmakmun commented May 17, 2024 •

edited

foarsitter commented May 21, 2024 •

edited