Seeking advice for Postgres configuration with Directus for high concurrent users

Hi. The project I am working on is expected to have 20-50k monthly active users. There are many flows, both async and filter modes which will operate on a lot of data. I am also using web sockets from Directus and the REST API is consumed by a Nextjs app and a Webflow website.

The Directus instance is setup using Docker Compose in a DigitalOcean droplet with 16GB RAM, 4 core CPU and 160GB storage. Assets (Image and PDF) are uploaded to DigitalOcean Spaces through Directus and served by CDN. PM2 is used with fork mode for Directus inside the Docker container with 1 instance as previously I had a memory leak issue with cluster mode. Redis is available that would sync the PM2 instances according to Directus docs. Each container has auto restart enabled upon crash or OS restart and memory limits are also set.

But the issue is with Postgres database. According to docs, it can handle 100 concurrent connections by default. If I need to allow more concurrent connections, I need to use pgBouncer for pooling the database connections or increase the number in Postgres config but not sure what number to set it to. Given that the application has permissions and roles stored as metadata in database, would using pgBouncer be a security issue?

I am looking for some advice on how to prepare my Directus instance to handle large numbers of concurrent request as the project is getting close to initial release. Thanks in advance for any advice provided.

1 Like

I would highly recommend switching from Docker Compose to Kubernetes or another container orchestration setup so you can control scaling and high availability of the Directus containers itself. On DigitalOcean you could also consider using App Platform as that’ll handle a lot of the uptime stuff on your behalf.

As for the database, you’d indeed want to setup PgBouncer to make sure that the total number of incoming connections from Directus doesn’t become a bottleneck. As for what size of database to use it’s a bit hard to tell just based on the info shared, as it really depends on what does 50k users are actively doing. I believe DigitalOcean’s database service allows you to scale up without downtime, so I’d start a little smaller, actively monitor the resource usage based on real traffic patterns, and scale up as necessary.