Investigating API handshake timeouts and Flow execution lag?

Hey everyone,
I’m currently working on a project where I’ve layered Directus over an existing PostgreSQL database to handle our internal task management. It’s been a great experience so far, but I’ve run into a consistent performance bottleneck when executing complex Flows. Specifically, when I have a Flow triggered by a collection update that needs to push data to an external webhook, the “Activity Log” shows the process hanging for several seconds before either completing or occasionally throwing a 504 timeout.

I’ve checked the server resources and they seem stable, so I’m wondering if this is a concurrency issue within the Node.js environment or if there’s a specific way to optimize the payload handling. I even tried a roblox modified configuration for my background execution environment to see if a more isolated script handler would prevent the main Directus thread from stalling, but I’m still seeing intermittent handshake errors.

Has anyone else dealt with execution lag when running heavy logic inside of Flows? I’m trying to figure out if I should move this logic into a custom “Operation” extension or if there are specific environment variables (like FLOWS_EXECUTION_MAX_COUNT) that I should be tweaking to keep the UI responsive during these background tasks. I’d really appreciate any insights or debugging tips from anyone who has scaled their automations beyond simple CRUD triggers!

I can’t really offer help as far as debugging your particular situation, but I have have some crazy complex flows running on our instance so might be able to offer some tips.

For anything involving more than a few operations (or anything with lots of branching logic or loops), I’ve been using serverless functions on Digital Ocean (which happens to be also where our Directus instance is hosted).

This works really well and allows me to just ping the whole payload off to the function via a webhook, write whatever I want in Python and return the result. It works well with either blocking or non-blocking flows. It’s handy because you can tweak the timeout and memory settings for each function individually if they are complex.

Some of them do take few seconds to run by their complex nature, so I often need to decide whether or not it’s important for the user to see the result instantly, and if not, I’ll just use a non-blocking Flow and let it do its thing.

I do have some that use a blocking flow though that can take a couple of seconds to run and return the result, but I’ve never experienced the flakiness you’re describing. I sounds like you’re just hitting resource issues, but you say that you’ve checked that. Even so, might be worth upping things and seeing if it helps.

Also it’s worth considering using n8n for complex flows. There are official Directus operations in there now, and they work really well. And it allows you to do looping and complex flows much easier than within Directus itself. I road-tested it recently and thought it was great, but honestly, I find it easier to just write my own Python scripts and use external functions.