Sovereign Platform is in pre-launch alpha.
Not yet available to purchase. Sign up for our mailing list for upcoming launch dates.
Sovereign Platform is in pre-launch alpha.
Not yet available to purchase. Sign up for our mailing list for upcoming launch dates.
Things go wrong — APIs time out, services return errors, data is not in the expected format. Sovereign Workflows is designed to handle failures gracefully so your automations are resilient and recoverable.
When a step fails with a retriable error (like a network timeout or a temporary service outage), the workflow engine automatically retries it.
The default retry policy uses exponential backoff with jitter:
This approach gives transient issues time to resolve while avoiding overwhelming a struggling service.
The default retry behavior can be configured at the platform level:
| Setting | Default | Description |
|---|---|---|
| Max attempts | 3 | How many times to retry before giving up |
| Initial delay | 2 seconds | Wait time before the first retry |
| Backoff factor | 2.0 | Multiplier for each subsequent delay |
| Jitter | Enabled | Adds randomness to retry timing |
Enterprise Retry Features
Enterprise tier unlocks advanced retry capabilities: adaptive retry policies that learn from historical failure patterns, circuit breaker policies that stop retrying when a service is consistently down, and custom per-action retry policies.
Not all failures trigger retries. The system distinguishes between:
When a step fails after exhausting its retries, you have two options:
Let the workflow fail — if there is no failure edge, the entire workflow execution is marked as failed. This is the default behavior and is appropriate when any step failure means the whole process should stop.
Route to an error handler — connect a failure edge from the step to another node. When the step fails, instead of stopping the workflow, execution continues along the failure path. This lets you:
Building Resilient Workflows
For critical workflows, add failure edges to key steps. Even a simple "send an email when this step fails" can save hours of investigation time.
When a ForEach node is processing a collection, you can choose what happens when individual iterations fail:
Fail Fast — stop all remaining iterations as soon as one fails. The ForEach node reports a failure. Use this when partial results are not useful (e.g., a batch update that must be all-or-nothing).
Continue — let all iterations run to completion, even if some fail. The ForEach node reports the aggregated results: how many succeeded, how many failed, and how many were cancelled. Use this when partial results are acceptable (e.g., sending notifications to a list of users — one failure should not block the rest).
When a workflow execution fails, you can investigate what went wrong:
This information helps you quickly identify whether the issue was transient (and a simple re-run would fix it) or systematic (requiring a workflow change).
If a workflow is running and you need to stop it — perhaps you noticed an issue with the data or the workflow is taking too long — you can cancel the execution.
Cancelling an execution: