Define Workflow Failure Strategy
- Before You Begin
- Review: Workflow and Phase Priority
- Step: Add Workflow Failure Strategy
A Failure Strategy defines how your Workflow handles different failure conditions.
In this topic:
Before You Begin
- Add a Workflow
- Add Phases to a Workflow
- Skip Workflow Steps
- Review: Multiple Failure Strategies in a Workflow
Review: Workflow and Phase Priority
You can define a Failure Strategy at the Workflow step and Phase level.
What is a Phase? Unless you add multiple phases to your Workflow, the Workflow is considered a single Phase. Canary Workflows use multiple phases, and other Workflow types, such as Blue/Green and Rolling, are considered single Phase Workflows.
The Failure Strategy applied to a Workflow step takes precedence over the Failure Strategy applied to the Phase.
The Workflow step Failure Strategy does not propagate to the parent Phase.
Step: Add Workflow Failure Strategy
To define the failure strategy for the entire Workflow, do the following:
- In a Workflow, click Failure Strategy. The default failure strategy appears.
The default failure strategy is to fail the Workflow if there is any application error, and to rollback the Workflow execution. You can modify the default strategy or additional strategies.
- Click Add Failure Strategy. The Failure Strategy settings appear.
The dialog has the following fields:
Select the type of error:
Harness encountered an application error during deployment.
The following types are listed but not supported at this time:
- Connectivity Error: Harness is unable to connect to the target host/cluster/etc, or a provider, such as a Git repo.
- Authentication Error: Harness is unable to authenticate using the credentials you supplied in the Cloud Provider, Artifact Source, Source Repo Provider, and other connectors.
- Verification Error: If you have set up verification steps in your Workflow and a deployment event is flagged as an error by the step, Harness will fail the deployment.
Select the scope of the strategy. If you select Workflow, the Action is applied to the entire Workflow. If you select Workflow Phase, then the Action is applied to the Workflow Phase only.
For example, if you selected Workflow Phase and then selected the Action Rollback Phase Execution, and a failure occurred in the second Phase of the Workflow, then the second Phase of the Workflow would be rolled back but the first Phase of the Workflow would not be rolled back.
Select the action for Harness to take in the event of a failure, such as a retry or a rollback:
Applies to Workflow steps only, but not Approval steps or Resource Lock.
You will be prompted to approve or reject the deployment on the Deployments page.
Timeout (Manual Intervention)
If you select Manual Intervention in Action, enter a timeout in Timeout and an action in Action after timeout (such as Ignore). Once the timeout is reached, the action is executed.
The default value for Timeout is 14 days (14d).
The available actions in Action after timeout are:
- Mark as Success
- End Execution
- Abort Workflow
- Rollback Workflow
If a manual intervention has occurred, you can see it in the Workflow step details in Deployments. Here is an example using End Execution:
Rollback Workflow Execution
(Applies to Workflow Phase only)
Harness will initiate rollback.
Failure strategies can be applied at both the Workflow step and Phase level:
Rollback Workflow Execution is not applicable for Workflow steps, presently. It applies to Workflow Phases.
Rollback Phase Execution
Harness will initiate rollback of the Phase.
Rollback Provisioner After Phases
See New features added to Harness and Features behind Feature Flags (Early Access) for Feature Flag information.
This option is for Canary and Multiservice Workflows that use Infrastructure Provisioners in their Pre-deployment steps to provision the target infrastructure.
By default, provisioners are rolled back before deployment phases and all provisioners are rolled back in the same order in which they were deployed.
When the Rollback Provisioner After Phases failure strategy is used, rollback will happen as follows:
- Deployment phases are rolled back before the Infrastructure Provisioners in Pre-deployment steps.
- All Infrastructure Provisioners in the Pre-deployment Steps are rolled back in the reverse order in which they were deployed.
Harness ignores any failure and continues with deployment. The deployment does not fail on this step and the status is Passed.
(Applies to Workflow steps only)
Harness will retry the step where the failure occurred. This is also only applicable to Workflow steps.
Harness will end the Workflow (fail the state) without rolling back. The status of the Workflow will be Failed. Typically, End Execution is used with Manual Intervention.
Harness will abort the Workflow without rolling back. The status of the Workflow will be Aborted.
Step: Step-level Failure Strategy
To define the failure strategy for the step section of a Workflow, do the following:
- Next to the step section title, click more options (⋮). The step-level settings appear.
- In Failure Strategy, click Custom. The Failure Strategy settings appear.
- Click Add Failure Strategy.
- Fill out the strategy. The dialog has the following fields:
- Failure - Select the type of error, such as Verification, Application, etc. The step-level Failure Strategy has the same options as the Phase-level Failure Strategy, with the exception of Timeout Error.
- Action - Select the action for Harness to take in the event of a failure, such as a retry or a rollback.
- Specific Steps - Select any specific Workflow steps that you want to target for the Failure Strategy.
The criteria for the strategy will be applied to those steps only.If you do not select steps, then the strategy is applied to all steps in that Workflow section.
There is no Scope setting, like the Scope setting in the Workflow-level Failure Strategy, because the scope of this strategy is the step section.
- Click Submit. The failure strategy is added to the step section.
The Timeout Error condition in a Workflow step-level Failure Strategy helps you manage ECS step timeouts when you are deploying many containers in a Workflow or Pipeline. Timeout Error helps you manage these timeouts gracefully.
In Specific Steps, you can select one or more of the following step types:
- ECS Service Setup
- ECS Run Task
- ECS Daemon Service Setup
- Setup Load Balancer
- Setup Route 53
- ECS Upgrade Containers
- ECS Steady State Check
- Swap Target Groups
- Swap Route 53 DNS
- Rollback ECS Setup
- ECS Rollback Containers
- Rollback Route 53 Weights
- Rollback Swap Target Groups
Review: Multiple Failure Strategies in a Workflow
When using multiple Failure Strategies in a Workflow, consider the following:
- For failure strategies that do not overlap (different types of failures selected), they will behave as expected.
- Two failures cannot occur at the same time, so whichever error occurs, that Failure Strategy will be used.
Conflicts might arise between failure strategies on the same level or different levels. By level, we mean the step-level or the Workflow level:
If there is a conflict between multiple failures in strategies on the same level, the first applicable strategy is used, and the remaining strategies are ignored.
For example, consider these two strategies:
- Abort Workflow on Verification Failure or Authentication Failure.
- Ignore on Verification Failure or Connectivity Error.
Here's what will happen:
- On a verification failure, the Workflow is aborted.
- On an authentication failure, the Workflow is aborted.
- On a connectivity error, the error is ignored.
If there is a clash of selected failures in strategies on different levels, the step-level strategy will be used and the Workflow level strategy will be ignored.