Handling Transient Failures with Retry Logic
Overview
Networking issues or service instability can occasionally cause steps in your CircleCI job to fail. In such cases, adding retry logic can help your job recover without manual intervention. This article shows how to implement a simple retry loop using a Bash while
statement inside a run
step.
Retry Loop Example
Here is an example of retry logic in a CircleCI job:
- run: |
N=5
while [ $N -gt 0 ]
do
if docker-compose up -d; then
echo "Spun up services via Docker Compose"
exit 0
fi
echo "Failed to spin up services. Retrying..."
N=$(( N - 1 ))
sleep 10
done
exit 1
In this example, the command docker-compose up -d
is retried up to 5 times. If it succeeds at any point, the loop exits early. You can change the value of N
to adjust how many attempts are made.
The sleep 10
line introduces a delay (in seconds) between retries. This can help if the issue is due to temporary unavailability or delayed resource readiness.
Note: Retrying steps can increase the total job duration and may result in higher credit usage. Be mindful of this when deciding how many retries to allow.
Comments
Article is closed for comments.