Handling Transient Failures with Retry Logic

Overview

Networking issues or service instability can occasionally cause steps in your CircleCI job to fail. In such cases, adding retry logic can help your job recover without manual intervention. This article shows how to implement a simple retry loop using a Bash while statement inside a run step.

Retry Loop Example

Here is an example of retry logic in a CircleCI job:

- run: |
    N=5
    while [ $N -gt 0 ]
    do
      if docker-compose up -d; then 
        echo "Spun up services via Docker Compose"
        exit 0
      fi

      echo "Failed to spin up services. Retrying..."
      N=$(( N - 1 ))
      sleep 10
    done
    exit 1

In this example, the command docker-compose up -d is retried up to 5 times. If it succeeds at any point, the loop exits early. You can change the value of N to adjust how many attempts are made.

The sleep 10 line introduces a delay (in seconds) between retries. This can help if the issue is due to temporary unavailability or delayed resource readiness.

Note: Retrying steps can increase the total job duration and may result in higher credit usage. Be mindful of this when deciding how many retries to allow.

Additional Resources

Bash For-Loop Examples

Handling Transient Failures with Retry Logic

Handling Transient Failures with Retry Logic

Overview

Retry Loop Example

Additional Resources

Comments

Articles in this section