Overview
This article is a central troubleshooting reference for both runner types:
- Machine Runner 3.x — an agent installed directly on a VM or physical machine (Linux, macOS, Windows)
- Container Runner — a Helm-deployed agent that schedules jobs as pods in a Kubernetes cluster
If you are still using Launch Agent 1.x, stop here and migrate first — see Issue 4: Launch Agent 1.x jobs are failing (EOL) below.
Quick Pre-Checks
Before diving into specific issues, confirm the following:
| Check | How |
|---|---|
| Runner is registered and visible | Org Settings → Self-Hosted Runners → confirm resource class appears and shows a runner |
| Runner version | circleci-runner --version (machine runner) or helm list -n <namespace> (container runner) |
| Resource class name in config matches exactly | Names are case-sensitive: my-org/my-runner ≠ my-org/My-Runner |
Runner has outbound internet access to runner.circleci.com | Port 443 required |
| Runner token is valid and not rotated | If token was recently rotated, restart the runner process with the new token |
Issue 1: "We cannot run this job using the selected resource class"
Symptom: The job fails immediately with:
Cause A — Resource class does not exist
Verify the resource class was created:
circleci runner resource-class list <your-namespace>If missing, create it:
circleci runner resource-class create <your-namespace>/<resource-class-name> "description"Cause B — Runner is not enabled for your plan
Self-hosted runners require a Scale, Custom, or Server plan. Performance and Free plans do not have access. Check at Org Settings → Plan.
Cause C — Typo in config.yml
The resource class in your config must exactly match what was created. Check for capitalization differences, leading/trailing spaces, or namespace mismatches:
# Must match the registered resource class exactly resource_class: my-org/my-runner-name
Issue 2: Jobs Queued or Stuck in "Not Running" / "Preparing Environment"
Check 1 — Confirm at least one runner is online
Go to Org Settings → Self-Hosted Runners. If the resource class shows "No runners" or all runners appear offline, the runner process has stopped or lost connectivity.
Check 2 — Review maxConcurrentTasks
Each resource class has a maxConcurrentTasks limit (default: 20). If this limit is reached, additional jobs queue even if runner machines appear idle. Contact CircleCI Support to request an increase.
Check 3 — Inspect runner logs
See Runner Log File Locations below. Look for:
failed to claim task— runner cannot reach the CircleCI backendcontext deadline exceeded— network timeout torunner.circleci.comtoken is invalid— runner token was rotated; restart the runner with the new token
Check 4 — For container runner, check pod status
kubectl get pods -n <namespace> kubectl logs deployment/container-agent -n <namespace>
If the container-agent pod is not in Running state, see Issues 5 and 6 below.
Issue 3: Runner Appears Online but Jobs Are Not Being Claimed
Cause A — Runner is at maxConcurrentTasks capacity
If a previous batch of jobs did not release cleanly (e.g., machine rebooted mid-job), tasks may still be counted as active in the backend. Contact Support to clear stuck task claims.
Cause B — Runner cannot reach the task assignment endpoint
The runner must be able to reach:
runner.circleci.com:443*.circle-artifacts.com(for artifact and cache operations)
Test from the runner machine:
curl -I https://runner.circleci.com/api/v3/runner/unclaimCause C — Clock skew on the runner machine
TLS certificate validation requires the system clock to be within a few minutes of actual time. If the clock is skewed, authentication will fail silently. Verify NTP is configured and the clock is accurate (timedatectl on Linux).
Issue 4: Launch Agent 1.x Jobs Are Failing (EOL)
Support for Launch Agent 1.x ended on September 17, 2024. Any runner still running a 1.x version will fail.
Symptoms:
- Jobs fail immediately with no useful error in the job output
- Runner logs show authentication or connection errors with no clear cause
Action required: Migrate to Machine Runner 3.x
The migration is straightforward — the configuration file is 1:1 compatible. No config changes are required.
# macOS (Homebrew) brew install circleci-runner # Linux (Debian/Ubuntu) apt install circleci-runner # Linux (RHEL/CentOS) yum install circleci-runner
After installing, your existing config file (launch-agent-config.yaml) works without modification:
circleci-runner start --config launch-agent-config.yamlFull migration docs: https://circleci.com/docs/guides/execution-runner/migrate-from-launch-agent-to-machine-runner-3-on-linux/
Issue 5: Container Runner — Jobs Stuck in "Task Lifecycle" Stage (K8s Throttling)
Symptom: Jobs hang in the "Task lifecycle" stage. Container-agent logs show:
waited for 3s due to client-side throttling, not priority and fairness, request: ...Cause: The single container-agent pod is saturating the Kubernetes API rate limits under high task concurrency.
Fix: Increase the replica count in values.yaml:
agent: replicaCount: 2
Apply the change:
helm upgrade container-agent container-agent/container-agent -n <namespace> -f values.yamlIssue 6: Container Runner — Pods Remain in "Pending" State
| Cause | How to check |
|---|---|
| Node out of memory (OOM) | kubectl describe node <node-name> — look for MemoryPressure: True |
| Node disk pressure | kubectl describe node <node-name> — look for DiskPressure: True |
| No nodes match pod affinity/tolerations | kubectl describe pod <task-pod-name> -n <namespace> — look for Unschedulable events |
| Image pull failure | kubectl describe pod <task-pod-name> — look for ImagePullBackOff or ErrImagePull |
For image pull issues with a private registry, see How to use imagePullSecrets on Container Runner.
Issue 7: OIDC Tokens Not Available in Runner Jobs
Symptom: $CIRCLE_OIDC_TOKEN is empty or the job fails when trying to use it.
Cause: OIDC token generation writes a file to /tmp. If /tmp is mounted with the noexec flag (common in hardened environments), this fails silently.
Diagnose:
mount | grep /tmp # Look for "noexec" in the output
Fix options:
- Remove the
noexecflag from/tmpif your security policy permits. - Configure the runner to use an alternative working directory that allows execution.
- Use a native credential mechanism (AWS IAM instance profiles, GCP Workload Identity) instead of OIDC on that runner.
Issue 8: "fork/exec /bin/bash: bad file descriptor" (Container Runner)
Symptom:
failed to start cmd: fork/exec /bin/bash: bad file descriptorCause: The job's Docker image does not have /bin/bash, or the image entrypoint conflicts with the runner's task agent.
Fix:
- Ensure the image includes bash (
RUN apt-get install -y bash), or use an image that includes it. - Explicitly set the shell in your job config:
jobs:
my-job:
shell: /bin/sh -eo pipefailIssue 9: SSH Debugging Not Working on Self-Hosted Runners
Container Runner does not support SSH debugging. This is a current product limitation — "Rerun job with SSH" is not available for container runner jobs.
Machine Runner does support SSH reruns. If it's not working, verify:
- Project Settings → Advanced → Enable SSH reruns is turned on
- The runner machine is network-accessible from your IP on the SSH port
Runner Log File Locations
Machine Runner 3.x
| OS | Log location |
|---|---|
| Linux (systemd) | journalctl -u circleci-runner -f |
| Linux (file) | /var/log/circleci-runner/circleci-runner.log |
| macOS | ~/Library/Logs/com.circleci.runner/circleci-runner.log |
| Windows | C:\ProgramData\CircleCI\circleci-runner.log |
To increase log verbosity, set log_level: debug in the runner config file and restart the service.
Container Runner
# Container agent logs kubectl logs deployment/container-agent -n <namespace> --tail=200 # Logs for a specific task pod kubectl logs <task-pod-name> -n <namespace> # Events (most useful for Pending pods) kubectl describe pod <task-pod-name> -n <namespace>
When Escalating to Support
Include the following in your ticket to avoid back-and-forth:
- Runner type: Machine Runner or Container Runner
- Runner version:
circleci-runner --versionor Helm chart version (helm list -n <namespace>) - Resource class name exactly as it appears in
config.yml - OS and version (machine runner) or Kubernetes version and cloud provider (container runner)
- Runner logs from the time window of the failure
- The specific failing job URL from
app.circleci.com - Output of
circleci runner resource-class list <namespace> - Whether the issue is intermittent or consistent
Additional Resources
- Self-hosted runner overview
- Machine Runner 3 installation
- Container Runner installation
- Migrate from launch agent to machine runner 3
- Machine runner 3 configuration reference
- Container runner Helm chart (GitHub)
Comments
Please sign in to leave a comment.