Clearing the Nomad job queue
When CircleCI jobs are queued or otherwise won't run, it is sometimes necessary to manually clear out the Nomad job queue as part of regular troubleshooting—refer to our Introduction to Nomad Cluster Operation article for a basic overview.
CircleCI Server 3.x/4.x
To cancel a single job:
- Make sure you are connected to the Kubernetes Cluster
-
kubectl get pods -l layer=execution -n <namespace>
-
kubectl exec -it <nomad_server_pod> -n <namespace> -- nomad status
-
kubectl exec -it <nomad_server_pod> -n <namespace> -- nomad stop $ID
To force cancel all jobs in the queue:
- Make sure you are connected to the Kubernetes Cluster
-
kubectl get pods -l layer=execution -n <namespace>
-
kubectl exec -it <nomad_server_pod> -n <namespace> -- sh -c "nomad status | cut -d' ' -f1 | grep -v 'ID' | xargs -n1 nomad stop"
CircleCI Server 2.x
Running nomad
status, while SSH'd into the Services machine, will display all currently running Nomad jobs, each of which represents a specific CircleCI job.
Running nomad stop $ID
with a job ID from the ID column of this output will clear an individual job from the queue.
However, the list of jobs can sometimes be very long, depending on how long queueing has been happening and how frequently developers have been pushing new commits.
Instead of manually running nomad stop
for each job in the queue, you can automate this process using the following series of shell commands:
nomad status | awk 'NR>1{print $1}' | xargs -iID replicated admin nomad stop ID
These shell commands will cancel all queued/running jobs in the Nomad job queue.
Comments
Article is closed for comments.