Telegraf Pod CrashLoopBackOff After Installing or upgrading CircleCI Server with Helm 3.18.0

Issue Summary:

When deploying CircleCI Server using Helm 3.18.0, the telegraf pods may fail to start and enter a CrashLoopBackOff state due to invalid TOML syntax in the generated ConfigMap.

Symptoms:

Running kubectl logs <telegraf-pod> shows:

I! Loading config: /etc/telegraf/telegraf.conf
E! loading config file /etc/telegraf/telegraf.conf failed: error parsing data: line 30: invalid TOML syntax

Diagnosis:

Inspect the telegraf ConfigMap:

kubectl get cm -n <circleci-namespace> telegraf -o yaml

You'll likely find:

[[inputs.statsd]]
datadog_extensions = true
max_ttl = "12h"
metric_separator = "."
percentiles = [,,
]

The percentiles array is broken ([, ,]), causing Telegraf to fail parsing.

Cause:

This is related to a bug in Helm 3.18.0 when rendering arrays in templates with default values.

See Helm issue #30918 for full context.

Solution:

  • Upgrade to Helm 3.18.1 or later and redeploy with helm upgrade to resolve the issue.

Workaround:

Manually patch the telegraf ConfigMap with valid TOML syntax, the values below are from the values.yaml defaults:

percentiles = [50, 95, 99]

Then roll the pod:

kubectl delete pod -n <circleci-namespace> <telegraf-pod-name>
Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Article is closed for comments.