Troubleshooting¶
This page describes how to get the information you need to diagnose problems with a Varnish Gateway installation. The list of known failure modes will grow as the project matures.
Getting logs¶
Most problems surface in one of three log streams. Start here before digging deeper.
Operator logs¶
The operator reconciles Gateway, HTTPRoute, and GatewayClassParameters resources. Errors during reconciliation — missing references, invalid VCL, failed resource creation — appear here:
bash
kubectl logs -n varnish-gateway-system -l app.kubernetes.io/component=operator -f
Chaperone logs¶
Chaperone manages the varnishd process inside each gateway pod. Reload failures, endpoint resolution problems, and varnishd crashes appear here:
bash
kubectl logs -f <gateway-pod> -c varnish-gateway
Varnish request logs¶
For request-level debugging — cache misses, backend errors, unexpected routing — use varnishlog inside the pod:
bash
kubectl exec -it <gateway-pod> -c varnish-gateway -- \
varnishlog -n /var/run/varnish/vsm -g request
Filter to specific status codes, URLs, or backends with -q:
```bash
503 errors¶
kubectl exec -it
Backend-side 5xx¶
kubectl exec -it
See guides/logging.md for the full set of varnishlog flags and query syntax.
Dashboard¶
The built-in dashboard gives a live view of ghost's routing state, recent reload events, and filtered varnishlog output — without needing kubectl exec. Port-forward to reach it:
bash
kubectl port-forward <gateway-pod> 9000:9000
Then open http://localhost:9000. The /api/state endpoint returns a
JSON snapshot of the current vhost configuration, backend health, and
recent events. The /api/varnishlog endpoint streams filtered
varnishlog-json over Server-Sent Events. See
observability.md for the full endpoint
list.
Checking resource status¶
Gateway API resources carry status conditions that surface problems without needing logs:
```bash
Gateway status — look for Accepted and Programmed conditions¶
kubectl describe gateway
HTTPRoute status — look for Accepted and ResolvedRefs conditions¶
kubectl describe httproute
GatewayClass status¶
kubectl describe gatewayclass varnish ```
A Gateway stuck in Programmed=False or an HTTPRoute with
ResolvedRefs=False usually indicates a missing Secret, Service, or
ReferenceGrant. The status message includes specifics.
Known failure modes¶
Pods crashlooping¶
Check the chaperone logs for the crash reason. Possible causes:
- Invalid VCL syntax in user VCL — chaperone logs the
vcl.loaderror from varnishadm. - Missing or unreadable ConfigMap mount — the pod starts but varnishd cannot load its configuration.
- Insufficient memory — if a memory limit is set too close to the configured storage size, the kernel OOM-kills varnishd. Check
kubectl describe podfor OOMKilled status.
Reloads not taking effect¶
Both VCL and ghost reloads use content-based deduplication — if the
generated content is byte-identical to the previous version, no reload
is issued. Check the chaperone logs for reload events and errors. The
Prometheus counters chaperone_ghost_reloads_total and
chaperone_vcl_reloads_total confirm whether reloads are happening.
A failed VCL reload leaves the previous VCL active. The error is logged and the gateway continues serving with the last good configuration.
TLS handshake failures¶
Check that the referenced Secret exists, is of type kubernetes.io/tls,
and contains valid PEM in tls.crt and tls.key. For cross-namespace
certificate references, verify that a ReferenceGrant exists in the
Secret's namespace. See guides/tls.md for the full
TLS setup.
See also¶
- Observability — health endpoints, metrics, dashboard, and log access
- Logging guide — sidecar and ad-hoc varnishlog usage
- TLS guide