Troubleshooting
Common production issues and how to recover from them.
Deployment Has Non-Ready Pods
Diagnose and recover deployments that have non-ready or failing pods.
Artifact History & Rollback
Inspect artifact history and roll back to a previous version when needed.
Expanding a PVC
Safely expand a Kubernetes PersistentVolumeClaim without downtime.
Rolling Restart of Pods
Perform a rolling restart of pods without downtime.
Unlocking Terraform State
Recover from a locked Terraform state when a release operation is interrupted.
Setting Up Loki for VMs
Configure Loki to fetch logs from virtual machines in addition to Kubernetes pods.
Handle Loki Alerts
Diagnose and resolve common Loki log-aggregator alerts.
Removing Unhealthy Ingester Instances
Remove unhealthy ingester instances from the Loki ring during ingestion issues.