Chaos engineering in Kubernetes cluster

Chaos Engineering: Breaking Your Cluster to Make It Stronger

Your cluster looks healthy. Pods are running. Metrics are green. Everything works. Until a node fails during peak traffic. Or the database connection pool exhausts. Or that one service nobody remembers deploying starts consuming all available memory. You can wait for these things to happen in production at 3 AM. Or you can break things intentionally, on your terms, and fix the weaknesses before they become outages. This is chaos engineering. ...

April 28, 2026 · 7 min read · Tom Meurs
Effective alerting strategy visualization

Alerting That Works: From Alert Fatigue to Actionable Notifications

Your phone buzzes at 3 AM. You groggily check: “High CPU usage on node-worker-3.” You look at the graph, see it’s been at 75% for 10 minutes, and go back to sleep. Tomorrow, same alert. Next week, you stop checking altogether. This is alert fatigue, and it’s dangerous. When everything alerts, nothing does. Real incidents get lost in the noise. I’ve been on both sides — drowning in alerts, and running systems where pages are rare and always actionable. The difference isn’t better tools. It’s better thinking about what deserves attention. ...

April 16, 2026 · 7 min read · Tom Meurs