Drift Detection with ArgoCD: How to Know If Your Cluster Is Still in Sync

GitOps promises that Git is the source of truth. But what if someone kubectl edits a deployment? What if a mutating webhook changes a resource? What if the cluster silently diverges from what Git says it should be?

This is configuration drift, and it’s one of the most insidious problems in Kubernetes operations. ArgoCD can help you detect it — if you configure it correctly.

What Is Configuration Drift?

Drift happens when the actual state of your cluster differs from the desired state in Git.

flowchart LR
    subgraph git["Git says (Source of truth)"]
        G1["replicas: 3"]
        G2["image: v1.2.3"]
        G3["cpu: 100m"]
    end

    subgraph cluster["Cluster has (Actual state)"]
        C1["replicas: 5"]
        C2["image: v1.2.3"]
        C3["cpu: 200m"]
    end

    git -.->|"≠"| cluster

How did replicas become 5 when Git says 3? Possible causes:

Manual changes: Someone ran kubectl scale or kubectl edit
Horizontal Pod Autoscaler: HPA adjusted replicas
Mutating webhooks: Admission controllers modified resources
Controller side effects: Operators made changes
Partial syncs: Sync failed midway

Some drift is intentional (HPA). Most is not. The problem is not knowing which is which.

Why Drift Matters

Without drift detection, you have no guarantee that Git represents reality. This breaks:

Audit trails: “What’s deployed?” becomes “check the cluster” instead of “check Git”
Disaster recovery: Rebuilding from Git won’t match the old state
Security: Unauthorized changes go unnoticed
Reproducibility: Two clusters from the same Git won’t be identical

The moment you have undetected drift, you’ve lost the core benefit of GitOps.

ArgoCD’s Sync Status

ArgoCD continuously compares Git to cluster state. The sync status tells you:

Synced: Cluster matches Git exactly
OutOfSync: Differences detected
Unknown: ArgoCD can’t determine state

Application	Sync Status	Health
frontend	Synced	Healthy
backend	OutOfSync	Healthy
database	Synced	Healthy
cache	Synced	Degraded

“OutOfSync” = drift detected

OutOfSync means drift. But ArgoCD’s default behavior might surprise you.

Self-Heal: Automatic Drift Correction

ArgoCD can automatically revert drift:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: frontend
spec:
  syncPolicy:
    automated:
      selfHeal: true  # Revert manual changes
      prune: true     # Delete orphaned resources

With selfHeal: true, when someone runs kubectl scale deployment frontend --replicas=5, ArgoCD will revert it to what Git says within seconds.

This is powerful but has implications:

Intentional changes get reverted
HPA adjustments get overwritten
You can’t quickly hotfix production

For most applications, selfHeal should be enabled. It’s the “GitOps purist” approach.

Handling Intentional Drift: Ignore Differences

Some fields should be managed outside Git:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: autoscaled-app
spec:
  ignoreDifferences:
    - group: apps
      kind: Deployment
      jsonPointers:
        - /spec/replicas
    - group: ""
      kind: Service
      jsonPointers:
        - /spec/clusterIP

This tells ArgoCD: “Don’t report drift for these fields.”

Common fields to ignore:

/spec/replicas (if using HPA)
/spec/clusterIP (assigned by Kubernetes)
/metadata/annotations (controller-added)
/status (always managed by controllers)

Detecting Drift Without Auto-Fix

Sometimes you want to know about drift but not automatically fix it:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: critical-app
spec:
  syncPolicy:
    automated:
      selfHeal: false  # Don't auto-fix
      prune: false     # Don't auto-delete

Now ArgoCD shows OutOfSync status but waits for manual intervention. This is useful for:

Critical production systems where you want human review
Debugging drift sources
Applications managed partially outside GitOps

Notifications: Alert on Drift

Don’t stare at the dashboard. Get notified:

apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-notifications-cm
  namespace: argocd
data:
  trigger.on-sync-status-unknown: |
    - when: app.status.sync.status == 'OutOfSync'
      send: [app-out-of-sync]
  template.app-out-of-sync: |
    message: |
      Application {{.app.metadata.name}} is OutOfSync.
      Sync Status: {{.app.status.sync.status}}
      Health: {{.app.status.health.status}}
      Repository: {{.app.spec.source.repoURL}}

Connect this to Slack, PagerDuty, or email. Drift should trigger alerts.

The Diff View: Understanding Drift

When drift occurs, ArgoCD shows exactly what changed:

argocd app diff frontend

Or in the UI, click on an OutOfSync application to see the diff:

--- Git (desired)
+++ Cluster (actual)
@@ -1,4 +1,4 @@
 spec:
-  replicas: 3
+  replicas: 5
   template:
     spec:

This is invaluable for understanding what drifted and why.

Refresh vs Sync

Two different operations:

Refresh: Compare Git to cluster, update status. No changes made.

argocd app get frontend --refresh

Sync: Apply Git state to cluster. Changes made.

argocd app sync frontend

Refresh is safe and frequent (every 3 minutes by default). Sync is destructive and should be deliberate (unless automated).

Drift Detection Strategy

Here’s my approach:

For Development/Staging

selfHeal: true — Revert all drift
prune: true — Delete orphaned resources
Fast feedback, pure GitOps

For Production (Most Apps)

selfHeal: true — Revert drift
prune: true — Delete orphaned
Alerts on any OutOfSync event
Investigate why drift happened

For Production (Critical/Special)

selfHeal: false — Human review required
prune: false — Manual deletion only
Strict alerts
Explicit sync approval

For HPA-Managed Apps

ignoreDifferences:
  - group: apps
    kind: Deployment
    jsonPointers:
      - /spec/replicas
syncPolicy:
  automated:
    selfHeal: true  # For other fields

Finding the Drift Source

When you see drift, investigate:

Check audit logs: Who ran kubectl?

kubectl get events --field-selector reason=Update

Check controller logs: Did an operator make changes?
Check admission webhooks: Are mutations happening?
```
kubectl get mutatingwebhookconfigurations
```
Check the diff: What exactly changed?
```
argocd app diff app-name
```

Preventing Drift at the Source

Better than detecting drift is preventing it:

RBAC restrictions: Limit who can modify resources

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: readonly
rules:
  - apiGroups: ["*"]
    resources: ["*"]
    verbs: ["get", "list", "watch"]  # No create/update/delete

Policy enforcement: Use Kyverno to block manual changes

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-gitops
spec:
  rules:
    - name: block-manual-changes
      match:
        resources:
          kinds:
            - Deployment
      exclude:
        subjects:
          - kind: ServiceAccount
            name: argocd-application-controller
      validate:
        message: "Changes must go through GitOps"
        deny: {}

Training: Teach teams to change Git, not cluster

Monitoring Drift Over Time

Track drift as a metric:

# Prometheus query for out-of-sync apps
count(argocd_app_info{sync_status="OutOfSync"})

Alert if it’s non-zero for too long:

- alert: GitOpsDriftDetected
  expr: count(argocd_app_info{sync_status="OutOfSync"}) > 0
  for: 10m
  labels:
    severity: warning
  annotations:
    summary: "GitOps drift detected"
    description: "One or more applications are OutOfSync with Git"

My Checklist for Drift-Free GitOps

[ ] selfHeal enabled for most applications
[ ] ignoreDifferences configured for HPA-managed replicas
[ ] Notifications set up for OutOfSync events
[ ] RBAC restricts direct cluster modifications
[ ] Policy enforcement prevents manual changes
[ ] Monitoring alerts on drift
[ ] Team trained on GitOps workflow

Configuration drift is the enemy of reliable infrastructure. Detect it immediately, fix it automatically where safe, and investigate ruthlessly when it happens. Git should always reflect reality — that’s the whole point.

What Is Configuration Drift?#

Why Drift Matters#

ArgoCD’s Sync Status#

Self-Heal: Automatic Drift Correction#

Handling Intentional Drift: Ignore Differences#

Detecting Drift Without Auto-Fix#

Notifications: Alert on Drift#

The Diff View: Understanding Drift#

Refresh vs Sync#

Drift Detection Strategy#

For Development/Staging#

For Production (Most Apps)#

For Production (Critical/Special)#

For HPA-Managed Apps#

Finding the Drift Source#

Preventing Drift at the Source#

Monitoring Drift Over Time#

My Checklist for Drift-Free GitOps#