Longhorn vs Rook-Ceph: Kubernetes Storage Compared

The first time you run a stateful workload on a self-hosted cluster, you hit a wall. No cloud provider storage class to lean on. Just your nodes, their disks, and a Postgres pod that refuses to schedule because nothing can give it a PersistentVolume. So you start reading, and within an hour you’ve narrowed it down to two names that keep coming up: Longhorn and Rook-Ceph.

I’ve run both in production. So let me get my bias out of the way before anything else: I default to Longhorn on small clusters, and I’ll explain exactly why later. Keep that in mind as you read, because it colours how I weigh things. Both are CNCF projects, both give you replicated block storage that survives a node dying, and both are good software. They just disagree about how much complexity you should be signing up for.

What you’re actually choosing between

Longhorn is distributed block storage written for Kubernetes from day one. Each volume gets replicated across your nodes using plain Linux storage primitives, and there’s no separate storage system underneath that you need to learn.

Rook-Ceph is a Kubernetes operator wrapped around Ceph. Ceph is a distributed storage system that’s older than Kubernetes by years, runs petabytes at places like CERN, and brings its entire feature set with it: block, object, filesystem, erasure coding, the lot. Rook teaches Kubernetes how to drive it.

That difference in lineage is the whole story. Longhorn was born in your cluster. Ceph moved in and brought a lot of luggage. The luggage is useful when you need it and a burden when you don’t.

The criteria I actually care about

A feature checklist tells you nothing useful here, because both tools will tick most of the boxes. What matters is how a choice plays out in operation. I weigh four things: how much operational surface I’m taking on, how it behaves under failure (this is storage, so this is the whole game), how it uses the resources I have, and how far it scales before it breaks. Performance matters too, but for most homelab and small-production workloads it’s rarely the thing that decides it, so I’ll treat it as a secondary concern.

Longhorn: the one that lives in your cluster

flowchart TD
    subgraph longhorn["Longhorn Architecture"]
        subgraph node1["Node 1"]
            E1["Longhorn Engine"]
            R1["Replica"]
        end
        subgraph node2["Node 2"]
            E2["Longhorn Engine"]
            R2["Replica"]
        end
        subgraph node3["Node 3"]
            R3["Replica"]
        end
    end

    PV["PersistentVolume"] --> E1
    E1 --> R1
    E1 --> R2
    E1 --> R3

How Longhorn works

The model is small enough to hold in your head, which is most of the appeal. Every PVC gets its own Longhorn engine running as a pod. That engine writes your data to replicas sitting on the local disks of several nodes, and it writes synchronously, so nothing gets acknowledged until every replica has it. The workload talks to its volume over iSCSI, exposed by the engine.

Engine per volume: each PVC gets a dedicated Longhorn engine (runs as a pod)
Replicas on nodes: data replicated to multiple nodes’ local disks
Synchronous replication: all replicas written before acknowledging
iSCSI frontend: engine exposes the volume via iSCSI to the workload

When I want to know what a volume is doing, I open the Longhorn UI and I can see it: which replicas are healthy, where they live, whether a rebuild is in progress. No black box. That fits how I want to run infrastructure, as I wrote in Sovereign Infrastructure - I need to understand what I’m running, and Longhorn lets me.

Installing it

helm repo add longhorn https://charts.longhorn.io
helm repo update

helm install longhorn longhorn/longhorn \
  --namespace longhorn-system \
  --create-namespace

Basic configuration:

# longhorn-values.yaml
defaultSettings:
  defaultReplicaCount: 3
  defaultDataPath: /var/lib/longhorn
  storageMinimalAvailablePercentage: 15
  defaultLonghornStaticStorageClass: longhorn

persistence:
  defaultClass: true
  defaultClassReplicaCount: 3

Where it shines

Helm install, and you have storage. No dedicated storage nodes, no pool topology to design first. The built-in web UI shows you volume management, backup status, and node health without you wiring up anything. Backups go straight to S3-compatible storage with incremental snapshots, which on my setup means I point it at my own MinIO and forget about it. None of this carries legacy baggage, because there’s no older system being adapted underneath.

Where it bites you

The honest costs are real. Longhorn is good for the workloads most of us run, but it isn’t built for extreme IOPS, because every volume’s traffic funnels through its own engine pod and that pod is a ceiling. It works well up to roughly 100 nodes and gets awkward past that. And every replica is a full copy of your data, so three replicas means three times the raw capacity. There’s no erasure coding to soften that.

Rook-Ceph: the option with the heavy luggage

flowchart TD
    subgraph rook["Rook-Ceph Architecture"]
        subgraph mgmt["Management"]
            OP["Rook Operator"]
            MON["Ceph Monitors"]
            MGR["Ceph Manager"]
        end
        subgraph storage["Storage"]
            OSD1["OSD<br/>(disk 1)"]
            OSD2["OSD<br/>(disk 2)"]
            OSD3["OSD<br/>(disk 3)"]
            OSD4["OSD<br/>(disk 4)"]
        end
        subgraph access["Access"]
            RBD["RBD<br/>(Block)"]
            RGW["RGW<br/>(Object)"]
            CFS["CephFS<br/>(Filesystem)"]
        end
    end

    PV["PersistentVolume"] --> RBD
    RBD --> OSD1
    RBD --> OSD2

How Rook-Ceph works

Ceph’s model is genuinely clever, and that cleverness is exactly why there’s more to learn. Each disk becomes an OSD, an Object Storage Daemon. Data spreads across those OSDs using the CRUSH algorithm and placement rules you define, so Ceph decides where each piece of data lives based on your failure domains rather than dumb round-robin. On top of that you get three ways in: block via RBD, S3-compatible object storage via RGW, and a real filesystem via CephFS. Holding the whole thing together is a quorum of monitor daemons tracking cluster state.

OSDs on disks: each disk becomes an Object Storage Daemon
CRUSH algorithm: data distributed across OSDs using placement rules
Multiple access methods: block (RBD), object (S3-compatible), filesystem (CephFS)
Monitors for consensus: cluster state managed by monitor daemons

Every one of those moving parts is a thing you can inspect, which is great, and a thing you have to understand when it misbehaves, which is the catch.

Installing it

helm repo add rook-release https://charts.rook.io/release
helm repo update

# Install Rook operator
helm install rook-ceph rook-release/rook-ceph \
  --namespace rook-ceph \
  --create-namespace

# Create Ceph cluster
kubectl apply -f ceph-cluster.yaml

Cluster configuration:

# ceph-cluster.yaml
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  cephVersion:
    image: quay.io/ceph/ceph:v18.2.0

  mon:
    count: 3
    allowMultiplePerNode: false

  mgr:
    count: 2

  storage:
    useAllNodes: true
    useAllDevices: false
    deviceFilter: "^sd[b-z]"  # Use sdb, sdc, etc.

  resources:
    mon:
      requests:
        cpu: 500m
        memory: 1Gi
    osd:
      requests:
        cpu: 500m
        memory: 2Gi

Where it shines

This is where the luggage pays off. Ceph handles petabytes, the kind of scale that runs at CERN and Bloomberg, so you will not outgrow it. You get block, object, and filesystem storage from one system, plus erasure coding, snapshots, and cross-cluster mirroring. The tuning surface is enormous, which means a team that knows what they’re doing can shape it to a specific workload. And erasure coding cuts the storage overhead: instead of paying 3x for replication you can land closer to 1.5x, which at large capacity is real money saved.

Where it bites you

The same power is the same cost. There are far more moving parts, and monitors, managers, and OSDs all want resources and attention. The floor is high: three monitors, two managers, and your OSDs before you’ve stored a single byte, and the memory footprint is significant. Ceph carries decades of features and configuration, so the learning curve is steep and it doesn’t flatten quickly. For real performance you often end up dedicating nodes to OSDs, which means hardware you’ve set aside specifically for storage. None of that is a flaw. It’s the price of what Ceph gives you, and you only want to pay it if you’ll use what you bought.

Head to head

Aspect	Longhorn	Rook-Ceph
Complexity	Low	High
Setup time	10 minutes	30+ minutes
Resource overhead	Low	High
Max scale	~100 nodes	1000+ nodes
Storage types	Block only	Block, Object, Filesystem
Performance	Good	Excellent (when tuned)
Storage efficiency	3x (replication)	1.5x+ (erasure coding)
Backup	Built-in S3	External tools
UI	Excellent	Ceph Dashboard
Community	Growing	Mature

The table is handy for a quick glance, but the decision lives in the rows you’ll actually feel. For me that’s the resource overhead and the complexity columns, because those are the things I pay for every single day a cluster runs, not just on the day I install it.

When Longhorn is the right call

Reach for Longhorn when the shape of your situation looks like this:

Small to medium clusters (under 100 nodes)
Simplicity matters and you want storage that just works
Limited ops capacity, a small team that can’t dedicate time to babysitting storage
General workloads like databases and stateful apps with moderate I/O
Homelab or edge where resources are tight

# Typical Longhorn workload
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-data
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: longhorn
  resources:
    requests:
      storage: 100Gi

When Rook-Ceph earns its keep

Reach for Rook-Ceph when:

Large clusters (100+ nodes)
Multiple storage types needed, block AND object AND filesystem from one system
Performance critical and you need to tune for specific workloads
Storage efficiency matters and erasure coding will save you real capacity
Dedicated storage team, people who can learn Ceph and operate it well

That last point is the one people skip. Ceph rewards a team that knows it and punishes one that doesn’t. If nobody owns the storage, the complexity owns you.

# Rook-Ceph with erasure coding
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicated-pool
  namespace: rook-ceph
spec:
  failureDomain: host
  replicated:
    size: 3
---
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: erasure-coded-pool
  namespace: rook-ceph
spec:
  failureDomain: host
  erasureCoded:
    dataChunks: 2
    codingChunks: 1

The nuance on performance

This is the criterion I parked earlier, and here’s why it rarely decides it. For the workloads most people run, both are fast enough, and the difference only shows up at the edges.

Longhorn under load

# Tune replica count for performance vs durability
defaultSettings:
  defaultReplicaCount: 2  # Faster than 3, less durable

# Use dedicated disk path
defaultDataPath: /mnt/fast-ssd/longhorn

Longhorn is I/O bound by that per-volume engine pod. Push a high-IOPS workload through it and the engine becomes your bottleneck, which is the trade-off you accept for the simple architecture.

Rook-Ceph under load

# Dedicated OSD nodes
spec:
  placement:
    osd:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
            - matchExpressions:
                - key: storage-node
                  operator: In
                  values:
                    - "true"

# NVMe optimization
storage:
  config:
    osdsPerDevice: "1"
    storeType: bluestore

Ceph can saturate modern NVMe drives when it’s configured properly. The phrase “when it’s configured properly” is carrying weight there, and reaching that point is exactly the work you’re signing up for.

Backups

Storage you can’t restore from isn’t storage, it’s a liability with a countdown. So this matters more than raw throughput.

Longhorn backups

Built in. Configure an S3 target:

defaultSettings:
  backupTarget: s3://longhorn-backups@us-east-1/
  backupTargetCredentialSecret: longhorn-s3-credentials

Schedule backups per volume:

apiVersion: longhorn.io/v1beta1
kind: RecurringJob
metadata:
  name: daily-backup
spec:
  cron: "0 2 * * *"
  task: backup
  groups:
    - default
  retain: 7

Rook-Ceph backups

There’s no equivalent built-in flow, so you reach for Velero with Ceph CSI snapshots:

velero install \
  --provider aws \
  --plugins velero/velero-plugin-for-csi \
  --features=EnableCSI

Or native Ceph mirroring for disaster recovery between clusters, which is genuinely nice once you’re operating at the scale where a second cluster exists.

My take, and what I actually run

Time to cash in the bias I flagged at the top. I run Longhorn in my homelab:

# My Longhorn configuration
defaultSettings:
  defaultReplicaCount: 2  # 3 nodes, 2 replicas
  defaultDataPath: /mnt/storage/longhorn
  backupTarget: s3://backups@minio/longhorn/
  backupTargetCredentialSecret: minio-credentials
  storageMinimalAvailablePercentage: 20

persistence:
  defaultClass: true

Three nodes is the deciding fact. That’s too small to justify Ceph’s overhead, where the monitors and managers alone would eat a chunk of capacity I can’t spare. The 2 AM test settles the rest: when a volume misbehaves and I’m half-awake, I want to open the Longhorn UI and see what’s wrong, not page through Ceph internals trying to remember which daemon does what. Backups land in my own MinIO over S3, and every spare MB stays spare on small nodes.

The day I’m running 50-plus nodes or genuinely need object storage alongside block, I’ll switch to Ceph and gladly pay the complexity tax, because at that point I’d be using what I paid for. That day isn’t here. Your context might put it a lot closer, and if you’ve got the team and the scale, Ceph is a fantastic choice. Read your own situation, not mine.

Migrating later if you outgrow Longhorn

Starting on Longhorn and worried you’re painting yourself into a corner? You aren’t. The path out is boring, which is the best thing you can say about a migration:

Back up the data from the Longhorn volume
Deploy Rook-Ceph alongside it
Restore into Ceph volumes
Update workloads to use the new StorageClass
Retire Longhorn once everything’s moved

Both speak CSI, so your workloads see the same interface either way. The switch is a StorageClass change, not a rewrite.

Picking the complexity you can carry

Storage is the part of Kubernetes that punishes mistakes hardest. Get it wrong and you lose data, the one failure mode you can’t roll back. Over-build it and you spend your weeks feeding complexity you never needed.

Map it to scale and the answer usually falls out. Homelab and small clusters point at Longhorn. Medium production goes either way depending on whether those extra Ceph features earn their keep. Large scale points at Ceph. Both are solid software, and either will serve you well. What actually separates them is how much operational weight you want to carry, and that’s a question only you can answer for your own cluster.

Pick the simplest thing that survives your failure modes. When the cluster grows past it, you’ll know, and the door out is open.

What you’re actually choosing between#

The criteria I actually care about#

Longhorn: the one that lives in your cluster#

How Longhorn works#

Installing it#

Where it shines#

Where it bites you#

Rook-Ceph: the option with the heavy luggage#

How Rook-Ceph works#

Installing it#

Where it shines#

Where it bites you#

Head to head#

When Longhorn is the right call#

When Rook-Ceph earns its keep#

The nuance on performance#

Longhorn under load#

Rook-Ceph under load#

Backups#

Longhorn backups#

Rook-Ceph backups#

My take, and what I actually run#

Migrating later if you outgrow Longhorn#

Picking the complexity you can carry#

What you’re actually choosing between

The criteria I actually care about

Longhorn: the one that lives in your cluster

How Longhorn works

Installing it

Where it shines

Where it bites you

Rook-Ceph: the option with the heavy luggage

How Rook-Ceph works

Installing it

Where it shines

Where it bites you

Head to head

When Longhorn is the right call

When Rook-Ceph earns its keep

The nuance on performance

Longhorn under load

Rook-Ceph under load

Backups

Longhorn backups

Rook-Ceph backups

My take, and what I actually run

Migrating later if you outgrow Longhorn

Picking the complexity you can carry