Skip to main content

Kafka on Object Storage Was Inevitable. The Next Step Is Open.

Kafka on object storage is not a trend. It is a correction. WarpStream proved that the Kafka protocol can run without stateful brokers by pushing durability into object storage. The next logical evolution is taking that architecture out of vendor-controlled control planes and making it open and self-hosted. KafScale is built for teams that want Kafka client compatibility, object storage durability, and Kubernetes-native operations without depending on a managed metadata service.

The problem was never Kafka clients

The Kafka protocol is one of the most successful infrastructure interfaces ever shipped. It is stable, widely implemented, and deeply integrated into tooling and teams. The part that aged poorly is not the protocol. It is the original broker-centric storage model.

Stateful brokers made sense in a disk-centric era where durability lived on the same machines that ran compute. That coupling forces partition rebalancing, replica movement, disk hot spots, slow recovery, and persistent overprovisioning.

WarpStream proved the core idea

WarpStream demonstrated that Kafka compatibility does not require broker-local disks. By storing log segments in object storage and running stateless compute to serve the Kafka protocol, they showed that elastic scaling and simplified operations are possible without changing clients.

Their work validated an architectural shift many teams suspected was viable but unproven at scale.

When Confluent acquired WarpStream in 2023, it confirmed that Kafka on object storage is no longer an experiment but a mainstream direction for streaming platforms.

You can read more about WarpStream’s architecture and approach on their site: warpstream.com

What comes after “diskless Kafka”

Once an architecture becomes credible, teams stop asking whether it works and start asking whether it fits their constraints.

WarpStream, like many modern streaming services, relies on a vendor-managed control plane for metadata and coordination. For many organizations, that is a reasonable trade-off. For others, it is a hard blocker.

  • Regulated environments that cannot depend on external control planes
  • Sovereign and private cloud deployments
  • Teams that require open-source licensing and forkability
  • Platforms that prefer self-hosted economics over managed margins

The next logical evolution: open and self-hosted

KafScale is an open-source, Apache 2.0 licensed streaming platform that applies the stateless, object-storage-backed Kafka model in a fully self-hosted form. It runs on Kubernetes, stores durable log segments in S3, and uses etcd for topic metadata, offsets, and consumer group coordination.

This is not a rejection of what WarpStream proved. It is the next logical step after that proof became accepted.

From Proof to Platform

WarpStream demonstrated that Kafka compatibility does not require stateful brokers. KafScale applies the same architectural correction under different operational and licensing constraints.

Dimension WarpStream KafScale
Kafka Protocol Compatible Compatible
Broker State Stateless Stateless
Durable Storage Object Storage Object Storage (S3)
Metadata Control Vendor Managed Self-Hosted (etcd)
Deployment Model Managed Service Kubernetes
License Closed Source Apache 2.0
Primary Tradeoff Operational Convenience Operational Control
  • Kafka protocol compatibility, so existing clients and tooling continue to work
  • Stateless brokers, treated as ephemeral Kubernetes pods
  • Object storage as the source of truth, using immutable segments in S3
  • Self-hosted metadata, using etcd for topic maps, offsets, and consumer group state
  • Apache 2.0 licensing, with no usage restrictions or control plane dependency

The architecture is inevitable. The deployment model should be a choice.

Tradeoffs you should understand

Separating compute from durable storage changes the latency profile and simplifies operations. It is a strong fit for durable pipelines, logs, ETL, and asynchronous event transport. It is not a universal replacement for every Kafka workload.

  • If you need sub-10ms latency, stateful brokers are usually a better fit.
  • If you rely on exactly-once transactions or compacted topics, KafScale does not target that scope.
  • If you want a fully managed service, a managed offering is the right choice.

KafScale is built for the common case: durable replay, predictable retention, and minimal operational overhead.

Why this matters now

WarpStream accelerated industry acceptance by proving that Kafka on object storage works in production. That question is now settled.

The next phase is about control, licensing, and deployment freedom. The remaining decision is not whether to adopt this architecture, but how much control teams retain when they do.

This article builds on earlier discussions shared with the developer community.

Where to go next

If you need help with distributed systems, backend engineering, or data platforms, check my Services.