Skip to main content

Kafka on Object Storage Was Inevitable. The Next Step Is Open.

Kafka on object storage is not a trend. It is a correction. WarpStream proved that the Kafka protocol can run without stateful brokers by pushing durability into object storage. The next logical evolution is taking that architecture out of vendor-controlled control planes and making it open and self-hosted. KafScale is built for teams that want Kafka client compatibility, object storage durability, and Kubernetes-native operations without depending on a managed metadata service.

The problem was never Kafka clients

The Kafka protocol is one of the most successful infrastructure interfaces ever shipped. It is stable, widely implemented, and deeply integrated into tooling and teams. The part that aged poorly is not the protocol. It is the original broker-centric storage model.

Stateful brokers made sense in a disk-centric era where durability lived on the same machines that ran compute. That coupling forces partition rebalancing, replica movement, disk hot spots, slow recovery, and persistent overprovisioning.

WarpStream proved the core idea

WarpStream demonstrated that Kafka compatibility does not require broker-local disks. By storing log segments in object storage and running stateless compute to serve the Kafka protocol, they showed that elastic scaling and simplified operations are possible without changing clients.

Their work validated an architectural shift many teams suspected was viable but unproven at scale.

When Confluent acquired WarpStream in 2023, it confirmed that Kafka on object storage is no longer an experiment but a mainstream direction for streaming platforms.

You can read more about WarpStream’s architecture and approach on their site: warpstream.com

What comes after “diskless Kafka”

Once an architecture becomes credible, teams stop asking whether it works and start asking whether it fits their constraints.

WarpStream, like many modern streaming services, relies on a vendor-managed control plane for metadata and coordination. For many organizations, that is a reasonable trade-off. For others, it is a hard blocker.

  • Regulated environments that cannot depend on external control planes
  • Sovereign and private cloud deployments
  • Teams that require open-source licensing and forkability
  • Platforms that prefer self-hosted economics over managed margins

The next logical evolution: open and self-hosted

KafScale is an open-source, Apache 2.0 licensed streaming platform that applies the stateless, object-storage-backed Kafka model in a fully self-hosted form. It runs on Kubernetes, stores durable log segments in S3, and uses etcd for topic metadata, offsets, and consumer group coordination.

This is not a rejection of what WarpStream proved. It is the next logical step after that proof became accepted.

From Proof to Platform

WarpStream demonstrated that Kafka compatibility does not require stateful brokers. KafScale applies the same architectural correction under different operational and licensing constraints.

Dimension WarpStream KafScale
Kafka Protocol Compatible Compatible
Broker State Stateless Stateless
Durable Storage Object Storage Object Storage (S3)
Metadata Control Vendor Managed Self-Hosted (etcd)
Deployment Model Managed Service Kubernetes
License Closed Source Apache 2.0
Primary Tradeoff Operational Convenience Operational Control
  • Kafka protocol compatibility, so existing clients and tooling continue to work
  • Stateless brokers, treated as ephemeral Kubernetes pods
  • Object storage as the source of truth, using immutable segments in S3
  • Self-hosted metadata, using etcd for topic maps, offsets, and consumer group state
  • Apache 2.0 licensing, with no usage restrictions or control plane dependency

The architecture is inevitable. The deployment model should be a choice.

Tradeoffs you should understand

Separating compute from durable storage changes the latency profile and simplifies operations. It is a strong fit for durable pipelines, logs, ETL, and asynchronous event transport. It is not a universal replacement for every Kafka workload.

  • If you need sub-10ms latency, stateful brokers are usually a better fit.
  • If you rely on exactly-once transactions or compacted topics, KafScale does not target that scope.
  • If you want a fully managed service, a managed offering is the right choice.

KafScale is built for the common case: durable replay, predictable retention, and minimal operational overhead.

Why this matters now

WarpStream accelerated industry acceptance by proving that Kafka on object storage works in production. That question is now settled.

The next phase is about control, licensing, and deployment freedom. The remaining decision is not whether to adopt this architecture, but how much control teams retain when they do.

This article builds on earlier discussions shared with the developer community.

Where to go next

If you need help with distributed systems, backend engineering, or data platforms, check my Services.

Most read articles

Why Is Customer Obsession Disappearing?

Many companies trade real customer-obsession for automated, low-empathy support. Through examples from Coinbase, PayPal, GO Telecommunications and AT&T, this article shows how reliance on AI chatbots, outsourced call centers, and KPI-driven workflows erodes trust, NPS and customer retention. It argues that human-centric support—treating support as strategic investment instead of cost—is still a core growth engine in competitive markets. It's wild that even with all the cool tech we've got these days, like AI solving complex equations and doing business across time zones in a flash, so many companies are still struggling with the basics: taking care of their customers. The drama around Coinbase's customer support is a prime example of even tech giants messing up. And it's not just Coinbase — it's a big-picture issue for the whole industry. At some point, the idea of "customer obsession" got replaced with "customer automation," and no...

What are the performance implications of cross-platform execution within Wayang?

Apache Wayang ® enables cross-platform execution across multiple data processing platforms such as Spark, Flink, Java Streams, PostgreSQL or GraphChi. This capability fundamentally changes the performance behavior of distributed data pipelines. Wayang reduces manual data movement by selecting where each operator should run, but crossing platform boundaries still introduces serialization cost, shifts in locality, different memory strategies and new tuning constraints. Understanding these dynamics is essential before adopting Wayang for multi-platform pipelines at scale. Apache Wayang is a cross-platform data processing framework that lets developers run a single logical pipeline across engines such as Apache Spark, Apache Flink or a native Java backend. It provides an abstraction layer and a cost-based optimizer that selects the execution platform for each operator. This flexibility introduces new performance variables that do not exist in single-engine systems. Engine boundaries ...

Building a Model-Agnostic Multi-Agent System with OpenClaw

Over one week we rebuilt our AI stack around OpenClaw’s multi-agent architecture to avoid provider lock-in and stop wasting premium tokens. By aligning models to tasks, diversifying fallbacks across providers, enforcing minimal tool access, and switching to memory-first workflows with ephemeral sessions, we reduced token usage per task by about 70% and cut our monthly bill by 77% while improving operational resilience. How We Achieved 77% Cost Reduction and Provider Independence Over the past week, we rebuilt our AI infrastructure around OpenClaw’s multi-agent architecture. The result was a 77% cost reduction , provider independence , and a delegation system that routes work to the most cost-effective model for each job. Below is the technical journey of optimizing a 7-agent squad with OpenClaw. The Challenge: Model Provider Lock-In We started with a simple problem: our entire squad defaulted to a single model provider. This created three issues: Cost inefficiency beca...