Skip to main content

KafScale

Kafscale: Kafka-Compatible Streaming with Stateless Brokers and S3 Segment Storage

Kafscale is an Apache Kafka protocol-compatible streaming platform built for workloads that need durable message delivery, consumer offsets, and replay without running stateful Kafka brokers. Brokers are stateless pods, S3 stores immutable segments as the source of truth, and etcd handles topic metadata and consumer group offsets. Kubernetes manages scaling and failover.

The 80% use case at 30% of the cost. Most Kafka deployments function as durable pipes. They move events, track offsets, and support replay. They do not need sub-millisecond latency, exactly-once transactions, or compacted topics. Kafscale targets this common case: ~$110/month for 100GB/day throughput vs $400+ for self-managed Kafka or $200+ for managed offerings.

Who Kafscale is for

  • Platform teams running Kafka as infrastructure plumbing, not as a product feature
  • Data engineering teams using Kafka primarily for CDC, log aggregation, or event sourcing where latency tolerance is 100ms+
  • Organizations where Kafka operational burden exceeds the value of features they do not use
  • Greenfield projects that want Kafka client compatibility without Kafka operational complexity

Not a fit for: Trading systems, real-time bidding, or workloads requiring exactly-once semantics, compacted topics, or single-digit millisecond latency.

Architecture

Kubernetes Cluster Broker Pods (HPA) Broker 0 Broker 1 Broker 2 stateless / no local disk etcd Topic config / Offsets Consumer groups S3 Segment Storage Source of Truth .kfs .index ... flush segments fetch + cache Kafka Clients Flink / Wayang / Apps Kafka protocol Legend: data flow metadata

Brokers accept Kafka protocol connections, buffer writes, flush segments to S3, serve reads with caching and read-ahead, and coordinate consumer groups. etcd stores metadata: topic configuration, partition state, consumer group membership, and committed offsets. S3 stores immutable segment and index objects that represent the message log.

What problem Kafscale solves

Most Kafka deployments function as durable pipes. They move events from point A to points B through N, track consumer offsets, and depend on replay when downstream systems fail. Many teams do not need sub-millisecond latency, exactly-once transactions, or compacted topics, but they still pay for stateful broker operations, disk management, rebalancing workflows, and on-call complexity.

Kafscale targets the common case by trading latency for operational simplicity. Brokers do not persist message logs locally. Message data is stored as immutable segments in S3. Kubernetes provides scheduling, scaling, and failover. The goal is to make durable message transport easier to operate without changing client integrations, since the system speaks the Kafka protocol.

Scope

In scope

  • Kafka protocol compatibility for core producer and consumer workflows
  • Produce and fetch paths backed by immutable segment storage
  • Consumer groups, membership, heartbeats, and committed offsets
  • Topic administration needed for everyday platform use
  • Kubernetes operator integration via CRDs for cluster and topic lifecycle

Explicit non-goals

  • Exactly-once semantics and transactions
  • Compacted topics
  • Kafka internal replication and ISR protocols
  • Embedding stream processing inside the broker

Kafscale only does durable message transport. Stream processing remains the responsibility of compute engines such as Apache Flink, Apache Wayang, or any other stack that reads from Kafka topics. This keeps the broker surface area small and preserves compatibility with the Kafka ecosystem.

Storage and data model

Each topic is partitioned. Each partition is represented as an ordered sequence of immutable segment files plus a sparse index file used for offset-to-position lookup. Segment keys are based on base offsets so storage remains append-friendly and retention is handled with S3 lifecycle policies.

Topics and partitions

Topic: orders Partition 0 seg-000000.kfs seg-050000.kfs seg-100000.kfs Partition 1 seg-000000.kfs seg-050000.kfs Partition 2 seg-000000.kfs s3://bucket/namespace/orders/{partition}/segment-{offset}.kfs

S3 key layout

s3://{bucket}/{namespace}/{topic}/{partition}/segment-{base_offset}.kfs
s3://{bucket}/{namespace}/{topic}/{partition}/segment-{base_offset}.index

Segment file format

Each segment is a self-contained file with messages and metadata. The format includes a header for identification and versioning, message batches containing the actual records, and a footer with checksums for integrity verification.

Segment Header (32 bytes) Magic: 0x4B414653 ("KAFS") Version: 1 Flags Base Offset (8B) Message Count (4B) Timestamp (8B) Message Batch 1 Message Batch 2 ... Segment Footer (16 bytes) CRC32 (4B) Last Offset (8B) Magic: 0x454E4421 ("END!")

Write path

When a producer sends messages, the broker validates ownership, buffers the data, assigns offsets, and eventually flushes to S3. The acks setting controls when the producer receives confirmation.

1. RECEIVE Parse ProduceRequest Validate topic + ownership 2. BUFFER Append to write buffer Assign offsets from etcd 3. FLUSH DECISION Buffer >= 4MB? Time >= 500ms? Explicit flush? no Return to producer (acks=0 or acks=1) yes 4. BUILD SEGMENT Compress (snappy) Build header/footer + index 5. S3 UPLOAD PutObject: segment.kfs PutObject: segment.index 6. COMMIT Update etcd (HWM) Ack producers (acks=all) HWM = High Watermark

Read path

When a consumer fetches messages, the broker locates the relevant segment, checks the cache, and retrieves data from S3 if needed. Read-ahead prefetching improves performance for sequential consumers.

1. RECEIVE Parse FetchRequest 2. LOCATE SEGMENT Binary search for offset 3. CHECK CACHE LRU segment lookup hit 4a. FROM CACHE Serve cached data miss 4b. FETCH FROM S3 Load index, find position Range GET + read-ahead 5. CHECK BUFFER Add unflushed if offset > HWM 6. BUILD RESPONSE FetchResponse + HWM Legend: cache hit cache miss

S3 resiliency and backpressure

Kafscale deliberately avoids persistent local queues. When S3 misbehaves, the system surfaces it through protocol-native backpressure and operator automation instead of inventing new operational knobs.

S3 Health State Machine HEALTHY Flushes proceed normally Full read-ahead enabled Rollouts allowed HPA active latency > threshold DEGRADED Conservative retries (jittered backoff) REQUEST_TIMED_OUT when budget expires Rollouts paused Read-ahead slowed error rate > threshold UNAVAILABLE Immediate UNKNOWN_SERVER_ERROR Segment flushes disabled HPA halted, alerts fired S3Unavailable flag exposed latency recovers errors decrease Metrics kafscale_s3_health_state produce_backpressure_state s3_put_latency_ms / s3_put_failures_total Configuration KAFSCALE_S3_LATENCY_WARN_MS KAFSCALE_S3_ERROR_RATE_WARN

Every broker tracks S3 health as Healthy, Degraded, or Unavailable based on sliding-window PutObject latency and error metrics. The same health monitor wraps the fetch path so degraded buckets slow read-ahead and emit REQUEST_TIMED_OUT; unavailability raises UNKNOWN_SERVER_ERROR immediately so consumers understand the outage.

Operator guardrails

The Kubernetes operator watches broker health via control-plane RPCs or Prometheus. When any broker is Degraded, rollouts are paused. If a quorum reports Unavailable, the operator halts HPA decisions, emits alerts, and optionally rechecks IAM credentials and endpoints before resuming.

Surfacing state

The broker exposes /metrics with Prometheus-style gauges and BrokerControl.GetStatus returns a sentinel partition named __s3_health whose state field reflects the current S3 state. Operators or HPAs can watch either interface to gate rollouts or trigger alerts. For ops teams that prefer push semantics, the broker also opens a StreamMetrics gRPC stream and continuously emits the latest health snapshot plus derived latency and error stats to the operator so automation can react without scraping delays.

Consumer group protocol

Kafscale implements the standard Kafka consumer group protocol. Groups transition through states as members join, leave, or fail heartbeats. The broker handles coordination, assignment, and offset tracking.

Empty PreparingRebalance CompletingRebalance Stable first member joins all members joined leader assigns member leaves or heartbeat timeout all members leave all expire

Operational defaults

  • Bucket naming: kafscale-{environment}-{region} to isolate IAM and retention policies
  • Region affinity: bucket region matches the Kubernetes cluster region to avoid cross-region cost and latency
  • Encryption: SSE-KMS with a customer-managed CMK when provided; SSE-S3 fallback with a warning
  • Lifecycle retention: operator-managed prefix rules derived from topic retention configuration

Current development status

Kafscale is under active development and not yet production-ready. Compatibility regression testing, fault injection coverage, and repeatable benchmarks are required before any production recommendation.

View on GitHub

How to use Kafscale in an architecture

Kafscale is intended to be used as a Kafka-compatible transport layer. Producers and consumers connect using standard Kafka client libraries. Downstream compute engines such as Flink or Wayang read from Kafscale topics using their existing Kafka connectors. The platform focus remains durable delivery and replay, not embedded processing.

Related resources

If you are evaluating Kafscale or similar architectures for your organization:

Evaluating Kafka alternatives or simplifying your streaming infrastructure?

I help teams assess streaming architectures, reduce operational burden, and design cost-effective data platforms.
See how I work with teams or book a call.

If you need help with distributed systems, backend engineering, or data platforms, check my Services.

Most read articles

Why Is Customer Obsession Disappearing?

Many companies trade real customer-obsession for automated, low-empathy support. Through examples from Coinbase, PayPal, GO Telecommunications and AT&T, this article shows how reliance on AI chatbots, outsourced call centers, and KPI-driven workflows erodes trust, NPS and customer retention. It argues that human-centric support—treating support as strategic investment instead of cost—is still a core growth engine in competitive markets. It's wild that even with all the cool tech we've got these days, like AI solving complex equations and doing business across time zones in a flash, so many companies are still struggling with the basics: taking care of their customers. The drama around Coinbase's customer support is a prime example of even tech giants messing up. And it's not just Coinbase — it's a big-picture issue for the whole industry. At some point, the idea of "customer obsession" got replaced with "customer automation," and no...

How to scale MySQL perfectly

When MySQL reaches its limits, scaling cannot rely on hardware alone. This article explains how strategic techniques such as caching, sharding and operational optimisation can drastically reduce load and improve application responsiveness. It outlines how in-memory systems like Redis or Memcached offload repeated reads, how horizontal sharding mechanisms distribute data for massive scale, and how tools such as Vitess, ProxySQL and HAProxy support routing, failover and cluster management. The summary also highlights essential practices including query tuning, indexing, replication and connection management. Together these approaches form a modern DevOps strategy that transforms MySQL from a single bottleneck into a resilient, scalable data layer able to grow with your application. When your MySQL database reaches its performance limits, vertical scaling through hardware upgrades provides a temporary solution. Long-term growth, though, requires a more comprehensive approach. This invo...

What the Heck is Superposition and Entanglement?

This post is about superposition and interference in simple, intuitive terms. It describes how quantum states combine, how probability amplitudes add, and why interference patterns appear in systems such as electrons, photons and waves. The goal is to give a clear, non mathematical understanding of how quantum behavior emerges from the rules of wave functions and measurement. If you’ve ever heard the words superposition or entanglement thrown around in conversations about quantum physics, you may have nodded politely while your brain quietly filed them away in the "too confusing to deal with" folder.  These aren't just theoretical quirks; they're the foundation of mind-bending tech like Google's latest quantum chip, the Willow with its 105 qubits. Superposition challenges our understanding of reality, suggesting that particles don't have definite states until observed. This principle is crucial in quantum technologies, enabling phenomena like quantum comp...