Design Phase Β· Oxia-Powered Β· Pulsar-Native

Nereus

Pulsar-native Shared-Storage Streaming Engine

Oxia-backed metadata and coordination plane. Object-storage-first durable data plane. Stateless, leaderless broker. Pulsar native protocol, KoP Kafka compatibility, and lakehouse-native stream-table duality β€” all in one architecture.

πŸ—οΈ L0—L4 Layered Architecture
β˜‘οΈ Oxia Metadata Plane
πŸ“¦ Object Data Plane
⚑ 8 Design Futures

Five Layers, One Truth

L0–L4 layered architecture. The single internal coordinate β€” streamId + offset β€” powers Pulsar, Kafka, cursor, retention, compaction, and lakehouse.

πŸ”Œ

L1/L2 Β· Protocol Projection

Pulsar native protocol + KoP Kafka compatibility. ManagedLedger facade projects MessageId/Position onto the internal offset truth without introducing a second durable log.

Pulsar Broker Β· KoP Β· MessageId Β· Position
↓
β˜‘οΈ

Oxia Β· Metadata and Coordination Plane

Offset authority, append fencing, offset index, object manifest, cursor state, transaction state, routing. Oxia decides visibility β€” not the broker, not the object store.

Offset Index Β· Fencing Β· CAS Β· Cursor Β· Notifications
↓
⚑

L0 Β· Core Stream Storage

Multi-stream WAL objects aggregate writes across partitions. Per-stream compacted objects serve reads and lakehouse queries from the same physical bytes.

StreamStorage Β· Object WAL Β· Commit-time Offset Assignment Β· Read Resolver
↓
πŸ—„οΈ

L3 Β· Compaction + Lakehouse

Generation replacement swaps offset index targets atomically. SBT and SDT expose committed streams as Iceberg/Delta/Hudi tables.

🌐

L4 Β· Routing + Elasticity

Stateless broker, preferred routing, append session fencing, zone-aware failover. No data movement on scale-in/out.

↓
☁️

Shared Object Data Plane

Multi-stream WAL objects, per-stream compacted objects, SBT/SDT table files, cursor/txn snapshots. Objects are immutable once committed. Visibility is Oxia's domain.

S3 Β· GCS Β· MinIO Β· Parquet Β· Iceberg

Shared-Storage Architecture with Pulsar-Native Enhancement

Nereus is built on a shared-storage streaming architecture, adding first-class Pulsar-native semantics β€” MessageId, ManagedCursor, Shared/Key_Shared subscriptions, transactions, and replication.

Capability Nereus Design Layer
Oxia metadata serviceOxia: offset index, fencing, routing, cursor, txn, manifestL0/L4
Stateless / leaderless brokerBroker not durable owner; preferred broker for locality onlyL4
Object WALMulti-stream WAL object; independent per-stream slice visibilityL0
Offset index{streamId, offsetEnd} β†’ object range + entry index + cumulative sizeL0
Commit-time offset assignmentOffset assigned by Oxia at index-commit time, not by brokerL0
CompactionGeneration replacement: same offset, new physical object, atomic swapL3
Lakehouse / stream-table dualitySBT (built-in table) + SDT (external delivery); catalog not on ack pathL3
Cost profileObject WAL + Oxia + object storage for large-scale, low-cost workloadsL0/L3
Latency profileBookKeeper WAL or local WAL feeding into the same StreamStorage truthL0
Kafka protocolKoP projection: Kafka offset = stream record offsetL2

Designed from First Principles

β˜‘οΈ

Oxia is the Offset Authority

Oxia decides visibility, assigns offsets, fences stale commits. Object store stores bytes β€” not truth. Broker is stateless for correctness.

πŸ“¦

Multi-Stream WAL Object

One physical WAL object aggregates multiple stream slices. Each slice becomes visible independently when its Oxia offset index entry is committed.

πŸ”„

Generation Replacement

Compaction never changes stream offsets. It replaces which physical object an offset range points to β€” readers switch atomically to the highest visible generation.

πŸ—„οΈ

Stream-Table Duality

SBT exposes streams as built-in Iceberg tables. SDT delivers to external catalogs. Lakehouse queries and streaming reads share the same compacted objects.

πŸ”Œ

Pulsar Native + KoP Kafka

Pulsar protocol as first-class, Kafka via KoP projection. Both map to the same stream offset truth. No second durable log for Kafka workloads.

⚑

Dual WAL Profiles

Latency-optimized with BookKeeper WAL, cost-optimized with Object WAL. Same StreamStorage API, same Oxia offset truth, same compaction pipeline.

πŸ”

Fencing & Correctness

Oxia append sessions with monotonic fencing tokens. Stale broker commits are rejected at the metadata plane. No split-brain visible commit.

πŸ“

Immutable Objects

WAL, compacted, snapshot, and table objects are immutable once committed. GC is driven by Oxia reference counting β€” not by object store list operations.

8 Futures to Full Architecture

One target architecture, split into designable, reviewable modules. Each future maps to one or more layers (L0–L4).

1

Core StreamStorage + Object WAL Design Draft

StreamStorage API, object WAL, Oxia offset index, commit-time offset assignment, read resolver. L0

2

ManagedLedger Facade Design Draft

ManagedLedger/ManagedCursor compatibility runtime, virtual ledger projection, Position/MessageId mapping. L1

3

Cursor & Subscription State Design Draft

Mark-delete offset, individual ack ranges, cursor snapshot objects, cursor CAS, Shared/Failover/Exclusive recovery. L1

4

Compaction + Generation Replacement Design Draft

WAL→compacted Parquet, generation overlay, atomic index replacement, fallback, GC protection. L3

5

KoP Compatibility Design Draft

Kafka offset=stream offset, Produce/Fetch mapping, group offset, transaction visibility, leader epoch projection. L2

6

Lakehouse SBT / SDT Design Draft

SBT built-in table, SDT external delivery, Iceberg catalog commit, catalog repair, delivery idempotence. L3

7

Routing / Brown-out / Elasticity Design Draft

Broker session, zone-aware routing ring, preferred broker, append session transfer, cache invalidation. L4

8

Advanced Pulsar Semantics Design Draft

Key_Shared, delayed delivery, pending ack transaction, replicated subscription, schema/system topic bootstrap, geo-replication. L1/L4

Explore & Contribute

Nereus is in the design phase. Start with the architecture overview, then dive into individual future design docs.

# Clone the design repository
$git clone https://github.com/nereusstream/nereus.git
# Read the overall architecture
$cat docs/design/nereus-overall-architecture.md
# Browse the design index
$cat docs/design/nereus-design-index.md
β†’ 14 authoritative design docs Β· 8 futures Β· L0–L4 layers
Architecture Overview GitHub Organization