Sentry

Building Sentry: A Distributed Message Broker in Go (From Scratch)
I’m learning systems engineering and system design by doing —
not by watching tutorials, not by drawing diagrams, but by actually building infrastructure.
This blog documents my journey building Sentry, a Kafka-inspired distributed message broker written in Go, from scratch.
Not a wrapper.
Not a framework experiment.
Not a CRUD backend.
A real broker: raw TCP, binary wire protocol, append-only logs, offset indexing, crash recovery, and concurrency at scale.
Why I Started This Project
Most backend engineers (including me, earlier) live behind abstractions:
HTTP frameworks
ORMs
Message queues as black boxes
Managed services that “just work”
But at some point I realized something uncomfortable:
I could use Kafka, Redis, and RabbitMQ…
but I had no real idea how they worked internally.
So I asked myself:
How does a broker actually store messages on disk?
How are offsets tracked?
What happens when the process crashes mid-write?
How do consumers resume safely?
How does a binary protocol work over raw TCP?
What breaks under network lag or partial writes?
And that’s where Sentry was born.
What Is Sentry?
Sentry is a high-performance, Kafka-like distributed message broker written in Go.
It focuses on:
Simplicity over features
Deterministic behavior
Low-level correctness
Failure-first design
Observability via logs
Learning by implementation
The goal is not to replace Kafka.
The goal is to understand Kafka-class systems by building one.
High-Level Architecture
At a high level, Sentry has these core layers:
Network Layer
Raw TCP server
Per-connection goroutines
Binary protocol decoding
Protocol Layer
Custom wire protocol
Length-prefixed frames
Correlation IDs
Versioning support
Broker Core
Message routing
Partition selection
Offset assignment
Persistence Layer
Append-only log segments
Offset → byte index files
Time-based index files
Crash-safe replay
Consumer Layer
Offset-based reads
Replay semantics
Deterministic fetch order
Each layer is explicit.
Nothing is hidden behind magic.
Custom Binary Wire Protocol
One of the first things I built was the wire protocol.
Instead of HTTP or gRPC, Sentry uses a custom binary protocol over raw TCP.
Why?
Because production brokers don’t speak JSON over HTTP.
They speak:
Framed binary messages
Big-endian encoded integers
Compact payloads
Deterministic layouts
Versioned schemas
Protocol Design
Each request frame looks like:
Copy code

| Frame Length (4 bytes) |
| Message Type (2 bytes) |
| Correlation ID (4 bytes) |
| Payload (N bytes) |
Key features:
Length-prefixed framing
So partial TCP reads can be reassembled correctly.
Correlation IDs
So responses match requests in concurrent connections.
Big-endian encoding
For deterministic cross-platform decoding.
Version field (planned)
For forward compatibility.
Concurrency Model
Sentry uses Go’s strengths:
Goroutines
Channels
Mutexes
Worker pools
Ingress Model
One goroutine per TCP connection
Each connection reads frames
Frames are pushed into a bounded worker pool
Workers decode and route requests
This prevents:
Unbounded goroutine growth
Memory pressure
Head-of-line blocking
Backpressure
If the worker pool queue is full:
New requests block
Producers slow down
The system protects itself
No silent overload.
No hidden memory leaks.
Persistence: Append-Only Log Segments
This is the heart of the system.
Every topic partition in Sentry is backed by:
A .log file → message bytes
A .index file → offset → byte position
A .timeindex file → timestamp → offset
Why Append-Only?
Because append-only logs give you:
Crash safety
Sequential disk writes
High throughput
Simple replay
Deterministic ordering
Segment Structure
Each segment is defined by:
Copy code
Go
type Segment struct {
baseOffset uint64
log *os.File
index *os.File
timeIndex *os.File
topic string
partition uint32
active bool
}
Writing a Message
The write path:
Append message bytes to .log
Record (relativeOffset, bytePosition) in .index
Record (timestamp, relativeOffset) in .timeindex
fsync the log file
Return offset to producer
This ensures:
Durability
Deterministic offsets
Replay safety
Offset Indexing
To avoid scanning entire log files:
Each message write updates an index entry:
Copy code

| Relative Offset (4 bytes) |
| Byte Position (4 bytes) |
Stored in .index.
This allows:
O(1) offset → file seek
Fast consumer reads
Predictable latency
Time-Based Indexing
Each message also updates a .timeindex file:
Copy code

| Timestamp (8 bytes) |
| Relative Offset (4 bytes) |
This enables future features like:
Fetch by timestamp
Log compaction
Retention policies
Crash Recovery
This is where systems engineering actually begins.
When the broker restarts:
It scans the partition directory
Discovers all segment files
Loads .index files
Replays .log files if needed
Rebuilds in-memory offsets
Marks last segment as active
This ensures:
No data loss
No duplicate offsets
Safe resume for consumers
Failure-First Design
Everything in Sentry is built assuming failure:
Disk write failures
Partial TCP reads
Process crashes
Power loss
Client disconnects
Examples:
Writes check active segment state
Index lookups validate bounds
Reads handle EOF explicitly
Recovery paths are first-class
Consumer Fetch Logic
Consumers fetch messages using:
Topic
Partition
Offset
Max batch size
The broker:
Looks up offset in .index
Seeks to byte position
Reads message bytes
Returns batch to consumer
Advances consumer offset
This allows:
Replay from any offset
Exactly-once semantics (client-side)
Crash-safe resumes
Observability via Logs
Every major action logs explicitly:
Copy code

[INFO] Sentry Broker Starting…
[INFO] Data directory ready: ./data
[INFO] Loading log segments from disk…
[INFO] Loaded 3 log segments from disk
[INFO] Recovered offsets up to 10432
[INFO] Listening for producer connections…
Why this matters:
Debuggability
Replay verification
Crash analysis
Trust in the system
What This Project Taught Me

  1. Abstractions Hide Complexity
    Before Sentry, “Kafka stores messages” was a black box.
    Now I know:
    How bytes hit disk
    How offsets are assigned
    How indexes are structured
    How recovery actually works
  2. Concurrency Is Not Free
    Goroutines are cheap — but not infinite.
    Without:
    Worker pools
    Backpressure
    Bounded queues
    Your system will explode under load.
  3. Failure Is the Default
    If you don’t explicitly design for:
    crashes
    restarts
    partial writes
    corrupt state
    your system is already broken.
  4. Performance Is a Trade-Off
    Every choice has consequences:
    fsync = durability vs throughput
    memory buffers = speed vs safety
    batching = latency vs efficiency
    There is no “best” choice.
    Only conscious trade-offs.
    What’s Next for Sentry
    This is only the beginning.
    Planned improvements:
    Consumer groups
    Replication across nodes
    Leader election
    Segment compaction
    Retention policies
    Snapshotting
    Metrics & dashboards
    Raft-based metadata layer
    Why I’m Building This in Public
    Because:
    It keeps me accountable
    I get feedback from real engineers
    It forces me to document decisions
    It creates a learning trail others can follow
    Final Thoughts
    Building Sentry taught me something critical:
    Building features is easy.
    Designing systems that don’t break is the real challenge.
    This project took me far beyond CRUD apps and APIs.
    It forced me to think in:
    bytes
    offsets
    failure modes
    recovery paths
    concurrency limits
    durability guarantees
    And honestly?
    This has been the most educational project I’ve ever built.
    Repo & Resources
    GitHub:
    https://github.com/tejas2428cse990-svg/Sentry.git
    Blog series (coming soon):
    👉 Dev.to link
    Feedback Welcome