← Back to Index

High-Frequency Settlement Engine

C / Go / PostgreSQL / Docker / Protobufs

Execution Architecture

System Architecture Diagram
Fig 1. System design separating the C matching engine from the persistence and logging layer (Go). Communication occurs over Unix Domain Sockets using length-prefixed Protobuf framing to avoid TCP overhead/increased jitter.

The system architecture decouples low-latency matching from high-integrity settlement. The execution core is implemented in C to maintain deterministic sub-millisecond response times. To eliminate jitter caused by syscalls and heap fragmentation with respect to malloc(), we implemented a custom slab allocator. At startup, the engine requests a contiguous 128MB memory block and organizes it into a LIFO pointer stack. Incoming orders are assigned to pre-warmed memory slots, ensuring O(1) allocation and deallocation while maximizing CPU L1 cache locality during the matching loop.

Slab Allocator Memory Layout
Fig 2. Visualization of the Slab Allocator. The LIFO stack ensures that recently freed order nodes likely still present in the CPU cache are immediately reused for incoming requests.

Persistence & ACID Compliance

The settlement gateway, written in Go, bridges the execution engine to a PostgreSQL ledger. We enforce financial integrity through strict ACID transactions. Balance updates utilize row-level locking (`SELECT FOR UPDATE`) to isolate concurrent trades involving the same user ID, effectively serializing conflicting state changes at the database level. To mitigate the write amplification caused by the Postgres Write-Ahead Log (WAL) during high-throughput bursts (10k+ TPS), we implemented a micro-batching mechanism.

Batch Commit Logs
Fig 3. Gateway output showing the Group Commit protocol. Instead of fsyncing every trade individually, the worker aggregates net balance deltas in memory and flushes them via the Postgres COPY protocol. The variance in batch size is because of the randomization employed in the ingress engine, meaning not all orders immediately have a match. This reduces disk I/O operations by > three orders of magnitude while maintaining durability guarantees.

Data serialization is handled via Protocol Buffers. We define a strict schema using `uint64` and `uint32` types to enforce fixed-point arithmetic, preventing floating-point errors which are not allowed in financial software. The entire stack is orchestrated via Docker Compose, utilizing named volumes to bypass the virtiofs limitation on Unix Domain Sockets within macOS environments.