Engineering

Building a sub-3-second alert pipeline across 14 chains

Team Opsion

Opsion

·April 14, 2026·9 min read

EngineeringPerformanceInfrastructureReal-time

When we first described Opsion as delivering sub-3-second alerts, a few engineers pushed back. On a single chain with a fast block time, that is tractable. Across 14 EVM-compatible networks with different finality assumptions, RPC reliability profiles, and chain re-org behaviour, it sounded like we were setting ourselves up to fail.

We were not. This post explains how the pipeline actually works, what trade-offs we made, and where the 3-second figure comes from.

14 chains is not 14x one chain

The naive approach to multi-chain monitoring is to run one listener per chain, normalise the output, and feed it into a single rule engine. This works in demos. It breaks in production because the failure modes of each chain are different, and a system that treats them identically will have silent gaps.

Ethereum mainnet has roughly 12-second block times but high RPC provider reliability. Missing a block is rare, but handling it gracefully is still required.
Polygon has roughly 2-second block times, higher RPC variability, and re-org rates that need careful confirmation-depth tuning.
Arbitrum and Optimism use sequencers, so confirmed means something different than on L1. Sub-second transaction inclusion is possible, but L1 finality takes longer.
Some chains have intermittent RPC outages lasting minutes. Your monitoring cannot simply pause for those minutes.

The architecture

Layer 1: Chain listeners

Each supported chain has a dedicated listener process. Listeners subscribe to new block events via WebSocket where available, falling back to polling at chain-appropriate intervals. For each new block, the listener fetches the full transaction list, decodes relevant fields, and pushes normalised events onto an internal message queue.

Listeners maintain a sliding window of the last N blocks per chain and reconcile any gaps on reconnection. A 30-second RPC outage on Polygon does not mean those blocks are missed. It means the gap is detected and backfilled during recovery.

Layer 2: Normalisation

Every chain event is mapped to a common schema before it touches the rule engine. The schema captures chain ID, block number, transaction hash, from and to addresses, value, token contract, decoded method call, gas data, and a computed risk metadata field populated from our address database.

Normalisation is also where we resolve token decimals, convert values to USD using our internal price oracle, and annotate addresses with known labels such as exchange hot wallet, sanctioned entity, or mixer contract.

Layer 3: The rule engine

Normalised events fan out to a rule evaluation engine that checks each event against all active rules for all watching users in parallel. Rules are compiled at creation time to a bytecode representation that evaluates in microseconds per event. The rule engine runs as a stateless worker fleet, so horizontal scaling is straightforward.

Rules can reference stateful context: rolling 24-hour volumes per address, historical counterparty interactions, and organisation-level risk thresholds. This state lives in Redis with sub-millisecond read latency.

Layer 4: Alert delivery

A rule match produces an alert record in Postgres and triggers a delivery fanout: WebSocket push to open console sessions, webhook delivery to configured endpoints, and channel notifications via Slack, Telegram, and Lark. The delivery layer is decoupled from rule evaluation and retries failed deliveries with exponential backoff.

Where does 3 seconds come from?

The end-to-end latency breakdown for a typical Ethereum mainnet transaction that matches a rule:

Block propagation to our RPC provider: ~800ms, varies by provider and network conditions
Block fetch and transaction decode: ~120ms
Normalisation including address annotation: ~40ms
Rule evaluation across all active rules: ~15ms
Alert record write and WebSocket push: ~25ms

Total is roughly 1 second in the median case. The 3-second figure is our P95. It accounts for RPC provider lag, Redis cache misses on less-common addresses, and network jitter to alert delivery endpoints. We advertise P95, not P50, because usually fast is not good enough for compliance workflows.

We measure and alert on our own pipeline latency continuously. If median detection latency for any chain exceeds 5 seconds for more than 60 seconds, an internal incident is triggered automatically.

Trade-offs we made

Speed required making deliberate choices. A few worth calling out:

We process transactions at 1-confirmation depth on most chains. For use cases where false positives from re-orgs are unacceptable, users can configure a higher confirmation depth at the cost of latency.
Address annotation is done from a precomputed cache, not real-time on-chain lookups. New addresses get annotated within 30 seconds of first appearance, so very new wallets may not have full labels on their first transaction.
We normalise token values to USD at the time of detection using the best available price. Price feed latency means this figure can diverge slightly from final-settled prices on volatile assets.

What we are building next

The next phase focuses on cross-chain correlation: identifying when an entity is executing a strategy that spans multiple chains at the same time. This requires a stateful graph model that persists across per-chain event streams. It is more complex than per-chain rule evaluation, but essential for catching the most sophisticated on-chain behaviour.

Team Opsion

Opsion

Compliance

MiCAR Article 72: What on-chain monitoring looks like in practice

April 28, 2026 · 7 min

Security

The four types of on-chain risk every custodian needs to monitor

March 31, 2026 · 6 min