ADR 0027: State DB Concurrency Model and Tiered Loss Window
ADR 0027: State DB Concurrency Model and Tiered Loss Window
Status: Accepted for architectural direction.
Date: 2026-05-15.
Context
The version 1 backend uses a writable SQLite state database (ADR 0010).
Phase P10 targets a single Fly.io machine serving 50–200 concurrent users.
SQLite in its default rollback-journal mode allows at most one connection
to write or read at a time, which would cascade SQLITE_BUSY errors when
concurrent requests collide.
SQLite supports WAL mode, in which one writer and many readers can hold the database simultaneously: sqlite.org/wal.html.
modernc.org/sqlite is a pure-Go SQLite driver that accepts URI pragmas
via the connection string, so WAL and busy_timeout can be set at
sql.Open time without separate PRAGMA statements that bind to a
single connection only.
The state DB lives on a single Fly volume; Fly takes daily volume snapshots and retains them for an operator-configurable window.
Hourly VACUUM INTO produces a consistent point-in-time backup file
without blocking the live writer (WAL is required for the non-blocking
property).
A horizontally scaled deployment would require either a remote write target (e.g., a server-class DB) or distributed coordination on top of SQLite; both fall outside V1’s scope.
Decision
The state DB will open with the URI pragma chain
?_pragma=journal_mode(WAL)&_pragma=busy_timeout(5000)&_pragma=synchronous(NORMAL)&_pragma=foreign_keys(ON).
The state DB connection pool will be sized
SetMaxOpenConns(50) / SetMaxIdleConns(10) /
SetConnMaxIdleTime(5m).
All state-mutating endpoints will route through a single WriteCoordinator
goroutine with a buffered request channel (depth 64); the coordinator
serializes every mutation against the same *sql.DB so concurrent
mutators cannot collide on the WAL writer slot.
Reads bypass the coordinator and run on the connection pool directly, since WAL permits unlimited concurrent readers alongside one writer.
The WriteCoordinator will return ErrWriteQueueFull when the channel
is at capacity and a 5-second enqueue deadline elapses; the API layer
maps this to HTTP 503 + Retry-After: 5.
Hourly VACUUM INTO /data/snapshots/state-<utc-timestamp>.db will run
in a BackupRunner goroutine; retention sweeps keep the most recent
168 files (one week).
Fly daily volume snapshots will be retained for 14 days as a second backup tier.
Litestream and other streaming replicators are explicitly rejected for V1; the single-machine + tiered-snapshot model is the V1 boundary.
The migration trigger out of single-machine SQLite into a server-class
DB will be state_writes_per_second p99 > 50 sustained for 1 hour.
Tiered Loss Window
Tier 1 — process crash, volume intact: zero loss. WAL recovery on the
next sql.Open replays committed frames.
Tier 2 — state DB corruption, volume intact: ≤1 hour loss. Restore from
the newest /data/snapshots/state-*.db by cp over state.sqlite.
Tier 3 — volume catastrophic failure: ≤24 hour loss. Restore from the
most recent Fly daily volume snapshot via
fly volumes create --snapshot-id ....
Tier 4 — beyond Fly retention: loss window = (now − last manual offsite
copy). Offsite copy is an operator chore documented in
docs/operations/p10-deploy/.
CONTEXT D7 originally stated “≤1 hour loss” as a universal claim; this ADR supersedes that with the four-tier model above.
Consequences
The default WAL pragma supersedes the single per-connection
PRAGMA foreign_keys = ON issued by the previous startup path; the
URI pragma chain is applied to every connection in the pool, fixing a
latent bug where pooled connections lost the foreign-key enforcement.
The single-writer model precludes horizontal scaling without re-architecting the state DB layer. A second machine cannot safely share the same state SQLite file.
The WriteCoordinator adds one goroutine-hop latency per mutation; at
the V1 write rate (RESEARCH Q1 envelope: ≤50 writes/sec at 200 users)
the additional cost is far below the SQLite commit latency it replaces.
The hourly snapshot interval bounds the worst-case data loss to 1 hour
within Tier 2. Operators trading capacity for a tighter window can
lower UWSCRAPE_STATE_SNAPSHOT_INTERVAL.
The 168-file retention default consumes roughly the size of one state DB × 168 on the volume; for V1 traffic this is well under the 1 GB volume budget.
/api/v1/health returns 503 + Retry-After: 5 until the catalog
runtime caches have warmed; Fly’s health check grace_period=30s
covers warmup so the rolling deploy waits before sending traffic.
References
- ADR 0010 (preserved): Go runtime with read-only index and separate state store.
- ADR 0020 (superseded for production adapter choice): SvelteKit
frontend application framework — V1 production now uses
adapter-staticembedded in the Go binary viaembed.FS. See REQ-043 / REQ-046 amendments. - P10 CONTEXT D2 (write coordinator), D3 (Fly.io single machine), D4 (rolling restart), D5 (rate limits), D6 (logging), D7 (snapshots), D10 (single-binary frontend serving).
- P10 RESEARCH Q1 (write rate envelope), Q3 (tiered loss model), Q5 (proxy headers).