Backend Runtime Operations Specification
Backend Runtime Operations Specification
Status: Draft v0.1.
Project: UWScrape.
Directory: docs/specs.
Audience: backend implementers, operators, release reviewers, and local development maintainers.
Last reviewed: 2026-05-11.
Primary architecture document: docs/reference/architecture/backend-runtime-architecture/.
Related documents:
docs/specifications/backend-api-spec/docs/specifications/backend-state-storage-spec/docs/decisions/0006-backend-index-contract/docs/decisions/0007-index-release-gate-policy/docs/decisions/0010-go-runtime-readonly-index-and-state-store/docs/decisions/0011-anonymous-state-token-policy/
1. Purpose
This document specifies version 1 backend runtime operations.
It covers startup validation, configuration, health responses, request limits, timeouts, logging, metrics, static asset serving, and deployment assumptions.
It does not specify endpoint payload semantics.
It does not specify scraper or index builder commands.
It does not choose a concrete SQLite driver.
2. Source Anchors
The Go implementation should use net/http server concepts such as Server, handlers, request contexts, timeouts, and shutdown: Go net/http.
The backend should use Go database/sql concepts for state and index database access unless a later implementation spec narrows this: Go database/sql.
SQLite URI filename parameters such as mode=ro are documented by SQLite: SQLite URI filenames.
HTTP caching behavior follows RFC 9111: RFC 9111.
HTTP method and status semantics follow RFC 9110: RFC 9110.
Token handling follows OWASP session guidance for unpredictable identifiers, secure handling, and avoiding URL transport and logging: OWASP Session Management Cheat Sheet.
3. Runtime Inputs
Version 1 backend inputs:
- backend binary;
- published index directory;
- writable state database path;
- token verifier key file or equivalent local secret source;
- optional static frontend directory for fallback deployments;
- environment configuration.
The backend does not fetch Waterloo or Kuali data at runtime.
The backend does not read raw scraper snapshots at runtime.
The backend does not apply parser patches at runtime.
The backend does not hot-swap indexes in version 1.
4. Configuration Keys
Required keys:
| Key | Meaning |
|---|---|
UWSCRAPE_BIND_ADDR | HTTP bind address, such as 127.0.0.1:8080. |
UWSCRAPE_INDEX_DIR | Published index directory. |
UWSCRAPE_STATE_DB_PATH | Writable state SQLite database path. |
UWSCRAPE_TOKEN_KEY_PATH | Secret key material for token verifiers. |
Optional keys:
| Key | Meaning |
|---|---|
UWSCRAPE_STATIC_DIR | Optional fallback frontend static asset directory. Primary v1 frontend deployment uses the SvelteKit Node server. |
UWSCRAPE_LOG_LEVEL | Log verbosity. |
UWSCRAPE_ALLOWED_ORIGINS | Allowed frontend origins when split from API. |
UWSCRAPE_MAX_STATE_BODY_BYTES | Override state request body limit. |
UWSCRAPE_MAX_QUERY_BODY_BYTES | Override query request body limit. |
UWSCRAPE_MAX_GRAPH_BODY_BYTES | Override graph request body limit. |
Production defaults:
- no unapproved index override;
- no unsupported schema override;
- no request body logging;
- no raw token logging;
- bind to localhost unless configured otherwise.
5. Startup Sequence
Startup must fail closed by default.
Sequence:
- load configuration;
- validate configuration values;
- resolve absolute paths;
- read
build-metadata.json; - read
validation-summary.json; - read
release-decision.json; - verify release decision status is
approvedorapproved_with_warnings; - verify
index_schema_versionis supported; - open
course-universe.sqliteread-only; - compare SQLite metadata with
build-metadata.json; - validate required index tables exist;
- run cheap index consistency probes;
- open or create state database;
- run state database migrations;
- load token verifier key material;
- initialize route handlers;
- start HTTP server.
If any required step fails, the backend must not serve ordinary API traffic.
If an explicit diagnostic server mode is added later, development diagnostics must be visible in /api/v1/health and /api/v1/index.
Diagnostic mode must not serve ordinary query or state mutation traffic with an unsupported schema, missing release decision, or rejected release decision.
6. Published Index Checks
The backend must confirm the published index directory contains:
course-universe.sqlite;build-metadata.json;validation-summary.json;release-decision.json;build-report.md.
Optional:
graph-projection.json.
The backend must treat SQLite as canonical.
If optional graph projection data is present but malformed, has the wrong view type, omits index/catalog identity metrics, or declares identity metrics that disagree with the published index metadata, the backend must refuse startup. Missing graph projection data is not a startup error; the backend falls back to SQLite-backed graph construction.
Graph routes that use the projection cache must identify that in response
metrics, for example with projection_source: "published_artifact".
The backend must parse release-decision.json.
Required release decision fields for startup:
- release decision id;
- release decision status;
- reviewed timestamp;
index_id;index_schema_version;- parser version;
- source catalog metadata;
- artifact hashes.
Allowed release decision statuses:
approved;approved_with_warnings;rejected.
Startup accepts approved and approved_with_warnings.
Startup rejects rejected, missing status, unknown status, missing release decision file, and release decision metadata that conflicts with build-metadata.json or SQLite metadata.
7. Single Active Catalog Policy
Version 1 serves one active published index and one active catalog for normal runtime evaluation.
Multi-catalog runtime loading is not a version 1 requirement.
If saved state references a different catalog version, query responses must report catalog mismatch.
If the state catalog is unavailable, query responses must report catalog_unavailable or academic unknown depending on endpoint semantics.
The backend must not silently reinterpret old state under the active catalog.
8. Read-Only Index Opening
The backend should open the index database read-only.
When the SQLite driver supports URI filenames, use mode=ro.
Use immutable=1 only when deployment guarantees the file will not change during process lifetime.
Do not use immutable=1 if deployment overwrites the same SQLite file in place.
The backend must not write to the index database.
The backend should include a test or startup check that detects accidental write capability when practical.
9. State Database Startup
The backend opens or creates the state database at UWSCRAPE_STATE_DB_PATH.
If the database does not exist, create it and apply migrations.
If the database exists, verify supported state store schema version.
If the database schema is newer than the backend supports, fail startup.
If the database schema is older and migrations are available, run migrations.
State database migration failure is startup failure.
State database migration must not touch the published index.
10. Health Endpoint
GET /api/v1/health returns operational health.
Required fields:
status;degraded;checks;- loaded index summary when available;
- release decision summary when an index is loaded.
status values:
ok;degraded;starting;error.
Health response must not include:
- filesystem paths;
- raw tokens;
- token verifier values;
- verifier key material;
- full state data.
11. Index Metadata Endpoint
GET /api/v1/index returns loaded index metadata.
It should include:
index_id;index_schema_version;catalog_version_id;catalog_title;upstream_catalog_id;release_status;release_decision_id;release_decision_reviewed_at;build_started_at;build_completed_at;parser_version;validation_summary;- release decision summary.
It should not include local absolute paths by default.
It should not include raw source payloads.
12. Request Body Limits
Default limits:
| Request class | Default |
|---|---|
| state create or replace | 256 KiB |
| query request | 256 KiB |
| graph view request | 128 KiB |
| migration preview | 256 KiB |
The backend should reject oversized requests before fully decoding them.
Go implementations can use request body limiting through net/http facilities such as MaxBytesReader: Go net/http.
Oversized requests return 413 payload_too_large.
13. Timeouts
Default timeouts:
| Timeout | Default |
|---|---|
| read header timeout | 5 seconds |
| request body read timeout | 15 seconds |
| catalog request timeout | 5 seconds |
| state request timeout | 10 seconds |
| query request timeout | 15 seconds |
| graph view request timeout | 15 seconds |
| migration preview timeout | 30 seconds |
Request handlers should use request contexts for cancellation.
Client disconnect should cancel unnecessary downstream work.
Long-running future planning should use an explicit asynchronous workflow rather than ordinary synchronous query requests.
Timeouts that affect academic query completeness must be reflected as time_limit_reached unknowns when a partial response is still returned.
Operational timeouts may return HTTP errors when no useful response can be produced.
14. Rate Limiting
Version 1 should rate limit:
- state creation;
- token verification failures;
- state mutation;
- query endpoints;
- graph view endpoints;
- migration preview;
- state deletion attempts.
Catalog GET endpoints can have looser limits.
Rate limiting can use:
- client IP;
- token verifier id after successful authentication;
- route class.
Rate limit failures return 429 rate_limited.
Rate limit responses must not reveal whether a state token exists.
15. Logging
Logs should include:
- request id;
- method;
- route pattern;
- status code;
- duration;
- response size when available;
- index id;
- query type;
- graph view type;
- semantic status counts;
- error code.
Logs must not include:
- raw state tokens;
- token verifiers;
- verifier key material;
- authorization headers;
- full request bodies by default;
- user notes;
- full grade history by default;
- raw source HTML by default.
Debug mode may log additional local details only when explicitly enabled.
Debug mode must still redact tokens.
16. Metrics
Recommended metrics:
- request count by route;
- request latency by route;
- error count by code;
- rate limit count by route;
- query latency by query type;
- query academic status counts;
- unknown count by
unknown_reason; - conflict count by
conflict_reason; - graph view node count;
- graph view edge count;
- graph view truncation count;
- state creation count;
- state replacement count;
- state deletion count;
- token verification failure count;
- loaded index id;
- loaded index schema version;
- state database schema version.
Metrics must not contain raw tokens.
Metrics should not contain user-specific course lists.
Unknown counts are important because they reveal parser and data quality gaps.
V1 diagnostics expose a redacted in-process request-metrics snapshot under
/api/v1/diagnostics for local demo and development triage. Route keys must be
templated, for example /api/v1/courses/{subject}/{catalog_number}, so
diagnostics can identify hot graph/query routes without recording course-specific
paths, state tokens, or student-state content. The snapshot should include
request count, client/server error count, in-flight count, and average/max/last
duration in milliseconds.
17. Optional Static Asset Serving
The backend may serve static frontend assets in version 1 only as an optional fallback or local demo mode.
Static assets are optional.
The primary version 1 frontend deployment is the SvelteKit Node server defined by ADR 0020 and the frontend runtime architecture.
API routes remain under /api/v1.
Unknown /api/ routes must return API errors, not index.html.
Unknown non-API routes may serve index.html for client-side routing when static serving is enabled.
Hashed static assets may use long cache lifetimes.
index.html should use cautious caching.
API caching is specified separately from static asset caching.
Static asset serving must not change backend academic semantics.
18. CORS
If frontend and backend are same-origin, CORS may be disabled.
If frontend and backend are split, allowed origins must be explicit.
The backend should not use wildcard CORS for state endpoints in production.
Authorization headers must be allowed only for trusted origins.
Cookie credentials are not part of v1 because browser cookie sessions are not part of v1.
HttpOnly, Secure, and SameSite cookie settings belong to a later session extension.
19. Deployment
Version 1 deployment is a single backend process.
Inputs:
- backend binary;
- published index directory;
- state database path;
- token verifier key path;
- optional static directory.
Index promotion should happen by changing configuration or symlink target and restarting the backend.
The backend should not hot-swap indexes in version 1.
The state database must be backed up independently from the published index.
Backups should be protected as sensitive data.
20. Local Development
Example local run:
UWSCRAPE_BIND_ADDR=127.0.0.1:8080 \UWSCRAPE_INDEX_DIR=data/published/2026-2027-undergraduate \UWSCRAPE_STATE_DB_PATH=data/runtime/state.sqlite \UWSCRAPE_TOKEN_KEY_PATH=data/runtime/token-key \go run ./cmd/uwscrape-serverThe exact command may change after implementation.
Local development must use the same published index contract as production.
Do not introduce a separate development-only index format.
21. Check Config Mode
The backend should support a startup validation mode.
Example:
go run ./cmd/uwscrape-server --check-configThe mode should:
- load configuration;
- validate index metadata;
- open index read-only;
- run index probes;
- open or create state database if configured to do so;
- validate state schema;
- report success or failure;
- exit without serving HTTP.
This mode is useful for deployment and release validation.
22. Failure Modes
Startup failure examples:
- missing index directory;
- missing
course-universe.sqlite; - unsupported index schema;
- missing release decision;
- rejected release decision;
- unsupported release decision status;
- unreadable metadata file;
- SQLite metadata mismatch;
- state database migration failure;
- missing token key in production mode;
- invalid bind address.
Runtime error examples:
- invalid JSON;
- missing authorization;
- invalid state token;
- request too large;
- graph hard bound exceeded;
- state version conflict;
- catalog mismatch;
- catalog unavailable.
Catalog mismatch and catalog unavailable are not always HTTP errors.
For query endpoints, they may be academic unknowns or warnings inside a successful response.
23. Security Checklist
Implementation must:
- generate tokens with cryptographically secure randomness;
- store only token verifiers;
- compare verifiers safely;
- reject tokens in URLs;
- redact authorization headers;
- use
Cache-Control: no-storefor state responses; - rate limit token failures;
- avoid wildcard CORS for state endpoints;
- avoid logging request bodies by default;
- use HTTPS in non-local deployment;
- keep published index read-only.
24. Test Scenarios
Runtime operation tests should cover:
- startup succeeds with approved index;
- startup succeeds with approved-with-warnings index and exposes warnings;
- startup fails with missing index file;
- startup fails with unsupported index schema;
- startup fails with missing release decision;
- startup fails with rejected release decision;
- index opens read-only;
- state database migrates from empty file;
- future state database version fails startup;
- missing token key fails production startup;
- health endpoint redacts paths and secrets;
- index endpoint reports loaded index;
- state response has
Cache-Control: no-store; - catalog response can be cacheable;
- request body limit returns
413; - route timeout produces expected error or unknown;
- logs redact authorization headers;
- static serving does not swallow
/api/errors; - check-config exits without serving HTTP.
25. Acceptance Criteria
An implementation satisfies this spec when:
- startup fail-closed behavior is implemented;
- single active index policy is enforced;
- state store and index store are separate files;
- state database migrations are versioned;
- token key material is required for state-token deployments;
- request limits and timeouts are configured;
- logs and metrics redact tokens;
- health and index endpoints expose useful non-secret state;
- static serving is optional and cannot override API routes.
26. References
- Go
net/http: https://pkg.go.dev/net/http - Go
database/sql: https://pkg.go.dev/database/sql - SQLite URI filenames: https://www.sqlite.org/uri.html
- RFC 9110, HTTP Semantics: https://www.rfc-editor.org/rfc/rfc9110
- RFC 9111, HTTP Caching: https://www.rfc-editor.org/rfc/rfc9111
- OWASP Session Management Cheat Sheet: https://cheatsheetseries.owasp.org/cheatsheets/Session_Management_Cheat_Sheet.html