Appearance
0005 — Observability model ​
Status: Accepted (tooling TBD; stack fork resolved by ADR 0024 — API compute = Azure Container Apps, DB = Postgres on Azure DB for PostgreSQL Flexible Server)
Date: 2026-06-17
ADO work item: AB#3073
Deciders: Kristopher Turner (platform owner)
Context ​
The platform needs monitoring, alerting, and logging across three surfaces: infrastructure, backend API, and web/mobile clients. The community also requires an application-level audit log (who logged in, from where, how long) for security compliance and for the minister's oversight of a closed community.
The four-layer observability model is locked; the specific tooling for layers 2–3 is coupled to the compute/DB stack fork (ADR 0004) and is decided once that fork closes.
Decision ​
We will implement a four-layer observability model. Layer 1 (audit log) is implemented regardless of stack choice. Layers 2–3 tooling follows the stack: Azure Application Insights + Azure Monitor for Azure-native; OpenTelemetry + Grafana for non-Azure. Layer 4 is the Clerk dashboard (ADR 0003).
Four layers:
App audit/activity log —
AuditLogtable in the platform database. Tracks every security-relevant event: login, logout, session duration, approval actions, content publish. This layer is implemented in Phase 2 regardless of stack.App telemetry — SDK instrumentation in
apps/web,apps/api, andapps/mobile. Captures errors, performance traces, and custom events.- Azure-native stack: Azure Application Insights (~5 GB/month free).
- Non-Azure stack: OpenTelemetry + Grafana Cloud free tier + Sentry for error tracking.
Infrastructure monitoring + alerts — resource health, availability, cost alerts.
- Azure-native stack: Azure Monitor + Log Analytics + action groups (~5 GB/month free).
- Non-Azure stack: provider-native dashboards + OpenTelemetry collector.
Auth-provider logs — Clerk dashboard (session events, sign-in attempts, suspicious activity). ADR 0003.
Alternatives considered ​
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
| Azure App Insights + Monitor (chosen for Azure path) | Native Azure integration; free tier; no extra setup | Coupled to Azure stack; not portable | Chosen if ADR 0004 → Azure-native |
| OpenTelemetry + Grafana (chosen for non-Azure path) | Stack-agnostic; OSS; vendor-neutral | Ops overhead; Grafana Cloud free tier limits | Chosen if ADR 0004 → Supabase/non-Azure |
| Sentry alone | Strong error tracking + session replay | Not a full observability solution; cost at scale | Used as a complement, not replacement |
| Datadog | Best-in-class unified observability | Expensive; not justified at ~200 members | Revisit if community grows significantly |
| No structured observability | Nothing to build | Unacceptable — audit log is a security and compliance requirement for a closed community with minors | Hard requirement |
Consequences ​
Positive ​
- The
AuditLogtable (Layer 1) is independent of tooling choice — it ships in Phase 2 regardless. - The four-layer model is budget-friendly: free-tier telemetry and monitoring are viable at 200 members.
- Minister/admin can audit who accessed the platform and when — required for a closed community with parent-managed child accounts.
Negative / trade-offs ​
- Layers 2–3 tooling cannot be finalized until ADR 0004 (stack fork) closes. Do not instrument before that.
- If the stack changes post-Phase 1, telemetry SDK calls must be updated throughout
apps/api,apps/web, andapps/mobile. Mitigation: wrap the telemetry calls behind a thinpackages/shared-utils/telemetryfacade so the SDK is swappable.
Risks ​
- Free-tier data retention — Application Insights free tier retains data for 90 days; Log Analytics free tier retains for 7 days. Document retention limits and set up export or archival if compliance requires longer retention.
- Mobile telemetry — Application Insights does not have an official React Native SDK; Sentry or OpenTelemetry React Native is required for
apps/mobileregardless of stack choice.
References ​
- Platform strategy — observability section
- ADR 0004 — Cloud/hosting stack — stack fork determines tooling (coupled)
- ADR 0003 — Authentication: Clerk — Layer 4 auth-provider logs