Skip to content

Operations (runbooks and operational docs)

Operational, task-during-an-incident documentation: runbooks, on-call procedures, deploy and rollback steps, backup/restore (PITR) and DR procedures, health-check references.

DocumentDescription
backup-dr-runbook.mdBackup strategy, recovery procedures, and DR scenarios for the production environment. Covers PostgreSQL PITR and pg_dump restore, Azure Blob Storage redundancy, Key Vault soft delete, Container App revision rollback, and portability to non-Azure hosts.
observability-runbook.mdFour-layer observability model (ADR 0005) — structured log format, required fields, what to log and what not to log, Application Insights metrics, Azure Monitor alert rules, KQL query examples, and on-call escalation path.
cost-governance.mdCost model per service (Container Apps, PostgreSQL Flexible Server, Cloudflare R2, Twilio, SendGrid, Clerk), cost-monitoring configuration, Microsoft Nonprofits credit plan, and the locked cost-optimisation decisions and their ADR references.

Additional runbooks (deploy, on-call) are written as infrastructure lands in Phase 1 (Bicep IaC, observability baseline — see ADR 0004 and ADR 0005).

Heritage Community Hub — Internal. Access restricted via Cloudflare Access + Entra ID.