Appearance
0013 — Notification Transport (channels) ​
Status: Accepted (2026-06-18 — SMS provider decided: Twilio, provider-neutral behind the SmsProvider adapter; aligned to the portability posture in ADR 0024)
Date: 2026-06-17
ADO work item: AB#3138
Deciders: Kristopher Turner (platform owner)
Context ​
The platform needs a single notification transport that every feature uses to reach members across all active channels. Without a shared transport, each feature would hard-code its own provider calls, making provider changes expensive and creating inconsistent delivery behavior across the platform.
This ADR covers transport only — the channels by which approved content and system events are delivered to members. It does not cover:
- Content authoring or approval — those are defined in ADR 0012 (Announcements) and ADR 0023 (Communications authoring & approval workflow).
- What constitutes a valid notification subject — the single platform content type for broadcast communications is the Announcement (ADR 0012); system notifications (RSVP confirmations, approval outcomes) are generated by the API layer and passed to the transport in the same way.
The transport's job is to fan out a resolved payload across all of a member's active channels.
Several constraints govern this decision:
- Member phone is required for adults (ADR 0007). Every adult account carries a verified phone number. This is the foundation for SMS delivery — the data prerequisite is already satisfied by the account model, removing a common blocker for SMS adoption.
- Cost (governing). The platform targets free or near-zero recurring spend. Email can start free; push is free via the Expo tier; SMS is always paid (per-message). The governing cost constraint and the fact that SMS is now in-scope creates direct tension that must be managed through volume budgeting and an explicit owner decision on provider.
- No user-to-user messaging. ADR 0007 and the closed-community design rule out two-way chat and direct messages between members. This collapses the moderation problem: without inbound member-originated messages, there is nothing for the transport to moderate.
- API-first / headless (ADR 0008). The transport is a platform shared service callable by any feature through the API; features do not contact channel providers directly.
- Expo push stack (ADR 0002). Mobile push is handled by Expo's push notification service routing to APNs (iOS) and FCM (Android). This reuses the already-decided RN+Expo stack and avoids a separate push infrastructure.
- Security audit S4 (now closed). The security audit identified provider selection for email and SMS as an open item. This revision brings SMS in-scope and selects both providers — SendGrid (email) and Twilio (SMS), each behind a provider-neutral adapter. S4 is closed; cost is accepted as a modest ministry operating expense (see SMS cost estimate below).
- Secrets in Key Vault (ADR 0004). All provider API keys and credentials are stored in Key Vault and injected at runtime into the containerized API (ADR 0024) via the
SecretsProvideradapter. - Observability (ADR 0005). Every delivery attempt — success or failure — is logged to Application Insights through the standard platform observability model.
Decision ​
We will build a single channel-agnostic notification transport as a platform shared service (ADR 0008, responsibility 4) that exposes one
POST /notifications/sendabstraction. Channel adapters (SMS, email, in-app, mobile push) sit behind it so features call the transport, not a provider. Every approved Announcement and every system notification (RSVP confirmation, approval outcome) fans out across all of a member's active channels simultaneously: SMS text, app push (if the app is installed and push is enabled), and email. Per-member channel preferences may suppress optional channels but cannot suppress SMS for adult members whose phone is on file. Push via Expo → APNs/FCM (free); email via SendGrid free tier; SMS via Twilio. Email and SMS sit behind provider-neutral adapters (EmailProvider,SmsProvider) consistent with the portability posture in ADR 0024 — no Azure-locked communications provider (e.g. Azure Communication Services) is used, so the platform's comms layer moves with it to any host.
Transport architecture ​
text
Feature call (Announcements approved, RSVP confirmed, approval outcome …)
│
▼
POST /notifications/send ← single API endpoint in apps/api
│
├─► SMS adapter → Twilio (decided; provider-neutral; paid per-message)
├─► Push adapter → Expo Push API → APNs / FCM (free)
├─► Email adapter → SendGrid free tier (free at low volume)
└─► In-app adapter → INSERT into Notifications table (Postgres — ADR 0024)Fan-out is parallel. A failure on one channel does not block delivery on others. Each adapter logs a delivery attempt (success or error code) to Application Insights per ADR 0005.
Multi-channel fan-out ​
When the transport receives a send request, it resolves the recipient list from the audience scope, then for each member simultaneously dispatches to every active channel:
| Channel | Trigger condition | Notes |
|---|---|---|
| SMS | Adult member; verified phone on file (ADR 0007) | Sent for every notification unless member opts out of non-urgent SMS (see preferences below) |
| Push | App installed; member has not disabled push at the OS level | Expo token on file; falls back gracefully if token is stale |
| Member has email on file (all adults — ADR 0007) | SendGrid free tier; sufficient at congregation scale; provider-neutral adapter | |
| In-app | Always | Written to Notifications table; visible when member opens the app or web shell |
Delivery and fallback behavior ​
- The transport dispatches all channels concurrently for each recipient.
- A push token rejection (expired or invalid) causes the push adapter to remove the stale token and log the event; the other channels continue.
- An email hard bounce is logged and the member's email delivery flag is suspended until an admin clears it; other channels continue.
- An SMS delivery failure (carrier rejection, invalid number) is logged; the member's phone number is flagged for admin review; other channels continue.
- In-app delivery is the always-available baseline — it cannot be suppressed by a preference or a provider failure.
Per-member channel preferences ​
Members may configure preferences through their profile:
| Preference | Effect |
|---|---|
| Opt out of non-urgent SMS | SMS suppressed for priority = 0 (normal) notifications; always sent for priority = 2 (urgent) |
| Push notifications off | Push suppressed (OS-level disables are respected automatically via Expo) |
| Email digest (future) | Not in scope for initial build; noted for Phase 2 consideration |
Children (parent-managed accounts — ADR 0007) have no email and no phone; they receive in-app notifications only through the parent-approved session.
SMS provider — DECIDED: Twilio (provider-neutral, behind the SmsProvider adapter) ​
Decision (owner-confirmed 2026-06-18): SMS is delivered through Twilio, accessed only through the platform's SmsProvider adapter. The earlier "ACS or Twilio" framing is resolved against ACS: Azure Communication Services SMS is an Azure-locked resource, which would re-introduce exactly the lock-in that ADR 0024 (portable-by-design) sets out to avoid. SMS is the one channel that must keep working unchanged if the platform relocates to AWS, GCP, or on-prem — so the SMS provider must be cloud-neutral. Twilio runs identically on any host.
Why Twilio over the cheaper alternatives. Because the provider sits behind an adapter, the choice is reversible by config, so it was made on reliability + ecosystem rather than squeezing the last cent:
| Provider | US per-segment | Number/mo | Portable? | Notes |
|---|---|---|---|---|
| Twilio (chosen) | ~$0.0083 | ~$1.15 | Yes (cloud-neutral) | Best-in-class docs/SDK/reliability; owns SendGrid → SMS + email under one portable vendor |
| Telnyx | ~$0.0040 | ~$1.00 | Yes | Cheapest carrier-grade; documented as the cost-down drop-in if SMS volume grows |
| Plivo | ~$0.0050–0.0077 | ~$0.80 | Yes | Closest Twilio-compatible API; viable alternative |
| ~$0.0075 | ~$2.00 | No — Azure-locked | Rejected: contradicts ADR 0024 portability |
All US A2P providers require the same A2P 10DLC registration via The Campaign Registry (brand + campaign) before sending — this is a carrier requirement, not provider-specific, so it is not a differentiator. Carrier surcharges apply on top of every message regardless of provider.
Cost estimate at congregation scale. Assume ~150 adult members, ~3 notifications/week:
- 150 × 3 × 4.3 weeks = ~1,935 SMS/month
- Twilio: ~1,935 × $0.0083 + ~$1.15 number ≈ ~$17/month (before carrier pass-through fees)
- Telnyx (cost-down option): ~1,935 × $0.0040 + ~$1.00 ≈ ~$9/month
The absolute spread is ~$8/month — immaterial at this scale — so Twilio's reliability and the single-vendor SMS+email consolidation (SendGrid) win for the initial build. If SMS volume grows materially, swapping the SmsProvider adapter to Telnyx is a config change, not a rewrite.
Cost-control levers (retained): members may opt out of non-urgent SMS (priority 0); urgent messages (priority 2 — weather/safety) always send. This caps spend while preserving the emergency-reach use case that justifies SMS at all.
Activation gate. Live Twilio credentials are added to Key Vault only under an ADO task linked to this ADR, after A2P 10DLC registration completes. This closes security audit item S4 (provider selected; cost accepted as a modest ministry operating expense).
Boundary vs. Announcements (ADR 0012) and authoring workflow (ADR 0023) ​
| Concern | Owner ADR |
|---|---|
| Announcement content, metadata, priority, audience | ADR 0012 — Announcements |
Authoring, approval state machine, comms_author role | ADR 0023 — Communications authoring & approval |
| Channel adapters (SMS, email, in-app, push) | This ADR — Transport |
| Member channel preferences | This ADR — Transport |
| Delivery fan-out, fallback, retry | This ADR — Transport |
Receipt recording (AnnouncementReceipts) | ADR 0012 — Announcements |
| Approver notification when submission enters queue | This ADR — Transport (system notification) |
Announcements authors a broadcast and calls POST /notifications/send on approval. System events (RSVP confirmation, approval outcome, queue submission alert to approvers) are generated by the relevant API handler and passed to the same endpoint. The transport does not know about editorial state; it executes delivery.
Provider decisions summary ​
| Channel | Provider | Cost | Status |
|---|---|---|---|
| SMS | Twilio (behind SmsProvider adapter; Telnyx as cost-down swap) | ~$17/month at congregation scale (paid, per-message) | Decided — activate after A2P 10DLC registration |
| Push | Expo Push → APNs/FCM | Free (Expo free tier) | Active — initial build |
SendGrid free tier (behind EmailProvider adapter) | ~$0 at low volume | Active — initial build; provider-neutral (no Azure lock-in) | |
| In-app | Postgres Notifications table (ADR 0024) | Covered by database budget | Active — initial build |
Alternatives considered ​
| Option | Pros | Cons | Why not chosen |
|---|---|---|---|
| Multi-channel transport: SMS + push + email + in-app, fan out all channels (chosen) | Maximum reach; every member receives on every active channel; no member is missed because they don't open the app | SMS has a recurring cost (~$15–20/month); increases transport implementation surface | — chosen |
| SMS deferred; email + push + in-app only | Eliminates SMS cost; simpler initial adapter set | Members without push enabled and who don't check the app miss time-sensitive notifications; phone is already collected (ADR 0007) so not using it is a wasted asset | Rejected — phone is a required field (ADR 0007); the cost is modest; urgency use case (weather cancellation) justifies the channel |
| All-in-one paid comms SaaS (e.g. Twilio Notify, Courier, Knock) | Unified dashboard; rich routing; no per-channel adapter to build | Monthly platform fee from day one; exceeds cost constraint; provider lock-in for all channels | Rejected — cost; the transport abstraction provides equivalent routing without a platform fee |
| Hard-code one email provider per feature (no shared transport) | Simpler per-feature initially | Provider scattered across features; change requires touching every feature; inconsistent delivery and retry behavior | Rejected — breaks the API-first / shared-service model in ADR 0008 |
| Push + in-app only (no email, no SMS) | Zero provider cost; no external accounts | Misses members who have push disabled; no reach to members without a device installed; no urgency channel for weather / safety notifications | Rejected — insufficient reach guarantee for a congregation that includes less tech-engaged members |
| Two-way chat / direct messages between members | Members could converse in-app | Re-introduces moderation at scale; contradicts the closed-community design; significant infrastructure cost | Rejected — out of scope; the closed-community and no-user-to-user-messaging constraints are locked |
Consequences ​
Positive ​
- Any feature that calls
POST /notifications/sendautomatically gains all active channels with no per-feature provider integration work. - Multi-channel fan-out maximizes the probability that every member receives time-sensitive content (weather cancellations, schedule changes) regardless of whether they have the app installed or push enabled.
- Provider migration (e.g. SendGrid → ACS Email) is isolated to the email adapter; no feature endpoints change.
- The one-way model eliminates all inbound-message moderation requirements from the transport layer; moderation effort stays in the approval workflow (ADR 0023) where it already lives.
- Expo push reuses the existing RN+Expo investment (ADR 0002) — no separate push infrastructure or additional vendor account beyond what mobile already requires.
- Secrets are centralized in Key Vault (ADR 0004); rotating a provider key does not require a deployment.
- Delivery attempts are logged to Application Insights (ADR 0005); bounce rates, push failures, and SMS error codes are observable without bespoke logging code in the adapters.
Negative / trade-offs ​
- SMS cost. At congregation scale, Twilio SMS adds roughly ~$17/month (before carrier surcharges) — a recurring line item accepted by the owner (2026-06-18) as a modest ministry operating cost, justified by guaranteed urgent reach. The
SmsProvideradapter keeps Telnyx (~$9/month) available as a config-only swap if volume makes the delta material. - Per-member preference complexity. Honoring opt-out preferences (e.g. suppress non-urgent SMS) adds per-recipient evaluation logic in the fan-out path. At congregation scale this is not a performance concern, but it is code surface that must be tested.
- Adapter failure surface increases. With four active channels, there are four classes of provider failure to monitor and handle. The fan-out-and-continue model limits blast radius, but each adapter needs its own error handling and retry logic.
- In-app notification delivery requires the member to open the app or web shell. There is no guarantee of timely delivery for time-sensitive messages if the member has push notifications disabled and SMS is not yet active.
Risks ​
- SMS cost runaway. A misconfigured fan-out or a compromised author flooding the queue could drive per-message spend up. Mitigation: urgent-only opt-out levers cap normal volume; set a Twilio spend alert / monthly cap; the approval workflow (ADR 0023) gates what reaches fan-out. If volume grows, swap the
SmsProvideradapter to Telnyx (~half the per-segment cost). - SendGrid account suspension. Free-tier accounts are subject to usage policies; a spam complaint or policy violation can suspend delivery. Mitigation: the
EmailProvideradapter means a fallback provider can be wired in hours; monitor bounce and complaint rates via SendGrid webhooks → Application Insights (ADR 0005). - Expo push service outage. Expo's push service is a third-party dependency. Mitigation: in-app notifications remain available; the push adapter can be extended to call APNs/FCM directly as a fallback if Expo SLAs prove insufficient.
- Scope creep toward two-way messaging. Stakeholder requests for replies or threads will arise. Mitigation: this ADR is the explicit record of the one-way transport decision; any two-way capability requires a new ADR and explicit cost/moderation analysis.
- Stale push tokens causing phantom failures. Push tokens expire when members reinstall the app. Mitigation: the push adapter removes stale tokens on first rejection; this is standard Expo practice and requires no custom retry logic.
Amendment — Web Push (VAPID) for PWA (2026-06-24) ​
ADO: Feature AB#4429, Story AB#4430, Tasks AB#4431–4435
Amendment context ​
ADR 0031 establishes the PWA (Progressive Web App) as the verified cross-platform delivery path: members can install heritageva.app on their iPhone, Android, or Windows desktop home screen and use it as a full app without an App Store download. The existing Push channel in the transport architecture routes only through Expo Push → APNs/FCM, which covers the React Native mobile app (ADR 0002). PWA users — running the React web app in a browser — are not Expo app users and do not have Expo push tokens. Without Web Push, they cannot receive notifications through the push channel even when the PWA is installed.
Web Push (VAPID) is the browser-native push standard: the browser registers a subscription endpoint with the OS push service, the API sends a push message to that endpoint using the web-push Node.js package, and the service worker shows a notification. It is free, requires no App Store, and works on Chrome, Edge, Firefox, and Safari (iOS 16.4+ with PWA installed to the home screen).
This is foundational transport infrastructure, not a feature. The permission prompt shown to the member belongs to the first feature that uses it (Calendar or Announcements), not to the infrastructure itself — requesting permission on login with no context gets denied.
Amendment Decision ​
The push channel is extended with a Web Push (VAPID) sub-channel alongside the existing Expo Push → APNs/FCM sub-channel. Both sub-channels are dispatched concurrently within the existing push adapter. VAPID key material is stored in Key Vault (
kv-heritageva-prod-eus). Per-member web push subscriptions are stored in a newpush_subscriptionstable in Postgres. The member notification permission prompt is deferred to the first feature that sends notifications. Infrastructure ships with the Platform foundation (Feature AB#4429).
Extended push architecture ​
text
Push adapter (apps/api — existing)
│
├─► Expo Push API → APNs / FCM ← existing (native mobile app, ADR 0002)
└─► Web Push (VAPID) sub-adapter ← NEW (PWA users)
│
├─► Chrome / Edge (FCM endpoint via browser)
├─► Firefox (Mozilla Push Service)
└─► Safari iOS 16.4+ (PWA installed to home screen)Infrastructure components ​
| Component | Location | Notes |
|---|---|---|
| VAPID key pair | kv-heritageva-prod-eus secrets vapid-private-key + vapid-public-key | One key pair for all members; generated once with npx web-push generate-vapid-keys |
push_subscriptions table | Postgres migration | id, member_id (FK→members, cascade delete), endpoint (unique), p256dh, auth, user_agent, created_at |
POST /api/v1/notifications/subscribe/web | apps/api | Authenticated; upsert on endpoint to handle browser subscription refresh |
DELETE /api/v1/notifications/subscribe/web | apps/api | Authenticated; remove on logout or permission revoke |
web-push npm package | apps/api/package.json | Sends VAPID-signed push messages to browser push endpoints |
| Push event listener | apps/web/public/sw.js | self.addEventListener('push', ...) + notificationclick to focus URL |
VAPID_PUBLIC_KEY env var | Container App (web) | Used by pushManager.subscribe() in the React app |
VAPID_PRIVATE_KEY env var | Container App (api only) | Never exposed to the browser |
Stale subscription cleanup ​
When a push endpoint returns 410 Gone (member cleared browser data or revoked permission at the OS level), the web push sub-adapter deletes the stale row from push_subscriptions. This mirrors the existing Expo stale-token cleanup behavior in the push adapter.
Known limitations ​
| Limitation | Detail |
|---|---|
| iOS requires 16.4+ | And the PWA must be installed to the home screen — in-browser Safari push is not supported |
| Permission is not auto-granted | The browser prompt is shown by the first feature that calls Notification.requestPermission() — not here |
| Children (ADR 0007) | No push — in-app only, same as today |
Spec ​
docs/internal/design/push-notifications.md
References ​
- Platform strategy — Communication / Messaging + Notifications service
- Platform strategy — Security audit S4 (provider selection)
- ADR 0002 — Mobile: React Native + Expo
- ADR 0004 — Cloud hosting stack / Key Vault
- ADR 0005 — Observability model
- ADR 0006 — Two-plane RBAC
- ADR 0007 — Account & Family-Group identity — adult phone required; email required; children receive in-app only; data prerequisites for SMS/email.
- ADR 0008 — Platform composition
- ADR 0024 — Cloud portability & provider abstraction — the portability posture that drives the provider-neutral
EmailProvider/SmsProvideradapters and the rejection of Azure-locked ACS for communications. - ADR 0012 — Announcements (one-way broadcast) — the primary caller of this transport; owns
AnnouncementReceipts. - ADR 0023 — Communications authoring & approval workflow — defines the state machine that produces the
approvedevent that triggers fan-out. - ADR 0031 — Code-first design workflow and cross-platform delivery — establishes PWA as the verified cross-platform path and the rationale for Web Push.
- docs/internal/design/push-notifications.md — implementation spec.
- ADO: Epic AB#3138 (Messaging & Notifications), Feature AB#3145 (Notifications service).
- ADO (Web Push infrastructure): Feature AB#4429, Story AB#4430, Tasks AB#4431–4435.