mirror of
https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools.git
synced 2026-06-18 15:29:36 +00:00
2.8 KiB
2.8 KiB
Dealix On-Call / Incident Contact
Effective: 2026-04-23 Review: after first paid deal (G5) — then set up a real rotation if >1 engineer
Primary contact
- Name: Sami Assiri
- Email: sami.assiri11@gmail.com
- Role: Founder / sole maintainer
- Timezone: Asia/Riyadh (UTC+3)
- Availability: 24/7 during Primitive Launch phase (first 14 days)
Until there's a second engineer or a paid on-call service, Sami is the escalation path. This is a conscious trade-off — it's cheaper than a pager rotation until we have revenue.
Secondary (automated)
- Agent: Perplexity agent with GitHub connector (
api_credentials=["github"])- Can: read repo state, open PRs, merge PRs with admin override, read GitHub-side CI logs
- Cannot: SSH to prod, read
.envvalues, run k6/drills on the server - Contact method: via existing Perplexity chat thread
Escalation decision tree
incident
├─ can I see it in /health/deep ?
│ ├─ yes + 5xx → page Sami immediately
│ └─ no → check /admin/dlq/stats, /admin/approvals/stats
├─ is revenue path affected (/checkout, /webhooks/moyasar) ?
│ ├─ yes → page Sami + open GitHub issue with label `P0`
│ └─ no → file GitHub issue with label `P1`, handle next business day
└─ is it a security incident (suspected compromise) ?
└─ yes → follow RUNBOOK.md Scenario 6; rotate secrets; email Sami
Channels
| Channel | Use for | Target response time |
|---|---|---|
| Email (sami.assiri11@gmail.com) | Everything | <4h business hours |
GitHub issue with P0 label |
Incidents with repo evidence | Same as email |
| Perplexity agent chat | Automated daily briefs + follow-ups | Next scheduled run |
To be added (blocked on credentials):
- WhatsApp Business number — blocker: need number from Sami
- UptimeRobot status page → automated alerts — blocker: UptimeRobot API key
- Slack webhook for DLQ alerts — blocker: Slack workspace + webhook URL
What closes I2
This document merged to docs/ON_CALL.md → closed ✅ (you're looking at it)
I3 (public status page) remains blocked on the UptimeRobot key.
Incident response checklist (first 15 minutes)
- Acknowledge — reply to the alert, even just "on it"
- Assess — one
/health/deepcurl + one/admin/dlq/statscurl - Decide — fix forward OR roll back (see RUNBOOK Scenario 2)
- Communicate — if customer-visible, prepare a one-line status update
- Execute — run the appropriate RUNBOOK scenario
- Verify — re-check
/health/deepand the specific endpoint that failed - Document — file the incident in
docs/incidents/YYYY-MM-DD-<slug>.md
Post-mortem is required for any P0 incident within 48 hours.