audit: Dealix Truth Audit — brutally honest verification

VERDICT: GTM_DRY_RUN_READY

PROVEN:
- 28/28 imports pass
- 30/30 evals pass
- 5/5 dry-runs produce 17/17 fields
- 11/11 prohibited actions blocked
- Cost guard blocks at budget
- Proof pack in output
- Supervisor uses 9/9 systems

HONEST GAPS:
- No GTM API routes
- 7 frontend pages missing
- Empty pipelines/ dir (logic in supervisor)
- No standalone proof/governance modules
- Payment BLOCKED_BY_ENV
- LLM BLOCKED_BY_ENV

https://claude.ai/code/session_01W1rJthWDkasijTdXCfxVHs
This commit is contained in:
Claude 2026-04-26 21:46:49 +00:00
parent 4910e7d7b3
commit 25a5ba844d
No known key found for this signature in database

View File

@ -0,0 +1,81 @@
# Dealix Truth Audit (2026-04-26)
## 1. Executive Summary
Dealix GTM Intelligence OS is **real working code** — not skeleton. 28/28 imports pass, 5/5 dry-runs produce complete 17-field output, 11/11 prohibited actions blocked, cost guard stops at budget, proof packs present. However: no GTM API routes, 7 frontend pages missing, no standalone pipeline files, no dedicated test directories for cost/proof/quality/governance, and several systems are embedded in supervisor rather than standalone modules.
## 2. Strict Verdict: **GTM_DRY_RUN_READY**
## 3. Biggest Truth
**The core intelligence pipeline WORKS.** It is NOT skeleton — supervisor_agent.py imports and uses 9 systems (cache, tokens, cost guard, validator, trace, proof pack, compliance, approval, no-send). Every dry-run produces structured output with scores, channels, compliance, proof, and cost. BUT: the architecture uses a monolithic supervisor pattern, not separate pipeline files. And there are no GTM-specific API routes or command center frontend pages.
## 4. Evidence Summary
### What PASSED (hard evidence):
| Test | Result | Evidence |
|------|--------|----------|
| Python imports | 28/28 ✅ | Every module loads clean |
| Dry-run fields | 5/5 × 17/17 = 85/85 ✅ | All required fields present |
| Evals | 30/30 ✅ | 9 sectors, correct channel selection |
| Compliance tests | 11/11 blocked ✅ | LinkedIn/WhatsApp/Instagram/X/TikTok |
| Forbidden claims | 4/4 blocked ✅ | "مضمون", "100%", "SOC 2" blocked |
| Message quality | 3/3 ✅ | Personalized, opt-out, approval required |
| Cost guard | Budget exceeded = blocked ✅ | 11 SAR > 10 SAR limit = False |
| Cache | Set + get + miss ✅ | Deterministic keys work |
| Token counter | Estimates + truncates ✅ | Working |
| Proof pack | Present in output ✅ | confidence=0.7, no_real_send=True |
| Output validation | Fake claims blocked ✅ | 4 issues caught in bad text |
| Supervisor wiring | 9/9 systems imported ✅ | grep confirms all used |
### What is PARTIAL:
| Item | Issue |
|------|-------|
| Pipeline files | Empty dir — logic embedded in supervisor |
| tools/ | Empty — no tool implementations |
| cost/proof/quality/governance dirs | Don't exist — logic in ai/, guardrails/ |
| tests/cost, tests/proof etc | Don't exist — all tests in tests/evals/ |
| Proof sources | Empty list — mock LLM has no real sources |
| GTM API routes | Not created |
| 7 frontend pages | Not built (/os /targets /approvals etc) |
### What is MISSING:
| Item | Status |
|------|--------|
| Customer Delivery OS | No code, no docs |
| GTM API routes (/api/gtm/*) | Not in FastAPI |
| Standalone pipeline files | Empty pipelines/ dir |
| Governance module (approval_queue, action_policy) | Not built |
| Dedicated proof module (evidence.py, claim_validator.py) | Embedded in supervisor |
| Frontend: /os, /company-intelligence, /targets, /approvals, /delivery, /learning-loop, /revenue | Not built |
| Real LLM integration | BLOCKED_BY_ENV (GROQ_API_KEY) |
| Payment | BLOCKED_BY_ENV (Moyasar) |
| Real outreach | SAMI_ACTION (manual Gmail) |
## 5. Setup From Clean Clone
```bash
git clone <repo>
cd salesflow-saas/backend
pip install -r requirements.txt
# Run tests:
python3 tests/evals/test_gtm_os_eval.py
python3 tests/evals/test_compliance_gate.py
python3 tests/evals/test_message_quality.py
# Run dry-run:
python3 scripts/gtm_os_dry_run.py --company-name "Test Agency" --sector agency --city Riyadh
```
## 6. Env Vars Needed (do NOT put in code)
```
GROQ_API_KEY — enables real LLM (currently mock)
ANTHROPIC_API_KEY — optional high-tier model
DATABASE_URL — PostgreSQL connection
MOYASAR_SECRET_KEY — enables payment
MOYASAR_PUBLISHABLE_KEY
SENTRY_DSN — error monitoring
POSTHOG_API_KEY — analytics
TAVILY_API_KEY — web search for enrichment
GOOGLE_SEARCH_API_KEY — search API
GOOGLE_SEARCH_CX — search engine ID
```
## 7. Final Executive Decision
The GTM Intelligence pipeline is **genuinely implemented and working code** — verified by imports, tests, dry-runs, and output inspection. It is not documentation or skeleton. However, it follows a monolithic pattern (supervisor does everything) rather than the planned modular pipeline/route architecture. The next real milestone is not more code — it's first email sent, first reply received, first payment collected.