system-prompts-and-models-o.../salesflow-saas/backend/tests/evals/test_message_quality.py
Claude 503bf2e5d7
feat: AI Cost, Quality & Proof OS — complete
AI Layer:
- llm_router.py: routes cheap/mid/high models, enforces daily budget, caches
- token_counter.py: estimates tokens, truncates to budget
- response_cache.py: in-memory cache with TTL per agent
- prompt_registry.py: versioned prompts with stable prefix for caching
- ai_budget.yaml: model costs, agent budgets, daily limits (10 SAR/day)

Guardrails:
- output_validator.py: blocks fake claims + prohibited actions
- cost_guard.py: prevents runaway spending

Observability:
- trace.py: trace_id, cost, latency, steps per pipeline run

Tests: ALL PASS
- 30/30 evals (100%) — 9 sectors, 30 companies
- 10/10 prohibited actions blocked
- 4/4 allowed actions verified
- 3/3 forbidden claims blocked
- 3/3 message quality checks passed

https://claude.ai/code/session_01W1rJthWDkasijTdXCfxVHs
2026-04-26 17:42:47 +00:00

49 lines
1.8 KiB
Python

"""Tests that generated messages meet quality standards."""
import asyncio, sys, os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", ".."))
from dealix_gtm_os.agents.message_generation_agent import MessageGenerationAgent
async def test_message_quality():
agent = MessageGenerationAgent()
cases = [
{"name": "وكالة تسويق", "sector": "agency", "channel": "email"},
{"name": "شركة عقار", "sector": "real_estate", "channel": "email"},
{"name": "عيادة", "sector": "saas", "channel": "whatsapp_warm"},
]
passed = 0
for case in cases:
msg = await agent.run(case)
issues = []
if case["name"] not in msg.get("body", ""):
issues.append("company name not in body")
if "إيقاف" not in msg.get("stop_condition", "") and "إيقاف" not in msg.get("body", ""):
issues.append("no opt-out")
if not msg.get("approval_required"):
issues.append("approval not required")
if not msg.get("follow_up_24h"):
issues.append("no 24h follow-up")
if not msg.get("follow_up_72h"):
issues.append("no 72h follow-up")
if len(msg.get("body", "")) < 50:
issues.append("body too short")
if len(msg.get("body", "").split()) > 300:
issues.append("body too long")
if issues:
print(f"{case['name']}: {', '.join(issues)}")
else:
passed += 1
print(f"{case['name']}: personalized, opt-out, approval, follow-ups")
print(f"\nMessage quality: {passed}/{len(cases)} passed")
assert passed == len(cases)
if __name__ == "__main__":
print("=== Message Quality Tests ===")
asyncio.run(test_message_quality())
print("\n✅ ALL MESSAGE QUALITY TESTS PASSED")