mirror of
https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools.git
synced 2026-06-18 15:29:36 +00:00
Security Curator (4 modules) — جدار الحماية الأول
- secret_redactor: 11 patterns (GitHub PAT, OpenAI/Anthropic/Supabase/WhatsApp/Moyasar/Sentry/Google/AWS/private keys); never returns raw secret
- patch_firewall: blocks .env / credentials.json / RSA keys; scans added lines for secret patterns
- trace_redactor: masks phones (+966...) and emails for PII safety
- tool_output_sanitizer: cleans tool outputs before they hit ledger/Proof Pack/UI/observability
Growth Curator (5 modules) — التحسين الذاتي
- message_curator: grades Arabic messages (0..100), detects 8 risky phrases, suggests Saudi-tone skeleton
- playbook_curator: scores playbooks by outcome (accept/reply/meeting/deal); winner/promising/needs_work/archive
- mission_curator: scores completed missions; ship_it_widely/iterate/rework_or_retire
- skill_inventory: deterministic 23-skill catalog across 5 layers
- curator_report: weekly Arabic summary "ماذا تعلمنا هذا الأسبوع"
Meeting Intelligence (5 modules) — ذكاء الاجتماعات
- transcript_parser: accepts Google Meet entries OR plain "Speaker: text" format
- meeting_brief: 6-section pre-meeting brief in Arabic (objective/questions/objections/offer/next-step)
- objection_extractor: 8 categories (price/timing/authority/trust/integration/competitor/results/complexity)
- followup_builder: email + WhatsApp drafts; live_send_allowed=False always
- deal_risk: 0..100 score from objections + missing next-step + decision-maker absence + days-since-touch
Model Router (5 modules) — موجّه النماذج
- provider_registry: 7 providers (Claude Sonnet/Haiku, GPT-4-class, GPT-4o-mini, Gemini Pro, Azure OAI KSA-region, Local Qwen Arabic-tuned)
- task_router: 10 task types × routing decisions with reasons_ar
- cost_policy: bulk → low; output > 1500 tokens → high
- fallback_policy: high-sensitivity workloads prefer KSA-region/self-hosted FIRST
- usage_dashboard: deterministic demo of all task routes
Connector Catalog (3 modules) — كتالوج التكاملات
- 14 connectors (WhatsApp Cloud, Gmail, Calendar, Google Meet, Moyasar, LinkedIn Lead Forms, Google Business Profile, X API, Instagram, Sheets, CRM, Website Forms, Composio, MCP Gateway)
- Each has launch_phase (1-4), risk_level, allowed_actions, blocked_actions, Arabic risk dossier
- WhatsApp blocks cold_send_without_consent; Moyasar blocks store_card_number; MCP requires allowlist
Agent Observability (5 modules) — مراقبة الوكلاء + التقييمات
- trace_events: SHA256-hashes user/company IDs; sanitizes payload/output before logging
- safety_eval: 7 rules (guarantee, scarcity_fake, medical_claim, financial, regulatory, personal_data, urgency); 0..100 → safe/needs_review/blocked
- saudi_tone_eval: positive markers (هلا, لاحظت, يناسبك) vs negative (تحية طيبة وبعد, synergy, leverage); arabic_ratio bonus
- eval_pack: 5 curated cases with expected verdicts
- cost_tracker: per workflow/provider/task_type aggregation
Routers (6 new) — 30 endpoints
- /api/v1/security-curator/{demo, redact, inspect-diff, sanitize-output}
- /api/v1/growth-curator/{skills/inventory, messages/grade, messages/improve, messages/duplicates, missions/next, report/weekly, report/demo}
- /api/v1/meeting-intelligence/{brief, brief/demo, transcript/summarize, followup/draft, deal-risk}
- /api/v1/model-router/{providers, tasks, route, cost-class, usage/demo}
- /api/v1/connector-catalog/{catalog, summary, status, risks, {key}}
- /api/v1/agent-observability/{trace/build, safety/eval, tone/eval, evals/run}
Tests (6 new files, 76 tests)
- test_security_curator: 16 tests (PAT detect, key redact, env diff block, payload scan, trace mask)
- test_growth_curator: 16 tests (Arabic grade, risky phrases, dup detect, playbook scoring, mission recommend, weekly report)
- test_meeting_intelligence: 13 tests (transcript parse, brief sections, objection extract, followup drafts, deal risk)
- test_dealix_model_router: 11 tests (every task → ≥1 provider, KSA-region for high sensitivity, cost class, primary override)
- test_agent_observability: 12 tests (trace hashing, safety verdicts, tone scoring, eval pack)
- test_connector_catalog: 11 tests (≥12 connectors, every has risk/blocked actions, WA cold-send blocked, Moyasar card-storage blocked)
Docs (8 new + 1 updated)
- AGENT_SECURITY_CURATOR.md (Arabic)
- GROWTH_CURATOR_STRATEGY.md (Arabic)
- MEETING_INTELLIGENCE.md (Arabic)
- MODEL_PROVIDER_ROUTER.md (Arabic)
- CONNECTOR_CATALOG.md (Arabic)
- AGENT_OBSERVABILITY_EVALS.md (Arabic)
- PRIVATE_BETA_LAUNCH_TODAY.md (Arabic) — go-checklist + offer + risks
- DEMO_SCRIPT_12_MINUTES.md (Arabic) — minute-by-minute demo flow
- FIRST_20_OUTREACH_MESSAGES.md (Arabic) — 7 personas + 3 follow-ups, all under safety/tone evals
- DEALIX_100_PERCENT_LAUNCH_PLAN.md — added §34 Self-Improving Agent Platform + §35 Private Beta Launch
Landing
- landing/private-beta.html — Arabic RTL, dark theme, pricing, 11 demo endpoints, safety banner
Test results
- 76/76 new tests pass
- Full suite: 663 passed, 2 skipped (missing API keys, unrelated)
- 0 existing tests broken
Safety
- All 6 layers honor approval-first, draft-only, no-live-send
- Hash user/company IDs before any trace
- No secrets in logs/embeddings/traces (3-layer defense: redactor + sanitizer + firewall)
- Saudi tone eval rejects "تحية طيبة وبعد" + "synergy" auto-corporate language
- Safety eval blocks "ضمان 100%" + medical claims + fake urgency
- Connector Catalog: WhatsApp blocks cold-send, Moyasar blocks card storage, MCP requires allowlist
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
156 lines
5.4 KiB
Python
156 lines
5.4 KiB
Python
"""Unit tests for the Growth Curator."""
|
|
|
|
from __future__ import annotations
|
|
|
|
from auto_client_acquisition.growth_curator import (
|
|
build_weekly_curator_report,
|
|
detect_duplicates,
|
|
grade_message,
|
|
inventory_skills,
|
|
recommend_next_mission,
|
|
recommend_next_playbook,
|
|
score_mission,
|
|
score_playbook,
|
|
suggest_improvement,
|
|
)
|
|
|
|
|
|
# ── Skill Inventory ──────────────────────────────────────────
|
|
def test_inventory_lists_kill_feature():
|
|
out = inventory_skills()
|
|
assert out["total"] >= 20
|
|
kill_ids = [s["id"] for s in out["kill_features"]]
|
|
assert "first_10_opportunities" in kill_ids
|
|
|
|
|
|
def test_inventory_layers_present():
|
|
out = inventory_skills()
|
|
layers = set(out["layers"])
|
|
assert {"platform_services", "intelligence_layer",
|
|
"growth_curator", "security_curator"}.issubset(layers)
|
|
|
|
|
|
# ── Message Curator ──────────────────────────────────────────
|
|
def test_grades_natural_arabic_message_high():
|
|
text = ("هلا أحمد، لاحظت توسعكم في فريق المبيعات. "
|
|
"نشتغل على Dealix كمدير نمو عربي. "
|
|
"يناسبك أعرض لك مثال 10 دقائق هذا الأسبوع؟")
|
|
g = grade_message(text, sector="training")
|
|
assert g.score >= 60
|
|
assert g.verdict in ("publish", "needs_edit")
|
|
|
|
|
|
def test_blocks_risky_phrases():
|
|
text = "آخر فرصة! ضمان 100% نتائج مضمونة. اضغط الآن."
|
|
g = grade_message(text)
|
|
assert g.risky_phrases
|
|
assert g.verdict in ("needs_edit", "reject")
|
|
|
|
|
|
def test_rejects_non_arabic():
|
|
text = "Hello there, just checking in. Cheers."
|
|
g = grade_message(text)
|
|
assert g.verdict == "reject"
|
|
|
|
|
|
def test_detects_near_duplicates():
|
|
msgs = [
|
|
"هلا أحمد، لاحظت توسعكم. يناسبك أعرض لك Pilot؟",
|
|
"هلا محمد، لاحظت توسعكم. يناسبك أعرض لك Pilot؟",
|
|
"totally unrelated message in english",
|
|
]
|
|
pairs = detect_duplicates(msgs, threshold=0.8)
|
|
assert any({i, j} == {0, 1} for i, j, _r in pairs)
|
|
|
|
|
|
def test_suggest_improvement_returns_skeleton():
|
|
out = suggest_improvement("Hi")
|
|
assert "suggested_skeleton_ar" in out
|
|
assert "هلا" in out["suggested_skeleton_ar"]
|
|
|
|
|
|
# ── Playbook Curator ────────────────────────────────────────
|
|
def test_score_playbook_winner_tier():
|
|
"""Strong outcomes across all signals should push into winner/promising."""
|
|
pb = {
|
|
"used_count": 100, "accept_count": 90,
|
|
"replied_count": 80, "meeting_count": 60, "deal_count": 40,
|
|
}
|
|
s = score_playbook(pb)
|
|
assert s["score"] >= 50
|
|
assert s["tier"] in ("winner", "promising")
|
|
|
|
|
|
def test_score_playbook_needs_work_tier():
|
|
"""Modest outcomes should map to needs_work."""
|
|
pb = {
|
|
"used_count": 100, "accept_count": 60,
|
|
"replied_count": 40, "meeting_count": 20, "deal_count": 8,
|
|
}
|
|
s = score_playbook(pb)
|
|
assert s["tier"] in ("needs_work", "promising")
|
|
|
|
|
|
def test_score_playbook_unproven_for_zero_uses():
|
|
s = score_playbook({"used_count": 0})
|
|
assert s["tier"] == "unproven"
|
|
assert s["score"] == 0
|
|
|
|
|
|
def test_recommend_next_playbook_default_when_empty():
|
|
rec = recommend_next_playbook([])
|
|
assert rec["recommended_id"] == "default_warm_outreach"
|
|
|
|
|
|
def test_recommend_next_playbook_picks_promising_first():
|
|
pbs = [
|
|
{"id": "p1", "title": "Winner", "tier": "winner", "score": 80},
|
|
{"id": "p2", "title": "Promising", "tier": "promising", "score": 60},
|
|
]
|
|
rec = recommend_next_playbook(pbs)
|
|
assert rec["recommended_id"] == "p2"
|
|
|
|
|
|
# ── Mission Curator ─────────────────────────────────────────
|
|
def test_score_mission_ship_it_with_strong_outcome():
|
|
out = score_mission({
|
|
"opportunities_generated": 10,
|
|
"drafts_approved": 5,
|
|
"meetings_booked": 3,
|
|
"revenue_influenced_sar": 60_000,
|
|
"time_to_value_minutes": 8,
|
|
"risks_blocked": 2,
|
|
})
|
|
assert out["score"] >= 70
|
|
assert out["verdict"] == "ship_it_widely"
|
|
|
|
|
|
def test_recommend_next_mission_starts_with_kill_feature():
|
|
rec = recommend_next_mission(None)
|
|
assert rec["recommended_mission_id"] == "first_10_opportunities"
|
|
|
|
|
|
def test_recommend_next_mission_after_kill_feature():
|
|
history = [{"mission_id": "first_10_opportunities"}]
|
|
rec = recommend_next_mission(history, growth_brain={
|
|
"growth_priorities": ["fill_pipeline"],
|
|
})
|
|
assert rec["recommended_mission_id"] == "meeting_booking_sprint"
|
|
|
|
|
|
# ── Curator Report ───────────────────────────────────────────
|
|
def test_weekly_report_handles_empty_input():
|
|
rep = build_weekly_curator_report()
|
|
assert rep["messages"]["total"] == 0
|
|
assert rep["playbooks"]["total"] == 0
|
|
assert rep["missions"]["total"] == 0
|
|
assert rep["next_playbook"]["recommended_id"]
|
|
|
|
|
|
def test_weekly_report_marks_low_quality_for_archive():
|
|
rep = build_weekly_curator_report(messages=[
|
|
{"id": "m1", "text": "Hi"},
|
|
{"id": "m2", "text": "آخر فرصة! ضمان 100% نتائج مضمونة!"},
|
|
])
|
|
assert rep["messages"]["to_archive"] >= 1
|