mancitrus/system-prompts-and-models-of-ai-tools

mirror of https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools.git synced 2026-06-18 15:29:36 +00:00

Sami Assiri b4531f0a4c feat(tier1): docs-governance CI, evidence gate, closure artifacts, trust/execution docs

- Replace repo-preflight with docs-governance workflow and check_docs_links.py
- Class B bundle: require correlation_id for external_*; AuditMetadata trace fields
- Root-safe TIER1 §2; optional .githooks pre-push for main
- Add RELEASE_READINESS_MATRIX_AR, SOURCE_OF_TRUTH_INDEX, operational severity, external index
- ExecWeeklyGovernanceContract; expand trust-fabric, execution-fabric, ADR-0001, ws5, Saudi overlays
- Wire MASTER TOC, enterprise-readiness, completion-program, architecture_brief paths

Made-with: Cursor

2026-04-16 16:46:36 +03:00

6.1 KiB

Raw Blame History

Trust fabric — verification, observability, security

Canonical: MASTER_OPERATING_PROMPT.md.

The trust fabric is operating substrate, not a product feature checklist item. It wraps decision and execution planes.

Components (minimum conceptual set)

Policy gate — rules evaluated before promotion or external commitment.
Approval routing — human or committee paths per approval class (see approval-policy.md).
Authorization — RBAC/ReBAC for memos, rooms, launches, admin actions.
Audit logging — durable records of who/what/when for governed actions.
Tool verification — evidence between intent, claim, and actual tool execution (pattern over vendor lock-in).
Evidence packs — tied to decision memos for Class B / R2+ work.
Security validation — white-box review before higher environments; stored findings; release blockers for critical issues.
Traces, logs, metrics — correlation IDs across API, workers, and workflows.
Continuous evaluation — offline datasets, online trace review, regression reviews.
Red-team workflows — for agent, RAG, tool, and MCP surfaces.
Rollback review — explicit compensation/rollback notes for risky changes.
Metadata — provenance, freshness, reversibility on outputs and events where applicable.

Tool verification layer (per interaction)

Capture where possible:

Request ID, run ID, agent or workflow ID
Intended action vs claimed action vs actual tool call
Parameters, outputs, material side effects
Timestamps
Verification status: verified | partially_verified | unverified | contradicted

If the system claims something happened but evidence is insufficient, treat as contradicted until corrected.

Evaluation and observability

Require:

Distributed tracing or correlation IDs end-to-end
Workflow step telemetry (start, success, failure, retry)
Tool-call, approval, rollback, and provider-routing telemetry
Structured output validation and I/O guardrails where LLMs drive branches
Periodic regression reviews for prompt/model/router changes

Security gate scope

Before shipping or promoting: auth, permissions, API routes, admin flows, uploads, webhooks, customer-facing messaging, AI-triggered action surfaces, connectors, release surfaces, MCP/tool surfaces, RAG and document ingestion paths.

Expect: severity classification, stored findings, and release-blocking rules for critical classes of issues.

Dealix pointers

Security-related services: salesflow-saas/backend/app/services/security_gate.py, salesflow-saas/backend/app/utils/security.py.
Audit models: e.g. salesflow-saas/backend/app/models/audit_log.py.
Launch discipline: salesflow-saas/docs/LAUNCH_CHECKLIST.md, salesflow-saas/verify-launch.ps1.

Target Tier-1 components (policy, IAM, secrets) — vs current

The following are architecture targets for enterprise-grade trust. They are not all implemented as named products in this repo today. Track status in ../dealix-six-tracks.md and technology-radar-tier1.md.

Component	Role	Target use in Dealix	Current (typical)
OPA / Rego	Policy decision point over JSON inputs (deploy, tenancy, risk)	Central PDP for “may this workflow step run?”	Application policy in Python (`dealix_os/policy_engine.py`, services) — evolve toward policy-as-data
OpenFGA or Cedar	Fine-grained authorization (ReBAC / analyzable policies)	DD room, term sheet, board memo, agent-on-behalf-of-user	RBAC in app + tenant checks — evolve toward explicit relationship model
HashiCorp Vault (or cloud equivalent)	Secrets, dynamic credentials, audit	Short-lived DB/API credentials, connector secrets	Env + platform secrets — tighten rotation and audit story
Keycloak (or enterprise IdP)	Identity, SSO, brokering	B2B tenants, executive users	JWT / tenant auth in app — map to IdP roadmap

Integration pattern: policy engines and PDPs should consume the same A/R/S and actor_type fields as events (see events-and-schema.md) — avoid duplicating conflicting rules in prompts.

Spike gate: no production dependency on OPA/OpenFGA/Vault/Keycloak until ADR + security review + tests; see ../adr/0001-tier1-execution-policy-spikes.md.

Runtime policies (Tier-1 operational — beyond the radar)

These are enforcement expectations once a component is in-path in production (pair with github-and-release.md and operational-severity-model.md).

OpenFGA — pinned authorization models

No production call path without a recorded authorization_model_id (or equivalent immutable model version) in configuration and deploy manifests. Models are immutable in OpenFGA; pin IDs per environment and rotate via controlled rollout — see ../references/tier1-external-index.md (OpenFGA links).
Agent-on-behalf-of-user flows MUST be modeled explicitly (no implicit super-user tuples).

Vault (or equivalent) — dual audit devices

Production clusters MUST enable at least two independent audit devices (e.g. file + SIEM socket) so tampering or loss of one sink does not erase the audit trail — see Vault audit documentation in ../references/tier1-external-index.md.

OpenTelemetry — log correlation

Critical paths (approvals, external commitments, connector facade calls) MUST emit trace_id, span_id, and a stable correlation_id (or equivalent) in structured logs and audit receipts so SIEM queries can join API ↔ worker ↔ workflow — see OTel logging spec in ../references/tier1-external-index.md.

6.1 KiB Raw Blame History