Skip to content

ClaimsAuditors: Observation vs Enforcement

Lucid separates observation from enforcement. ClaimsAuditors produce claims (structured observations about AI traffic). The Gateway evaluates those claims against a Cedar policy and makes the enforcement decision.

Key Principle

Auditors observe. The Gateway decides. This separation means you can change what is blocked or allowed by editing a Cedar policy -- without redeploying any auditor code.

The Claims Model

A claim is a typed observation about a piece of AI traffic. Examples:

Claim Name Type Example Value Auditor
toxic_content number 0-1 0.12 LLM Judge Auditor
injection_risk number 0-1 0.82 LLM Judge Auditor
pii_found boolean true PII Compliance Auditor
detected_regions string[] ["US", "EU"] Sovereignty Auditor
safety_score number 0-1 0.95 Eval Auditor

Claims are the universal interface between auditors and the policy layer. Every auditor, whether built-in or custom, communicates exclusively through claims.

How It Works

flowchart LR
    subgraph Auditors["ClaimsAuditors (Observe)"]
        direction TB
        A1["Guardrails"]
        A2["PII"]
        A3["Sovereignty"]
        A4["Eval"]
        A5["..."]
    end

    subgraph Gateway["Gateway (Enforce)"]
        direction TB
        G1["Collect Claims"]
        G2["Evaluate Cedar Policy"]
        G3["Produce Evidence"]
        G1 --> G2 --> G3
    end

    A1 -->|claims| G1
    A2 -->|claims| G1
    A3 -->|claims| G1
    A4 -->|claims| G1
    A5 -->|claims| G1
  1. Each auditor receives the request/response data
  2. Each auditor returns claims describing what it observed
  3. The Gateway collects all claims into a single ClaimsContext
  4. The Gateway evaluates one Cedar policy against that context
  5. The Cedar policy produces an allow/deny decision
  6. The Gateway bundles everything into a signed Evidence record

The Four Phases

ClaimsAuditors can produce claims at four lifecycle phases. Each phase receives different data.


Phase 1: Build & Deploy (Artifact)

Claims about static assets verified at deployment time.

Integrity Watchdog

Produces claims about model weight hash validity and SBOM integrity.

signature_validformat_allowed

Model Card Validator

Produces claims about model transparency metadata completeness.

required_benchmarks_completesafety_score

Phase 2: Input Gate (Request)

Claims about the user's request before it reaches the model.

PII Sanitizer

Produces claims about PII entities detected in the prompt.

pii_foundpii_risk_score

Injection Shield

Produces claims about prompt injection patterns detected.

injection_risksecret_leaked

Geo-Fencer

Produces claims about the request origin's jurisdiction.

detected_regionslocation_confidence

Phase 3: Runtime Mirror (Execution)

Claims about the model's behavior during inference.

Token Economist

Produces claims about token usage, latency, and cost.

token_countlatency_ms

Loop Breaker

Produces claims about recursive tool-calling patterns.

loop_exceededtool_count

MCP Firewall

Produces claims about MCP tool usage and domain access.

network_allowedtool_denied

Phase 4: Output Gate (Response)

Claims about the model's response before it reaches the user.

Truth Proxy

Produces claims about hallucination detection and RAG groundedness.

faithfulnesshallucination_score

Fairness Judge

Produces claims about bias and disparate impact metrics.

demographic_parity_diffstereotype_detected

Toxicity Audit

Produces claims about harmful content in the response.

toxic_contentbias_detected

The ClaimsAuditor Pattern

All auditors follow the same pattern: subclass ClaimsAuditor, use the @claims decorator to mark observation methods, and deploy with serve().

from lucid_auditor_sdk import ClaimsAuditor, claims, serve, Phase
from lucid_schemas import Claim

class MyAuditor(ClaimsAuditor):
    @claims(phase=Phase.REQUEST)
    def observe(self, request: dict, *, risk_threshold: float = 0.8) -> list[Claim]:
        # Analyze the request and return observations
        score = self.analyze(request)
        return [Claim(name="my_observation", value=score > risk_threshold)]

serve(MyAuditor(), port=8080)

The auditor never returns Deny(), Proceed(), or any decision. It returns claims. The Cedar policy in the Gateway decides what those claims mean.

See the Auditor Development Guide for full implementation details.

Claim Provenance

Every claim carries an optional provenance field that records which auditor settings produced it. The @claims decorator auto-stamps provenance using the keyword-only parameters injected into the method:

@claims(phase=Phase.REQUEST)
def scan_security(self, request: dict, *, injection_threshold: float = 0.9) -> list[Claim]:
    score = self.pipeline.detect(request["prompt"])
    return [Claim(name="injection_risk", value=score)]
    # provenance is auto-stamped: {"injection_threshold": 0.9}

This makes every claim self-describing: - What was measured: claim.value - How it was measured: claim.provenance (the exact settings used) - Who measured it: evidence.attester_id - When: claim.timestamp

Cedar Policy Decides

The Gateway evaluates a Cedar policy against the collected claims. For example:

// Block requests with high toxicity
forbid(principal, action == Action::"invoke", resource)
when { context.claims.toxic_content > 0.8 };

// Block requests with PII unless the agent has PII access
forbid(principal, action == Action::"invoke", resource)
when { context.claims.pii_found == true }
unless { resource.has_pii_access == true };

This means: - Changing thresholds requires editing the Cedar policy, not auditor code - Adding new rules requires editing the Cedar policy, not auditor code - Auditors focus purely on accurate observation

See the Cedar Policies Guide for authoring details.


Detection Settings in AuditorPolicy

Detection settings are declared as keyword-only parameters on @claims-decorated methods and stored in the AuditorPolicy.detection section alongside Cedar response rules. This means detection and response configuration live in one policy document.

How Detection Overrides Work

Concern Model Example
Cedar policies Deny-overrides (like AWS SCPs) Org forbid rule cannot be overridden by workspace permit
Detection overrides Per-policy overrides via AuditorPolicy.detection Policy sets injection_threshold: 0.5 for a specific agent

Detection overrides are scoped to the AuditorPolicy and resolved by the Gateway at runtime. The @claims decorator auto-introspects parameter defaults from the method signature.

Enforcement Modes

Each field in AuditorPolicy.detection can carry an enforcement mode that constrains how overrides are applied at the policy scope:

Mode Behavior Use Case
Floor Override can raise but not lower a numeric value injection_threshold >= 0.7 — policy can tighten to 0.9
Ceiling Override can lower but not raise a numeric value max_tool_calls_per_session <= 50
Exact Override must use the specified value pii_compliance_mode = "hipaa"
Superset Override must include all specified items, may add more pii_types_enabled >= [SSN, CREDIT_CARD]
Unlocked No constraint (default) log_level, otlp_endpoint

In the Observer UI, enforced fields display a lock badge with the enforcement mode.

The JSON schema for detection overrides with enforcement:

{
  "detection_overrides": {
    "pii_found": {
      "pii_types_enabled": ["email", "phone", "ssn"],
      "score_threshold": 0.7
    }
  },
  "enforcement": {
    "pii_types_enabled": { "mode": "superset" },
    "score_threshold": { "mode": "floor" }
  }
}

When an override violates an enforcement constraint, the API returns 422 Unprocessable Entity with a clear error message explaining the violation.

Enforced vs Configurable Settings

Settings with enforcement modes at a parent policy scope are constrained and cannot be changed beyond what the enforcement mode allows. This enables org admins to mandate security baselines while allowing flexibility on operational configuration.

Badge Meaning
Locked (Set by Org Policy) Field enforced at org level -- not overridable in child policies
Locked (Set by Workspace Policy) Field enforced at workspace level -- not overridable in agent policies
INHERITED Value inherited but overridable
OVERRIDDEN Value changed from parent default
DIRECT Set at this policy scope

Mandatory Auditors

Auditor sources cascade and merge (never subtract):

System auditors (always)        -> [observability, guardrails]      Cannot remove
  + Org-required auditors       -> [pii, ...]                       Cannot remove
  + Workspace-required auditors -> [sovereignty, ...]               Cannot remove
  + Framework-required auditors -> (from selected frameworks)       Cannot remove
  + User-selected auditors      -> [red-team, ...]                  Can remove
  = Final auditor set

Auditor Presets

Presets are policy templates that bundle both detection overrides and Cedar response rules optimized for a specific risk tolerance. They provide a fast on-ramp for configuring auditors without manually tuning every field.

Per-Auditor Tiers

Each auditor supports three preset tiers:

Tier Risk Tolerance False-Positive Rate Target
Starter High tolerance < 2% Individual developers, prototyping
Balanced Moderate 3-5% Production teams with SLOs
Strict Low tolerance 5-8% Regulated industries (finance, healthcare, government)

Apply a per-auditor preset via the API:

POST /api/v1/workspaces/{id}/apply-preset
{ "auditor": "guardrails", "tier": "balanced" }

Quick-Start Bundles

Bundles configure multiple auditors at once for common deployment scenarios:

Bundle Auditors Preset Tiers Claims Monitored
Solo Builder Guardrails, PII, Governance, Observability All Starter ~35
Production Team Guardrails, PII, Governance, Observability, RAG Quality, Fairness, Sovereignty Mostly Balanced ~65
Regulated Enterprise All 11 auditors Mostly Strict ~97

Apply a bundle to a workspace:

POST /api/v1/workspaces/{id}/apply-preset
{ "bundle": "production_team" }

The Observer UI provides a preset selector accessible from the Detection Rules tab: a dropdown for per-auditor tiers and a modal for quick-start bundles that shows a confirmation dialog with what will change before applying.


Claim Flow Navigation

The auditor detail page uses a Claim Flow navigation bar that represents the three-stage data flow through an auditor. Each segment is clickable and activates the corresponding tab below.

  Detection Rules ──produces──► Claims ──evaluated by──► Response Rules
Segment Color Purpose
Detection Rules Blue/Indigo Configure what the auditor observes and how sensitively
Claims Neutral/Gray View the claims this auditor produces (read-only reference)
Response Rules Amber/Orange Define actions when claims exceed thresholds
  • Detection Rules (formerly "Settings") control thresholds, entity lists, scan targets, and feature flags
  • Claims shows a table of all claim names, types, and descriptions produced by the auditor
  • Response Rules (formerly "Policy") contain the Cedar/IFTTT rules that map claim observations to decisions (deny, warn, redact, escalate)

Deep linking is supported via the ?tab=detection|claims|response query parameter.


Policy Version History

Every change to an AuditorPolicy (including detection overrides and Cedar response rules) is recorded via policy versioning. Each version captures:

  • Who made the change (user ID, email)
  • When the change was made (timestamp)
  • Policy scope (org, workspace, or agent)
  • Detection overrides and Cedar rules at that version
  • Enforcement modes at the time of change

Policy versions are queryable via the Verifier API. The Observer UI displays the version history as a timeline view accessible from the Detection Rules tab.


Self-Hosted Auditor Registration

Organizations can register custom ClaimsAuditors that run on their own infrastructure. Registered auditors appear in the catalog, the agent creation wizard, and settings panels (auto-generated from @claims parameter metadata via the /vocabulary endpoint). Three deployment modes are supported:

Mode Trust Tier Description
Sidecar TEE-attested (highest) Same pod, operator-injected
In-cluster mTLS-verified Customer's K8s, separate pod
External mTLS or API-key Customer's infrastructure

The passport displays trust tier per component, so downstream verifiers can see where each piece of the chain ran and at what trust level.

See the Auditor Development Guide for details on building and registering custom auditors.


What's Next?