Human Values API — How to Embed Principles into AI Systems

The System Prompt Is Not a Governance Layer

Every developer building production AI has done this: you need the model to behave a certain way — be honest but tactful, prioritize safety over conversion, stay conservative in regulated contexts — so you write it into the system prompt.

It feels like a solution. It is not.

System prompts are ephemeral text strings sitting in a context window. They do not persist across agents in a multi-agent workflow. They do not surface value conflicts when two principles collide. They produce no audit log. Under adversarial pressure, they collapse.

More precisely: there is no such thing as prompt-based values governance. There is only the appearance of it — right up until the moment it fails in front of a user, a regulator, or a board.

What you actually need is a human values API: a dedicated runtime service that holds your principles, evaluates AI-generated outputs against them before delivery, and returns alignment scores, conflict maps, and suggested revisions — all in under 100ms.

What a Human Values API Is

A human values API is not a content filter. It is not fine-tuning. It is not Constitutional AI, which is training-time and model-embedded.

It is a runtime evaluation layer that sits between generation and delivery. You POST a generated response. You get back:

An alignment score (0–1)
Active tensions — which values are in conflict in this specific context
A resolution hint — how to navigate the conflict
An audit log ID — a permanent record of the reasoning

The key insight: values should be infrastructure, not configuration text. The difference is the same as the difference between a database and a CSV file. Both hold data. Only one gives you versioning, queries, transactions, and audit trails.

Integrating a Human Values API into Your LLM Pipeline

Step 1: Define Your Values Schema

Values are encoded as structured, versioned objects — not free text:

{
  "value_id": "honesty-over-harmony",
  "priority_weight": 0.85,
  "tension_pair": "truthfulness vs. kindness",
  "domain": ["customer-service", "health"],
  "version": "1.2.0"
}

This is not a prompt. It is a data structure. It was reviewed by specific humans on a specific date. It has a domain scope. You can diff it, audit it, and version-control it like any other infrastructure artifact.

Step 2: Score at Runtime

After generation and before delivery, POST to the values endpoint:

result = await client.post(
    "https://api.trucontext.ai/v1/values/align",
    json={
        "content": response_text,
        "values_profile": "org-default",
        "return_guidance": True
    },
    timeout=0.1  # 100ms hard cap
)

The API returns an alignment score and active tensions in under 100ms — less than 10% of a typical LLM inference call.

Step 3: Gate, Regenerate, or Monitor

Three patterns for production use:

Inline gate: If score drops below threshold, regenerate with the suggested revision as system context.
Shadow mode: Fire the alignment check asynchronously. Do not block delivery. Build the audit trail silently, surface drift patterns in the dashboard.
Multi-agent governance: Run checks at the orchestration layer — not just on output edges, but on task delegation itself.

Why This Changes Your Architecture

When values live in infrastructure, the model's job inverts. It does not need to remember your principles. It generates candidates. The values layer judges them against a stable, auditable schema.

Generation and judgment are different jobs. Separating them is how you get reliability.

This also changes your compliance story. In healthcare, finance, or legal, "we instructed the model to behave ethically" is not a defensible answer to a regulatory inquiry. "We run every response through a values alignment API that logs the active principles and resolution logic" is a different story entirely. It is the difference between hoping and governing.

TruContext Values Oracle

TruContext implements this architecture as the Values Oracle API — a production runtime values service for AI pipelines.

What it includes:

Structured values schema management — typed, versioned, reviewed by humans
Sub-100ms alignment scoring via precomputed value embeddings
Tension detection with resolution guidance, not just pass/fail
Full audit log on every check
Multi-agent orchestration support
Values drift monitoring across sessions

The API is in private beta. TruContext is building it with a founding developer cohort — teams who have shipped agents in production and felt the subtle wrongness that better prompts do not fix.

Founding Developer Offer: The first 1,000 developers get 1M Ops free, direct API access, schema design sessions with the team, and founding pricing locked permanently.

If you have been hacking values into system prompts and watching them drift, there is a better architecture — and a cohort building it together.

Join the Founding Developer cohort → trucontext.ai