Home/Learn/AI access control
Core guide

AI access control for LLMs and RAG

Updated June 2026 · 9 min read

When the consumer of your data is a model rather than a person in a UI, the access controls you built for human sessions stop applying. This guide explains why, and how to enforce least privilege on the records and fields a model is about to read.

01What AI access control is

AI access control is the set of controls that decide which data an AI system, such as an LLM or a RAG pipeline, may receive on behalf of a given caller. It resolves the caller's identity and clearance, then governs what records and fields are allowed to enter the prompt. Its defining property is that it acts at the input: data the caller is not entitled to is withheld before the model ever sees it.

The discipline goes by a few names. LLM access control and AI authorization describe the same idea: applying the principle of least privilege to a non-human consumer that reads data and produces an answer. What makes it distinct from ordinary application security is the consumer. A model has no concept of roles, ownership, or clearance, and it will use everything placed in its context. Controlling its access is therefore not about hardening the model. It is about governing what reaches the model in the first place. This is one pillar of the wider discipline of AI data governance.

02Why classic RBAC and ABAC break for models

Role-based and attribute-based access control were designed for a world of human users moving through an application. That world has assumptions an AI caller quietly violates.

  • There is no session-bound UI to gate each view. In a traditional app, every screen enforces its own rules, so access control is spread across the interface. A model bypasses the interface entirely and reads source data directly, so there is no screen left to do the gating.
  • One request fans out across many sources. A human clicks one record at a time. An AI caller retrieves from many systems in a single turn and fuses them into one answer, so a control scoped to a single application boundary never sees the full picture.
  • Relevance, not permission, drives retrieval. A similarity search returns the most relevant passage regardless of who is asking. Without an authorization filter, the most relevant chunk is often the most sensitive, and it flows straight into the context.
  • The caller is a service, not a person. The model usually runs under a service identity. Unless the original human's clearance is carried through to the data decision, every request looks identical and the system cannot tell a clinician from a contractor.
The principle of least privilege still holds. What changes is where you enforce it. RBAC and ABAC are not wrong; their enforcement point, the application session, is simply the one thing a model routes around. AI access control keeps the model of roles and clearances but moves the decision to the data the model is about to consume.

03Identity and clearance for AI callers

Every access decision starts with a question: on whose behalf is this request being made, and what is that principal cleared to see. For AI callers, answering it takes deliberate work.

The model typically acts under a service account, but the request originated with a person, an agent, or another system that has a real clearance. Effective AI access control resolves that originating principal and binds the request to it, so the decision is made against the caller's actual rights rather than the broad permissions of the service. The clearance itself is best expressed not as a binary but as a level on a lattice. Custosa uses a five-level clearance lattice, which lets policy distinguish graduated sensitivity and map each role to exactly the tier of data it may read.

Relevance is not permission.

Once identity and clearance are resolved, they become the inputs to the policy decision for every record and field in the request. The rest of this guide is about where that decision is enforced and how it is made trustworthy.

04Access control at retrieval versus at output

The single most consequential design choice is where the control sits. There are three candidate points, and they are not equivalent.

Enforcement pointWhat it doesWhy it is or is not enough
At retrievalFilters candidate documents by the caller's clearance before they are ranked, so unauthorized records are never fetched.Necessary. Stops whole-document exposure where a relevant but restricted file is surfaced to the wrong caller.
At the prompt boundaryEvaluates each remaining field and redacts anything above clearance before the context is assembled.Necessary. Catches the sensitive field inside an otherwise-permitted record that document-level filtering misses.
At the outputInspects the model's answer after generation and tries to strip anything sensitive.Insufficient alone. It is detection, not prevention; it fails open and cannot see the authorization context.

The conclusion is to enforce at the input, in both senses. Filter at retrieval so unauthorized records are never fetched, then redact at the prompt boundary so any sensitive field that remains is masked before the context is built. If unauthorized content never enters the context, the model cannot leak it. Output checks can serve as a backstop, but they cannot be the load-bearing control. For the retrieval-side mechanics specifically, see permission-aware RAG; for the disclosure paths a leak can take, see LLM data leakage.

05Field-level authorization (per-field PASS and REDACT)

Authorization is rarely all-or-nothing at the document level. A record can be mostly safe yet contain one field a given role may not see, so the right granularity is the field, not the file.

Field-level authorization evaluates each field of a record independently and returns a per-field verdict, pass or redact, against the caller's clearance. The fields that exceed clearance are masked or removed; the useful remainder still reaches the model. The same record therefore yields a different, correct view for each role. A clinician sees the diagnosis but not the identifier; an analyst sees the aggregate but not the material non-public detail. Because the sensitive field is withheld before the prompt is built, the model cannot leak what it never received.

This is the difference between a document-level allow or deny and a record that is reshaped per caller. Document-level filtering forces a binary: ship the whole record or none of it. Per-field redaction preserves utility, returning the parts the caller is entitled to while withholding the parts they are not, in the same response.

06Deterministic policy for explainable decisions

For the control to be trusted, the access decision itself must be reliable. The pass-or-redact verdict should come from a formal policy engine, not from a model or a heuristic.

Custosa uses a deterministic policy engine built on Cedar, a formal policy language. Determinism means the same inputs always produce the same verdict. That has three practical consequences. Decisions are explainable, because each one traces to a written rule rather than a probabilistic guess. They are reproducible, because the same request re-evaluates to the same outcome, which is what makes audit and testing meaningful. And they are defensible, because a deterministic rule can be reviewed, versioned, and reasoned about. A control that sometimes guesses is not a control; an authorization model that decides authorization with another model only moves the trust problem.

07Fail-closed defaults

A good access control fails safely. When the policy engine cannot reach a confident verdict, for example because a record is missing the metadata a rule needs, the safe default is to withhold rather than to allow.

Custosa fails closed: it blocks when it cannot reach a verdict. This is the opposite of output filtering, which fails open by letting anything it does not recognize pass through. Failing closed means an error, a gap in metadata, or an unanticipated case results in less data reaching the model, never more. For systems handling regulated data, that asymmetry is the point. The cost of a false withhold is a retry or a manual review; the cost of a false release can be a reportable disclosure.

08Evidence of every access decision

Enforcement stops the exposure; evidence proves the enforcement happened. A system that makes access decisions on regulated data needs a durable record of which caller was allowed which records and fields, when, and under which policy, or it cannot answer to an auditor.

An application log is not enough, because a log you control is mutable and therefore only a claim. The stronger form is evidence that is signed with HMAC-SHA256, so each decision is attributable; hash-chained into an append-only ledger, so altering or removing any entry breaks the chain and is detectable; content-free, so it records that a field was redacted without copying the field itself; and offline-verifiable, so it can be checked independently without contacting Custosa. That combination turns "we enforced least privilege" into something a third party can confirm. For the full treatment of how that record is built, see content-free, tamper-evident evidence.

09An AI access control implementation checklist

Use this as a baseline when designing or reviewing access control for an LLM or RAG deployment that touches sensitive data.

AI access control checklist
  • Resolve the originating principal, not just the service identity, so decisions reflect the caller's real clearance.
  • Express clearance as a lattice, not a binary, so policy can map each role to graduated sensitivity.
  • Filter at retrieval by clearance before ranking, so unauthorized records are never fetched.
  • Redact at the field level before the prompt, so a mostly-safe record does not leak its one sensitive field.
  • Decide with a deterministic policy engine, so every verdict is explainable and reproducible.
  • Fail closed when a verdict cannot be reached, so errors reduce exposure rather than increase it.
  • Do not rely on output filtering as the primary control; treat it only as a backstop.
  • Record signed, content-free, hash-chained evidence for every pass or redact decision.
  • Keep the data plane in your environment, so records are evaluated where they live and never leave for the decision.

Enforce least privilege on what the model reads

See Custosa resolve the caller's clearance and apply per-field, deterministic authorization at runtime, so the model only ever receives what the caller is allowed to see.

Frequently asked questions

What is AI access control?

AI access control is the set of controls that decide which data an AI system, such as an LLM or a RAG pipeline, is allowed to receive on behalf of a given caller. It resolves the caller's identity and clearance, then governs what records and fields may enter the prompt. Unlike a content filter on the answer, it acts at the input: data the caller is not entitled to is withheld before the model ever sees it, so the model cannot leak what it never received.

How is AI access control different from normal RBAC?

Classic RBAC and ABAC assume a human user inside an application session, clicking through screens that each enforce their own rules. An AI caller has none of that. It fans out across many sources in one request, has no session-bound UI to gate each view, and assembles a single answer from everything it retrieved. AI access control keeps the principle of least privilege but moves enforcement to the data the model is about to consume, evaluating every record and field at runtime rather than trusting an application boundary that the model bypassed.

Where should access control sit in an AI pipeline?

In two places at once. The first is retrieval: filter candidate documents by the caller's clearance before they are ranked, so unauthorized records are never fetched. The second is the prompt boundary: evaluate each remaining field and redact anything above clearance before the context is assembled. Enforcing only at the output is too late, because by then the data is already in the model. Enforcing at the input means the prompt contains only what the caller is entitled to, by construction.

What is field-level authorization?

Field-level authorization evaluates each field of a record independently and returns a per-field verdict, pass or redact, based on the caller's clearance. Authorization is rarely all-or-nothing at the document level: a record can be mostly safe yet hold one field a given role may not see. Per-field decisions mask only the parts that exceed clearance and let the useful remainder through, so the same record yields a different, correct view for each role without leaking its sensitive fields.

Can you audit AI access decisions?

Yes, if the system is built to produce evidence rather than just logs. A trustworthy record signs every pass or redact decision, hash-chains the entries so altering one breaks the chain, and keeps the record content-free so it proves what was withheld without copying the sensitive data. Because it can be verified offline without contacting the vendor, an auditor can confirm independently which caller was allowed which fields, when, and under which policy.