01What a data control plane is and where it sits
An AI data control plane is a layer that sits between your data and the model and governs which records and fields a given caller may send into a prompt. It inspects every record and field at runtime, before the model sees it, applies a policy to pass or redact each field by role, and sends only the approved context onward. It is the data-governance equivalent of a control plane: it does not generate answers, it decides what data is allowed to reach the thing that does.
The distinction worth drawing early is between this and a model gateway. A model gateway sits in front of model providers and manages traffic: routing, retries, caching, rate limits, and observability for the API calls themselves. A data control plane sits one step earlier, in front of the data, and governs the content of the request rather than the mechanics of the call. The two are complementary, but they solve different problems. This page is about the data layer: the control that ensures the prompt contains only what the caller is entitled to. It is one part of the broader practice of AI data governance for LLM applications.
The reason a dedicated layer is needed is that a model has no concept of roles, ownership, or clearance, and it will use everything placed in its context. So the question of what data an AI application may use cannot be answered inside the model. It has to be answered before the data reaches the model, by something that understands identity, policy, and the structure of the data.
02The integration shape: gateway or SDK
A data control plane integrates in one of two shapes. Both enforce the same policy and produce the same evidence; the difference is where the inspection step physically sits in your topology.
- As a gateway (reverse proxy). The control plane runs as a proxy in the data path. You point your retriever or API client at it, and it inspects and redacts records in line before they continue to your application. This shape is transparent to application code and gives you a single chokepoint, which suits teams that want one enforcement point without threading the control through every call site.
- As an SDK or API client. Your application calls the control plane directly, passing the retrieved records in and receiving the governed result back. This embeds the inspection step explicitly in your pipeline, which suits teams that want precise control over exactly when records are evaluated, or that already mediate retrieval through their own service.
On the wire, the transport can be REST, gRPC, or streaming, so the control plane fits both request-response and token-streaming designs. Authentication of the caller is handled with API keys, mutual TLS, or OIDC against your identity provider, which is what lets the policy decision be made against the caller's real clearance rather than a single shared service identity. The exact endpoints, client libraries, and parameters are part of the developer documentation available to design partners; the shapes above are the public integration surface.
| Integration question | Answer |
|---|---|
| How does it deploy? | As a gateway (reverse proxy) in the data path, or called in process via an SDK or API client. |
| What transports are supported? | REST, gRPC, or streaming, so it fits request-response and token-streaming pipelines. |
| How is the caller authenticated? | API keys, mutual TLS, or OIDC against your own identity provider. |
| Where does the data plane run? | Inside your environment. Records are inspected where they live and never leave your boundary for the decision. |
| Which models does it support? | Any. It governs the data before the prompt, so OpenAI, Anthropic, Gemini, and self-hosted models all work the same way. |
| What does it leave behind? | A signed, content-free, hash-chained evidence entry for every pass or redact decision. |
03Where it sits in a RAG pipeline
In a retrieval-augmented generation pipeline, the control plane sits after retrieval and before the model call. The cleanest way to see it is as a five-step flow.
- Authenticate the caller. Resolve who the request is on behalf of and what they are cleared to see, using API keys, mTLS, or OIDC. The clearance, not just the service identity, is the input to every later decision.
- Retrieve context. Run your normal retrieval against your vector store or system of record to gather the candidate records for the question. This step is unchanged from a standard RAG design.
- Inspect and redact per field by role. Pass the candidate records through the control plane, which evaluates every record and field at runtime and returns a per-field pass or redact verdict against the caller's clearance. Fields above clearance are withheld; the useful remainder passes through.
- Send only approved context to the model. Assemble the prompt from the approved fields only, then call the model. Because unauthorized fields never enter the context, the model cannot leak what it never received.
- Seal the decision as evidence. Record a signed, content-free, hash-chained entry of which caller was allowed which fields, under which policy, so the decision can be verified offline later.
The placeholder step is deliberate. The exact API for the governance call is available to design partners; what is public and stable is the position of the step. It belongs between retrieval and the model, where it can filter at the record level and redact at the field level before the context is built. For the retrieval-side mechanics specifically, see permission-aware RAG.
04Where it sits in an AI agent or tool-use flow
Agents complicate the picture because they make many calls, not one. A tool-using agent may query several systems, call functions, and feed each result back into its own context across multiple turns. Every one of those tool results is a place where sensitive data can enter the prompt, so the governance step has to sit on the tool boundary, not just at a single retrieval.
The principle is the same as RAG, applied at every point where data flows back to the model. When a tool returns records, those records pass through the control plane before they reach the agent's context, so each tool result is inspected and redacted by the caller's clearance. In the gateway shape, this is natural: tool calls route through the proxy and are governed in line. In the SDK shape, the inspection step wraps each tool's return path. Either way, the agent only ever reasons over data the caller is entitled to, on every turn, and each of those decisions is sealed as evidence so a multi-step agent run leaves a complete, verifiable trail.
05Model-agnostic across OpenAI, Anthropic, and Gemini
Because the control governs data before the prompt is assembled, it is independent of which model receives the prompt. The same policy and the same redaction step apply whether the approved context then goes to OpenAI, Anthropic's Claude, Google Gemini, or a model you host yourself. The provider sits downstream of the control, so it never participates in the data decision.
That independence has a practical payoff. You can switch providers, run an A/B test across two of them, or route different workloads to different models without touching how data is governed. There is no per-provider rework, because the control was never tied to a provider in the first place. A control built into one vendor's API would have to be rebuilt for the next; a data-layer control built before the prompt does not. For the full treatment of this property, see model-agnostic AI data protection.
06Deployment modes and data residency
Where the control plane runs matters as much as what it does, because the whole point is to govern data without moving it. The architecture separates the two planes for exactly this reason.
The data plane runs inside your environment. Records are inspected and redacted where they live, and they never cross your boundary for the decision to be made. The managed control plane receives only content-free verdict evidence, the signed record of what was allowed or withheld, never the underlying data itself. For teams that need everything inside their own perimeter, self-managed, on-premises, and air-gapped deployments are available, along with a FIPS build. The result is that adding governance does not add a new place where your sensitive data lives.
07How to get started
At the design stage, the decision is architectural, not code-level: settle on where the control sits and which shape fits your stack. Use this as a checklist.
- Decide the shape. Gateway (reverse proxy) for a transparent single chokepoint, or SDK / API client for explicit in-process control.
- Place the step after retrieval, before the model. In a RAG pipeline it sits at the prompt boundary; in an agent it sits on every tool return path.
- Carry the caller's clearance, not just a service identity. Authenticate with API keys, mTLS, or OIDC so decisions reflect who the request is really for.
- Keep the data plane in your environment. Records are inspected where they live; only content-free evidence leaves.
- Treat the model as downstream. Pick OpenAI, Anthropic, or Gemini freely; governance does not change with the provider.
- Plan for evidence from day one. Every pass or redact decision should be signed and hash-chained, so you can prove what was withheld.
The detailed developer documentation, API reference, and quickstart are available to design partners rather than published, because Custosa is early-stage and in production with design partners. If you are building an AI application on sensitive data and want the full integration path, request access below.
Put a data control plane in your stack
See where Custosa sits between your retriever and the model, inspecting every record and field at runtime and sending only the approved context into the prompt, the same way across OpenAI, Anthropic, and Gemini.
Frequently asked questions
What is a data control plane for AI?
A data control plane for AI is a layer that sits between your data and the model and governs which records and fields a given caller is allowed to send into a prompt. It inspects data at runtime before the model sees it, applies a deterministic policy to pass or redact each field by role, and produces signed evidence of the decision. It is distinct from a model gateway that load-balances or caches calls: the control plane governs the data, not the traffic, so the model only ever receives what the caller is entitled to.
Should I integrate as a gateway or an SDK?
Both shapes enforce the same policy; the choice is about where the control sits in your topology. A gateway, or reverse proxy, is transparent: you point your retriever or API client at it and it inspects and redacts in line, which suits teams that want a single chokepoint without touching application code. An SDK or API client embeds the inspection step directly in your pipeline, which suits teams that want explicit control over exactly when records are evaluated. Many deployments use the gateway as the default and reach for the in-process path where they need finer control.
Does it work with OpenAI, Anthropic, and Gemini?
Yes. Because a data control plane governs the data before the prompt is built, it is model-agnostic by design. The same policy and the same redaction step apply whether the approved context is then sent to OpenAI, Anthropic's Claude, Google Gemini, or a self-hosted model. The model provider is downstream of the control, so switching or mixing providers does not change how data is governed and does not require redoing the controls.
Where does it sit in a RAG pipeline?
It sits after retrieval and before the model call. The flow is: authenticate the caller, retrieve candidate context as usual, pass that context through the control plane so each field is inspected and redacted by role, then send only the approved fields into the prompt, and finally seal the decision as evidence. Placing it at this boundary means unauthorized fields never enter the context, so the model cannot leak what it never received.
Does my data leave my environment?
No. The data plane runs inside your environment, so records are inspected and redacted where they live and never cross your boundary for the decision. The managed control plane receives only content-free verdict evidence, never the underlying data. Self-managed, on-premises, and air-gapped deployments are available, along with a FIPS build, for teams that need the entire system inside their own perimeter.