Home/Learn/Output filtering
Compare

Custosa vs output filtering: Guardrails AI and NeMo Guardrails

Updated June 2026 · 6 min read

Output-filtering frameworks such as Guardrails AI and NVIDIA NeMo Guardrails validate and shape the text flowing into and out of a model. Custosa works one layer earlier, controlling which records and fields reach the model at all. This page sets out what each does well, where the gap is, and why the two are complementary rather than alternatives.

The short version: output-filtering frameworks shape what a model is allowed to say; Custosa controls what data the model is allowed to see. Guardrails AI and NeMo Guardrails add input and output rails around the model, validating prompts and completions as text with schema checks, regex and PII filters, and topical and safety rails. Custosa works one layer earlier, inspecting the source records field by field and withholding prohibited data before the prompt is ever built. A filter cannot make a model un-see a record it already received, which is why prevention at the data layer and filtering at the text layer solve different problems and pair well.

If you are evaluating a Guardrails AI alternative for stopping sensitive data from reaching a model, the honest framing is that these are not competing products. Output filtering is conversation control. Custosa is data control. The table below maps the two categories against the capabilities buyers most often conflate.

CapabilityOutput filteringCustosa
Validate output format and schemaNot its job
Topical and safety rails on responsesNot its job
Regex and PII filters on prompt and output text
Re-ask or fix-up loops on failed validationNo
Inspect source records before retrievalNo; sees prompt text
Per-field PASS or REDACT by roleNo
Deterministic formal policy engineNo; heuristic rails✓ Cedar, fail-closed
Withhold data before the prompt is builtNo; acts on assembled text
Signed, hash-chained, content-free evidenceNo
Runs inside your environment; records never leaveVaries by deployment✓ Data plane in your boundary

01What output-filtering frameworks do well

Guardrails AI and NVIDIA NeMo Guardrails are open frameworks that wrap a model with programmable rails on the way in and the way out. They give application teams a structured way to constrain model behavior without hand-rolling validation for every call, and they are genuinely useful for keeping a production assistant reliable and on-topic.

  • Structured output validation. Guardrails AI lets you define an expected schema and validate completions against it, with composable validators and re-ask or fix-up loops when a response fails, so a model that returns malformed output is corrected rather than passed downstream.
  • Topical and dialogue rails. NeMo Guardrails uses a modeling approach to define conversational flows and rails, keeping a bot within approved topics and steering it away from prohibited ones.
  • Safety and quality checks. Both frameworks support checks for toxicity, off-topic responses, and other quality rules on model output, and they can add input rails that screen a prompt before the model sees it.
  • Text-level PII and regex filters. Both can detect and mask common PII patterns and apply regex rules to prompt and response strings, which catches a class of obvious disclosures.
  • Defense at the conversation layer. Input rails can help screen for prompt-injection patterns and keep an application's behavior within bounds.

These are valuable capabilities, and for shaping how a model converses they are the right tool. The question is what they can see about your data, and when they see it.

02The gap: filtering cannot un-see data the model already received

Output filtering operates on text as it passes the model: a prompt going in, a completion coming back, both as strings. By the time an output rail runs, retrieval has already happened and the sensitive record has already been read and assembled into the prompt. The model received it. The filter is now trying to catch sensitive content on the way out, by pattern, after the fact. That ordering is the heart of the gap.

  • It cannot inspect the source records before retrieval. A rail sees the rendered prompt, not the underlying rows, documents, or fields pulled to build it. It cannot decide whether a particular record should have been retrieved for a particular user, because that decision was made upstream of the text it inspects.
  • It cannot give deterministic field-level verdicts by role. Text filters operate on string patterns, not on a model of the record's fields or the actor's role. They cannot reliably say "this diagnosis field is permitted for a clinician but must be withheld from a billing analyst," because the same string is sensitive in one context and benign in another.
  • It cannot produce signed, content-free evidence of access. A failed-validation log is operational, not a tamper-evident, offline-verifiable record of who was allowed to see which field and why.

The deeper issue is that post-hoc text filtering cannot un-see data the model already received. Once a prohibited record is in the prompt, it has influenced the model's internal state, and a string filter on the output is a probabilistic net, not a guarantee. Preventing leakage reliably means the prohibited data never reaches the model in the first place, which is a data-layer control, not a text-layer one. This is the same prevention-first logic behind sound RAG security.

An output rail can redact a Social Security number it recognizes in a completion. It cannot stop a model from reasoning over a salary field a support agent should never have been shown, because by the time the rail runs, the field is already in the prompt. Withholding it beforehand is a different kind of control.

03What Custosa does, and where the two overlap

Custosa is a runtime data-control plane for enterprise AI. Its data plane runs inside the customer's environment and inspects every record and field at runtime, before the model, so records never leave the boundary. A deterministic formal policy engine, built on Cedar rather than a model, evaluates each field against the actor's role using a five-level clearance lattice and issues a per-field PASS or REDACT verdict, withholding prohibited fields before the prompt is built. Because the engine is deterministic and fail-closed, the same inputs always produce the same decision, and an unresolved decision blocks rather than leaks. Every decision is signed with HMAC-SHA256 and hash-chained into an append-only, tamper-evident, content-free evidence ledger that is verifiable offline; the control plane receives only content-free verdict evidence.

The overlap with output filtering is real and worth stating plainly. Both can act on data before it reaches a user, and both reduce the chance that sensitive content surfaces. A guardrail framework with text-level PII filtering and Custosa with field-level redaction will both show "PII handling" on a feature checklist. The difference is depth, determinism, and timing: the guardrail framework acts on assembled text with heuristic rails, while Custosa acts on the structured records with a formal policy engine before the text exists, and records signed evidence of each decision. One shapes the conversation; the other controls the data the conversation is built from.

04Why they are complementary

Because output filtering and Custosa work at different layers, the practical answer is usually both, with each owning what it is built for.

  • Use output filtering to shape the conversation: enforce response schema and format, keep the assistant on approved topics, block unsafe or off-brand output, and run re-ask loops when a completion fails validation. This is conversation safety and structure around the model.
  • Use Custosa to control the data: per-record, per-field, per-role inspection at runtime, withholding prohibited fields before the prompt is built, with deterministic policy and signed, tamper-evident evidence. This is data control beneath the model, inside your environment.

A common deployment runs Custosa inspection at retrieval time, inside your boundary, so prohibited fields are withheld before any prompt is assembled. The already-filtered context then flows to the model, where a framework such as Guardrails AI or NeMo Guardrails enforces output format and conversation rails. The model is never handed a field the actor may not see, and the guardrail framework keeps the response well-formed and on-topic. Added latency from Custosa inspection is typically a p99 of 50 to 110ms, which fits inside an interactive AI request alongside guardrail overhead.

Custosa is early-stage and in production with design partners. It is not an output-filtering framework and is not trying to replace one. Custosa controls what reaches the model; guardrails shape what the model says. This division mirrors how field-level control fits the wider stack described in AI data governance.

Control the data before the rails ever run

Custosa inspects every record and field at runtime, redacts by role inside your environment, and signs content-free evidence of each decision. It runs alongside the output-filtering framework you already use.

05Frequently asked questions

What is LLM output filtering?

LLM output filtering is the practice of validating and sanitizing the text a model produces, and often the text going into it, before it reaches a user. Frameworks such as Guardrails AI and NVIDIA NeMo Guardrails add input and output rails around a model: schema and format checks, regex and PII filters, toxicity and topical rails, and re-ask loops when a response fails validation. The controls operate mostly at the text layer, on the prompt and the completion as strings.

Is Custosa an alternative to Guardrails AI or NeMo Guardrails?

Not really; they work at different layers and are usually complementary. Guardrails AI and NeMo Guardrails shape the conversation by validating and filtering model inputs and outputs as text. Custosa controls the source data before the model: it inspects records and fields at runtime, applies deterministic role-based policy, and withholds prohibited fields before the prompt is built. Custosa governs what reaches the model; the guardrail frameworks shape what the model is allowed to say.

Can output filtering stop data leakage?

It reduces some leakage but cannot prevent it at the source. Output filters act after retrieval has already placed sensitive records into the prompt, so the model has already received the data, and the filter is left trying to catch it on the way out by pattern. Detection is probabilistic and context-blind: the same value can be permitted for one role and prohibited for another. Preventing leakage reliably requires withholding the data before the model sees it, which is a data-layer control rather than a text-layer one.

What is the difference between prevention and filtering?

Filtering inspects text after the model has already received the data and tries to remove or block sensitive content on the way out. Prevention stops prohibited data from reaching the model at all. Custosa is a prevention control: it evaluates each field against the actor's role before the prompt is assembled and withholds what that actor may not see, so the model never receives it. A filter cannot make the model un-see a record it was already given; prevention ensures it was never given.

Can you use Custosa and output-filtering frameworks together?

Yes, and many teams should. A common pattern runs Custosa at retrieval time inside your environment, so prohibited fields are withheld before any prompt is built, then uses Guardrails AI or NeMo Guardrails around the model to enforce output format, block off-topic or unsafe responses, and keep the conversation on the rails. Custosa owns the data-control and evidence layer; the guardrail framework owns conversation safety and structure. Together they cover both what the model can see and what it is allowed to say.