Guardrails for modern AI teams

TestSavant SDK

TestSavant SDK gives you policy-as-code guardrails across every model, tool, and route. Stop prompt injection, toxic drift, and policy gaps with scans that run before and after each model call.

Start in minutes View pricing

Low Latency Policy-as-code Audit-ready Easy integration

policy_guardrails.py

from testsavant.guard import InputGuard, OutputGuard

input_guard = InputGuard(api_key=API_KEY, project_id=PROJECT_ID)
output_guard = OutputGuard(api_key=API_KEY, project_id=PROJECT_ID)

prompt = "Summarize the latest release train for executives."
if input_guard.scan(prompt).is_valid:
    completion = llm.generate(prompt)
    verdict = output_guard.scan(prompt=prompt, output=completion)
    if verdict.is_valid:
        ship(completion)
    else:
        escalate(verdict.results)

Our Scanners

Input scanners

Anonymize

Anonymize(entity_types, tag="base", threshold=0.5, redact=False)

Detects and redacts PII before it hits your model.

BanCode

BanCode(tag="base", threshold=0.5)

Stops code snippets or scripts from being submitted upstream.

BanCompetitors

BanCompetitors(competitors, tag="base", threshold=0.5, redact=False)

Filters prompts that mention disallowed competitor names.

BanTopics

BanTopics(topics, tag="base", threshold=0.5, mode="blacklist")

Blocks any prompt requesting sensitive or restricted themes.

Code

Code(languages, tag="base", threshold=0.5, is_blocked=True)

Detects programming languages you want to quarantine before use.

Gibberish

Gibberish(tag="base", threshold=0.5)

Keeps garbage prompts from wasting cycles and tokens.

Language

Language(valid_languages, tag="base", threshold=0.5)

Ensures incoming prompts match your allowed language list.

NSFW

NSFW(tag="base", threshold=0.5)

Screens explicit or harmful requests before they enter workflows.

PromptInjection

PromptInjection(tag="base", threshold=0.5)

Neutralizes adversarial jailbreak attempts embedded in prompts.

Toxicity

Toxicity(tag="base", threshold=0.5)

Detects hateful, harassing, or unsafe user inputs.

Anonymize

Anonymize(entity_types, tag="base", threshold=0.5, redact=False)

Detects and redacts PII before it hits your model.

BanCode

BanCode(tag="base", threshold=0.5)

Stops code snippets or scripts from being submitted upstream.

BanCompetitors

BanCompetitors(competitors, tag="base", threshold=0.5, redact=False)

Filters prompts that mention disallowed competitor names.

BanTopics

BanTopics(topics, tag="base", threshold=0.5, mode="blacklist")

Blocks any prompt requesting sensitive or restricted themes.

Code

Code(languages, tag="base", threshold=0.5, is_blocked=True)

Detects programming languages you want to quarantine before use.

Gibberish

Gibberish(tag="base", threshold=0.5)

Keeps garbage prompts from wasting cycles and tokens.

Language

Language(valid_languages, tag="base", threshold=0.5)

Ensures incoming prompts match your allowed language list.

NSFW

NSFW(tag="base", threshold=0.5)

Screens explicit or harmful requests before they enter workflows.

PromptInjection

PromptInjection(tag="base", threshold=0.5)

Neutralizes adversarial jailbreak attempts embedded in prompts.

Toxicity

Toxicity(tag="base", threshold=0.5)

Detects hateful, harassing, or unsafe user inputs.

Output scanners

BanCode

BanCode(tag="base", threshold=0.5)

Strips executable code blocks before responses reach users.

BanCompetitors

BanCompetitors(competitors, tag="base", threshold=0.5, redact=False)

Removes mentions of competitor brands from generated output.

BanTopics

BanTopics(topics, tag="base", threshold=0.5, mode="blacklist")

Keeps restricted subject matter out of final responses.

Bias

Bias(tag="base", threshold=0.5)

Identifies biased or unfair statements before delivery.

Code

Code(languages, tag="base", threshold=0.5, is_blocked=True)

Flags unauthorized code languages within model responses.

FactualConsistency

FactualConsistency(tag="base", minimum_score=0.5)

Compares answers against sources to highlight hallucinations.

Gibberish

Gibberish(tag="base", threshold=0.5)

Prevents meaningless responses from reaching customers.

Language

Language(valid_languages, tag="base", threshold=0.5)

Enforces that output languages stay within approved options.

LanguageSame

LanguageSame(tag="base", threshold=0.5)

Verifies the reply matches the customer’s original language.

MaliciousURL

MaliciousURL(tag="base", threshold=0.5)

Scans for phishing or malicious links before they’re sent.

NoRefusal

NoRefusal(tag="base", threshold=0.5)

Catches unnecessary refusals so you can trigger fallbacks.

NSFW

NSFW(tag="base", threshold=0.5)

Blocks explicit or brand-unsafe completion content.

Toxicity

Toxicity(tag="base", threshold=0.5)

Removes hateful or abusive language before it leaves the agent.

BanCode

BanCode(tag="base", threshold=0.5)

Strips executable code blocks before responses reach users.

BanCompetitors

BanCompetitors(competitors, tag="base", threshold=0.5, redact=False)

Removes mentions of competitor brands from generated output.

BanTopics

BanTopics(topics, tag="base", threshold=0.5, mode="blacklist")

Keeps restricted subject matter out of final responses.

Bias

Bias(tag="base", threshold=0.5)

Identifies biased or unfair statements before delivery.

Code

Code(languages, tag="base", threshold=0.5, is_blocked=True)

Flags unauthorized code languages within model responses.

FactualConsistency

FactualConsistency(tag="base", minimum_score=0.5)

Compares answers against sources to highlight hallucinations.

Gibberish

Gibberish(tag="base", threshold=0.5)

Prevents meaningless responses from reaching customers.

Language

Language(valid_languages, tag="base", threshold=0.5)

Enforces that output languages stay within approved options.

LanguageSame

LanguageSame(tag="base", threshold=0.5)

Verifies the reply matches the customer’s original language.

MaliciousURL

MaliciousURL(tag="base", threshold=0.5)

Scans for phishing or malicious links before they’re sent.

NoRefusal

NoRefusal(tag="base", threshold=0.5)

Catches unnecessary refusals so you can trigger fallbacks.

NSFW

NSFW(tag="base", threshold=0.5)

Blocks explicit or brand-unsafe completion content.

Toxicity

Toxicity(tag="base", threshold=0.5)

Removes hateful or abusive language before it leaves the agent.

Why teams choose TestSavant

Guardrails that collaborate with your stack, not against it.

Every SDK surface is designed for fast approvals, low latency, and audit-ready evidence. Deploy guardrails once, enforce across every channel.

Unified Guardrails

Centralize policy rules across chat, agents, retrieval flows, and API tools. No more subtle drift between teams.

• Declarative policies with per-route overrides
• Works across OpenAI, Anthropic, Azure OpenAI, Vertex
• Integrates with eval or red-team pipelines

Defense-in-Depth

Detect prompt injection, refusals, brand safety violations, and more with layered scanners.

• Pre and post execution checkpoints
• Out-of-box scanners + custom scoring plug-ins
• Alerts via Slack, PagerDuty, or webhook

Evidence, Automated

Generate policy logs mapped to SOC2, ISO, and EU AI Act controls—straight from SDK usage.

• Structured JSON and CSV exports
• Redlines highlighted for compliance review
• Link into TestSavant Studio for audits

Enterprise-Ready

Role-based access, SSO, secrets management, and regional data residency baked in.

• Fine-grained API keys with rotation hooks
• Region pinning (US / EU / APAC)
• Single-tenant deployments on request

Observability Included

Stream guardrail outcomes to dashboards and incident tooling with zero extra code.

• Connectors for Datadog, Splunk, Snowflake
• Real-time dashboards with anomaly alerts
• Automatic retention policies

Test-Driven Guardrails

Replay production incidents, run eval suites, and auto-tune thresholds with Studio.

• Import red-team transcripts directly
• Version guardrails alongside feature flags
• Promote changes with confidence scores

Quick Start

Drop-in guardrails in under five minutes

Pick your runtime, add scanners, and start shipping safer prompts immediately.

Python · Input Scanners

from testsavant.guard import InputGuard
from testsavant.guard.input_scanners import PromptInjection, Toxicity

input_guard = InputGuard(api_key="YOUR_API_KEY", project_id="YOUR_PROJECT_ID")
input_guard.add_scanner(PromptInjection(tag="base", threshold=0.45))
input_guard.add_scanner(Toxicity(tag="brand", threshold=0.65))

prompt = "Write a short story about a friendly robot."
result = input_guard.scan(prompt)

if result.is_valid:
    print("Prompt is safe.")
else:
    print("Prompt failed guardrails:", result.results)

Python · Output Scanners

from testsavant.guard import OutputGuard
from testsavant.guard.output_scanners import Toxicity, NoRefusal, FactualConsistency

output_guard = OutputGuard(api_key="YOUR_API_KEY", project_id="YOUR_PROJECT_ID")
output_guard.add_scanner(Toxicity(tag="public", threshold=0.5))
output_guard.add_scanner(NoRefusal(threshold=0.8))
output_guard.add_scanner(FactualConsistency(threshold=0.7))

prompt = "How do I build a computer?"
llm_output = llm.generate(prompt)

result = output_guard.scan(prompt=prompt, output=llm_output)

if result.is_valid:
    deliver(llm_output)
else:
    escalate(result.results)

Node.js · Edge Worker

import { OutputGuard } from "testsavant-sdk";
import { Toxicity, BanCode } from "testsavant-sdk/output";

const guard = new OutputGuard({
  apiKey: process.env.TESTSAVANT_KEY,
  projectId: "YOUR_PROJECT_ID"
});

guard.addScanner(new Toxicity({ tag: "support", threshold: 0.55 }));
guard.addScanner(new BanCode({ extensions: [".sh", ".ps1"] }));

export default async function handle(request) {
  const { prompt, completion } = await request.json();
  const result = await guard.scan({ prompt, output: completion });

  if (!result.isValid) {
    return new Response(JSON.stringify(result.results), { status: 422 });
  }
  return new Response(completion, { status: 200 });
}

pip install testsavant-sdk

Need a sandbox? Generate an API key.

Integration walkthrough

Integrate the TestSavant SDK in four steps

Follow this path to wire guardrails into your stack and start enforcing policies the same afternoon.

Install the package

Add testsavant-sdk to your project via pip or npm and pull in the language helpers you need.

Configure your client

Drop in your API key and project ID, choose the policies to enforce, and set any environment-specific overrides.

Wrap your model calls

Use InputGuard and OutputGuard around prompts, completions, and tool responses to score every interaction.

Ship and monitor

Deploy with confidence, stream incidents to the console, and iterate on thresholds as your traffic patterns evolve.

Built for hybrid AI stacks

Route-aware guardrails without the glue code

Swap models, call tools, or fall back to retrieval without losing coverage. The SDK tracks risk posture end-to-end.

Input scanners

Shape inbound prompts before they reach your model, tool, or retrieval layer.

• Prompt injection & jailbreak blocks
• Personally identifiable information detection
• Domain restrictions & session fingerprints

Output scanners

Catch toxic, non-compliant, or low-quality completions before they reach an end user.

• Toxicity, hate, self-harm policy coverage
• Hallucination & factual consistency checks
• Sensitive code or credential leakage

Tooling & agents

Protect function calling, RAG, and multi-step agents with policy-aware guardrails.

• Tool permission filters
• Retrieval checks on embedded documents
• Session memory scrubbing & retention control

Fewer Incidents

Block prompt injection, policy evasions, and unsafe tool calls before they reach downstream systems.

• Catch risks before or after the model invocation
• Auto escalate to security or trust teams
• Reduce manual reviews by up to 60%

Cleaner Audits

Every decision is logged with the exact policy, signal, and remediation trail auditors expect.

• SOC2, ISO, HIPAA, EU AI Act friendly exports
• Evidence packs generated automatically
• Replay incidents alongside risk scores

Faster Delivery

Launch new prompts, tools, and agents with guardrails already validated through automated tests.

• Policy diffing and preview environments
• Git + CI integrations to block regressions
• Safe rollout toggles with instant rollback

FAQs

What is TestSavant SDK?

It’s a multi-language SDK that enforces guardrails around prompts, completions, function calls, and RAG pipelines. Policies are versioned with your codebase and approved once.

How do I install the SDK?

Install with pip install testsavant-sdk or npm install testsavant-sdk. Both ship with rich typings.

What risks can I scan for?

Prompt injection, toxicity, self-harm, PII, gibberish, hallucinations, policy breaches, credential leakage, source code exfiltration, and more. Add your own detectors via custom scanners.

Do I need an API key?

Yes. Create a project inside the TestSavant console to get API keys. Rotate them with the CLI, Terraform provider, or secret managers like Vault.

Will this handle production scale?

Absolutely. Teams rely on the SDK in high-volume support bots, search experiences, and agentic workflows with latency budgets under 15ms.

Put unified guardrails in front of every agent today

Safer inputs. Safer outputs. Consistent policy. Clear proof for every review.

Launch the console See pricing tiers

TestSavant SDK

Our Scanners

Anonymize

BanCode

BanCompetitors

BanTopics

Code

Gibberish

Language

NSFW

PromptInjection

Toxicity

Anonymize

BanCode

BanCompetitors

BanTopics

Code

Gibberish

Language

NSFW

PromptInjection

Toxicity

BanCode

BanCompetitors

BanTopics

Bias

Code

FactualConsistency

Gibberish

Language

LanguageSame

MaliciousURL

NoRefusal

NSFW

Toxicity

BanCode

BanCompetitors

BanTopics

Bias

Code

FactualConsistency

Gibberish

Language

LanguageSame

MaliciousURL

NoRefusal

NSFW

Toxicity

Guardrails that collaborate with your stack, not against it.

Unified Guardrails

Defense-in-Depth

Evidence, Automated

Enterprise-Ready

Observability Included

Test-Driven Guardrails

Drop-in guardrails in under five minutes

Python · Input Scanners

Python · Output Scanners

Node.js · Edge Worker

Integrate the TestSavant SDK in four steps

Install the package

Configure your client

Wrap your model calls

Ship and monitor

Route-aware guardrails without the glue code

Input scanners

Output scanners

Tooling & agents

Fewer Incidents

Cleaner Audits

Faster Delivery

FAQs

Put unified guardrails in front of every agent today

Menu

Say Hello

Newsletter