Guardrails Overview

Guardrails are intelligent safety and content filtering mechanisms in PLai Framework that protect both your agents and users by detecting, blocking, or masking inappropriate, sensitive, or harmful content in real-time. Powered by Amazon Bedrock Guardrails, they provide enterprise-grade AI safety and compliance controls.

What are Guardrails?

Guardrails act as protective barriers that monitor and control content flowing through your AI agents. They operate at two critical points:

INPUT Guardrails

Filter and validate user messages before they reach the AI model

OUTPUT Guardrails

Validate and filter AI-generated responses before delivery to users

Unlike Answer Filters which guide specific responses to queries, Guardrails enforce broader safety and compliance rules across all interactions, ensuring your agent operates within defined boundaries at all times.

How Guardrails Work

Guardrails inspect content in real-time using advanced detection models:

Action Types

Guardrails can take different actions when they detect policy violations:

Block
Mask (Anonymize)

Complete prevention of contentWhen sensitive content is detected, the guardrail completely blocks the message or response.Best for:

Hate speech
Explicit sexual content
Violent or harmful content
Prohibited topics (politics, religion)
Policy violations

Example:

User: "Tell me how to hack into..."
Guardrail: BLOCKED
Response: "I cannot assist with activities that could be harmful or illegal."

Redaction of sensitive informationDetects and masks personally identifiable information (PII) and sensitive data while allowing the conversation to continue.Best for:

Email addresses
Phone numbers
Credit card numbers
Social security numbers
Names and addresses
Other PII data

Example:

User: "My email is john.doe@example.com and my phone is 555-1234"
After Masking: "My email is [EMAIL_REDACTED] and my phone is [PHONE_REDACTED]"

Masking allows conversations to continue while protecting sensitive information from being processed or stored.

Key Features

1. INPUT and OUTPUT Protection

INPUT Guardrails

Protect your AI model from harmful inputsINPUT guardrails validate user messages before they reach the AI model:

Filter malicious prompts: Block prompt injection and jailbreak attempts
Mask PII: Remove sensitive user data before processing
Block prohibited topics: Prevent queries about restricted subjects
Validate content safety: Screen for harmful or inappropriate input

When to use:

User-facing chatbots
Public-facing agents
Compliance-sensitive applications
Customer service bots

OUTPUT Guardrails

Ensure safe, compliant AI responsesOUTPUT guardrails validate AI-generated content before delivery to users:

Prevent harmful outputs: Block toxic, violent, or inappropriate responses
Protect sensitive information: Mask PII in AI-generated content
Enforce brand safety: Ensure responses meet brand guidelines
Maintain compliance: Verify regulatory compliance

When to use:

Public content generation
Customer-facing communications
Regulated industries
Brand-sensitive applications

2. PII Detection and Masking

Guardrails can automatically detect and mask various types of personally identifiable information:

Contact Information

Email addresses
Phone numbers
Physical addresses
Social media handles

Financial Data

Credit card numbers
Bank account numbers
SSN/Tax IDs
Financial statements

Personal Identifiers

Full names
Date of birth
Driver’s license numbers
Passport numbers

Privacy & Compliance: PII masking helps you comply with regulations like GDPR, CCPA, HIPAA, and other data protection laws.

3. Default Guardrail

PLai Framework includes a default guardrail that automatically protects all agents from harmful INPUT content:

Default INPUT Guardrail:Automatically blocks user queries containing:

🚫 Sexual content: Explicit sexual material or inappropriate content
🚫 Hate speech: Discrimination, prejudice, or hateful content
🚫 Insults: Personal attacks or abusive language
🚫 Politics: Political opinions, debates, or partisan content
🚫 Religion: Religious debates or divisive religious content

Status: Always active on all agents by default

The default guardrail provides baseline protection without requiring any configuration. You can create additional custom guardrails for specific needs.

4. On-Demand Creation

Guardrails are created on-demand based on your specific requirements:

Identify Protection Needs

Determine what content needs to be blocked or masked for your use case

Request Guardrail Creation

Guardrails are created through the Amazon Bedrock Guardrails service

Configure and Apply

Apply the guardrail to your agents with appropriate settings

Monitor and Adjust

Review guardrail performance and refine as needed

Guardrails are powered by Amazon Bedrock Guardrails, providing enterprise-grade content filtering backed by AWS’s advanced AI safety models.

Guardrail Scope

Guardrails can be configured at different organizational levels:

General Guardrails
Organization-Specific Guardrails

Available to all organizations

Platform-wide default guardrail
Standard safety and compliance guardrails
Common PII masking rules
Industry-standard content filters

Characteristics:

Maintained by PLai Framework
Automatically updated
Best practices built-in
No configuration required

Ideal for getting started quickly with proven safety measures

Use Cases

1. Customer Service & Support

Protect customer interactions and maintain professional communication standards.

Guardrails to implement:

INPUT: Block offensive language and inappropriate requests
INPUT: Mask customer PII (phone, email, account numbers)
OUTPUT: Prevent sharing of internal system information
OUTPUT: Ensure professional, brand-aligned language

2. Healthcare & Medical Applications

HIPAA compliance requires strict protection of patient health information (PHI).

Guardrails to implement:

INPUT: Mask all PHI (patient names, diagnoses, records)
INPUT: Block requests for medical advice or diagnoses
OUTPUT: Prevent disclosure of patient information
OUTPUT: Block medical recommendations outside scope

3. Financial Services

PCI-DSS and financial regulations mandate protection of financial data.

Guardrails to implement:

INPUT: Mask credit card numbers, account details, SSNs
INPUT: Block fraudulent or suspicious requests
OUTPUT: Prevent disclosure of account information
OUTPUT: Ensure regulatory compliance in responses

4. Education & E-Learning

Protect students and maintain appropriate educational environment.

Guardrails to implement:

INPUT: Block inappropriate content (sexual, violent, hateful)
INPUT: Mask student PII (names, emails, student IDs)
OUTPUT: Ensure age-appropriate responses
OUTPUT: Prevent academic integrity violations

5. Content Moderation

Guardrails to implement:

INPUT: Filter user-generated content for harmful material
INPUT: Block spam and malicious content
OUTPUT: Ensure community guidelines compliance
OUTPUT: Maintain platform safety standards

Benefits of Guardrails

AI Safety

Prevent harmful AI behavior

Block toxic outputs
Prevent bias and discrimination
Stop misinformation
Ensure appropriate content

Privacy Protection

Protect user privacy

Automatic PII detection
Data masking and anonymization
Regulatory compliance
Reduced data exposure

Compliance

Meet regulatory requirements

GDPR compliance
HIPAA compliance
PCI-DSS compliance
Industry regulations

Brand Protection

Safeguard your reputation

Prevent PR incidents
Maintain brand voice
Control public messaging
Ensure professionalism

Risk Mitigation

Reduce operational risks

Limit legal liability
Prevent security incidents
Control information disclosure
Audit trail maintenance

User Trust

Build user confidence

Demonstrate commitment to safety
Transparent data handling
Consistent behavior
Professional interactions

Guardrails vs. Answer Filters

Understanding the difference between these two features:

Feature	Guardrails	Answer Filters
Purpose	Safety & compliance enforcement	Response guidance for specific queries
Scope	All interactions	Specific query patterns
Action	Block or mask content	Guide response content
Trigger	Policy violations detected	Query similarity matching
Granularity	Broad safety rules	Specific Q&A pairs
Technology	Amazon Bedrock Guardrails	Semantic similarity
Best For	Content safety, PII protection, compliance	Consistent answers to FAQs

Best Practice: Use both Guardrails and Answer Filters together for comprehensive agent control. Guardrails provide safety boundaries, while Answer Filters ensure consistent, high-quality responses within those boundaries.

Limitations & Considerations

Detection Accuracy

Understanding AI detection limitsGuardrails use advanced AI models but are not 100% perfect:

May occasionally miss sophisticated attempts to bypass filters
Can have false positives (blocking safe content)
Context-dependent detection may vary
Evolving adversarial techniques

Mitigation:

Regular monitoring and review
Continuous model updates
Human oversight for critical applications
Multi-layered security approach

Performance Impact

Latency considerationsGuardrails add processing time to each interaction:

INPUT guardrails: +50-200ms
OUTPUT guardrails: +50-200ms
PII masking: +100-300ms
Multiple guardrails compound latency

Mitigation:

Use only necessary guardrails
Optimize guardrail selection
Consider async processing where possible
Balance safety with performance needs

Language Support

Multi-language considerationsGuardrail effectiveness varies by language:

Best performance in English
Good support for major European languages
Limited support for some languages
Cultural context differences

Mitigation:

Test thoroughly in target languages
Consider language-specific guardrails
Monitor performance by language
Adjust thresholds as needed

Context Sensitivity

Understanding context limitationsGuardrails may struggle with:

Sarcasm and irony
Cultural nuances
Domain-specific terminology
Context-dependent appropriateness

Mitigation:

Test with realistic scenarios
Provide feedback for improvement
Use domain-specific guardrails
Combine with human review

Getting Started

Ready to implement Guardrails for your agents?

Assess Your Needs

Identify what content needs protection:

What are your compliance requirements?
What PII needs to be masked?
What topics should be prohibited?
What are your safety priorities?

Review Default Guardrail

Understand the built-in protection already active on your agents

Plan Custom Guardrails

Determine if you need organization-specific guardrails for:

Industry-specific compliance
Custom PII handling
Organization-specific policies
Brand-specific requirements

Request Creation

Work with your PLai administrator to create custom guardrails through Amazon Bedrock Guardrails service

Configure and Test

Apply guardrails to your agents and test thoroughly with realistic scenarios

Monitor and Refine

Review guardrail performance and adjust as needed

Next Steps

Configuration Guide

Learn how to configure and apply Guardrails to your agents

Best Practices

Discover expert tips for effective Guardrail implementation

API Reference

Explore the Guardrails API for programmatic control

Answer Filters

Learn about complementary response control features

Frequently Asked Questions

How does the default guardrail work?

The default INPUT guardrail is automatically active on all agents and blocks queries containing sexual content, hate speech, insults, politics, or religion. It runs before any custom guardrails and requires no configuration.

Can I disable the default guardrail?

The default guardrail provides essential safety protection and is always active. However, you can create custom guardrails with different policies for organization-specific needs.

How is PII detected and masked?

Guardrails use advanced pattern matching and AI models to detect PII such as emails, phone numbers, addresses, and financial data. Detected PII is replaced with tokens like [EMAIL_REDACTED] or [PHONE_REDACTED].

What happens when a guardrail blocks content?

When content is blocked, users receive a polite safety message indicating that the request cannot be fulfilled. The original content is logged for monitoring but not processed by the AI model.

How do I create a custom guardrail?

Custom guardrails are created through the Amazon Bedrock Guardrails service. Contact your PLai administrator or account manager to request creation of organization-specific guardrails.

Do guardrails work with all AI models?

Yes, guardrails operate independently of the underlying AI model. They filter content before and after model processing, working with any model supported by PLai Framework.

Can I see when guardrails are triggered?

Yes, guardrail activations are logged in your agent’s analytics. You can monitor trigger frequency, blocked content patterns, and guardrail effectiveness.

What's the difference between blocking and masking?

Blocking completely prevents content from being processed (used for harmful content). Masking redacts specific sensitive information while allowing the conversation to continue (used for PII protection).

​Guardrails Overview

​What are Guardrails?

INPUT Guardrails

OUTPUT Guardrails

​How Guardrails Work

​Action Types

​Key Features

​1. INPUT and OUTPUT Protection

​2. PII Detection and Masking

Contact Information

Financial Data

Personal Identifiers

​3. Default Guardrail

​4. On-Demand Creation

​Guardrail Scope

​Use Cases

​1. Customer Service & Support

​2. Healthcare & Medical Applications

​3. Financial Services

​4. Education & E-Learning

​5. Content Moderation

​Benefits of Guardrails

AI Safety

Privacy Protection

Compliance

Brand Protection

Risk Mitigation

User Trust

​Guardrails vs. Answer Filters

​Limitations & Considerations

​Getting Started

​Next Steps

Configuration Guide

Best Practices

API Reference

Answer Filters

​Frequently Asked Questions

Guardrails Overview

What are Guardrails?

How Guardrails Work

Action Types

Key Features

1. INPUT and OUTPUT Protection

2. PII Detection and Masking

3. Default Guardrail

4. On-Demand Creation

Guardrail Scope

Use Cases

1. Customer Service & Support

2. Healthcare & Medical Applications

3. Financial Services

4. Education & E-Learning

5. Content Moderation

Benefits of Guardrails

Guardrails vs. Answer Filters

Limitations & Considerations

Getting Started

Next Steps

Frequently Asked Questions