Skip to main content

Guardrails Overview

Guardrails are intelligent safety and content filtering mechanisms in PLai Framework that protect both your agents and users by detecting, blocking, or masking inappropriate, sensitive, or harmful content in real-time. Powered by Amazon Bedrock Guardrails, they provide enterprise-grade AI safety and compliance controls.

What are Guardrails?

Guardrails act as protective barriers that monitor and control content flowing through your AI agents. They operate at two critical points:

INPUT Guardrails

Filter and validate user messages before they reach the AI model

OUTPUT Guardrails

Validate and filter AI-generated responses before delivery to users
Unlike Answer Filters which guide specific responses to queries, Guardrails enforce broader safety and compliance rules across all interactions, ensuring your agent operates within defined boundaries at all times.

How Guardrails Work

Guardrails inspect content in real-time using advanced detection models:

Action Types

Guardrails can take different actions when they detect policy violations:
Complete prevention of contentWhen sensitive content is detected, the guardrail completely blocks the message or response.Best for:
  • Hate speech
  • Explicit sexual content
  • Violent or harmful content
  • Prohibited topics (politics, religion)
  • Policy violations
Example:
User: "Tell me how to hack into..."
Guardrail: BLOCKED
Response: "I cannot assist with activities that could be harmful or illegal."

Key Features

1. INPUT and OUTPUT Protection

Protect your AI model from harmful inputsINPUT guardrails validate user messages before they reach the AI model:
  • Filter malicious prompts: Block prompt injection and jailbreak attempts
  • Mask PII: Remove sensitive user data before processing
  • Block prohibited topics: Prevent queries about restricted subjects
  • Validate content safety: Screen for harmful or inappropriate input
When to use:
  • User-facing chatbots
  • Public-facing agents
  • Compliance-sensitive applications
  • Customer service bots
Ensure safe, compliant AI responsesOUTPUT guardrails validate AI-generated content before delivery to users:
  • Prevent harmful outputs: Block toxic, violent, or inappropriate responses
  • Protect sensitive information: Mask PII in AI-generated content
  • Enforce brand safety: Ensure responses meet brand guidelines
  • Maintain compliance: Verify regulatory compliance
When to use:
  • Public content generation
  • Customer-facing communications
  • Regulated industries
  • Brand-sensitive applications

2. PII Detection and Masking

Guardrails can automatically detect and mask various types of personally identifiable information:

Contact Information

  • Email addresses
  • Phone numbers
  • Physical addresses
  • Social media handles

Financial Data

  • Credit card numbers
  • Bank account numbers
  • SSN/Tax IDs
  • Financial statements

Personal Identifiers

  • Full names
  • Date of birth
  • Driverโ€™s license numbers
  • Passport numbers
Privacy & Compliance: PII masking helps you comply with regulations like GDPR, CCPA, HIPAA, and other data protection laws.

3. Default Guardrail

PLai Framework includes a default guardrail that automatically protects all agents from harmful INPUT content:
Default INPUT Guardrail:Automatically blocks user queries containing:
  • ๐Ÿšซ Sexual content: Explicit sexual material or inappropriate content
  • ๐Ÿšซ Hate speech: Discrimination, prejudice, or hateful content
  • ๐Ÿšซ Insults: Personal attacks or abusive language
  • ๐Ÿšซ Politics: Political opinions, debates, or partisan content
  • ๐Ÿšซ Religion: Religious debates or divisive religious content
Status: Always active on all agents by default
The default guardrail provides baseline protection without requiring any configuration. You can create additional custom guardrails for specific needs.

4. On-Demand Creation

Guardrails are created on-demand based on your specific requirements:
1

Identify Protection Needs

Determine what content needs to be blocked or masked for your use case
2

Request Guardrail Creation

Guardrails are created through the Amazon Bedrock Guardrails service
3

Configure and Apply

Apply the guardrail to your agents with appropriate settings
4

Monitor and Adjust

Review guardrail performance and refine as needed
Guardrails are powered by Amazon Bedrock Guardrails, providing enterprise-grade content filtering backed by AWSโ€™s advanced AI safety models.

Guardrail Scope

Guardrails can be configured at different organizational levels:
Available to all organizations
  • Platform-wide default guardrail
  • Standard safety and compliance guardrails
  • Common PII masking rules
  • Industry-standard content filters
Characteristics:
  • Maintained by PLai Framework
  • Automatically updated
  • Best practices built-in
  • No configuration required
Ideal for getting started quickly with proven safety measures

Use Cases

1. Customer Service & Support

Protect customer interactions and maintain professional communication standards.
Guardrails to implement:
  • INPUT: Block offensive language and inappropriate requests
  • INPUT: Mask customer PII (phone, email, account numbers)
  • OUTPUT: Prevent sharing of internal system information
  • OUTPUT: Ensure professional, brand-aligned language

2. Healthcare & Medical Applications

HIPAA compliance requires strict protection of patient health information (PHI).
Guardrails to implement:
  • INPUT: Mask all PHI (patient names, diagnoses, records)
  • INPUT: Block requests for medical advice or diagnoses
  • OUTPUT: Prevent disclosure of patient information
  • OUTPUT: Block medical recommendations outside scope

3. Financial Services

PCI-DSS and financial regulations mandate protection of financial data.
Guardrails to implement:
  • INPUT: Mask credit card numbers, account details, SSNs
  • INPUT: Block fraudulent or suspicious requests
  • OUTPUT: Prevent disclosure of account information
  • OUTPUT: Ensure regulatory compliance in responses

4. Education & E-Learning

Protect students and maintain appropriate educational environment.
Guardrails to implement:
  • INPUT: Block inappropriate content (sexual, violent, hateful)
  • INPUT: Mask student PII (names, emails, student IDs)
  • OUTPUT: Ensure age-appropriate responses
  • OUTPUT: Prevent academic integrity violations

5. Content Moderation

Guardrails to implement:
  • INPUT: Filter user-generated content for harmful material
  • INPUT: Block spam and malicious content
  • OUTPUT: Ensure community guidelines compliance
  • OUTPUT: Maintain platform safety standards

Benefits of Guardrails

AI Safety

Prevent harmful AI behavior
  • Block toxic outputs
  • Prevent bias and discrimination
  • Stop misinformation
  • Ensure appropriate content

Privacy Protection

Protect user privacy
  • Automatic PII detection
  • Data masking and anonymization
  • Regulatory compliance
  • Reduced data exposure

Compliance

Meet regulatory requirements
  • GDPR compliance
  • HIPAA compliance
  • PCI-DSS compliance
  • Industry regulations

Brand Protection

Safeguard your reputation
  • Prevent PR incidents
  • Maintain brand voice
  • Control public messaging
  • Ensure professionalism

Risk Mitigation

Reduce operational risks
  • Limit legal liability
  • Prevent security incidents
  • Control information disclosure
  • Audit trail maintenance

User Trust

Build user confidence
  • Demonstrate commitment to safety
  • Transparent data handling
  • Consistent behavior
  • Professional interactions

Guardrails vs. Answer Filters

Understanding the difference between these two features:
FeatureGuardrailsAnswer Filters
PurposeSafety & compliance enforcementResponse guidance for specific queries
ScopeAll interactionsSpecific query patterns
ActionBlock or mask contentGuide response content
TriggerPolicy violations detectedQuery similarity matching
GranularityBroad safety rulesSpecific Q&A pairs
TechnologyAmazon Bedrock GuardrailsSemantic similarity
Best ForContent safety, PII protection, complianceConsistent answers to FAQs
Best Practice: Use both Guardrails and Answer Filters together for comprehensive agent control. Guardrails provide safety boundaries, while Answer Filters ensure consistent, high-quality responses within those boundaries.

Limitations & Considerations

Understanding AI detection limitsGuardrails use advanced AI models but are not 100% perfect:
  • May occasionally miss sophisticated attempts to bypass filters
  • Can have false positives (blocking safe content)
  • Context-dependent detection may vary
  • Evolving adversarial techniques
Mitigation:
  • Regular monitoring and review
  • Continuous model updates
  • Human oversight for critical applications
  • Multi-layered security approach
Latency considerationsGuardrails add processing time to each interaction:
  • INPUT guardrails: +50-200ms
  • OUTPUT guardrails: +50-200ms
  • PII masking: +100-300ms
  • Multiple guardrails compound latency
Mitigation:
  • Use only necessary guardrails
  • Optimize guardrail selection
  • Consider async processing where possible
  • Balance safety with performance needs
Multi-language considerationsGuardrail effectiveness varies by language:
  • Best performance in English
  • Good support for major European languages
  • Limited support for some languages
  • Cultural context differences
Mitigation:
  • Test thoroughly in target languages
  • Consider language-specific guardrails
  • Monitor performance by language
  • Adjust thresholds as needed
Understanding context limitationsGuardrails may struggle with:
  • Sarcasm and irony
  • Cultural nuances
  • Domain-specific terminology
  • Context-dependent appropriateness
Mitigation:
  • Test with realistic scenarios
  • Provide feedback for improvement
  • Use domain-specific guardrails
  • Combine with human review

Getting Started

Ready to implement Guardrails for your agents?
1

Assess Your Needs

Identify what content needs protection:
  • What are your compliance requirements?
  • What PII needs to be masked?
  • What topics should be prohibited?
  • What are your safety priorities?
2

Review Default Guardrail

Understand the built-in protection already active on your agents
3

Plan Custom Guardrails

Determine if you need organization-specific guardrails for:
  • Industry-specific compliance
  • Custom PII handling
  • Organization-specific policies
  • Brand-specific requirements
4

Request Creation

Work with your PLai administrator to create custom guardrails through Amazon Bedrock Guardrails service
5

Configure and Test

Apply guardrails to your agents and test thoroughly with realistic scenarios
6

Monitor and Refine

Review guardrail performance and adjust as needed

Next Steps


Frequently Asked Questions

The default INPUT guardrail is automatically active on all agents and blocks queries containing sexual content, hate speech, insults, politics, or religion. It runs before any custom guardrails and requires no configuration.
The default guardrail provides essential safety protection and is always active. However, you can create custom guardrails with different policies for organization-specific needs.
Guardrails use advanced pattern matching and AI models to detect PII such as emails, phone numbers, addresses, and financial data. Detected PII is replaced with tokens like [EMAIL_REDACTED] or [PHONE_REDACTED].
When content is blocked, users receive a polite safety message indicating that the request cannot be fulfilled. The original content is logged for monitoring but not processed by the AI model.
Custom guardrails are created through the Amazon Bedrock Guardrails service. Contact your PLai administrator or account manager to request creation of organization-specific guardrails.
Yes, guardrails operate independently of the underlying AI model. They filter content before and after model processing, working with any model supported by PLai Framework.
Yes, guardrail activations are logged in your agentโ€™s analytics. You can monitor trigger frequency, blocked content patterns, and guardrail effectiveness.
Blocking completely prevents content from being processed (used for harmful content). Masking redacts specific sensitive information while allowing the conversation to continue (used for PII protection).