Skip to main content

Guardrails Configuration

Learn how to configure and apply Guardrails to protect your agents with intelligent content filtering. This guide covers everything from understanding the default guardrail to creating and managing custom guardrails.

Prerequisites

Before configuring Guardrails, ensure you have:
Agent Created: You need an existing agent to apply guardrails to
Admin Access: Only project Admins and Owners can configure guardrails
Requirements Identified: Know what content needs protection
Guardrails are managed in the Guardrails section of the agent dashboard.

Understanding the Default Guardrail

Every agent in PLai Framework comes with a default INPUT guardrail automatically active:

Default Guardrail Coverage

Sexual Content

Blocks explicit sexual material, inappropriate content, or sexual advances

Hate Speech

Blocks discrimination, prejudice, hateful content, or targeted harassment

Insults & Abuse

Blocks personal attacks, abusive language, or aggressive insults

Politics & Religion

Blocks political debates, partisan content, religious disputes, or divisive topics
Important: Default guardrail is optional enabled. It provides baseline protection for all agents without requiring any configuration.

Guardrail Types

Guardrails can be configured with different directions and actions:

1. Direction: INPUT vs OUTPUT

Applied to user messages before AI processingPurpose:
  • Protect AI model from harmful inputs
  • Filter malicious prompts
  • Mask sensitive user data
  • Block prohibited topics
Use Cases:
  • User-facing chatbots
  • Public interfaces
  • Customer service applications
  • Community platforms
Processing Flow:
User Message β†’ INPUT Guardrail β†’ AI Model β†’ Response
INPUT guardrails run before the AI model sees the content, providing first-line defense against inappropriate inputs.

2. Action: Block vs Mask

Completely prevent content from passingWhen to use:
  • Harmful content (hate speech, violence)
  • Prohibited topics (politics, religion)
  • Policy violations
  • Security threats
  • Inappropriate requests
Behavior:
Content Detected β†’ BLOCKED β†’ Safety Message Displayed
Example Configuration:
{
  "type": "INPUT",
  "action": "BLOCK",
  "categories": [
    "hate_speech",
    "violence",
    "sexual_content"
  ]
}
User Experience:
  • Request is not processed
  • Polite safety message displayed
  • User prompted to rephrase
  • Interaction logged for monitoring

Creating Custom Guardrails

Custom guardrails are created on-demand through Amazon Bedrock Guardrails service to meet your specific requirements.

When to Create Custom Guardrails

Healthcare (HIPAA):
  • Mask protected health information (PHI)
  • Block medical advice outside scope
  • Prevent patient data disclosure
Financial Services (PCI-DSS):
  • Mask financial account details
  • Block unauthorized financial advice
  • Protect transaction information
Legal:
  • Prevent unauthorized legal advice
  • Protect privileged information
  • Maintain confidentiality
  • Custom prohibited topics
  • Brand-specific content rules
  • Internal data protection
  • Proprietary information safeguards
  • Employee information protection
  • Custom PII types (employee IDs, patient numbers)
  • Industry-specific identifiers
  • Regional data protection (EU vs US)
  • Multi-language PII detection
  • Content generation safety
  • Academic integrity
  • Child safety protections
  • Community guidelines enforcement
  • Custom safety categories

Custom Guardrail Creation Process

1

Define Requirements

Document your specific needs:Required Information:
  • Purpose: What should this guardrail protect?
  • Direction: INPUT, OUTPUT, or both?
  • Action: Block or Mask?
  • Content Categories: What to filter?
  • PII Types: What to mask (if applicable)?
  • Scope: General or organization-only?
2

Request Creation

Contact your PLai Framework administrator or account manager:Provide:
  • Requirements document
  • Use case description
  • Compliance regulations
  • Timeline needs
  • Testing requirements
Methods:
  • Support ticket
  • Account manager email
  • Admin dashboard request
  • API (for enterprise customers)
Guardrail creation typically takes 2-5 business days depending on complexity and testing requirements.
3

Review and Testing

Once created, thoroughly test the guardrail:Test Scenarios:
  • Positive cases (should trigger)
  • Negative cases (should not trigger)
  • Edge cases
  • Performance impact
  • False positives
  • False negatives
Testing Checklist:
βœ… Blocks/masks intended content
βœ… Allows safe content through
βœ… No excessive false positives
βœ… Acceptable latency impact
βœ… Works across different phrasings
βœ… Handles edge cases appropriately
βœ… Logs triggers correctly
4

Apply to Agents

Add the guardrail to your agents through:
  • Agent dashboard UI
  • API endpoint
  • Bulk application (multiple agents)
See β€œApplying Guardrails” section below for details.
5

Monitor and Refine

After deployment:
  • Monitor trigger rates
  • Review blocked content
  • Check for false positives
  • Adjust configuration as needed
  • Collect user feedback

Applying Guardrails to Agents

Via Dashboard UI

1

Navigate to Guardrails Section

  1. Open your Agent Dashboard
  2. Select the Guardrails tab
  3. Select the guardrail(s) you want to apply.
2

Configure Settings

For each selected guardrail:Direction:
  • β—‰ INPUT only
  • β—― OUTPUT only
  • β—― Both INPUT and OUTPUT
Priority (if multiple guardrails):
  • Higher priority guardrails run first
  • Range: 1 (highest) to 10 (lowest)

Next Steps


Additional Resources

Guardrails in PLai Framework are powered by Amazon Bedrock Guardrails. For technical details on the underlying service:
GDPR Compliance:
  • PII masking for EU users
  • Data protection requirements
  • Right to be forgotten
HIPAA Compliance:
  • PHI protection requirements
  • HIPAA Security Rule
  • HIPAA Privacy Rule
PCI-DSS Compliance:
  • Payment card data protection
  • Cardholder data environment
  • Security assessment procedures
For custom guardrail creation:
  • Contact your account manager
  • Submit detailed requirements document
  • Expected turnaround: 2-5 business days