List LLMs
List all available language models and their configurations.Response
Object containing all available language models, keyed by model identifier
Get LLM
Get information about a specific language model.Path Parameters
The model identifier (e.g., “OPENAI_GPT_4_O”, “ANTHROPIC_CLAUDE_3_5_SONNET_V1”)
Response
Human-readable name of the model
Array of provider names that support this model
Model category (speech-to-text, embedding, completion)
Whether the model supports tool/function calling
Whether the model supports streaming responses
Whether the model supports temperature parameter adjustment
Available Models
The PLai Framework supports a wide range of language models across different providers and categories:Completion Models
OpenAI Models
OpenAI Models
- GPT-4o (
OPENAI_GPT_4_O) - Latest multimodal model - GPT-4o mini (
OPENAI_GPT_4_O_MINI) - Faster, cost-effective version - GPT-4 Turbo (
OPENAI_GPT_4_TURBO) - Enhanced GPT-4 with longer context - GPT-3.5 Turbo (
OPENAI_GPT_3_5_TURBO) - Fast and efficient model - o1 (
OPENAI_O1) - Advanced reasoning model - o1-mini (
OPENAI_O1_MINI) - Smaller reasoning model
Anthropic Models
Anthropic Models
- Claude 3.5 Sonnet (
ANTHROPIC_CLAUDE_3_5_SONNET_V1) - Balanced performance - Claude 3.5 Haiku (
ANTHROPIC_CLAUDE_3_5_HAIKU_20241022) - Fast responses - Claude 3 Sonnet (
ANTHROPIC_CLAUDE_3_SONNET_V1) - Previous generation - Claude 3 Haiku (
ANTHROPIC_CLAUDE_3_HAIKU_V1) - Previous generation fast model
Meta Llama Models
Meta Llama Models
- Llama 3.1 405B (
META_LLAMA_3_1_405B_INSTRUCT_TURBO) - Largest model - Llama 3.1 70B (
META_LLAMA_3_1_70B_INSTRUCT_TURBO) - High performance - Llama 3.1 8B (
META_LLAMA_3_1_8B_INSTRUCT_TURBO) - Efficient model - Llama 3.2 90B Vision (
META_LLAMA_3_2_90B_VISION) - Multimodal capabilities
Google Models
Google Models
- Gemini 2.0 Flash (
GOOGLE_GEMINI_2_FLASH) - Latest fast model - Gemma 2 9B (
GOOGLE_GEMMA_2_9B_IT) - Open model - Gemma 1.1 7B (
GOOGLE_GEMMA_1_1_7B_IT) - Lightweight model
Groq Models
Groq Models
- Llama 3 Groq 70B (
GROQ_LLAMA_3_GROQ_70B_Tool_Use) - Tool-optimized - Llama 3.1 70B (
GROQ_LLAMA_3_1_70B_VERSATILE) - Versatile model - Llama 3.3 70B (
GROQ_LLAMA_3_3_70B_VERSATILE) - Latest version
Embedding Models
Text Embedding 3 Large
OPENAI_TEXT_EMBEDDING_3_LARGE - High-quality embeddings with 3072 dimensionsText Embedding 3 Small
OPENAI_TEXT_EMBEDDING_3_SMALL - Efficient embeddings with 1536 dimensionsAda 002
OPENAI_EMBEDDING_ADA_002 - Legacy embedding modelSpeech-to-Text Models
Whisper
OPENAI_WHISPER - Advanced speech recognition with multilingual supportModel Categories
- Completion Models
- Embedding Models
- Speech-to-Text Models
These models generate text completions and are ideal for:
- Chat applications
- Content generation
- Question answering
- Code generation
- Text analysis
- Tool/function calling support (most models)
- Streaming responses
- Temperature control for creativity
- Context length varies by model (4K to 128K+ tokens)
Model Selection Guidelines
Performance vs Cost
1
High Performance
Use GPT-4o, Claude 3.5 Sonnet, or Llama 3.1 405B for complex reasoning tasks
2
Balanced
Use GPT-4o mini, Claude 3.5 Haiku, or Llama 3.1 70B for general applications
3
Cost-Effective
Use GPT-3.5 Turbo, Llama 3.1 8B, or Gemma models for simple tasks
Feature Requirements
Provider Considerations
Different providers may offer:- Latency: Groq typically offers faster inference
- Cost: Together AI and OpenRouter often have competitive pricing
- Reliability: OpenAI and Anthropic provide highly stable APIs
- Privacy: Some providers offer enhanced privacy features
Model availability and pricing may vary by provider. Check your API service configurations for specific model access.