Skip to main content

List LLMs

List all available language models and their configurations.

Response

models
object
Object containing all available language models, keyed by model identifier
curl --location --request GET 'https://api.plaisolutions.com/llms'
{
  "OPENAI_GPT_4_O": {
    "display_name": "GPT-4o",
    "providers": ["OPENAI"],
    "category": "completion",
    "allows_tools": true,
    "allows_streaming": true,
    "allows_temperature": true
  },
  "ANTHROPIC_CLAUDE_3_5_SONNET_V1": {
    "display_name": "Claude 3.5 Sonnet",
    "providers": ["ANTHROPIC"],
    "category": "completion",
    "allows_tools": true,
    "allows_streaming": true,
    "allows_temperature": true
  },
  "OPENAI_TEXT_EMBEDDING_3_LARGE": {
    "display_name": "Text Embedding 3 Large",
    "providers": ["OPENAI"],
    "category": "embedding",
    "allows_tools": false,
    "allows_streaming": false,
    "allows_temperature": false
  },
  "OPENAI_WHISPER": {
    "display_name": "Whisper",
    "providers": ["OPENAI"],
    "category": "speech-to-text",
    "allows_tools": false,
    "allows_streaming": false,
    "allows_temperature": false
  }
}

Get LLM

Get information about a specific language model.

Path Parameters

llm
string
required
The model identifier (e.g., “OPENAI_GPT_4_O”, “ANTHROPIC_CLAUDE_3_5_SONNET_V1”)

Response

display_name
string
Human-readable name of the model
providers
array
Array of provider names that support this model
category
string
Model category (speech-to-text, embedding, completion)
allows_tools
boolean
Whether the model supports tool/function calling
allows_streaming
boolean
Whether the model supports streaming responses
allows_temperature
boolean
Whether the model supports temperature parameter adjustment
curl --location --request GET 'https://api.plaisolutions.com/llms/OPENAI_GPT_4_O'
{
  "display_name": "GPT-4o",
  "providers": ["OPENAI"],
  "category": "completion",
  "allows_tools": true,
  "allows_streaming": true,
  "allows_temperature": true
}

Available Models

The PLai Framework supports a wide range of language models across different providers and categories:

Completion Models

  • GPT-4o (OPENAI_GPT_4_O) - Latest multimodal model
  • GPT-4o mini (OPENAI_GPT_4_O_MINI) - Faster, cost-effective version
  • GPT-4 Turbo (OPENAI_GPT_4_TURBO) - Enhanced GPT-4 with longer context
  • GPT-3.5 Turbo (OPENAI_GPT_3_5_TURBO) - Fast and efficient model
  • o1 (OPENAI_O1) - Advanced reasoning model
  • o1-mini (OPENAI_O1_MINI) - Smaller reasoning model
  • Claude 3.5 Sonnet (ANTHROPIC_CLAUDE_3_5_SONNET_V1) - Balanced performance
  • Claude 3.5 Haiku (ANTHROPIC_CLAUDE_3_5_HAIKU_20241022) - Fast responses
  • Claude 3 Sonnet (ANTHROPIC_CLAUDE_3_SONNET_V1) - Previous generation
  • Claude 3 Haiku (ANTHROPIC_CLAUDE_3_HAIKU_V1) - Previous generation fast model
  • Llama 3.1 405B (META_LLAMA_3_1_405B_INSTRUCT_TURBO) - Largest model
  • Llama 3.1 70B (META_LLAMA_3_1_70B_INSTRUCT_TURBO) - High performance
  • Llama 3.1 8B (META_LLAMA_3_1_8B_INSTRUCT_TURBO) - Efficient model
  • Llama 3.2 90B Vision (META_LLAMA_3_2_90B_VISION) - Multimodal capabilities
  • Gemini 2.0 Flash (GOOGLE_GEMINI_2_FLASH) - Latest fast model
  • Gemma 2 9B (GOOGLE_GEMMA_2_9B_IT) - Open model
  • Gemma 1.1 7B (GOOGLE_GEMMA_1_1_7B_IT) - Lightweight model
  • Llama 3 Groq 70B (GROQ_LLAMA_3_GROQ_70B_Tool_Use) - Tool-optimized
  • Llama 3.1 70B (GROQ_LLAMA_3_1_70B_VERSATILE) - Versatile model
  • Llama 3.3 70B (GROQ_LLAMA_3_3_70B_VERSATILE) - Latest version

Embedding Models

Text Embedding 3 Large

OPENAI_TEXT_EMBEDDING_3_LARGE - High-quality embeddings with 3072 dimensions

Text Embedding 3 Small

OPENAI_TEXT_EMBEDDING_3_SMALL - Efficient embeddings with 1536 dimensions

Ada 002

OPENAI_EMBEDDING_ADA_002 - Legacy embedding model

Speech-to-Text Models

Whisper

OPENAI_WHISPER - Advanced speech recognition with multilingual support

Model Categories

These models generate text completions and are ideal for:
  • Chat applications
  • Content generation
  • Question answering
  • Code generation
  • Text analysis
Features:
  • Tool/function calling support (most models)
  • Streaming responses
  • Temperature control for creativity
  • Context length varies by model (4K to 128K+ tokens)

Model Selection Guidelines

Performance vs Cost

1

High Performance

Use GPT-4o, Claude 3.5 Sonnet, or Llama 3.1 405B for complex reasoning tasks
2

Balanced

Use GPT-4o mini, Claude 3.5 Haiku, or Llama 3.1 70B for general applications
3

Cost-Effective

Use GPT-3.5 Turbo, Llama 3.1 8B, or Gemma models for simple tasks

Feature Requirements

Not all models support all features. Check the allows_tools, allows_streaming, and allows_temperature properties before implementation.
Use the List LLMs endpoint to programmatically check model capabilities and build dynamic model selection interfaces.

Provider Considerations

Different providers may offer:
  • Latency: Groq typically offers faster inference
  • Cost: Together AI and OpenRouter often have competitive pricing
  • Reliability: OpenAI and Anthropic provide highly stable APIs
  • Privacy: Some providers offer enhanced privacy features
Model availability and pricing may vary by provider. Check your API service configurations for specific model access.