AI Models
Agent HQ supports multiple AI model providers, letting you choose the right model for each job.
Available providers
Section titled “Available providers”Anthropic (Claude)
Section titled “Anthropic (Claude)”Claude models excel at complex reasoning, code generation, and following detailed instructions.
- Claude Opus 4.6 — Most capable model for complex tasks.
- Claude Sonnet 4.6 — Strong balance of speed and capability.
- Claude Sonnet 4.5 — Fast and capable. Great for most code tasks.
- Claude Haiku 4.5 — Fastest and cheapest. Good for simple tasks and quick iterations.
OpenAI
Section titled “OpenAI”- GPT-4o — Strong general-purpose model for code and reasoning.
- GPT-4o Mini — Faster, lower cost option for lightweight tasks.
Google (Gemini)
Section titled “Google (Gemini)”Gemini models offer fast responses with strong reasoning capabilities.
- Gemini 2.5 Pro — Most capable Gemini model.
- Gemini 2.5 Flash — Fast iteration speed with good code quality.
Cloudflare Workers AI
Section titled “Cloudflare Workers AI”Free-tier models running on Cloudflare’s edge network. No API key required.
- Llama 3.3 70B — open-source model available at no cost
- No external API calls — runs entirely on Cloudflare’s infrastructure
Choosing a model
Section titled “Choosing a model”Per-conversation
Section titled “Per-conversation”When creating a new conversation, you can select a model using the model selector in the chat interface. All tasks created from that conversation will use the selected model by default.
Per-task
Section titled “Per-task”When creating a task directly from the task board, you can choose a model in the New Task Dialog.
Recommendations
Section titled “Recommendations”| Use case | Recommended model |
|---|---|
| Complex code generation | Claude Opus 4.6 or Sonnet 4.6 |
| Quick fixes and small changes | Claude Haiku 4.5 or Gemini Flash |
| General coding tasks | Claude Sonnet 4.5 or GPT-4o |
| Experimentation / low budget | Cloudflare Workers AI (free) |
| Large refactoring tasks | Claude Opus 4.6 |
Extended thinking
Section titled “Extended thinking”Models that support extended thinking (like Claude) show their reasoning process in real time. You’ll see a collapsible “Thinking” section in the chat that reveals Pilot’s chain of thought, including timing information for each reasoning step.
This is useful for understanding why Pilot made certain decisions.
API keys
Section titled “API keys”To use models from Anthropic, OpenAI, or Google, your workspace needs the corresponding API keys configured.
Cloudflare Workers AI models work out of the box with no additional configuration.
Cost tracking
Section titled “Cost tracking”Every message and task tracks token usage (input and output tokens). Combined with model pricing, this lets you see the cost of each task. See Budget & Cost Tracking for details.