Enabled when
BASETEN_API_KEY is set in your environment.Overview
Baseten Model APIs expose managed LLMs through an OpenAI-compatible chat completions endpoint. The localbaseten provider uses Chat Completions request
translation with Baseten-specific connection config, provider icon, and model
discovery.
Key Features
- OpenAI-compatible chat completions via
https://inference.baseten.co/v1 - Model discovery from Baseten’s Model APIs catalog
- Catalog pricing copied to model definitions for request cost logs
- Tool calling, JSON mode, and streaming capability hints
- Baseten provider icon with lab-specific model icons when catalog IDs identify the maker
Environment Setup
.dev.vars
Provider Definition
The provider lives inagents/providers/baseten.ts and is auto-discovered by
AgentBuilder:
Model Discovery
The provider fetches available models from Baseten’s management API:cost_per_million_input_tokens and cost_per_million_output_tokens and stored
as inputPrice and outputPrice on generated model definitions.
Request Routing
Generation and streaming are delegated to the Chat Completions implementation using Baseten’s inference base URL:Related
OpenAI Provider
Compare with the first-party OpenAI provider
defineProvider
Provider extension API