Skip to main content
Enabled when BASETEN_API_KEY is set in your environment.

Overview

Baseten Model APIs expose managed LLMs through an OpenAI-compatible chat completions endpoint. The local baseten provider uses Chat Completions request translation with Baseten-specific connection config, provider icon, and model discovery.

Key Features

  • OpenAI-compatible chat completions via https://inference.baseten.co/v1
  • Model discovery from Baseten’s Model APIs catalog
  • Catalog pricing copied to model definitions for request cost logs
  • Tool calling, JSON mode, and streaming capability hints
  • Baseten provider icon with lab-specific model icons when catalog IDs identify the maker

Environment Setup

.dev.vars
BASETEN_API_KEY=...
The provider targets Baseten’s public inference and model catalog APIs.

Provider Definition

The provider lives in agents/providers/baseten.ts and is auto-discovered by AgentBuilder:
import { defineModel } from '@standardagents/spec';
import baseten from '../providers/baseten';

export default defineModel({
  name: 'baseten_deepseek_v4_pro',
  provider: baseten,
  model: 'deepseek-ai/DeepSeek-V4-Pro',
});

Model Discovery

The provider fetches available models from Baseten’s management API:
GET https://api.baseten.co/v1/model_apis
The returned Model API catalog supplies model names, descriptions, context windows, and pricing metadata. The provider uses the catalog model name as the model identifier sent to chat completions. Catalog prices are parsed from cost_per_million_input_tokens and cost_per_million_output_tokens and stored as inputPrice and outputPrice on generated model definitions.

Request Routing

Generation and streaming are delegated to the Chat Completions implementation using Baseten’s inference base URL:
POST https://inference.baseten.co/v1/chat/completions

OpenAI Provider

Compare with the first-party OpenAI provider

defineProvider

Provider extension API