> ## Documentation Index
> Fetch the complete documentation index at: https://docs.standardagentbuilder.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Baseten Model APIs

> Baseten Model APIs as an OpenAI-compatible provider for Standard Agents

<Info>
  **Enabled when** `BASETEN_API_KEY` is set in your environment.
</Info>

## Overview

Baseten Model APIs expose managed LLMs through an OpenAI-compatible chat
completions endpoint. The local `baseten` provider uses Chat Completions request
translation with Baseten-specific connection config, provider icon, and model
discovery.

<Card title="Key Features" icon="server">
  * OpenAI-compatible chat completions via `https://inference.baseten.co/v1`
  * Model discovery from Baseten's Model APIs catalog
  * Catalog pricing copied to model definitions for request cost logs
  * Tool calling, JSON mode, and streaming capability hints
  * Baseten provider icon with lab-specific model icons when catalog IDs identify the maker
</Card>

## Environment Setup

```bash .dev.vars theme={null}
BASETEN_API_KEY=...
```

The provider targets Baseten's public inference and model catalog APIs.

## Provider Definition

The provider lives in `agents/providers/baseten.ts` and is auto-discovered by
AgentBuilder:

```typescript theme={null}
import { defineModel } from '@standardagents/spec';
import baseten from '../providers/baseten';

export default defineModel({
  name: 'baseten_deepseek_v4_pro',
  provider: baseten,
  model: 'deepseek-ai/DeepSeek-V4-Pro',
});
```

## Model Discovery

The provider fetches available models from Baseten's management API:

```text theme={null}
GET https://api.baseten.co/v1/model_apis
```

The returned Model API catalog supplies model names, descriptions, context
windows, and pricing metadata. The provider uses the catalog model name as the
model identifier sent to chat completions. Catalog prices are parsed from
`cost_per_million_input_tokens` and `cost_per_million_output_tokens` and stored
as `inputPrice` and `outputPrice` on generated model definitions.

## Request Routing

Generation and streaming are delegated to the Chat Completions implementation
using Baseten's inference base URL:

```text theme={null}
POST https://inference.baseten.co/v1/chat/completions
```

## Related

<CardGroup cols={2}>
  <Card title="OpenAI Provider" icon="brain" href="/providers/openai">
    Compare with the first-party OpenAI provider
  </Card>

  <Card title="defineProvider" icon="code" href="/api-reference/define/provider">
    Provider extension API
  </Card>
</CardGroup>
