> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cloudidr.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Supported AI Models

> Complete list of AI models supported by Cloudidr LLM Ops with current pricing. 

## Overview

These are the models supported by Cloudidr LLM Ops to track API costs. The model pricing is from the providers which we use to calculate your spend.

> Up to date supported models, providers and pricing can be found in the LLM Ops left side panel **Starting Guide** → **Model Pricing** tab

<Note>
  **Last Updated:** January 10, 2026

  Model pricing is subject to change by the providers. We update our pricing regularly to ensure accurate cost tracking.
</Note>

<Warning>
  **Model Not Listed?**

  If your model is not in this list, please contact us at [support@cloudidr.com](mailto:support@cloudidr.com) and we'll add support for it.
</Warning>

***

## Pricing Tables

<Tabs>
  <Tab title="Anthropic">
    ## Anthropic Claude Models

    All pricing is per 1 million tokens.

    | Model                      | Input Cost | Output Cost |
    | -------------------------- | ---------- | ----------- |
    | **Claude Opus 4.5**        |            |             |
    | claude-opus-4-5-20251101   | \$5.00     | \$25.00     |
    | **Claude Opus 4.1**        |            |             |
    | claude-opus-4-1-20250805   | \$15.00    | \$75.00     |
    | **Claude Opus 4**          |            |             |
    | claude-opus-4-20250514     | \$15.00    | \$75.00     |
    | **Claude Sonnet 4.5**      |            |             |
    | claude-sonnet-4-5-20250929 | \$3.00     | \$15.00     |
    | **Claude Sonnet 4**        |            |             |
    | claude-sonnet-4-20250514   | \$3.00     | \$15.00     |
    | **Claude Haiku 4.5**       |            |             |
    | claude-haiku-4-5-20251001  | \$1.00     | \$5.00      |
    | **Claude 3.5 Haiku**       |            |             |
    | claude-3-5-haiku-20241022  | \$0.80     | \$4.00      |
    | **Claude 3 Haiku**         |            |             |
    | claude-3-haiku-20240307    | \$0.25     | \$1.25      |

    <Info>
      **Model Recommendations:**

      * **Opus** - Most capable, best for complex reasoning
      * **Sonnet** - Balanced performance and cost
      * **Haiku** - Fastest and most affordable
    </Info>

    ### Integration Guide

    See the [Anthropic Integration](/guides/llm-ops/integrations/anthropic) page to start tracking costs.
  </Tab>

  <Tab title="OpenAI">
    ## OpenAI Models

    All pricing is per 1 million tokens unless otherwise noted.

    ### GPT-5 Family

    | Model               | Input Cost | Output Cost |
    | ------------------- | ---------- | ----------- |
    | gpt-5.2             | \$1.75     | \$14.00     |
    | gpt-5.2-chat-latest | \$1.75     | \$14.00     |
    | gpt-5.2-pro         | \$21.00    | \$168.00    |
    | gpt-5.1             | \$1.25     | \$10.00     |
    | gpt-5.1-chat-latest | \$1.25     | \$10.00     |
    | gpt-5.1-codex-max   | \$1.25     | \$10.00     |
    | gpt-5.1-codex       | \$1.25     | \$10.00     |
    | gpt-5.1-codex-mini  | \$0.25     | \$2.00      |
    | gpt-5               | \$1.25     | \$10.00     |
    | gpt-5-chat-latest   | \$1.25     | \$10.00     |
    | gpt-5-codex         | \$1.25     | \$10.00     |
    | gpt-5-pro           | \$15.00    | \$120.00    |
    | gpt-5-mini          | \$0.25     | \$2.00      |
    | gpt-5-nano          | \$0.05     | \$0.40      |
    | gpt-5-search-api    | \$1.25     | \$10.00     |

    ### GPT-4.1 Family

    | Model        | Input Cost | Output Cost |
    | ------------ | ---------- | ----------- |
    | gpt-4.1      | \$2.00     | \$8.00      |
    | gpt-4.1-mini | \$0.40     | \$1.60      |
    | gpt-4.1-nano | \$0.10     | \$0.40      |

    ### GPT-4o Family

    | Model                      | Input Cost | Output Cost |
    | -------------------------- | ---------- | ----------- |
    | gpt-4o                     | \$2.50     | \$10.00     |
    | gpt-4o-2024-05-13          | \$5.00     | \$15.00     |
    | gpt-4o-mini                | \$0.15     | \$0.60      |
    | gpt-4o-mini-2024-07-18     | \$0.15     | \$0.60      |
    | gpt-4o-search-preview      | \$2.50     | \$10.00     |
    | gpt-4o-mini-search-preview | \$0.15     | \$0.60      |

    ### Realtime & Audio Models

    | Model                        | Input Cost | Output Cost |
    | ---------------------------- | ---------- | ----------- |
    | gpt-realtime                 | \$4.00     | \$16.00     |
    | gpt-realtime-mini            | \$0.60     | \$2.40      |
    | gpt-4o-realtime-preview      | \$5.00     | \$20.00     |
    | gpt-4o-mini-realtime-preview | \$0.60     | \$2.40      |
    | gpt-audio                    | \$2.50     | \$10.00     |
    | gpt-audio-mini               | \$0.60     | \$2.40      |
    | gpt-4o-audio-preview         | \$2.50     | \$10.00     |
    | gpt-4o-mini-audio-preview    | \$0.15     | \$0.60      |

    ### Image Generation Models (Per-Image Pricing)

    | Model    | Resolution | Quality  | Price Per Image |
    | -------- | ---------- | -------- | --------------- |
    | dall-e-3 | 1024×1024  | Standard | \$0.040         |
    | dall-e-3 | 1024×1792  | Standard | \$0.080         |
    | dall-e-3 | 1792×1024  | Standard | \$0.080         |
    | dall-e-3 | 1024×1024  | HD       | \$0.080         |
    | dall-e-3 | 1024×1792  | HD       | \$0.120         |
    | dall-e-3 | 1792×1024  | HD       | \$0.120         |

    ### Audio Transcription Models (Per-Second Pricing)

    | Model     | Price Per Second | Notes                     |
    | --------- | ---------------- | ------------------------- |
    | whisper-1 | \$0.0001         | \$0.006/min transcription |

    ### Text-to-Speech Models (Per-Character Pricing)

    | Model    | Price Per 1M Characters | Notes            |
    | -------- | ----------------------- | ---------------- |
    | tts-1    | \$15.00                 | Standard quality |
    | tts-1-hd | \$30.00                 | High definition  |

    ### o-Series (Reasoning Models)

    | Model                 | Input Cost | Output Cost |
    | --------------------- | ---------- | ----------- |
    | o1                    | \$15.00    | \$60.00     |
    | o1-pro                | \$150.00   | \$600.00    |
    | o1-mini               | \$1.10     | \$4.40      |
    | o3                    | \$2.00     | \$8.00      |
    | o3-pro                | \$20.00    | \$80.00     |
    | o3-mini               | \$1.10     | \$4.40      |
    | o3-deep-research      | \$10.00    | \$40.00     |
    | o4-mini               | \$1.10     | \$4.40      |
    | o4-mini-deep-research | \$2.00     | \$8.00      |

    ### GPT-4 Legacy

    | Model               | Input Cost | Output Cost |
    | ------------------- | ---------- | ----------- |
    | gpt-4               | \$30.00    | \$60.00     |
    | gpt-4-turbo         | \$10.00    | \$30.00     |
    | gpt-4-turbo-preview | \$10.00    | \$30.00     |

    ### GPT-3.5 Family

    | Model             | Input Cost | Output Cost |
    | ----------------- | ---------- | ----------- |
    | gpt-3.5-turbo     | \$0.50     | \$1.50      |
    | gpt-3.5-turbo-16k | \$3.00     | \$4.00      |

    <Info>
      **Model Recommendations:**

      * **gpt-4o** - Best balance of capability and speed
      * **gpt-4o-mini** - Most cost-effective for simple tasks
      * **o1/o3** - Advanced reasoning for complex problems
      * **dall-e-3** - High-quality image generation
      * **whisper-1** - Audio transcription at \$0.006/minute
      * **tts-1** - Natural text-to-speech
    </Info>

    ### Integration Guide

    See the [OpenAI Integration](/guides/llm-ops/integrations/openai) page to start tracking costs.
  </Tab>

  <Tab title="Google">
    ## Google Gemini Models

    All pricing is per 1 million tokens unless otherwise noted.

    ### Gemini 3 Series

    | Model                      | Input Cost | Output Cost | Special Rates          |
    | -------------------------- | ---------- | ----------- | ---------------------- |
    | gemini-3-pro-preview       | \$2.00     | \$12.00     |                        |
    | gemini-3-pro-image-preview | \$2.00     | \$12.00     | Image Output: \$120.00 |
    | gemini-3-flash-preview     | \$0.50     | \$3.00      | Audio Input: \$1.00    |

    ### Gemini 2.5 Series

    | Model                                         | Input Cost | Output Cost | Special Rates                            |
    | --------------------------------------------- | ---------- | ----------- | ---------------------------------------- |
    | **Pro Models**                                |            |             |                                          |
    | gemini-2.5-pro                                | \$1.25     | \$10.00     |                                          |
    | gemini-2.5-pro-preview-tts                    | \$1.00     | \$20.00     | TTS: Audio output                        |
    | **Flash Models**                              |            |             |                                          |
    | gemini-2.5-flash                              | \$0.30     | \$2.50      | Audio Input: \$1.00                      |
    | gemini-2.5-flash-preview-09-2025              | \$0.30     | \$2.50      | Audio Input: \$1.00                      |
    | gemini-2.5-flash-preview-tts                  | \$0.50     | \$10.00     | TTS: Audio output                        |
    | gemini-2.5-flash-image                        | \$0.30     | \$2.50      | Image Output: \$30.00                    |
    | **Flash-Lite Models**                         |            |             |                                          |
    | gemini-2.5-flash-lite                         | \$0.10     | \$0.40      | Audio Input: \$0.30                      |
    | gemini-2.5-flash-lite-preview-09-2025         | \$0.10     | \$0.40      | Audio Input: \$0.30                      |
    | **Specialized Models**                        |            |             |                                          |
    | gemini-2.5-computer-use-preview-10-2025       | \$1.25     | \$10.00     |                                          |
    | gemini-2.5-flash-native-audio-preview-12-2025 | \$0.50     | \$2.00      | Audio In: \$3.00<br />Audio Out: \$12.00 |

    ### Gemini 2.0 Series

    | Model                 | Input Cost | Output Cost | Special Rates       |
    | --------------------- | ---------- | ----------- | ------------------- |
    | gemini-2.0-flash      | \$0.10     | \$0.40      | Audio Input: \$0.70 |
    | gemini-2.0-flash-lite | \$0.075    | \$0.30      |                     |

    ### Latest Aliases (Dynamic)

    | Model                    | Input Cost | Output Cost | Maps To               |
    | ------------------------ | ---------- | ----------- | --------------------- |
    | gemini-flash-latest      | \$0.30     | \$2.50      | gemini-2.5-flash      |
    | gemini-pro-latest        | \$1.25     | \$10.00     | gemini-2.5-pro        |
    | gemini-flash-lite-latest | \$0.10     | \$0.40      | gemini-2.5-flash-lite |

    ### Gemini 1.5 Series (Legacy)

    | Model            | Input Cost | Output Cost |
    | ---------------- | ---------- | ----------- |
    | gemini-1.5-pro   | \$1.25     | \$5.00      |
    | gemini-1.5-flash | \$0.075    | \$0.30      |

    ### Other Legacy Models

    | Model        | Input Cost | Output Cost |
    | ------------ | ---------- | ----------- |
    | gemini-pro   | \$0.50     | \$1.50      |
    | gemini-flash | \$0.075    | \$0.30      |
    | palm-2       | \$0.50     | \$1.50      |

    ### Image Generation Models (Per-Image Pricing)

    | Model                         | Price Per Image | Notes               |
    | ----------------------------- | --------------- | ------------------- |
    | imagen-4.0-fast-generate-001  | \$0.02          | Fast generation     |
    | imagen-4.0-generate-001       | \$0.04          | Standard generation |
    | imagen-4.0-ultra-generate-001 | \$0.06          | Ultra quality       |

    ### Video Generation Models (Per-Second Pricing)

    | Model                         | Price Per Second | Notes                     |
    | ----------------------------- | ---------------- | ------------------------- |
    | veo-3.1-fast-generate-preview | \$0.15           | Fast generation (Preview) |
    | veo-3.1-generate-preview      | \$0.40           | Standard (Preview)        |
    | veo-3.0-fast-generate-001     | \$0.15           | Fast generation (Stable)  |
    | veo-3.0-generate-001          | \$0.40           | Standard (Stable)         |

    <Info>
      **Model Recommendations:**

      * **gemini-2.5-pro** - Most capable for complex tasks
      * **gemini-2.5-flash** - Best cost/performance balance
      * **gemini-2.5-flash-lite** - Most affordable option
      * **gemini-2.5-flash-image** - Native image generation
      * **imagen-4.0-fast** - Fast image generation at \$0.02/image
      * **veo-3.0-fast** - Fast video generation at \$0.15/second
    </Info>

    <Note>
      **Special Pricing Notes:**

      * **TTS Models:** Output is audio tokens, not text
      * **Audio Input:** Premium pricing for audio/video multimodal input
      * **Image Generation (Imagen):** Charged per image generated, not per token
      * **Video Generation (Veo):** Charged per second of video generated
      * **gemini-2.5-flash-image:**
        * Text input: \$0.30 per 1M tokens
        * Text output: \$2.50 per 1M tokens
        * Image output: \$30.00 per 1M tokens (equivalent to \$0.039 per image)
    </Note>

    ### Integration Guide

    See the [Gemini Integration](/guides/llm-ops/integrations/gemini) page to start tracking costs.
  </Tab>
</Tabs>

***

## Cost Comparison

<AccordionGroup>
  <Accordion title="Most Affordable Models (Under $1/M tokens)" icon="dollar-sign">
    **Best for high-volume, simple tasks:**

    * **GPT-5 Nano:** $0.05 input / $0.40 output
    * **Gemini 2.0 Flash Lite:** $0.075 input / $0.30 output
    * **Gemini 1.5 Flash:** $0.075 input / $0.30 output
    * **GPT-4.1 Nano:** $0.10 input / $0.40 output
    * **Gemini 2.5 Flash Lite:** $0.10 input / $0.40 output
    * **GPT-4o Mini:** $0.15 input / $0.60 output
    * **Claude 3 Haiku:** $0.25 input / $1.25 output
    * **Claude 3.5 Haiku:** $0.80 input / $4.00 output

    Perfect for: Classification, extraction, simple Q\&A, high-throughput tasks
  </Accordion>

  <Accordion title="Mid-Range Models ($1-$5/M tokens)" icon="balance-scale">
    **Balanced performance and cost:**

    * **Claude Haiku 4.5:** $1.00 input / $5.00 output
    * **GPT-4.1:** $2.00 input / $8.00 output
    * **GPT-4o:** $2.50 input / $10.00 output
    * **Claude Sonnet 4.5:** $3.00 input / $15.00 output
    * **Claude Opus 4.5:** $5.00 input / $25.00 output

    Perfect for: Customer support, content generation, code assistance
  </Accordion>

  <Accordion title="Premium Models ($10+/M tokens)" icon="crown">
    **Advanced reasoning and complex tasks:**

    * **Claude Opus 4:** $15.00 input / $75.00 output
    * **o1:** $15.00 input / $60.00 output
    * **o3:** $2.00 input / $8.00 output
    * **o1-pro:** $150.00 input / $600.00 output
    * **GPT-5 Pro:** $15.00 input / $120.00 output
    * **GPT-4 (legacy):** $30.00 input / $60.00 output

    Perfect for: Complex reasoning, research, code generation, expert analysis
  </Accordion>

  <Accordion title="Image Generation (Per-Image Pricing)" icon="image">
    **AI image generation models:**

    * **Imagen 4 Fast:** \$0.02 per image
    * **DALL-E 3 Standard (1024×1024):** \$0.040 per image
    * **Imagen 4 Standard:** \$0.04 per image
    * **gemini-2.5-flash-image:** \$0.039 per image (+ token costs)
    * **Imagen 4 Ultra:** \$0.06 per image
    * **DALL-E 3 HD (1024×1024):** \$0.080 per image
    * **DALL-E 3 Large (1792×1024):** $0.080-$0.120 per image

    Perfect for: Marketing materials, product images, illustrations
  </Accordion>

  <Accordion title="Video Generation (Per-Second Pricing)" icon="video">
    **AI video generation models:**

    * **Veo 3.0/3.1 Fast:** \$0.15 per second
    * **Veo 3.0/3.1 Standard:** \$0.40 per second

    Perfect for: Marketing videos, product demos, content creation
  </Accordion>

  <Accordion title="Audio Models (Transcription & TTS)" icon="microphone">
    **Audio transcription models:**

    * **Whisper-1:** $0.006 per minute ($0.0001/second)

    **Text-to-speech models:**

    * **TTS-1 (Standard):** \$15.00 per 1M characters
    * **TTS-1-HD (High Definition):** \$30.00 per 1M characters

    Perfect for: Transcription services, voice assistants, audiobooks, accessibility
  </Accordion>
</AccordionGroup>

***

## How Pricing Works

### Token Calculation

LLM Ops tracks both input and output tokens separately:

* **Input tokens** = Your prompt + any system messages + conversation history
* **Output tokens** = The model's response

**Example:**

```text theme={null}
Input: "Write a haiku about AI" (6 tokens)
Output: "Silicon dreams flow / Algorithms learn and grow / Future unfolds now" (15 tokens)

Cost with GPT-4o:
Input: 6 tokens × $2.50/1M = $0.000015
Output: 15 tokens × $10.00/1M = $0.00015
Total: $0.000165
```

### Cost Calculation

Your total cost is calculated as:

```text theme={null}
Total Cost = (Input Tokens × Input Price) + (Output Tokens × Output Price)
```

All costs are tracked in real-time and displayed in your [LLM Ops Dashboard](https://llm-ops.cloudidr.com/dashboard).

***

## Image & Video Generation Pricing

### Image Generation Models

Models like **DALL-E 3**, **Imagen 4**, and **gemini-2.5-flash-image** generate images and are priced differently:

**DALL-E 3 (Per-Image):**

```text theme={null}
Cost = Number of Images × Price Per Image

Example (Standard 1024×1024):
Generate 5 images with dall-e-3
Cost = 5 images × $0.040 = $0.20

Example (HD 1024×1792):
Generate 3 images with dall-e-3 HD quality
Cost = 3 images × $0.120 = $0.36
```

**Imagen 4 Models (Per-Image):**

```text theme={null}
Cost = Number of Images × Price Per Image

Example:
Generate 5 images with imagen-4.0-fast-generate-001
Cost = 5 images × $0.02 = $0.10
```

**gemini-2.5-flash-image (Token-Based):**

```text theme={null}
Text Input Cost = Input Tokens × $0.30/1M
Text Output Cost = Text Output Tokens × $2.50/1M
Image Output Cost = Image Output Tokens × $30.00/1M

Example:
Input: "Generate a sunset image" (100 tokens)
Output: Text description (50 tokens) + Image (1,290 tokens)

Text Input: 100 × $0.30/1M = $0.00003
Text Output: 50 × $2.50/1M = $0.000125
Image Output: 1,290 × $30.00/1M = $0.0387
Total: $0.038855

Note: Each image consumes approximately 1,290 tokens
This equals ~$0.039 per image
```

### Video Generation Models

**Veo Models (Per-Second):**

```text theme={null}
Cost = Video Duration (seconds) × Price Per Second

Example:
Generate 10-second video with veo-3.0-fast-generate-001
Cost = 10 seconds × $0.15 = $1.50
```

### Audio Transcription Models

**Whisper (Per-Second):**

```text theme={null}
Cost = Audio Duration (seconds) × Price Per Second

Example:
Transcribe 125.5-second audio file with whisper-1
Cost = 125.5 seconds × $0.0001 = $0.01255

Note: $0.0001/second = $0.006/minute
```

### Text-to-Speech Models

**TTS Models (Per-Character):**

```text theme={null}
Cost = (Characters / 1,000,000) × Price Per Million Characters

Example (Standard):
Generate speech from 1,250 characters with tts-1
Cost = (1,250 / 1,000,000) × $15.00 = $0.01875

Example (HD):
Generate speech from 1,250 characters with tts-1-hd
Cost = (1,250 / 1,000,000) × $30.00 = $0.0375
```

<Note>
  **Provider Counting:**

  Image, video, and audio generation costs are calculated based on what the provider reports:

  * **Images (DALL-E 3, Imagen 4):** Provider returns number of images generated
  * **Videos (Veo):** Provider returns video duration in seconds
  * **Audio (Whisper):** Provider returns audio duration in seconds
  * **TTS (TTS-1, TTS-1-HD):** Provider returns character count of input text

  We trust the provider's counts and multiply by our pricing table.
</Note>

***

## Multimodal Pricing

### Two Types of Multimodal Models

**1. Multimodal Understanding Models (Token-Based)**

* These models **analyze** images, videos, and audio
* Input media is converted to tokens by the provider
* Charged per token (text + media tokens combined)
* Examples: GPT-4o, Claude Opus, Gemini 2.5 Flash

**2. Media Generation Models (Per-Unit)**

* These models **create** images, videos, or audio
* DALL-E 3, Imagen 4: Charged per image generated
* Veo 3: Charged per second of video generated
* Whisper: Charged per second of audio transcribed
* TTS: Charged per character of text input
* gemini-2.5-flash-image: Hybrid (token-based, but image output uses premium rate)

<Note>
  **How Multimodal Tokens Are Tracked:**

  For **understanding models**, providers (OpenAI, Anthropic, Google) automatically convert images, video, and audio into tokens and include them in the response. LLM Ops tracks the total token count returned by the provider.

  **Image/video/audio input tokens are included in `input_tokens`** - they are not tracked separately.

  For **generation models**, we track based on what the provider charges:

  * **Text tokens:** Standard input/output pricing
  * **Images generated:** Per-image or per-token (depending on model)
  * **Video generated:** Per-second of video
  * **Audio transcribed:** Per-second of audio
  * **TTS generated:** Per-character of input text
</Note>

### Multimodal Understanding Models (Token-Based)

**These models analyze images, video, and audio sent as input:**

<Tabs>
  <Tab title="Images">
    **Models with image understanding:**

    * Gemini 2.0/2.5 Flash
    * GPT-4o
    * GPT-4o Mini
    * Claude Opus 4/4.5
    * Claude Sonnet 4/4.5

    **How it works:**

    1. You send an image with your prompt
    2. Provider converts image to tokens based on resolution
    3. Provider returns total `input_tokens` (text + image)
    4. LLM Ops tracks the total as input tokens
    5. Cost = `input_tokens × input_price`

    **Note:** Higher resolution images = more input tokens = higher cost
  </Tab>

  <Tab title="Audio">
    **Models with audio support:**

    * GPT-4o Audio Preview
    * GPT-4o Realtime Preview
    * Gemini 2.0 Flash
    * Gemini 2.5 Flash (with audio input premium)

    **How it works:**

    1. You send audio with your prompt
    2. Provider converts audio to tokens based on duration
    3. Provider returns total `input_tokens` (text + audio)
    4. LLM Ops tracks the total as input tokens
    5. Cost = `input_tokens × input_price`
  </Tab>

  <Tab title="Video">
    **Models with video support:**

    * Gemini 2.0/2.5 Flash
    * Gemini 1.5 Pro

    **How it works:**

    1. You send video with your prompt
    2. Provider converts video to tokens (duration × resolution × frames)
    3. Provider returns total `input_tokens` (text + video)
    4. LLM Ops tracks the total as input tokens
    5. Cost = `input_tokens × input_price`

    **Note:** Longer videos at higher resolution = significantly more input tokens
  </Tab>
</Tabs>

### Pricing Breakdown Summary

Here's how different types of content are charged:

| Content Type             | Understanding (Input)                                                                                | Generation (Output)                                                                                                                                              | Examples                                                                    |
| ------------------------ | ---------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------- |
| **Text**                 | Per token                                                                                            | Per token                                                                                                                                                        | All models                                                                  |
| **Images (Input)**       | Converted to tokens by provider<br />Included in `input_tokens`                                      | N/A                                                                                                                                                              | GPT-4o, Claude Opus, Gemini Flash                                           |
| **Audio (Input)**        | Converted to tokens by provider<br />Included in `input_tokens`<br />Some models charge premium rate | N/A                                                                                                                                                              | GPT-4o Audio, Gemini Flash <br /> (Audio input: $1.00/1M vs $ 0.30/1M text) |
| **Audio Transcription**  | Per second ( $0.0001/sec = $ 0.006/min)                                                              | Text output (per token)                                                                                                                                          | whisper-1                                                                   |
| **Video (Input)**        | Converted to tokens by provider<br />Included in `input_tokens`                                      | N/A                                                                                                                                                              | Gemini 2.5 Flash                                                            |
| **Images (Output)**      | N/A                                                                                                  | **DALL-E 3:** Per image ( $0.040-$ 0.120) <br /> **Imagen 4:** Per image ( $0.02-$ 0.06) <br /> **gemini-2.5-flash-image:** Per token ( $30/1M = ~$ 0.039/image) | dall-e-3, imagen-4.0-fast<br />gemini-2.5-flash-image                       |
| **Video (Output)**       | N/A                                                                                                  | Per second ( $0.15-$ 0.40/sec)                                                                                                                                   | veo-3.0-fast, veo-3.1                                                       |
| **Audio (Output - TTS)** | N/A                                                                                                  | **OpenAI TTS:** Per character ( $15-$ 30/1M) <br /> **Gemini TTS:** Per token ( $10-$ 20/1M)                                                                     | tts-1, tts-1-hd<br />gemini-2.5-flash-tts                                   |

<Note>
  **Key Takeaways:**

  * **Understanding (Input):** Media → Tokens → Cost per token
  * **Generation (Output):**
    * Images: Per image (DALL-E 3, Imagen 4) OR per token (gemini-2.5-flash-image)
    * Video: Per second (Veo)
    * Audio Transcription: Per second (Whisper)
    * Text-to-Speech: Per character (OpenAI TTS) OR per token (Gemini TTS)
  * **Provider Controls Conversion:** We trust provider counts
  * **Not Tracked Separately:** Input media tokens are combined with text tokens in `input_tokens`
</Note>

### Example: Image Token Calculation

```text theme={null}
Request:
- Text prompt: "Describe this image" (4 tokens)
- Image: 1024x1024 JPG (converted to 765 tokens by provider)

Provider Response:
{
  "usage": {
    "input_tokens": 769,     // 4 text + 765 image
    "output_tokens": 50      // Response tokens
  }
}

Cost Calculation (using GPT-4o pricing):
Input: 769 tokens × $2.50/1M = $0.0019225
Output: 50 tokens × $10.00/1M = $0.0005
Total: $0.0024225
```

**LLM Ops Dashboard shows:**

* Input Tokens: 769 (includes both text and image)
* Output Tokens: 50
* Total Cost: \$0.0024225

<Warning>
  **Image/Audio/Video Breakdown Not Available:**

  LLM Ops does **not** currently separate multimodal tokens from text tokens. All input tokens (text + image + video + audio) are tracked together as `input_tokens`.

  If you need separate multimodal token tracking, please contact us at [support@cloudidr.com](mailto:support@cloudidr.com).
</Warning>

***

## Pricing Updates

Model pricing is set by the providers (Anthropic, OpenAI, Google) and can change at any time.

**How we handle updates:**

* ✅ We monitor provider pricing pages daily
* ✅ Updates are applied within 24 hours of provider changes
* ✅ Historical data uses pricing from the time of request
* ✅ You're notified of major pricing changes

<Note>
  **Last pricing update:** January 10, 2026

  Check this page regularly for pricing updates.
</Note>

***

## Need a Model Added?

If you're using a model that's not listed here:

<Steps>
  <Step title="Check Provider Documentation">
    Verify the model exists in your provider's official API docs
  </Step>

  <Step title="Contact Us">
    Email [support@cloudidr.com](mailto:support@cloudidr.com) with:

    * Model name
    * Provider (Anthropic/OpenAI/Google)
    * Link to provider pricing
  </Step>

  <Step title="We'll Add It">
    We typically add new models within 2-3 business days
  </Step>
</Steps>

***

## Next Steps

<CardGroup cols={3}>
  <Card title="Anthropic Integration" icon="sparkles" href="/guides/llm-ops/integrations/anthropic">
    Start tracking Claude costs
  </Card>

  <Card title="OpenAI Integration" icon="brain" href="/guides/llm-ops/integrations/openai">
    Start tracking GPT costs
  </Card>

  <Card title="Google Integration" icon="google" href="/guides/llm-ops/integrations/gemini">
    Start tracking Gemini costs
  </Card>
</CardGroup>
