Google (Gemini) Integration

Overview

Track costs and monitor usage for Google’s Gemini API by routing your requests through LLM Ops. This guide shows you how to integrate using Python, JavaScript, or cURL.

Security Guarantee: LLM Ops does NOT store your API keys, request prompts, or response content. We only track metadata needed for cost analytics.

Quick Start

Replace Google’s base URL with LLM Ops proxy URL to automatically track costs:

Original: https://generativelanguage.googleapis.com
LLM Ops: https://api.llm-ops.cloudidr.com

API Keys

You’ll need two credentials:

Google API Key - Your Gemini API key from aistudio.google.com
Cloudidr Token - Your tracking token from llmfinops.ai

Set them as environment variables:

export GOOGLE_API_KEY="AIzaSy..."
export CLOUDIDR_TOKEN="cloudidr_..."

Integration Examples

Python
JavaScript
cURL

Install SDK

pip install google-generativeai

Basic Example

import google.generativeai as genai

# Configure with LLM Ops proxy
genai.configure(
    api_key="AIzaSy...",  # Your Google API key
    transport="rest",
    client_options={
        "api_endpoint": "https://api.llm-ops.cloudidr.com"
    }
)

# Create model
model = genai.GenerativeModel('gemini-2.0-flash-exp')

# Make API call with tracking headers
response = model.generate_content(
    "What is the capital of France?",
    request_options={
        "headers": {
            "X-Cloudidr-Token": "cloudidr_..."
        }
    }
)

print(response.text)

With Metadata (Department/Team/Agent Tracking)

import google.generativeai as genai

genai.configure(
    api_key="AIzaSy...",
    transport="rest",
    client_options={
        "api_endpoint": "https://api.llm-ops.cloudidr.com"
    }
)

model = genai.GenerativeModel('gemini-2.0-flash-exp')

# Track costs by department, team, and agent
response = model.generate_content(
    "Explain quantum computing in simple terms",
    request_options={
        "headers": {
            "X-Cloudidr-Token": "cloudidr_...",
            "X-Department": "research",
            "X-Team": "ml",
            "X-Agent": "science-explainer"
        }
    }
)

print(response.text)

Streaming Example

import google.generativeai as genai

genai.configure(
    api_key="AIzaSy...",
    transport="rest",
    client_options={
        "api_endpoint": "https://api.llm-ops.cloudidr.com"
    }
)

model = genai.GenerativeModel('gemini-2.0-flash-exp')

# Streaming is fully supported
response = model.generate_content(
    "Write a story about a robot learning to paint",
    stream=True,
    request_options={
        "headers": {
            "X-Cloudidr-Token": "cloudidr_...",
            "X-Agent": "story-generator"
        }
    }
)

for chunk in response:
    print(chunk.text, end="", flush=True)

Chat Example (Multi-turn)

import google.generativeai as genai

genai.configure(
    api_key="AIzaSy...",
    transport="rest",
    client_options={
        "api_endpoint": "https://api.llm-ops.cloudidr.com"
    }
)

model = genai.GenerativeModel('gemini-2.0-flash-exp')
chat = model.start_chat(history=[])

headers = {
    "X-Cloudidr-Token": "cloudidr_...",
    "X-Agent": "chat-bot"
}

# All messages are tracked with costs
response1 = chat.send_message(
    "Hello! What's your name?",
    request_options={"headers": headers}
)
print(response1.text)

response2 = chat.send_message(
    "Can you help me with Python?",
    request_options={"headers": headers}
)
print(response2.text)

Multimodal Example (Image Analysis)

import google.generativeai as genai
from PIL import Image

genai.configure(
    api_key="AIzaSy...",
    transport="rest",
    client_options={
        "api_endpoint": "https://api.llm-ops.cloudidr.com"
    }
)

model = genai.GenerativeModel('gemini-2.0-flash-exp')

# Analyze an image - costs are tracked
img = Image.open('photo.jpg')
response = model.generate_content(
    ["What's in this image?", img],
    request_options={
        "headers": {
            "X-Cloudidr-Token": "cloudidr_...",
            "X-Agent": "vision-analyzer"
        }
    }
)

print(response.text)

Install SDK

npm install @google/generative-ai

Basic Example

import { GoogleGenerativeAI } from '@google/generative-ai';

// Initialize with LLM Ops proxy
const genAI = new GoogleGenerativeAI('AIzaSy...', {
  baseUrl: 'https://api.llm-ops.cloudidr.com'
});

const model = genAI.getGenerativeModel({ 
  model: 'gemini-2.0-flash-exp' 
});

// Make API call with tracking headers
const result = await model.generateContent(
  'What is the capital of France?',
  {
    headers: {
      'X-Cloudidr-Token': 'cloudidr_...'
    }
  }
);

console.log(result.response.text());

With Metadata (Department/Team/Agent Tracking)

import { GoogleGenerativeAI } from '@google/generative-ai';

const genAI = new GoogleGenerativeAI('AIzaSy...', {
  baseUrl: 'https://api.llm-ops.cloudidr.com'
});

const model = genAI.getGenerativeModel({ 
  model: 'gemini-2.0-flash-exp' 
});

// Track costs by department, team, and agent
const result = await model.generateContent(
  'Explain quantum computing in simple terms',
  {
    headers: {
      'X-Cloudidr-Token': 'cloudidr_...',
      'X-Department': 'research',
      'X-Team': 'ml',
      'X-Agent': 'science-explainer'
    }
  }
);

console.log(result.response.text());

Streaming Example

import { GoogleGenerativeAI } from '@google/generative-ai';

const genAI = new GoogleGenerativeAI('AIzaSy...', {
  baseUrl: 'https://api.llm-ops.cloudidr.com'
});

const model = genAI.getGenerativeModel({ 
  model: 'gemini-2.0-flash-exp' 
});

// Streaming is fully supported
const result = await model.generateContentStream(
  'Write a story about a robot learning to paint',
  {
    headers: {
      'X-Cloudidr-Token': 'cloudidr_...',
      'X-Agent': 'story-generator'
    }
  }
);

for await (const chunk of result.stream) {
  process.stdout.write(chunk.text());
}

Chat Example (Multi-turn)

import { GoogleGenerativeAI } from '@google/generative-ai';

const genAI = new GoogleGenerativeAI('AIzaSy...', {
  baseUrl: 'https://api.llm-ops.cloudidr.com'
});

const model = genAI.getGenerativeModel({ 
  model: 'gemini-2.0-flash-exp' 
});

const chat = model.startChat({
  history: []
});

const headers = {
  'X-Cloudidr-Token': 'cloudidr_...',
  'X-Agent': 'chat-bot'
};

// All messages are tracked with costs
const result1 = await chat.sendMessage(
  "Hello! What's your name?",
  { headers }
);
console.log(result1.response.text());

const result2 = await chat.sendMessage(
  'Can you help me with Python?',
  { headers }
);
console.log(result2.response.text());

Basic Example

curl "https://api.llm-ops.cloudidr.com/v1beta/models/gemini-2.0-flash-exp:generateContent?key=AIzaSy..." \
  -H "Content-Type: application/json" \
  -H "X-Cloudidr-Token: cloudidr_..." \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "What is the capital of France?"
          }
        ]
      }
    ]
  }'

With Metadata (Department/Team/Agent Tracking)

curl "https://api.llm-ops.cloudidr.com/v1beta/models/gemini-2.0-flash-exp:generateContent?key=AIzaSy..." \
  -H "Content-Type: application/json" \
  -H "X-Cloudidr-Token: cloudidr_..." \
  -H "X-Department: research" \
  -H "X-Team: ml" \
  -H "X-Agent: science-explainer" \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "Explain quantum computing in simple terms"
          }
        ]
      }
    ]
  }'

Streaming Example

curl "https://api.llm-ops.cloudidr.com/v1beta/models/gemini-2.0-flash-exp:streamGenerateContent?key=AIzaSy..." \
  -H "Content-Type: application/json" \
  -H "X-Cloudidr-Token: cloudidr_..." \
  -H "X-Agent: story-generator" \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "Write a story about a robot learning to paint"
          }
        ]
      }
    ]
  }'

With Generation Config

curl "https://api.llm-ops.cloudidr.com/v1beta/models/gemini-2.0-flash-exp:generateContent?key=AIzaSy..." \
  -H "Content-Type: application/json" \
  -H "X-Cloudidr-Token: cloudidr_..." \
  -H "X-Agent: creative-writer" \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "Write a creative product description"
          }
        ]
      }
    ],
    "generationConfig": {
      "temperature": 0.9,
      "topK": 1,
      "topP": 1,
      "maxOutputTokens": 2048
    }
  }'

Cost Tracking Headers

Add these headers to organize your costs by department, team, or agent:

Header	Description	Example
`X-Cloudidr-Token`	Required - Your CloudIDR tracking token	`cloudidr_abc123...`
`X-Department`	Track costs by department	`engineering`, `sales`, `marketing`, `support`
`X-Team`	Track costs by team	`backend`, `frontend`, `ml`, `data`, `qa`
`X-Agent`	Track costs by agent/application	`chatbot`, `summarizer`, `analyzer`, `translator`

Supported Models

All Google Gemini models are supported. See the Supported Models page for the complete list of available models and pricing.

What Gets Tracked

LLM Ops automatically captures: ✅ Token usage - Input and output tokens
✅ Cost - Real-time cost calculation
✅ Latency - Request duration
✅ Model - Which Gemini model was used
✅ Metadata - Department, team, agent
✅ Errors - Failed requests and error types
✅ Safety ratings - Content safety scores
✅ Multimodal inputs - Image/video/audio tokens (included in input_tokens)

What We DON’T Track:

❌ Customer API keys
❌ Request content (prompts)
❌ Response content (completions)
❌ Image/video/audio file contents
❌ Separate multimodal token breakdown (all included in input_tokens)

We only track metadata needed for cost analytics.

View Your Data

After making requests, view your costs in the LLM Ops Dashboard:

Agent Explorer - See costs by agent/application
Department Breakdown - Compare department spending
Team Analysis - Track team-level costs
Model Comparison - Compare costs across Gemini models
Time Series - Track spending over time
Safety Analytics - Monitor content safety scores

Migration from Direct API

Switching from direct Gemini API to LLM Ops requires updating the endpoint:

# Before
genai.configure(api_key="AIzaSy...")

# After - add proxy endpoint and headers
genai.configure(
    api_key="AIzaSy...",
    transport="rest",
    client_options={
        "api_endpoint": "https://api.llm-ops.cloudidr.com"  # ← Add this
    }
)

# Then add headers to each request
response = model.generate_content(
    "Your prompt",
    request_options={
        "headers": {"X-Cloudidr-Token": "cloudidr_..."}  # ← Add this
    }
)

Multimodal Support

Gemini supports images, video, and audio - all tracked by LLM Ops:

import google.generativeai as genai
from PIL import Image

genai.configure(
    api_key="AIzaSy...",
    transport="rest",
    client_options={
        "api_endpoint": "https://api.llm-ops.cloudidr.com"
    }
)

model = genai.GenerativeModel('gemini-2.0-flash-exp')

# Analyze an image
img = Image.open('photo.jpg')
response = model.generate_content(
    ["What's in this image?", img],
    request_options={
        "headers": {
            "X-Cloudidr-Token": "cloudidr_...",
            "X-Agent": "vision-analyzer"
        }
    }
)

print(response.text)

Multimodal Token Tracking:Google converts images/video/audio to tokens and includes them in the input_tokens count. LLM Ops tracks the total input tokens returned by Google.Image/video/audio tokens are NOT tracked separately - they’re included in the total input_tokens count.

Cost Optimization Tips

Monitor High-Token Multimodal Requests

Images and videos consume significant tokens:

Track total input token usage in dashboard
Identify agents with high token consumption
Optimize image resolution before sending to API
Use lower resolution for simple tasks

LLM Ops tracks total input tokens (text + multimodal combined).

Leverage Large Context Windows

Gemini supports massive context windows (up to 2M tokens):

Process entire documents in one call
Reduce chunking overhead
Maintain full context for better results

Can reduce total API calls needed, lowering costs.

Compare Model Variants

Use dashboard to identify cost-saving opportunities:

Track performance vs. cost by model
Test different Gemini variants
Switch high-volume tasks to cheaper models

Perfect for finding the right model-task match.

Troubleshooting

Requests not being tracked

Check these common issues:

✅ Verify base URL is https://api.llm-ops.cloudidr.com
✅ Confirm X-Cloudidr-Token header is included in all requests
✅ Check that your Google API key is valid
✅ Ensure you’re using /v1beta in the endpoint path for Gemini

Authentication errors

Two separate keys are needed:

Your Google API key (for Gemini access)
Your CloudIDR token (for cost tracking)

Make sure both are set correctly and not swapped.

Cost data not appearing

Wait a few moments:

Cost data may take 10-30 seconds to appear in dashboard
Check the correct time range in dashboard filters
Verify requests are returning 200 OK status

Next Steps

View Dashboard

See your Gemini API costs in real-time

Supported Models

View all supported Gemini models

OpenAI Integration

Add cost tracking for GPT models

Set Budgets

Configure spending alerts and limits

Get Started

LLM Ops

Flex Compute

Overview

Quick Start

API Keys

Integration Examples

Install SDK

Basic Example

With Metadata (Department/Team/Agent Tracking)

Streaming Example

Chat Example (Multi-turn)

Multimodal Example (Image Analysis)

Install SDK

Basic Example

With Metadata (Department/Team/Agent Tracking)

Streaming Example

Chat Example (Multi-turn)

Basic Example

With Metadata (Department/Team/Agent Tracking)

Streaming Example

With Generation Config

Cost Tracking Headers

Supported Models

What Gets Tracked

View Your Data

Migration from Direct API

Multimodal Support

Cost Optimization Tips

Troubleshooting

Next Steps

View Dashboard

Supported Models

OpenAI Integration

Set Budgets

Get Started

LLM Ops

Flex Compute

​Overview

​Quick Start

​API Keys

​Integration Examples

​Install SDK

​Basic Example

​With Metadata (Department/Team/Agent Tracking)

​Streaming Example

​Chat Example (Multi-turn)

​Multimodal Example (Image Analysis)

​Install SDK

​Basic Example

​With Metadata (Department/Team/Agent Tracking)

​Streaming Example

​Chat Example (Multi-turn)

​Basic Example

​With Metadata (Department/Team/Agent Tracking)

​Streaming Example

​With Generation Config

​Cost Tracking Headers

​Supported Models

​What Gets Tracked

​View Your Data

​Migration from Direct API

​Multimodal Support

​Cost Optimization Tips

​Troubleshooting

​Next Steps

View Dashboard

Supported Models

OpenAI Integration

Set Budgets

Overview

Quick Start

API Keys

Integration Examples

Install SDK

Basic Example

With Metadata (Department/Team/Agent Tracking)

Streaming Example

Chat Example (Multi-turn)

Multimodal Example (Image Analysis)

Install SDK

Basic Example

With Metadata (Department/Team/Agent Tracking)

Streaming Example

Chat Example (Multi-turn)

Basic Example

With Metadata (Department/Team/Agent Tracking)

Streaming Example

With Generation Config

Cost Tracking Headers

Supported Models

What Gets Tracked

View Your Data

Migration from Direct API

Multimodal Support

Cost Optimization Tips

Troubleshooting

Next Steps