Skip to main content

Overview

Track costs and monitor usage for OpenAI’s API (GPT-4, GPT-3.5, o1) by routing your requests through LLM Ops. This guide shows you how to integrate using Python, JavaScript, or cURL.
Security Guarantee: LLM Ops does NOT store your API keys, request prompts, or response content. We only track metadata needed for cost analytics.

Quick Start

Replace OpenAI’s base URL with LLM Ops proxy URL to automatically track costs:
  • Original: https://api.openai.com
  • LLM Ops: https://api.llm-ops.cloudidr.com

API Keys

You’ll need two credentials:
  1. OpenAI API Key - Your API key from platform.openai.com
  2. Cloudidr Token - Your tracking token from llmfinops.ai
Set them as environment variables:
export OPENAI_API_KEY="sk-proj-..."
export CLOUDIDR_TOKEN="cloudidr_..."

Integration Examples

Install SDK

pip install openai

Basic Example

from openai import OpenAI

# Initialize client with LLM Ops proxy
client = OpenAI(
    api_key="sk-proj-...",  # Your OpenAI API key
    base_url="https://api.llm-ops.cloudidr.com/v1",
    default_headers={
        "X-Cloudidr-Token": "cloudidr_..."  # Required for cost tracking
    }
)

# Make API call - costs are automatically tracked
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

print(response.choices[0].message.content)

With Metadata (Department/Team/Agent Tracking)

from openai import OpenAI

client = OpenAI(
    api_key="sk-proj-...",
    base_url="https://api.llm-ops.cloudidr.com/v1"
)

# Track costs by department, team, and agent
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Explain async/await in Python"}
    ],
    extra_headers={
        "X-Cloudidr-Token": "cloudidr_...",
        "X-Department": "engineering",
        "X-Team": "backend",
        "X-Agent": "code-tutor"
    }
)

print(response.choices[0].message.content)

Streaming Example

from openai import OpenAI

client = OpenAI(
    api_key="sk-proj-...",
    base_url="https://api.llm-ops.cloudidr.com/v1",
    default_headers={
        "X-Cloudidr-Token": "cloudidr_...",
        "X-Agent": "chat-bot"
    }
)

# Streaming is fully supported
stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Write a haiku about programming"}
    ],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Function Calling Example

from openai import OpenAI

client = OpenAI(
    api_key="sk-proj-...",
    base_url="https://api.llm-ops.cloudidr.com/v1",
    default_headers={
        "X-Cloudidr-Token": "cloudidr_...",
        "X-Agent": "weather-assistant"
    }
)

# Function calling with cost tracking
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            },
            "required": ["location"]
        }
    }
}]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "What's the weather in San Francisco?"}
    ],
    tools=tools
)

print(response.choices[0].message.tool_calls)

Cost Tracking Headers

Add these headers to organize your costs by department, team, or agent:
HeaderDescriptionExample
X-Cloudidr-TokenRequired - Your CloudIDR tracking tokencloudidr_abc123...
X-DepartmentTrack costs by departmentengineering, sales, marketing, support
X-TeamTrack costs by teambackend, frontend, ml, data, qa
X-AgentTrack costs by agent/applicationchatbot, summarizer, analyzer, translator

Supported Models

All OpenAI models are supported. See the Supported Models page for the complete list of available models and pricing.

What Gets Tracked

LLM Ops automatically captures: Token usage - Prompt, completion, and total tokens
Cost - Real-time cost calculation
Latency - Request duration (TTFT, total time)
Model - Which OpenAI model was used
Metadata - Department, team, agent
Errors - Failed requests and error types
Function calls - Tool/function usage tracking
What We DON’T Track:
  • ❌ Customer API keys
  • ❌ Request content (prompts)
  • ❌ Response content (completions)
We only track metadata needed for cost analytics.

View Your Data

After making requests, view your costs in the LLM Ops Dashboard:
  • Agent Explorer - See costs by agent/application
  • Department Breakdown - Compare department spending
  • Team Analysis - Track team-level costs
  • Model Comparison - Compare costs across models
  • Time Series - Track spending over time
  • Cost Optimization - Get recommendations for cheaper models

Migration from Direct API

Switching from direct OpenAI API to LLM Ops is a two-line change:
# Before
client = OpenAI(api_key="sk-proj-...")

# After - add base_url and X-Cloudidr-Token header
client = OpenAI(
    api_key="sk-proj-...",
    base_url="https://api.llm-ops.cloudidr.com/v1",      # ← Add this
    default_headers={"X-Cloudidr-Token": "cloudidr_..."}  # ← Add this
)
Everything else stays the same - no code changes needed!

Cost Optimization Tips

Use the LLM Ops dashboard to identify which agents can switch to cheaper models:
  • Track cost per request by model
  • Compare quality vs. cost trade-offs
  • Identify high-volume, low-complexity tasks
Perfect candidates for model switching appear in the Agent Explorer.
Function calling adds token overhead:
  • Track function call frequency per agent
  • Identify redundant or unnecessary calls
  • Optimize function descriptions to reduce tokens
LLM Ops tracks tool usage separately from chat completion.
OpenAI’s prompt caching can reduce costs by 50% for repeated system prompts:
  • Track cache hit rates in dashboard
  • Identify agents with repeated prompts
  • Structure prompts for maximum cache benefit
LLM Ops shows cache savings in cost breakdowns.

Troubleshooting

Check these common issues:
  • ✅ Verify base URL is https://api.llm-ops.cloudidr.com/v1
  • ✅ Confirm X-Cloudidr-Token header is included in all requests
  • ✅ Check that your OpenAI API key is valid
  • ✅ Ensure you’re using /v1 in the endpoint path
Two separate keys are needed:
  • Your OpenAI API key (for GPT access)
  • Your CloudIDR token (for cost tracking)
Make sure both are set correctly and not swapped.
Wait a few moments:
  • Cost data may take 10-30 seconds to appear in dashboard
  • Check the correct time range in dashboard filters
  • Verify requests are returning 200 OK status

Next Steps