Overview
Track costs and monitor usage for Google’s Gemini API by routing your requests through LLM Ops. This guide shows you how to integrate using Python, JavaScript, or cURL.Security Guarantee: LLM Ops does not store your API keys, request prompts, or response content in the analytics database—only metadata needed for cost analytics. The proxy must forward request bodies to Google to complete the call; optional operational logging may exist in your deployment environment.
Quick Start
Point the Gemini client at the LLM Ops API host (same path layout as Google:/v1beta/models/...). The proxy serves:
- Original API host:
https://generativelanguage.googleapis.com - LLM Ops API host:
https://api.llm-ops.cloudidr.com(paths such as/v1beta/models/{model}:generateContentstay the same; only the host changes)
API Keys
You’ll need two credentials:- Google API Key - Your Gemini API key from aistudio.google.com (or your Google Cloud project)
- Cloudidr Key - Your tracking token from the LLM Ops dashboard (tokens are typically prefixed with
trk_)
Integration Examples
- Python
- JavaScript
- cURL
Cost Tracking Headers
| Header | Description | Example |
|---|---|---|
X-Cloudidr-Key | Required - Your Cloudidr tracking token | trk_abc123... |
X-Department | Track costs by department | engineering, sales, marketing, support |
X-Project | Track costs by project/team (preferred) | backend, frontend, ml, data, qa |
X-Team | Legacy alias for project/team (same as X-Project) | backend, frontend |
X-Agent | Track costs by agent/application | chatbot, summarizer, analyzer, translator |
Supported Models
All Google Gemini models supported by the proxy are available. See the Supported Models page for the complete list of available models and pricing.What Gets Tracked
LLM Ops automatically captures: ✅ Token usage - Input and output tokens (including multimodal input counted toward input tokens)✅ Cost - Real-time cost calculation
✅ Latency - Request duration
✅ Model - Which Gemini model was used
✅ Metadata - Department, team, agent
✅ Errors - Failed requests and error types
✅ Multimodal inputs - Media you send affects token usage; totals appear in input tokens from Google’s usage metadata Google may report hidden or thoughts tokens (context, safety, etc.) in usage; LLM Ops uses those counts for billing alignment where present.
View Your Data
After making requests, view your costs in the LLM Ops Dashboard:- Agent Explorer - See costs by agent/application
- Department Breakdown - Compare department spending
- Team Analysis - Track team-level costs
- Model Comparison - Compare costs across Gemini models
- Time Series - Track spending over time
Migration from Direct API
Switching from direct Gemini API to LLM Ops requires updating the endpoint and adding the tracking header on each request:Multimodal Support
Gemini supports images, video, and audio—all requests go through the same proxy and are billed from Google’s usage metadata:Multimodal token tracking:Google converts images/video/audio to tokens and includes them in usage. LLM Ops records total input/output tokens and cost—typically not a separate line item per modality in the database.
Cost Optimization Tips
Monitor High-Token Multimodal Requests
Monitor High-Token Multimodal Requests
Images and videos can consume significant tokens:
- Track total input token usage in dashboard
- Identify agents with high token consumption
- Optimize image resolution before sending to API
Leverage Large Context Windows
Leverage Large Context Windows
Many Gemini models support large context windows (limits vary by model):
- Process large documents in fewer calls when appropriate
- Balance context size vs. token cost
Compare Model Variants
Compare Model Variants
Use the dashboard to find cost-saving opportunities:
- Track performance vs. cost by model
- Test different Gemini variants for your workload
- Move high-volume, low-complexity tasks to cheaper models where quality allows
Troubleshooting
Requests not being tracked
Requests not being tracked
Check these common issues:
- ✅ Use API host
https://api.llm-ops.cloudidr.comwith Gemini paths (/v1beta/models/...)—same structure asgenerativelanguage.googleapis.com. - ✅ Confirm the header name is
X-Cloudidr-Key(notX-Cloudidr-Token) on every request. - ✅ Pass your Google API key as Google expects (
?key=,x-goog-api-key, orAuthorization, depending on client). - ✅ Verify your Cloudidr tracking token is valid.
Authentication errors
Authentication errors
Two separate keys are needed:
- Your Google API key (for Gemini access)
- Your Cloudidr tracking token (for cost tracking)
Cost data not appearing
Cost data not appearing
Wait a few moments:
- Cost data may take 10-30 seconds to appear in dashboard
- Check the correct time range in dashboard filters
- Verify requests are returning 200 OK status
Next Steps
View Dashboard
See your Gemini API costs in real-time
Supported Models
View all supported Gemini models
OpenAI Integration
Add cost tracking for GPT models
Set Budgets
Configure spending alerts and limits

