Video Analysis with Gemini 3.0: A Developer's Guide
We built Pet Health Scan using Gemini's video analysis capabilities. Here's everything we learned.
Why Gemini for Video?
Gemini 3.0 excels at video analysis for several reasons:
- Native video understanding (not frame-by-frame)
- Long context windows (handles full videos)
- Cost-effective compared to alternatives
- Fast inference for real-time applications
Getting Started
API Setup
typescriptLoading...
Basic Video Analysis
typescriptLoading...
Real-World Application: Pet Health Scan
Pet Health Scan analyzes videos of pets to identify potential health issues. Here's how we built it.
Architecture
codeLoading...
Video Upload Handler
typescriptLoading...
Health Analysis Prompt
typescriptLoading...
Advanced Techniques
Timestamp Analysis
For long videos, get insights at specific timestamps:
typescriptLoading...
Comparison Analysis
Compare two videos (before/after, different angles):
typescriptLoading...
Streaming Analysis
For real-time feedback during video processing:
typescriptLoading...
Performance Optimization
Video Preprocessing
Reduce costs and improve speed:
typescriptLoading...
Caching
Cache analysis results for identical videos:
typescriptLoading...
Cost Management
Pricing (as of January 2026)
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| Gemini 3.0 Pro | $1.25 | $5.00 |
| Gemini 3.0 Flash | $0.075 | $0.30 |
Video token calculation: Approximately 263 tokens per second of video.
Example: A 30-second video ≈ 7,890 tokens ≈ $0.01 with Pro, $0.0006 with Flash.
Cost Optimization
- Use Flash for initial screening, Pro for detailed analysis
- Truncate long videos to relevant sections
- Lower resolution when full quality isn't needed
- Batch similar requests when possible
typescriptLoading...
Error Handling
typescriptLoading...
Production Checklist
- Video validation (size, format, duration)
- Rate limiting per user
- Error handling for all API responses
- Cost monitoring and alerts
- Caching for repeat analyses
- Logging for debugging
- User consent for video processing
- Data retention policy
Frequently Asked Questions
Q: How much does Gemini 3.0 video analysis cost per video?
A 30-second video uses approximately 7,890 tokens. With Gemini 3.0 Pro at $1.25/$5.00 per million input/output tokens, that costs roughly $0.01 per analysis. Using Gemini 3.0 Flash ($0.075/$0.30) drops the cost to about $0.0006. For cost optimization, use Flash for initial screening and only escalate to Pro when the screening detects something that needs detailed analysis.
Q: Does Gemini analyze video frame-by-frame or as a continuous stream?
Gemini 3.0 provides native video understanding, meaning it processes the video as a continuous stream rather than extracting individual frames. This allows it to understand motion, temporal patterns, and context across the entire clip, such as detecting a subtle limp that only appears during certain movements. This is a significant advantage over frame-by-frame approaches that miss temporal relationships between frames.
Q: What are the video size and format limitations for the Gemini API?
The Gemini API accepts videos up to approximately 50MB in common formats including MP4, QuickTime, and WebM. For optimal results, preprocess videos to 720p resolution, 15fps, and limit duration to 60 seconds. Lower resolution and frame rate reduce token consumption and cost without significantly impacting analysis quality for most use cases. Always validate file size and format before sending to the API.
Q: How do you handle errors and safety filters in Gemini video analysis?
Gemini may reject videos that trigger safety filters (SAFETY errors) or contain content too similar to training data (RECITATION errors). Implement specific error handling for each case: return a user-friendly message for safety blocks, retry with exponential backoff for quota errors, and log all failures for debugging. Always build a fallback path so your application degrades gracefully when the API rejects or fails to process a video.
Need Video AI for Your Project?
We specialize in video analysis applications with Gemini.
AI 4U Labs builds production video AI applications. Pet Health Scan is one of 30+ apps we've shipped.


