Why This Matters
Googleโs Gemini API is one of the most powerful available in 2025. With a 2-million token context window and multimodal capabilities (text, image, audio, video), it outperforms many competitors on complex technical tasks. Mastering this API gives you a significant edge for building production-grade AI applications.
Prerequisites
- Python 3.9+
- A Google account and API key (free at aistudio.google.com)
- Basic Python knowledge
Installation
pip install google-generativeai python-dotenv
Create a .env file at the root:
GEMINI_API_KEY=your_api_key_here
Step 1 โ Client Initialization
import google.generativeai as genai
import os
from dotenv import load_dotenv
load_dotenv()
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
# Choose the model
model = genai.GenerativeModel("gemini-1.5-pro")
Step 2 โ Simple Text Generation
response = model.generate_content("Explain the concept of tokenization in NLP.")
print(response.text)
Step 3 โ Streaming (for long responses)
for chunk in model.generate_content("Write an article about LLMs.", stream=True):
print(chunk.text, end="", flush=True)
Step 4 โ Image Analysis (multimodal)
import PIL.Image
img = PIL.Image.open("screenshot.png")
response = model.generate_content(["What do you see in this image?", img])
print(response.text)
Gemini Model Comparison
| Model | Context | Speed | Cost | Best for |
|---|---|---|---|---|
| gemini-1.5-pro | 2M tokens | Slow | $$$ | Long document analysis |
| gemini-1.5-flash | 1M tokens | Fast | $ | Real-time applications |
| gemini-1.0-pro | 32k tokens | Medium | $$ | General use |
โ ๏ธ Common Mistakes
ResourceExhausted: Youโve hit the rate limit. Add atime.sleep(1)between calls.InvalidArgument: The image is too large. Resize it to max 4MB.- Truncated response: Increase
max_output_tokensingeneration_config.
Key Takeaways
- The API is free up to a certain quota โ enough to get started
gemini-1.5-flashis 10x cheaper thanprofor 95% of use cases- Streaming significantly improves UX for long responses
- The 2M token context window allows injecting entire codebases
- Always handle rate limiting errors with exponential retry
Resources
- Official Documentation
- Google AI Studio โ Free playground
- Python SDK on GitHub