- Removed the `analyze_data_file` function from tool definitions to streamline functionality. - Enhanced the `execute_python_code` function description to clarify auto-installation of packages and file handling. - Deleted the `python_executor.py` module to simplify the codebase and improve maintainability. - Introduced a new `token_counter.py` module for efficient token counting for OpenAI API requests, including support for Discord image links and cost estimation.
368 lines
10 KiB
Markdown
368 lines
10 KiB
Markdown
# Token Counting Guide
|
|
|
|
## Overview
|
|
|
|
This bot implements comprehensive token counting for both text and images, with special handling for Discord image links stored in MongoDB with 24-hour expiration.
|
|
|
|
## Token Encoding by Model
|
|
|
|
### o200k_base (200k vocabulary) - Newer Models
|
|
Used for:
|
|
- ✅ **gpt-4o** and **gpt-4o-mini**
|
|
- ✅ **gpt-4.1**, **gpt-4.1-mini**, **gpt-4.1-nano** (NEW!)
|
|
- ✅ **gpt-5**, **gpt-5-mini**, **gpt-5-nano**, **gpt-5-chat**
|
|
- ✅ **o1**, **o1-mini**, **o1-preview**
|
|
- ✅ **o3**, **o3-mini**
|
|
- ✅ **o4**, **o4-mini**
|
|
|
|
### cl100k_base (100k vocabulary) - Older Models
|
|
Used for:
|
|
- ✅ **gpt-4** (original, not 4o or 4.1)
|
|
- ✅ **gpt-3.5-turbo**
|
|
|
|
## Token Counting Features
|
|
|
|
### 1. Text Token Counting
|
|
```python
|
|
from src.utils.token_counter import token_counter
|
|
|
|
# Count text tokens
|
|
tokens = token_counter.count_text_tokens("Hello, world!", "openai/gpt-4o")
|
|
print(f"Text uses {tokens} tokens")
|
|
```
|
|
|
|
### 2. Image Token Counting
|
|
|
|
Images consume tokens based on their dimensions and detail level:
|
|
|
|
#### Low Detail
|
|
- **85 tokens** (fixed cost)
|
|
|
|
#### High Detail
|
|
- **Base cost**: 170 tokens
|
|
- **Tile cost**: 170 tokens per 512x512 tile
|
|
- Images are scaled to fit 2048x2048
|
|
- Shortest side scaled to 768px
|
|
- Divided into 512x512 tiles
|
|
|
|
```python
|
|
# Count image tokens from Discord URL
|
|
tokens = await token_counter.count_image_tokens(
|
|
image_url="https://cdn.discordapp.com/attachments/...",
|
|
detail="auto"
|
|
)
|
|
print(f"Image uses {tokens} tokens")
|
|
|
|
# Count image tokens from bytes
|
|
with open("image.png", "rb") as f:
|
|
image_data = f.read()
|
|
tokens = await token_counter.count_image_tokens(
|
|
image_data=image_data,
|
|
detail="high"
|
|
)
|
|
```
|
|
|
|
### 3. Message Token Counting
|
|
|
|
Count tokens for complete message arrays including text and images:
|
|
|
|
```python
|
|
messages = [
|
|
{"role": "system", "content": "You are a helpful assistant."},
|
|
{"role": "user", "content": "Hello!"}
|
|
]
|
|
|
|
token_counts = await token_counter.count_message_tokens(messages, "openai/gpt-4o")
|
|
print(f"Total: {token_counts['total_tokens']} tokens")
|
|
print(f"Text: {token_counts['text_tokens']} tokens")
|
|
print(f"Images: {token_counts['image_tokens']} tokens")
|
|
```
|
|
|
|
### 4. Context Limit Checking
|
|
|
|
Check if messages fit within model's context window:
|
|
|
|
```python
|
|
context_check = await token_counter.check_context_limit(
|
|
messages=messages,
|
|
model="openai/gpt-4o",
|
|
max_output_tokens=4096
|
|
)
|
|
|
|
if not context_check["within_limit"]:
|
|
print(f"⚠️ Messages too large: {context_check['input_tokens']} tokens")
|
|
print(f"Maximum: {context_check['max_tokens']} tokens")
|
|
else:
|
|
print(f"✅ Within limit. Available for output: {context_check['available_output_tokens']} tokens")
|
|
```
|
|
|
|
## Discord Image Handling
|
|
|
|
### Image Storage in MongoDB
|
|
|
|
When users send images in Discord:
|
|
|
|
1. **Image URL Captured**: Discord CDN URL is stored
|
|
2. **Timestamp Added**: Current datetime is recorded
|
|
3. **Saved to History**: Stored in message content array
|
|
|
|
```python
|
|
content = [
|
|
{"type": "text", "text": "Look at this image"},
|
|
{
|
|
"type": "image_url",
|
|
"image_url": {
|
|
"url": "https://cdn.discordapp.com/attachments/...",
|
|
"detail": "auto"
|
|
},
|
|
"timestamp": "2025-10-01T12:00:00" # Added automatically
|
|
}
|
|
]
|
|
```
|
|
|
|
### 24-Hour Expiration
|
|
|
|
Discord CDN links expire after ~24 hours. The system:
|
|
|
|
1. **Filters Expired Images**: When loading history, images older than 23 hours are removed
|
|
2. **Token Counting Skips Expired**: Token counter checks timestamps and skips expired images
|
|
3. **Automatic Cleanup**: Database handler filters expired images on every `get_history()` call
|
|
|
|
```python
|
|
# In db_handler.py
|
|
def _filter_expired_images(self, history: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
|
"""Filter out image links that are older than 23 hours"""
|
|
current_time = datetime.now()
|
|
expiration_time = current_time - timedelta(hours=23)
|
|
|
|
# Checks timestamp and removes expired images
|
|
# ...
|
|
```
|
|
|
|
### Token Counter Expiration Handling
|
|
|
|
The token counter automatically skips expired images:
|
|
|
|
```python
|
|
# In token_counter.py count_message_tokens()
|
|
timestamp_str = part.get("timestamp")
|
|
if timestamp_str:
|
|
timestamp = datetime.fromisoformat(timestamp_str)
|
|
if timestamp <= expiration_time:
|
|
logging.info(f"Skipping expired image (added at {timestamp_str})")
|
|
continue # Don't count tokens for expired images
|
|
```
|
|
|
|
## Cost Estimation
|
|
|
|
Calculate costs based on token usage:
|
|
|
|
```python
|
|
cost = token_counter.estimate_cost(
|
|
input_tokens=1000,
|
|
output_tokens=500,
|
|
model="openai/gpt-4o"
|
|
)
|
|
print(f"Estimated cost: ${cost:.6f}")
|
|
```
|
|
|
|
### Model Pricing (per 1M tokens)
|
|
|
|
| Model | Input | Output |
|
|
|-------|-------|--------|
|
|
| gpt-4o | $5.00 | $20.00 |
|
|
| gpt-4o-mini | $0.60 | $2.40 |
|
|
| gpt-4.1 | $2.00 | $8.00 |
|
|
| gpt-4.1-mini | $0.40 | $1.60 |
|
|
| gpt-4.1-nano | $0.10 | $0.40 |
|
|
| gpt-5 | $1.25 | $10.00 |
|
|
| gpt-5-mini | $0.25 | $2.00 |
|
|
| gpt-5-nano | $0.05 | $0.40 |
|
|
| o1-preview | $15.00 | $60.00 |
|
|
| o1-mini | $1.10 | $4.40 |
|
|
|
|
## Database Token Tracking
|
|
|
|
### Save Token Usage
|
|
|
|
```python
|
|
await db_handler.save_token_usage(
|
|
user_id=user_id,
|
|
model="openai/gpt-4o",
|
|
input_tokens=1000,
|
|
output_tokens=500,
|
|
cost=0.0125,
|
|
text_tokens=950,
|
|
image_tokens=50
|
|
)
|
|
```
|
|
|
|
### Get User Statistics
|
|
|
|
```python
|
|
# Get total usage
|
|
stats = await db_handler.get_user_token_usage(user_id)
|
|
print(f"Total input: {stats['total_input_tokens']}")
|
|
print(f"Total text: {stats['total_text_tokens']}")
|
|
print(f"Total images: {stats['total_image_tokens']}")
|
|
print(f"Total cost: ${stats['total_cost']:.6f}")
|
|
|
|
# Get usage by model
|
|
model_usage = await db_handler.get_user_token_usage_by_model(user_id)
|
|
for model, usage in model_usage.items():
|
|
print(f"{model}: {usage['requests']} requests, ${usage['cost']:.6f}")
|
|
print(f" Text: {usage['text_tokens']}, Images: {usage['image_tokens']}")
|
|
```
|
|
|
|
## Integration Example
|
|
|
|
Complete example of using token counting in a command:
|
|
|
|
```python
|
|
from src.utils.token_counter import token_counter
|
|
|
|
async def process_user_message(interaction, user_message, image_urls=None):
|
|
user_id = interaction.user.id
|
|
model = await db_handler.get_user_model(user_id) or DEFAULT_MODEL
|
|
history = await db_handler.get_history(user_id)
|
|
|
|
# Build message content
|
|
content = [{"type": "text", "text": user_message}]
|
|
|
|
# Add images with timestamps
|
|
if image_urls:
|
|
for url in image_urls:
|
|
content.append({
|
|
"type": "image_url",
|
|
"image_url": {"url": url, "detail": "auto"},
|
|
"timestamp": datetime.now().isoformat()
|
|
})
|
|
|
|
# Add to messages
|
|
messages = history + [{"role": "user", "content": content}]
|
|
|
|
# Check context limit
|
|
context_check = await token_counter.check_context_limit(messages, model)
|
|
if not context_check["within_limit"]:
|
|
await interaction.followup.send(
|
|
f"⚠️ Context too large: {context_check['input_tokens']:,} tokens. "
|
|
f"Maximum: {context_check['max_tokens']:,} tokens.",
|
|
ephemeral=True
|
|
)
|
|
return
|
|
|
|
# Count input tokens
|
|
input_count = await token_counter.count_message_tokens(messages, model)
|
|
|
|
# Call API
|
|
response = await openai_client.chat.completions.create(
|
|
model=model,
|
|
messages=messages
|
|
)
|
|
|
|
reply = response.choices[0].message.content
|
|
|
|
# Get actual usage from API
|
|
usage = response.usage
|
|
actual_input = usage.prompt_tokens if usage else input_count['total_tokens']
|
|
actual_output = usage.completion_tokens if usage else token_counter.count_text_tokens(reply, model)
|
|
|
|
# Calculate cost
|
|
cost = token_counter.estimate_cost(actual_input, actual_output, model)
|
|
|
|
# Save to database
|
|
await db_handler.save_token_usage(
|
|
user_id=user_id,
|
|
model=model,
|
|
input_tokens=actual_input,
|
|
output_tokens=actual_output,
|
|
cost=cost,
|
|
text_tokens=input_count['text_tokens'],
|
|
image_tokens=input_count['image_tokens']
|
|
)
|
|
|
|
# Send response with cost
|
|
await interaction.followup.send(f"{reply}\n\n💰 Cost: ${cost:.6f}")
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
### 1. Always Check Context Limits
|
|
Before making API calls, check if the messages fit within the model's context window.
|
|
|
|
### 2. Add Timestamps to Images
|
|
When storing images from Discord, always add a timestamp:
|
|
```python
|
|
"timestamp": datetime.now().isoformat()
|
|
```
|
|
|
|
### 3. Filter History on Load
|
|
The database handler automatically filters expired images when loading history.
|
|
|
|
### 4. Count Before API Call
|
|
Count tokens before calling the API to provide accurate estimates and warnings.
|
|
|
|
### 5. Use Actual Usage from API
|
|
Prefer `response.usage` over estimates when available:
|
|
```python
|
|
actual_input = usage.prompt_tokens if usage else estimated_tokens
|
|
```
|
|
|
|
### 6. Track Text and Image Separately
|
|
Store both text_tokens and image_tokens for detailed analytics.
|
|
|
|
### 7. Show Cost to Users
|
|
Always display the cost after operations so users are aware of usage.
|
|
|
|
## Context Window Limits
|
|
|
|
| Model | Context Limit |
|
|
|-------|--------------|
|
|
| gpt-4o | 128,000 tokens |
|
|
| gpt-4o-mini | 128,000 tokens |
|
|
| gpt-4.1 | 128,000 tokens |
|
|
| gpt-4.1-mini | 128,000 tokens |
|
|
| gpt-4.1-nano | 128,000 tokens |
|
|
| gpt-5 | 200,000 tokens |
|
|
| gpt-5-mini | 200,000 tokens |
|
|
| gpt-5-nano | 200,000 tokens |
|
|
| o1 | 200,000 tokens |
|
|
| o1-mini | 128,000 tokens |
|
|
| o3 | 200,000 tokens |
|
|
| o3-mini | 200,000 tokens |
|
|
| gpt-4 | 8,192 tokens |
|
|
| gpt-3.5-turbo | 16,385 tokens |
|
|
|
|
## Troubleshooting
|
|
|
|
### Image Token Count Seems Wrong
|
|
- Check if image was downloaded successfully
|
|
- Verify image dimensions
|
|
- Remember: high detail images use tile-based calculation
|
|
|
|
### Expired Images Still Counted
|
|
- Check that timestamps are in ISO format
|
|
- Verify expiration threshold (23 hours)
|
|
- Ensure `_filter_expired_images()` is called
|
|
|
|
### Cost Calculation Incorrect
|
|
- Verify model name matches MODEL_PRICING keys exactly
|
|
- Check that pricing is per 1M tokens
|
|
- Ensure input/output tokens are correct
|
|
|
|
### Context Limit Exceeded
|
|
- Trim conversation history (keep last N messages)
|
|
- Reduce image detail level to "low"
|
|
- Remove old images from history
|
|
- Use a model with larger context window
|
|
|
|
## Cleanup
|
|
|
|
Don't forget to close the token counter session when shutting down:
|
|
|
|
```python
|
|
await token_counter.close()
|
|
```
|
|
|
|
This is typically done in the bot's cleanup/shutdown handler.
|