Files
ChatGPT-Discord-Bot/docs/TOKEN_COUNTING_GUIDE.md
cauvang32 9c180bdd89 Refactor OpenAI utilities and remove Python executor
- Removed the `analyze_data_file` function from tool definitions to streamline functionality.
- Enhanced the `execute_python_code` function description to clarify auto-installation of packages and file handling.
- Deleted the `python_executor.py` module to simplify the codebase and improve maintainability.
- Introduced a new `token_counter.py` module for efficient token counting for OpenAI API requests, including support for Discord image links and cost estimation.
2025-10-02 21:49:48 +07:00

368 lines
10 KiB
Markdown

# Token Counting Guide
## Overview
This bot implements comprehensive token counting for both text and images, with special handling for Discord image links stored in MongoDB with 24-hour expiration.
## Token Encoding by Model
### o200k_base (200k vocabulary) - Newer Models
Used for:
-**gpt-4o** and **gpt-4o-mini**
-**gpt-4.1**, **gpt-4.1-mini**, **gpt-4.1-nano** (NEW!)
-**gpt-5**, **gpt-5-mini**, **gpt-5-nano**, **gpt-5-chat**
-**o1**, **o1-mini**, **o1-preview**
-**o3**, **o3-mini**
-**o4**, **o4-mini**
### cl100k_base (100k vocabulary) - Older Models
Used for:
-**gpt-4** (original, not 4o or 4.1)
-**gpt-3.5-turbo**
## Token Counting Features
### 1. Text Token Counting
```python
from src.utils.token_counter import token_counter
# Count text tokens
tokens = token_counter.count_text_tokens("Hello, world!", "openai/gpt-4o")
print(f"Text uses {tokens} tokens")
```
### 2. Image Token Counting
Images consume tokens based on their dimensions and detail level:
#### Low Detail
- **85 tokens** (fixed cost)
#### High Detail
- **Base cost**: 170 tokens
- **Tile cost**: 170 tokens per 512x512 tile
- Images are scaled to fit 2048x2048
- Shortest side scaled to 768px
- Divided into 512x512 tiles
```python
# Count image tokens from Discord URL
tokens = await token_counter.count_image_tokens(
image_url="https://cdn.discordapp.com/attachments/...",
detail="auto"
)
print(f"Image uses {tokens} tokens")
# Count image tokens from bytes
with open("image.png", "rb") as f:
image_data = f.read()
tokens = await token_counter.count_image_tokens(
image_data=image_data,
detail="high"
)
```
### 3. Message Token Counting
Count tokens for complete message arrays including text and images:
```python
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
token_counts = await token_counter.count_message_tokens(messages, "openai/gpt-4o")
print(f"Total: {token_counts['total_tokens']} tokens")
print(f"Text: {token_counts['text_tokens']} tokens")
print(f"Images: {token_counts['image_tokens']} tokens")
```
### 4. Context Limit Checking
Check if messages fit within model's context window:
```python
context_check = await token_counter.check_context_limit(
messages=messages,
model="openai/gpt-4o",
max_output_tokens=4096
)
if not context_check["within_limit"]:
print(f"⚠️ Messages too large: {context_check['input_tokens']} tokens")
print(f"Maximum: {context_check['max_tokens']} tokens")
else:
print(f"✅ Within limit. Available for output: {context_check['available_output_tokens']} tokens")
```
## Discord Image Handling
### Image Storage in MongoDB
When users send images in Discord:
1. **Image URL Captured**: Discord CDN URL is stored
2. **Timestamp Added**: Current datetime is recorded
3. **Saved to History**: Stored in message content array
```python
content = [
{"type": "text", "text": "Look at this image"},
{
"type": "image_url",
"image_url": {
"url": "https://cdn.discordapp.com/attachments/...",
"detail": "auto"
},
"timestamp": "2025-10-01T12:00:00" # Added automatically
}
]
```
### 24-Hour Expiration
Discord CDN links expire after ~24 hours. The system:
1. **Filters Expired Images**: When loading history, images older than 23 hours are removed
2. **Token Counting Skips Expired**: Token counter checks timestamps and skips expired images
3. **Automatic Cleanup**: Database handler filters expired images on every `get_history()` call
```python
# In db_handler.py
def _filter_expired_images(self, history: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Filter out image links that are older than 23 hours"""
current_time = datetime.now()
expiration_time = current_time - timedelta(hours=23)
# Checks timestamp and removes expired images
# ...
```
### Token Counter Expiration Handling
The token counter automatically skips expired images:
```python
# In token_counter.py count_message_tokens()
timestamp_str = part.get("timestamp")
if timestamp_str:
timestamp = datetime.fromisoformat(timestamp_str)
if timestamp <= expiration_time:
logging.info(f"Skipping expired image (added at {timestamp_str})")
continue # Don't count tokens for expired images
```
## Cost Estimation
Calculate costs based on token usage:
```python
cost = token_counter.estimate_cost(
input_tokens=1000,
output_tokens=500,
model="openai/gpt-4o"
)
print(f"Estimated cost: ${cost:.6f}")
```
### Model Pricing (per 1M tokens)
| Model | Input | Output |
|-------|-------|--------|
| gpt-4o | $5.00 | $20.00 |
| gpt-4o-mini | $0.60 | $2.40 |
| gpt-4.1 | $2.00 | $8.00 |
| gpt-4.1-mini | $0.40 | $1.60 |
| gpt-4.1-nano | $0.10 | $0.40 |
| gpt-5 | $1.25 | $10.00 |
| gpt-5-mini | $0.25 | $2.00 |
| gpt-5-nano | $0.05 | $0.40 |
| o1-preview | $15.00 | $60.00 |
| o1-mini | $1.10 | $4.40 |
## Database Token Tracking
### Save Token Usage
```python
await db_handler.save_token_usage(
user_id=user_id,
model="openai/gpt-4o",
input_tokens=1000,
output_tokens=500,
cost=0.0125,
text_tokens=950,
image_tokens=50
)
```
### Get User Statistics
```python
# Get total usage
stats = await db_handler.get_user_token_usage(user_id)
print(f"Total input: {stats['total_input_tokens']}")
print(f"Total text: {stats['total_text_tokens']}")
print(f"Total images: {stats['total_image_tokens']}")
print(f"Total cost: ${stats['total_cost']:.6f}")
# Get usage by model
model_usage = await db_handler.get_user_token_usage_by_model(user_id)
for model, usage in model_usage.items():
print(f"{model}: {usage['requests']} requests, ${usage['cost']:.6f}")
print(f" Text: {usage['text_tokens']}, Images: {usage['image_tokens']}")
```
## Integration Example
Complete example of using token counting in a command:
```python
from src.utils.token_counter import token_counter
async def process_user_message(interaction, user_message, image_urls=None):
user_id = interaction.user.id
model = await db_handler.get_user_model(user_id) or DEFAULT_MODEL
history = await db_handler.get_history(user_id)
# Build message content
content = [{"type": "text", "text": user_message}]
# Add images with timestamps
if image_urls:
for url in image_urls:
content.append({
"type": "image_url",
"image_url": {"url": url, "detail": "auto"},
"timestamp": datetime.now().isoformat()
})
# Add to messages
messages = history + [{"role": "user", "content": content}]
# Check context limit
context_check = await token_counter.check_context_limit(messages, model)
if not context_check["within_limit"]:
await interaction.followup.send(
f"⚠️ Context too large: {context_check['input_tokens']:,} tokens. "
f"Maximum: {context_check['max_tokens']:,} tokens.",
ephemeral=True
)
return
# Count input tokens
input_count = await token_counter.count_message_tokens(messages, model)
# Call API
response = await openai_client.chat.completions.create(
model=model,
messages=messages
)
reply = response.choices[0].message.content
# Get actual usage from API
usage = response.usage
actual_input = usage.prompt_tokens if usage else input_count['total_tokens']
actual_output = usage.completion_tokens if usage else token_counter.count_text_tokens(reply, model)
# Calculate cost
cost = token_counter.estimate_cost(actual_input, actual_output, model)
# Save to database
await db_handler.save_token_usage(
user_id=user_id,
model=model,
input_tokens=actual_input,
output_tokens=actual_output,
cost=cost,
text_tokens=input_count['text_tokens'],
image_tokens=input_count['image_tokens']
)
# Send response with cost
await interaction.followup.send(f"{reply}\n\n💰 Cost: ${cost:.6f}")
```
## Best Practices
### 1. Always Check Context Limits
Before making API calls, check if the messages fit within the model's context window.
### 2. Add Timestamps to Images
When storing images from Discord, always add a timestamp:
```python
"timestamp": datetime.now().isoformat()
```
### 3. Filter History on Load
The database handler automatically filters expired images when loading history.
### 4. Count Before API Call
Count tokens before calling the API to provide accurate estimates and warnings.
### 5. Use Actual Usage from API
Prefer `response.usage` over estimates when available:
```python
actual_input = usage.prompt_tokens if usage else estimated_tokens
```
### 6. Track Text and Image Separately
Store both text_tokens and image_tokens for detailed analytics.
### 7. Show Cost to Users
Always display the cost after operations so users are aware of usage.
## Context Window Limits
| Model | Context Limit |
|-------|--------------|
| gpt-4o | 128,000 tokens |
| gpt-4o-mini | 128,000 tokens |
| gpt-4.1 | 128,000 tokens |
| gpt-4.1-mini | 128,000 tokens |
| gpt-4.1-nano | 128,000 tokens |
| gpt-5 | 200,000 tokens |
| gpt-5-mini | 200,000 tokens |
| gpt-5-nano | 200,000 tokens |
| o1 | 200,000 tokens |
| o1-mini | 128,000 tokens |
| o3 | 200,000 tokens |
| o3-mini | 200,000 tokens |
| gpt-4 | 8,192 tokens |
| gpt-3.5-turbo | 16,385 tokens |
## Troubleshooting
### Image Token Count Seems Wrong
- Check if image was downloaded successfully
- Verify image dimensions
- Remember: high detail images use tile-based calculation
### Expired Images Still Counted
- Check that timestamps are in ISO format
- Verify expiration threshold (23 hours)
- Ensure `_filter_expired_images()` is called
### Cost Calculation Incorrect
- Verify model name matches MODEL_PRICING keys exactly
- Check that pricing is per 1M tokens
- Ensure input/output tokens are correct
### Context Limit Exceeded
- Trim conversation history (keep last N messages)
- Reduce image detail level to "low"
- Remove old images from history
- Use a model with larger context window
## Cleanup
Don't forget to close the token counter session when shutting down:
```python
await token_counter.close()
```
This is typically done in the bot's cleanup/shutdown handler.