- Removed the `analyze_data_file` function from tool definitions to streamline functionality. - Enhanced the `execute_python_code` function description to clarify auto-installation of packages and file handling. - Deleted the `python_executor.py` module to simplify the codebase and improve maintainability. - Introduced a new `token_counter.py` module for efficient token counting for OpenAI API requests, including support for Discord image links and cost estimation.
10 KiB
Token Counting Guide
Overview
This bot implements comprehensive token counting for both text and images, with special handling for Discord image links stored in MongoDB with 24-hour expiration.
Token Encoding by Model
o200k_base (200k vocabulary) - Newer Models
Used for:
- ✅ gpt-4o and gpt-4o-mini
- ✅ gpt-4.1, gpt-4.1-mini, gpt-4.1-nano (NEW!)
- ✅ gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-chat
- ✅ o1, o1-mini, o1-preview
- ✅ o3, o3-mini
- ✅ o4, o4-mini
cl100k_base (100k vocabulary) - Older Models
Used for:
- ✅ gpt-4 (original, not 4o or 4.1)
- ✅ gpt-3.5-turbo
Token Counting Features
1. Text Token Counting
from src.utils.token_counter import token_counter
# Count text tokens
tokens = token_counter.count_text_tokens("Hello, world!", "openai/gpt-4o")
print(f"Text uses {tokens} tokens")
2. Image Token Counting
Images consume tokens based on their dimensions and detail level:
Low Detail
- 85 tokens (fixed cost)
High Detail
- Base cost: 170 tokens
- Tile cost: 170 tokens per 512x512 tile
- Images are scaled to fit 2048x2048
- Shortest side scaled to 768px
- Divided into 512x512 tiles
# Count image tokens from Discord URL
tokens = await token_counter.count_image_tokens(
image_url="https://cdn.discordapp.com/attachments/...",
detail="auto"
)
print(f"Image uses {tokens} tokens")
# Count image tokens from bytes
with open("image.png", "rb") as f:
image_data = f.read()
tokens = await token_counter.count_image_tokens(
image_data=image_data,
detail="high"
)
3. Message Token Counting
Count tokens for complete message arrays including text and images:
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
token_counts = await token_counter.count_message_tokens(messages, "openai/gpt-4o")
print(f"Total: {token_counts['total_tokens']} tokens")
print(f"Text: {token_counts['text_tokens']} tokens")
print(f"Images: {token_counts['image_tokens']} tokens")
4. Context Limit Checking
Check if messages fit within model's context window:
context_check = await token_counter.check_context_limit(
messages=messages,
model="openai/gpt-4o",
max_output_tokens=4096
)
if not context_check["within_limit"]:
print(f"⚠️ Messages too large: {context_check['input_tokens']} tokens")
print(f"Maximum: {context_check['max_tokens']} tokens")
else:
print(f"✅ Within limit. Available for output: {context_check['available_output_tokens']} tokens")
Discord Image Handling
Image Storage in MongoDB
When users send images in Discord:
- Image URL Captured: Discord CDN URL is stored
- Timestamp Added: Current datetime is recorded
- Saved to History: Stored in message content array
content = [
{"type": "text", "text": "Look at this image"},
{
"type": "image_url",
"image_url": {
"url": "https://cdn.discordapp.com/attachments/...",
"detail": "auto"
},
"timestamp": "2025-10-01T12:00:00" # Added automatically
}
]
24-Hour Expiration
Discord CDN links expire after ~24 hours. The system:
- Filters Expired Images: When loading history, images older than 23 hours are removed
- Token Counting Skips Expired: Token counter checks timestamps and skips expired images
- Automatic Cleanup: Database handler filters expired images on every
get_history()call
# In db_handler.py
def _filter_expired_images(self, history: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Filter out image links that are older than 23 hours"""
current_time = datetime.now()
expiration_time = current_time - timedelta(hours=23)
# Checks timestamp and removes expired images
# ...
Token Counter Expiration Handling
The token counter automatically skips expired images:
# In token_counter.py count_message_tokens()
timestamp_str = part.get("timestamp")
if timestamp_str:
timestamp = datetime.fromisoformat(timestamp_str)
if timestamp <= expiration_time:
logging.info(f"Skipping expired image (added at {timestamp_str})")
continue # Don't count tokens for expired images
Cost Estimation
Calculate costs based on token usage:
cost = token_counter.estimate_cost(
input_tokens=1000,
output_tokens=500,
model="openai/gpt-4o"
)
print(f"Estimated cost: ${cost:.6f}")
Model Pricing (per 1M tokens)
| Model | Input | Output |
|---|---|---|
| gpt-4o | $5.00 | $20.00 |
| gpt-4o-mini | $0.60 | $2.40 |
| gpt-4.1 | $2.00 | $8.00 |
| gpt-4.1-mini | $0.40 | $1.60 |
| gpt-4.1-nano | $0.10 | $0.40 |
| gpt-5 | $1.25 | $10.00 |
| gpt-5-mini | $0.25 | $2.00 |
| gpt-5-nano | $0.05 | $0.40 |
| o1-preview | $15.00 | $60.00 |
| o1-mini | $1.10 | $4.40 |
Database Token Tracking
Save Token Usage
await db_handler.save_token_usage(
user_id=user_id,
model="openai/gpt-4o",
input_tokens=1000,
output_tokens=500,
cost=0.0125,
text_tokens=950,
image_tokens=50
)
Get User Statistics
# Get total usage
stats = await db_handler.get_user_token_usage(user_id)
print(f"Total input: {stats['total_input_tokens']}")
print(f"Total text: {stats['total_text_tokens']}")
print(f"Total images: {stats['total_image_tokens']}")
print(f"Total cost: ${stats['total_cost']:.6f}")
# Get usage by model
model_usage = await db_handler.get_user_token_usage_by_model(user_id)
for model, usage in model_usage.items():
print(f"{model}: {usage['requests']} requests, ${usage['cost']:.6f}")
print(f" Text: {usage['text_tokens']}, Images: {usage['image_tokens']}")
Integration Example
Complete example of using token counting in a command:
from src.utils.token_counter import token_counter
async def process_user_message(interaction, user_message, image_urls=None):
user_id = interaction.user.id
model = await db_handler.get_user_model(user_id) or DEFAULT_MODEL
history = await db_handler.get_history(user_id)
# Build message content
content = [{"type": "text", "text": user_message}]
# Add images with timestamps
if image_urls:
for url in image_urls:
content.append({
"type": "image_url",
"image_url": {"url": url, "detail": "auto"},
"timestamp": datetime.now().isoformat()
})
# Add to messages
messages = history + [{"role": "user", "content": content}]
# Check context limit
context_check = await token_counter.check_context_limit(messages, model)
if not context_check["within_limit"]:
await interaction.followup.send(
f"⚠️ Context too large: {context_check['input_tokens']:,} tokens. "
f"Maximum: {context_check['max_tokens']:,} tokens.",
ephemeral=True
)
return
# Count input tokens
input_count = await token_counter.count_message_tokens(messages, model)
# Call API
response = await openai_client.chat.completions.create(
model=model,
messages=messages
)
reply = response.choices[0].message.content
# Get actual usage from API
usage = response.usage
actual_input = usage.prompt_tokens if usage else input_count['total_tokens']
actual_output = usage.completion_tokens if usage else token_counter.count_text_tokens(reply, model)
# Calculate cost
cost = token_counter.estimate_cost(actual_input, actual_output, model)
# Save to database
await db_handler.save_token_usage(
user_id=user_id,
model=model,
input_tokens=actual_input,
output_tokens=actual_output,
cost=cost,
text_tokens=input_count['text_tokens'],
image_tokens=input_count['image_tokens']
)
# Send response with cost
await interaction.followup.send(f"{reply}\n\n💰 Cost: ${cost:.6f}")
Best Practices
1. Always Check Context Limits
Before making API calls, check if the messages fit within the model's context window.
2. Add Timestamps to Images
When storing images from Discord, always add a timestamp:
"timestamp": datetime.now().isoformat()
3. Filter History on Load
The database handler automatically filters expired images when loading history.
4. Count Before API Call
Count tokens before calling the API to provide accurate estimates and warnings.
5. Use Actual Usage from API
Prefer response.usage over estimates when available:
actual_input = usage.prompt_tokens if usage else estimated_tokens
6. Track Text and Image Separately
Store both text_tokens and image_tokens for detailed analytics.
7. Show Cost to Users
Always display the cost after operations so users are aware of usage.
Context Window Limits
| Model | Context Limit |
|---|---|
| gpt-4o | 128,000 tokens |
| gpt-4o-mini | 128,000 tokens |
| gpt-4.1 | 128,000 tokens |
| gpt-4.1-mini | 128,000 tokens |
| gpt-4.1-nano | 128,000 tokens |
| gpt-5 | 200,000 tokens |
| gpt-5-mini | 200,000 tokens |
| gpt-5-nano | 200,000 tokens |
| o1 | 200,000 tokens |
| o1-mini | 128,000 tokens |
| o3 | 200,000 tokens |
| o3-mini | 200,000 tokens |
| gpt-4 | 8,192 tokens |
| gpt-3.5-turbo | 16,385 tokens |
Troubleshooting
Image Token Count Seems Wrong
- Check if image was downloaded successfully
- Verify image dimensions
- Remember: high detail images use tile-based calculation
Expired Images Still Counted
- Check that timestamps are in ISO format
- Verify expiration threshold (23 hours)
- Ensure
_filter_expired_images()is called
Cost Calculation Incorrect
- Verify model name matches MODEL_PRICING keys exactly
- Check that pricing is per 1M tokens
- Ensure input/output tokens are correct
Context Limit Exceeded
- Trim conversation history (keep last N messages)
- Reduce image detail level to "low"
- Remove old images from history
- Use a model with larger context window
Cleanup
Don't forget to close the token counter session when shutting down:
await token_counter.close()
This is typically done in the bot's cleanup/shutdown handler.