Files

cauvang32 9c180bdd89 Refactor OpenAI utilities and remove Python executor

- Removed the `analyze_data_file` function from tool definitions to streamline functionality.
- Enhanced the `execute_python_code` function description to clarify auto-installation of packages and file handling.
- Deleted the `python_executor.py` module to simplify the codebase and improve maintainability.
- Introduced a new `token_counter.py` module for efficient token counting for OpenAI API requests, including support for Discord image links and cost estimation.

2025-10-02 21:49:48 +07:00

10 KiB

Raw Blame History

Token Counting Guide

Overview

This bot implements comprehensive token counting for both text and images, with special handling for Discord image links stored in MongoDB with 24-hour expiration.

Token Encoding by Model

o200k_base (200k vocabulary) - Newer Models

Used for:

✅ gpt-4o and gpt-4o-mini
✅ gpt-4.1, gpt-4.1-mini, gpt-4.1-nano (NEW!)
✅ gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-chat
✅ o1, o1-mini, o1-preview
✅ o3, o3-mini
✅ o4, o4-mini

cl100k_base (100k vocabulary) - Older Models

Used for:

✅ gpt-4 (original, not 4o or 4.1)
✅ gpt-3.5-turbo

Token Counting Features

1. Text Token Counting

from src.utils.token_counter import token_counter

# Count text tokens
tokens = token_counter.count_text_tokens("Hello, world!", "openai/gpt-4o")
print(f"Text uses {tokens} tokens")

2. Image Token Counting

Images consume tokens based on their dimensions and detail level:

Low Detail

85 tokens (fixed cost)

High Detail

Base cost: 170 tokens
Tile cost: 170 tokens per 512x512 tile
Images are scaled to fit 2048x2048
Shortest side scaled to 768px
Divided into 512x512 tiles

# Count image tokens from Discord URL
tokens = await token_counter.count_image_tokens(
    image_url="https://cdn.discordapp.com/attachments/...",
    detail="auto"
)
print(f"Image uses {tokens} tokens")

# Count image tokens from bytes
with open("image.png", "rb") as f:
    image_data = f.read()
tokens = await token_counter.count_image_tokens(
    image_data=image_data,
    detail="high"
)

3. Message Token Counting

Count tokens for complete message arrays including text and images:

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
]

token_counts = await token_counter.count_message_tokens(messages, "openai/gpt-4o")
print(f"Total: {token_counts['total_tokens']} tokens")
print(f"Text: {token_counts['text_tokens']} tokens")
print(f"Images: {token_counts['image_tokens']} tokens")

4. Context Limit Checking

Check if messages fit within model's context window:

context_check = await token_counter.check_context_limit(
    messages=messages,
    model="openai/gpt-4o",
    max_output_tokens=4096
)

if not context_check["within_limit"]:
    print(f"⚠️ Messages too large: {context_check['input_tokens']} tokens")
    print(f"Maximum: {context_check['max_tokens']} tokens")
else:
    print(f"✅ Within limit. Available for output: {context_check['available_output_tokens']} tokens")

Discord Image Handling

Image Storage in MongoDB

When users send images in Discord:

Image URL Captured: Discord CDN URL is stored
Timestamp Added: Current datetime is recorded
Saved to History: Stored in message content array

content = [
    {"type": "text", "text": "Look at this image"},
    {
        "type": "image_url",
        "image_url": {
            "url": "https://cdn.discordapp.com/attachments/...",
            "detail": "auto"
        },
        "timestamp": "2025-10-01T12:00:00"  # Added automatically
    }
]

24-Hour Expiration

Discord CDN links expire after ~24 hours. The system:

Filters Expired Images: When loading history, images older than 23 hours are removed
Token Counting Skips Expired: Token counter checks timestamps and skips expired images
Automatic Cleanup: Database handler filters expired images on every get_history() call

# In db_handler.py
def _filter_expired_images(self, history: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
    """Filter out image links that are older than 23 hours"""
    current_time = datetime.now()
    expiration_time = current_time - timedelta(hours=23)
    
    # Checks timestamp and removes expired images
    # ...

Token Counter Expiration Handling

The token counter automatically skips expired images:

# In token_counter.py count_message_tokens()
timestamp_str = part.get("timestamp")
if timestamp_str:
    timestamp = datetime.fromisoformat(timestamp_str)
    if timestamp <= expiration_time:
        logging.info(f"Skipping expired image (added at {timestamp_str})")
        continue  # Don't count tokens for expired images

Cost Estimation

Calculate costs based on token usage:

cost = token_counter.estimate_cost(
    input_tokens=1000,
    output_tokens=500,
    model="openai/gpt-4o"
)
print(f"Estimated cost: ${cost:.6f}")

Model Pricing (per 1M tokens)

Model	Input	Output
gpt-4o	$5.00	$20.00
gpt-4o-mini	$0.60	$2.40
gpt-4.1	$2.00	$8.00
gpt-4.1-mini	$0.40	$1.60
gpt-4.1-nano	$0.10	$0.40
gpt-5	$1.25	$10.00
gpt-5-mini	$0.25	$2.00
gpt-5-nano	$0.05	$0.40
o1-preview	$15.00	$60.00
o1-mini	$1.10	$4.40

Database Token Tracking

Save Token Usage

await db_handler.save_token_usage(
    user_id=user_id,
    model="openai/gpt-4o",
    input_tokens=1000,
    output_tokens=500,
    cost=0.0125,
    text_tokens=950,
    image_tokens=50
)

Get User Statistics

# Get total usage
stats = await db_handler.get_user_token_usage(user_id)
print(f"Total input: {stats['total_input_tokens']}")
print(f"Total text: {stats['total_text_tokens']}")
print(f"Total images: {stats['total_image_tokens']}")
print(f"Total cost: ${stats['total_cost']:.6f}")

# Get usage by model
model_usage = await db_handler.get_user_token_usage_by_model(user_id)
for model, usage in model_usage.items():
    print(f"{model}: {usage['requests']} requests, ${usage['cost']:.6f}")
    print(f"  Text: {usage['text_tokens']}, Images: {usage['image_tokens']}")

Integration Example

Complete example of using token counting in a command:

from src.utils.token_counter import token_counter

async def process_user_message(interaction, user_message, image_urls=None):
    user_id = interaction.user.id
    model = await db_handler.get_user_model(user_id) or DEFAULT_MODEL
    history = await db_handler.get_history(user_id)
    
    # Build message content
    content = [{"type": "text", "text": user_message}]
    
    # Add images with timestamps
    if image_urls:
        for url in image_urls:
            content.append({
                "type": "image_url",
                "image_url": {"url": url, "detail": "auto"},
                "timestamp": datetime.now().isoformat()
            })
    
    # Add to messages
    messages = history + [{"role": "user", "content": content}]
    
    # Check context limit
    context_check = await token_counter.check_context_limit(messages, model)
    if not context_check["within_limit"]:
        await interaction.followup.send(
            f"⚠️ Context too large: {context_check['input_tokens']:,} tokens. "
            f"Maximum: {context_check['max_tokens']:,} tokens.",
            ephemeral=True
        )
        return
    
    # Count input tokens
    input_count = await token_counter.count_message_tokens(messages, model)
    
    # Call API
    response = await openai_client.chat.completions.create(
        model=model,
        messages=messages
    )
    
    reply = response.choices[0].message.content
    
    # Get actual usage from API
    usage = response.usage
    actual_input = usage.prompt_tokens if usage else input_count['total_tokens']
    actual_output = usage.completion_tokens if usage else token_counter.count_text_tokens(reply, model)
    
    # Calculate cost
    cost = token_counter.estimate_cost(actual_input, actual_output, model)
    
    # Save to database
    await db_handler.save_token_usage(
        user_id=user_id,
        model=model,
        input_tokens=actual_input,
        output_tokens=actual_output,
        cost=cost,
        text_tokens=input_count['text_tokens'],
        image_tokens=input_count['image_tokens']
    )
    
    # Send response with cost
    await interaction.followup.send(f"{reply}\n\n💰 Cost: ${cost:.6f}")

Best Practices

1. Always Check Context Limits

Before making API calls, check if the messages fit within the model's context window.

2. Add Timestamps to Images

When storing images from Discord, always add a timestamp:

"timestamp": datetime.now().isoformat()

3. Filter History on Load

The database handler automatically filters expired images when loading history.

4. Count Before API Call

Count tokens before calling the API to provide accurate estimates and warnings.

5. Use Actual Usage from API

Prefer response.usage over estimates when available:

actual_input = usage.prompt_tokens if usage else estimated_tokens

6. Track Text and Image Separately

Store both text_tokens and image_tokens for detailed analytics.

7. Show Cost to Users

Always display the cost after operations so users are aware of usage.

Context Window Limits

Model	Context Limit
gpt-4o	128,000 tokens
gpt-4o-mini	128,000 tokens
gpt-4.1	128,000 tokens
gpt-4.1-mini	128,000 tokens
gpt-4.1-nano	128,000 tokens
gpt-5	200,000 tokens
gpt-5-mini	200,000 tokens
gpt-5-nano	200,000 tokens
o1	200,000 tokens
o1-mini	128,000 tokens
o3	200,000 tokens
o3-mini	200,000 tokens
gpt-4	8,192 tokens
gpt-3.5-turbo	16,385 tokens

Troubleshooting

Image Token Count Seems Wrong

Check if image was downloaded successfully
Verify image dimensions
Remember: high detail images use tile-based calculation

Expired Images Still Counted

Check that timestamps are in ISO format
Verify expiration threshold (23 hours)
Ensure _filter_expired_images() is called

Cost Calculation Incorrect

Verify model name matches MODEL_PRICING keys exactly
Check that pricing is per 1M tokens
Ensure input/output tokens are correct

Context Limit Exceeded

Trim conversation history (keep last N messages)
Reduce image detail level to "low"
Remove old images from history
Use a model with larger context window

Cleanup

Don't forget to close the token counter session when shutting down:

await token_counter.close()

This is typically done in the bot's cleanup/shutdown handler.

10 KiB Raw Blame History