ChatGPT-Discord-Bot/docs/QUICK_REFERENCE.md

# Quick Reference: Token Counting System

## Import
```python
from src.utils.token_counter import token_counter
```

## Text Tokens
```python
tokens = token_counter.count_text_tokens("Hello!", "openai/gpt-4o")
```

## Image Tokens
```python
# From URL (Discord CDN)
tokens = await token_counter.count_image_tokens(
    image_url="https://cdn.discordapp.com/...",
    detail="auto"  # or "low" or "high"
)

# From bytes
tokens = await token_counter.count_image_tokens(
    image_data=image_bytes,
    detail="auto"
)
```

## Message Tokens
```python
messages = [
    {"role": "system", "content": "You are helpful."},
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Look at this"},
            {
                "type": "image_url",
                "image_url": {"url": "https://...", "detail": "auto"},
                "timestamp": "2025-10-01T12:00:00"  # Add for 24h expiration
            }
        ]
    }
]

counts = await token_counter.count_message_tokens(messages, "openai/gpt-4o")
# Returns: {
#     "text_tokens": 50,
#     "image_tokens": 500,
#     "total_tokens": 550
# }
```

## Context Check
```python
check = await token_counter.check_context_limit(messages, "openai/gpt-4o")

if not check["within_limit"]:
    print(f"⚠️ Too large: {check['input_tokens']} tokens")
    print(f"Max: {check['max_tokens']} tokens")
else:
    print(f"✅ OK! {check['available_output_tokens']} tokens available")
```

## Cost Estimate
```python
cost = token_counter.estimate_cost(
    input_tokens=1000,
    output_tokens=500,
    model="openai/gpt-4o"
)
print(f"Cost: ${cost:.6f}")
```

## Save Usage (Database)
```python
await db_handler.save_token_usage(
    user_id=123456789,
    model="openai/gpt-4o",
    input_tokens=1000,
    output_tokens=500,
    cost=0.0125,
    text_tokens=950,
    image_tokens=50
)
```

## Get User Stats
```python
# Total usage
stats = await db_handler.get_user_token_usage(user_id)
print(f"Total: {stats['total_cost']:.6f}")
print(f"Text: {stats['total_text_tokens']:,}")
print(f"Images: {stats['total_image_tokens']:,}")

# By model
model_usage = await db_handler.get_user_token_usage_by_model(user_id)
for model, usage in model_usage.items():
    print(f"{model}: ${usage['cost']:.6f}, {usage['requests']} reqs")
```

## Model Encodings

### o200k_base (200k vocabulary)
- gpt-4o, gpt-4o-mini
- **gpt-4.1, gpt-4.1-mini, gpt-4.1-nano** ⭐
- gpt-5 (all variants)
- o1, o3, o4 (all variants)

### cl100k_base (100k vocabulary)
- gpt-4 (original)
- gpt-3.5-turbo

## Image Token Costs

| Detail | Cost |
|--------|------|
| Low | 85 tokens |
| High | 170 + (170 × tiles) |

Tiles = ceil(width/512) × ceil(height/512) after scaling to 2048×2048 and 768px shortest side.

## Context Limits

| Model | Tokens |
|-------|--------|
| gpt-4o, gpt-4o-mini, gpt-4.1* | 128,000 |
| gpt-5*, o1-mini, o1-preview | 128,000-200,000 |
| o1, o3, o4 | 200,000 |
| gpt-4 | 8,192 |
| gpt-3.5-turbo | 16,385 |

## Discord Image Timestamps

Always add when storing images:
```python
{
    "type": "image_url",
    "image_url": {"url": discord_url, "detail": "auto"},
    "timestamp": datetime.now().isoformat()  # ← Important!
}
```

Images >23 hours old are automatically filtered.

## Complete Integration Pattern

```python
async def handle_message(interaction, text, image_urls=None):
    user_id = interaction.user.id
    model = await db_handler.get_user_model(user_id) or "openai/gpt-4o"
    history = await db_handler.get_history(user_id)

    # Build content
    content = [{"type": "text", "text": text}]
    if image_urls:
        for url in image_urls:
            content.append({
                "type": "image_url",
                "image_url": {"url": url, "detail": "auto"},
                "timestamp": datetime.now().isoformat()
            })

    messages = history + [{"role": "user", "content": content}]

    # Check context
    check = await token_counter.check_context_limit(messages, model)
    if not check["within_limit"]:
        await interaction.followup.send(
            f"⚠️ Too large: {check['input_tokens']:,} tokens",
            ephemeral=True
        )
        return

    # Count tokens
    input_count = await token_counter.count_message_tokens(messages, model)

    # Call API
    response = await openai_client.chat.completions.create(
        model=model,
        messages=messages
    )

    reply = response.choices[0].message.content

    # Get usage
    usage = response.usage
    actual_in = usage.prompt_tokens if usage else input_count['total_tokens']
    actual_out = usage.completion_tokens if usage else token_counter.count_text_tokens(reply, model)

    # Calculate cost
    cost = token_counter.estimate_cost(actual_in, actual_out, model)

    # Save
    await db_handler.save_token_usage(
        user_id=user_id,
        model=model,
        input_tokens=actual_in,
        output_tokens=actual_out,
        cost=cost,
        text_tokens=input_count['text_tokens'],
        image_tokens=input_count['image_tokens']
    )

    # Respond
    await interaction.followup.send(f"{reply}\n\n💰 ${cost:.6f}")
```

## Cleanup

At bot shutdown:
```python
await token_counter.close()
```

## Key Points

✅ **Always add timestamps** to Discord images
✅ **Check context limits** before API calls
✅ **Use actual usage** from API response when available
✅ **Track text/image separately** for analytics
✅ **Show cost** to users
✅ **Filter expired images** automatically (done by db_handler)

## Troubleshooting

**Tokens seem wrong?**
→ Check model name and encoding

**Images not counted?**
→ Verify URL is accessible and timestamp is valid

**Context errors?**
→ Trim history or use "low" detail for images

**Cost incorrect?**
→ Check MODEL_PRICING and use actual API usage