cost-limit โ Token Budget Enforcement โ
Severity: CRITICAL ยท Auto-fix: No ยท Category: ๐ฐ Cost
What It Does โ
Fires at CRITICAL level when the prompt exceeds the configured token_limit. Use this to catch accidentally bloated prompts before they hit API rate limits or blow your context window.
Example โ
Prompt: 850 tokens, token_limit: 800
[ CRITICAL ] cost-limit (line -)
Prompt exceeds token limit: 850/800 tokens.
Consider removing redundant content or splitting into smaller prompts.Configuration โ
yaml
token_limit: 800 # Default
rules:
cost_limit: trueChoosing a Limit โ
Set token_limit to 60โ70% of your model's context window to leave room for the completion:
| Model | Context window | Recommended token_limit |
|---|---|---|
| GPT-4o | 128k | 500โ2000 (system prompt budget) |
| GPT-4 | 8k | 500โ1000 |
| GPT-3.5 Turbo | 16k | 500โ1500 |
| Claude 3.5 Sonnet | 200k | 500โ5000 |
| Gemini 1.5 Pro | 1M | 500โ10000 |
Disabling โ
yaml
rules:
cost_limit: falseOr raise the limit:
yaml
token_limit: 2000