tokens_in and tokens_out. It counts them with tiktoken and the cl100k_base vocabulary.
On the free tier, each org gets 1M tokens in and 5M tokens out per UTC day. If either limit is exceeded, the API returns 429. Every remember and recall response includes a usage object. The SDK helpers tex.usage.today() and tex.usage.summary() show the same totals as the dashboard.
Free tier
tokens_in
1,000,000 / day
tokens_out
5,000,000 / day
RateLimitError (HTTP 429).
Usage in code
Per response
Everyremember and recall returns usage for that call:
Per org (dashboards, cron jobs)
Cost knobs
| Lever | Why it helps |
|---|---|
Lower top_k | Defaults are 15 / 25. Live chat often needs only 5-8. |
Stay on active | deep mode costs more time and tokens than active. |
| Trim noisy writes | Skip one-word acks and redundant system spam in remember. |
| Batch turns | Send many turns in one remember instead of dozens of calls. |
| Quota-aware routing | Fall back to “no memory” for non-critical paths when you are near the cap. |
Alerting
There is no hosted email alert yet. Polltex.usage.today() from your own monitor. Page your team when either usage column crosses ~90% of quota. Server-side emails near 80% are on the roadmap.
Pricing (later)
Billing will be pay-as-you-go once pricing is published. Daily caps stay in place as safety rails. Until the billing docs change, treat today’s429 behavior as the source of truth.
Install SDK
pip install tex-sdk
