Skip to main content
Tex tracks two numbers: tokens_in and tokens_out. It counts them with tiktoken and the cl100k_base vocabulary. On the free tier, each org gets 1M tokens in and 5M tokens out per UTC day. If either limit is exceeded, the API returns 429. Every remember and recall response includes a usage object. The SDK helpers tex.usage.today() and tex.usage.summary() show the same totals as the dashboard.

Free tier

tokens_in

1,000,000 / day

tokens_out

5,000,000 / day
Both reset at 00:00 UTC. Crossing either limit raises RateLimitError (HTTP 429).

Usage in code

Per response

Every remember and recall returns usage for that call:
hits = tex.recall(q="...", session_id=sid)
print(hits.usage.tokens_in, hits.usage.tokens_out)

Per org (dashboards, cron jobs)

today = tex.usage.today()         # daily totals + quota headroom
month = tex.usage.summary()       # current calendar month
march = tex.usage.summary("2026-03")
The Dashboard Usage page shows the same numbers.

Cost knobs

LeverWhy it helps
Lower top_kDefaults are 15 / 25. Live chat often needs only 5-8.
Stay on activedeep mode costs more time and tokens than active.
Trim noisy writesSkip one-word acks and redundant system spam in remember.
Batch turnsSend many turns in one remember instead of dozens of calls.
Quota-aware routingFall back to “no memory” for non-critical paths when you are near the cap.
if tex.usage.today().tokens_in_used / 1_000_000 > 0.9:
    return generate_without_memory(query)

Alerting

There is no hosted email alert yet. Poll tex.usage.today() from your own monitor. Page your team when either usage column crosses ~90% of quota. Server-side emails near 80% are on the roadmap.

Pricing (later)

Billing will be pay-as-you-go once pricing is published. Daily caps stay in place as safety rails. Until the billing docs change, treat today’s 429 behavior as the source of truth.

Install SDK

pip install tex-sdk