Skip to main content
New here? Start with the Quickstart. Then read this overview and Authentication before you ship.
Most chat apps make you choose between two bad options. You either send the whole chat history to the model, or you lose memory when the page refreshes. Tex gives you a simpler path. Store each turn as it happens. When the user asks the next question, ask Tex for the few memories that matter. Then call your model with that smaller context. Your app still runs the model, routes, and UI. Tex handles storage, search, ranking, and usage tracking.

API

CallWhen
rememberStore new turns, plus optional metadata.
recallBefore you call the model, ask for the most relevant memories.
Need access? Create an account in the dashboard, copy the API key once, then follow the Quickstart. Locally, set TEX_API_KEY or pass api_key= to the client.

Start

Quickstart

Install tex-sdk, store one turn, recall it, and print the score.

Benchmarks

LoCoMo and LongMemEval_S results with splits, latency, and token counts.

Benchmarks

LoCoMo · 93.3%

Full-system benchmark. Tex is ahead of EverMemOS (92.3%), MemMachine v0.2 (91.7%), Zep (~85%), and Mem0 (~66%). See Benchmarks for splits and methodology.

LongMemEval_S · 92.2%

Active retrieval track. Tex is ahead of Emergence AI (86.0%), Supermemory (81.6%), and Zep (71.2%). See Benchmarks for per-ability tables.

Loop

1

Remember

tex.conversations.remember(session_id="chat-1", turns=[
  {"role": "user", "text": "I'm allergic to shellfish.", "timestamp": "..."},
])
2

Recall

hits = tex.recall(q=user_msg, session_id="chat-1")
context = "\n".join(h.text for h in hits.hits.turns)
3

Generate

Put context where your model reads it. Answer the user. Store the new turns.
Low confidence at the start usually means the session has very little memory. Store more real turns and the score becomes more useful.

More

Latency: active write vs background work

The fast part of remember returns quickly. New turns are usually recallable within about 150 ms. Tex then continues background work, such as observations, entities, and timeline updates. The diagrams and timing notes are in How memory works.

Isolation between customers

Use org_id, user_id, and session_id to keep memory separated. Scopes and multi-tenancy shows how to map those fields to your users and tenants.

Python vs raw HTTP

Use the Python SDK if you want token exchange and refresh handled for you. Use the REST API from another language, or when your service already owns HTTP calls.

Quotas and billing

Tex meters tokens_in and tokens_out with daily caps. Usage, quotas, and billing explains what counts and when limits reset.

Docs

How memory works

What lands in storage after remember.

Recall and ranking

Modes, top_k, confidence.

Python SDK

Install and client setup.

Cookbook

Apps, agents, production patterns.