Skip to main content
Use this guide if you store chat history in Redis, Postgres, or Mongo and send too much of it to the model on every request.

Before (Redis log)

before.py
# Append every turn
redis.rpush(f"hist:{sid}", json.dumps({"role":"user","text": user_msg}))
redis.rpush(f"hist:{sid}", json.dumps({"role":"assistant","text": reply}))

# Load the whole history into the prompt
history = [json.loads(x) for x in redis.lrange(f"hist:{sid}", 0, 50)]
prompt = build_prompt(history, user_msg)
reply = llm(prompt)
Pain points:
  • Hits the LLM context limit fast.
  • “Last 50” is a guess. Older relevant context gets evicted.
  • Redis cost grows linearly forever.

After (Tex)

after.py
hits = tex.recall(q=user_msg, session_id=sid, top_k=8)
prompt = build_prompt_with_memory([h.text for h in hits.hits.turns], user_msg)
reply = llm(prompt)
tex.conversations.remember(session_id=sid, turns=[
    {"role":"user","text": user_msg, "timestamp": now_iso()},
    {"role":"assistant","text": reply, "timestamp": now_iso()},
])
Three things change:
  1. Bounded prompts. You pull the relevant 8 turns regardless of how many exist.
  2. Cross-session continuity. Use f"chat-{user_id}" to share memory across conversations.
  3. Less prompt trimming. You stop guessing which recent turns fit in context.

Backfill plan

1

Dump Redis to a script

For each hist:* key, fetch all turns with their original timestamps.
2

Bulk-remember

for sid, turns in redis_history.items():
    formatted = [
        {"role": t["role"], "text": t["text"], "timestamp": t["ts"]}
        for t in turns
    ]
    # Big batches are fine. Pass all turns at once.
    tex.conversations.remember(session_id=sid, turns=formatted)
3

Verify

Pick 10 sessions. For each, run a known query and compare retrieved turns against your Redis log. Look for:
  • All turns are present (active_fragment_ids count matches input)
  • recall(q=<a known phrase>) finds the right turn
4

Shadow mode

Run both read paths in production for a week. Log when Tex’s confidence < 0.2. If that rate stays below your tolerance, proceed.
5

Cut over the read path

Switch the prompt to use Tex hits. Keep the Redis writes for one more week as backup.
6

Cut over the write path

Stop appending to Redis. Drop the table.

Edge cases

  • Streaming responses. Persist with remember after the stream completes. Run it in the background so the next request is not delayed.
  • System messages. Do not migrate them. They consume tokens and add little recall value.
  • Tool calls. Store the result of a tool call as an assistant turn, not the raw JSON. Recall returns text.
  • Audit log. Tex is not an audit store. Keep Redis or another append-only log for compliance, and use Tex for retrieval.