Multi-tenant SaaS pattern - Tex | Memory API for agents

There are two common ways to isolate tenants. Most teams should start with Pattern A.

A - one key, put the user in `session_id`

Share a cached Tex client

deps.py

from functools import cache
from tex import Tex
import os

@cache
def shared_tex() -> Tex:
    return Tex(
        api_key=os.environ["TEX_API_KEY"],
        base_url=os.environ["TEX_BASE_URL"],
    )

Derive session ids per user + conversation

chat.py

from pydantic import BaseModel
from fastapi import APIRouter, Header
from .deps import shared_tex

router = APIRouter()

class ChatBody(BaseModel):
    text: str
    session_id: str

@router.post("/chat")
def chat(body: ChatBody, x_user_id: str = Header(...)):
    tex = shared_tex()
    sid = f"u_{x_user_id}-{body.session_id}"

    hits = tex.recall(q=body.text, session_id=sid)
    answer = your_llm(body.text, memory=hits)
    tex.conversations.remember(
        session_id=sid,
        turns=[
            {"role": "user", "text": body.text, "timestamp": "2026-05-13T00:00:00Z"},
            {"role": "assistant", "text": answer, "timestamp": "2026-05-13T00:00:00Z"},
        ],
    )
    return {"answer": answer}

Replace your_llm(...) with your model call. Reuse the same turn format you already store.

Trait	Pattern A
Tex keys you operate	1
Bills	1 (yours)
Dashboard for end-users	You build it
Isolation	Strong, as long as you do not collide `session_id`

The SDK accepts user_id in the constructor, but scopes are per client instance today. Creating one Tex client per end user would waste connection pools. Until per-call user_id ships in SDK 1.2, keep the tenant in session_id.

B - one key per customer org

Mint a fresh org when you onboard

signup.py

import httpx

def onboard_end_user(end_user_email: str) -> str:
    resp = httpx.post(
        "https://api.getmetacognition.com/signup",
        json={"name": end_user_email},
    )
    resp.raise_for_status()
    data = resp.json()
    db.users.update(
        end_user_email,
        tex_api_key=data["api_key"],
        tex_org_id=data["org_id"],
    )
    return data["api_key"]

Look up the right client per request

def tex_for_user(end_user_id: str) -> Tex:
    row = db.users.get(end_user_id)
    return Tex(api_key=row.tex_api_key, base_url=os.environ["TEX_BASE_URL"])

Cache instances in a TTL map for about 1 hour. That keeps warm connections without keeping every customer client forever.

Trait	Pattern B
Tex keys	One per paying customer
Bills	Per customer org
Dashboard	Each customer can log into Tex directly
Isolation	Hard boundary at org level

Which pattern to choose

Pattern A

You ship an app on top of Tex. Infrastructure is shared, ops are simpler, and metering stays in your product.

Pattern B

You resell Tex and customers expect their own bill and console.

Shared quota (A only)

Daily quotas are per Tex org. Under Pattern A, every user shares your quota. One noisy tenant can affect everyone. Add these controls in your own app:

You track per-user bytes/tokens yourself (usage is on every response).
You soft-cap heavy users (for example switch off memory after they consume 10% of your daily budget).
You poll tex.usage.today() and degrade gracefully after ~90%.

​A - one key, put the user in session_id

​B - one key per customer org

​Which pattern to choose

Pattern A

Pattern B

​Shared quota (A only)

A - one key, put the user in `session_id`

B - one key per customer org

Which pattern to choose

Shared quota (A only)