Use cases

Customer support agents that remember every ticket

Give your support agent memory of every past ticket, the customer's plan, and what's already been tried, so it stops asking people to repeat themselves.

There's a particular kind of frustration that comes from explaining your problem to a support bot, getting nowhere, and then having to explain the whole thing again to the next agent. The bot didn't remember. It never does. Each conversation starts from nothing.

The model is perfectly capable. What it lacks is somewhere to put what it learns, so it learns nothing. This page is about fixing that, by giving a support agent real memory of the customer, the account, and everything that's already been tried.

The problem

A support agent without memory is stuck in the present tense. It can read the current ticket, but it has no idea this is the third time the same customer has raised the same issue. It doesn't know they upgraded to enterprise last week, so it keeps offering them workarounds for limits they no longer have. It doesn't know that the fix it's about to suggest was already tried and didn't work.

The usual reach for a fix is to stuff the conversation history into the prompt. That holds up for about a day. Then the history gets too long for the context window, facts start contradicting each other, and the agent has no way to tell which version is current. Storing transcripts in Postgres gives you persistence, but persistence isn't memory. A pile of logs doesn't tell the agent that the customer's plan changed, only that two conflicting statements both exist somewhere.

The other half of the problem is that the answer often isn't in memory at all. It's in your help docs, your past resolved tickets, your product guides, and the agent needs to find the relevant passage, not the document that happens to share the most keywords. Plain keyword search misses anything phrased differently from how the customer asked it.

And underneath all of it: every customer's data has to stay separate from every other customer's. One support tenant's memory leaking into another's isn't a bug, it's an incident.

What Exabase unlocks

Put proper infrastructure underneath and the agent stops behaving like a goldfish.

A customer writes in about a sync failure. The agent already knows this account is on the Team plan, that they reported something similar in March, and that the March fix was a token refresh that worked. It opens with that context instead of asking them to describe their setup from scratch. The customer feels recognised, which is most of what good support actually is.

A week after a customer upgrades from free to enterprise, they ask why a feature is locked. A memory-less agent reasons from the stale plan tier and gives them the free-plan answer, which is wrong and a little insulting. An agent with self-organising memory knows the relationship changed. The upgrade superseded the old fact, so it answers as if to the enterprise customer they now are.

A customer asks a question whose answer lives in a help article nobody's looked at in months. The agent searches across all of it by meaning, finds the exact paragraph, and answers with it, rather than returning a list of articles for the customer to read themselves.

None of this requires a smarter model. It requires the model to have somewhere to remember, somewhere to search, and a guarantee that one customer's world stays sealed off from the next.

How it works

Three Exabase primitives do the work here. You don't need the whole platform for a support agent. You need these.

Memory

Exabase Memory is where the agent keeps what it learns about a customer: their plan, their past issues, what's been tried, how they like to be spoken to. You send in the conversation and Exabase extracts the facts automatically, so you don't have to decide in advance what's worth keeping.

The important part is what happens when facts change. When a customer upgrades, the old plan tier doesn't sit alongside the new one waiting to confuse things; Exabase resolves the contradiction. It's a process worth understanding properly if you've been burned by stale context before (see entity resolution and memory drift). The memory gets more accurate the more the customer interacts, not noisier. This is the bit that vector search alone can't do, and it's why a vector database isn't a memory system.

Bases

A Base is an isolated environment, with its own memory, its own storage, and its own search index. For a support product, you create one Base per customer account (or per end user, depending on how you're structured). Everything that agent remembers about that customer is scoped to that Base and can't leak into another. You get multi-tenancy from a single API call instead of building partitioning logic and praying it holds.

Every memory carries creation and modification timestamps too, so when someone asks why the agent said what it said, there's an actual trail to point at.

Deep Search

Deep Search is how the agent finds answers that aren't in memory, across help docs, past resolved tickets, product guides, whatever you've stored. It searches inside content at the paragraph level, semantically rather than by keyword, so a customer asking about "my files won't go up" matches an article titled "troubleshooting upload errors." Results come back as specific chunks with a relevance score, ready to hand to the model as context.

It's hybrid by default, combining semantic with typo-tolerant keyword matching, so exact matches like error codes still land while natural-language questions still work.

Example architecture

Here's how you'd actually wire this up. It's deliberately simple; that's the point.

On account creation, create a Base for the customer and hold onto the Base ID against their account record.

On every support conversation, two things happen. Before the agent replies, you search memory scoped to that Base for relevant context (plan, history, preferences) and run a Deep Search across your help content for anything matching the question. Both sets of results go into the prompt. After the conversation ends, you send the transcript to Memory and Exabase extracts whatever's worth remembering for next time.

Your help content, meaning docs, guides, and resolved-ticket write-ups, lives as Resources in a shared Base that every customer's agent can search, kept separate from the per-customer memory.

So customer data flows into a per-account Base, shared knowledge sits in a common Base, and on each turn the agent reads from both and writes back to the one. That's the whole shape of it.

What compounds over time

A support agent built this way is worth more in month six than in month one, and the reason is structural rather than hopeful.

Every conversation adds to what the agent knows about that customer, and because the memory self-organises, more interactions make it sharper rather than messier. Contradictions get resolved, repeated facts get consolidated, stale ones fade. Reliable memory and context have been shown to cut hallucinations by around 28%, and that effect grows as the memory fills out.

Your shared knowledge base compounds too. Every resolved ticket you write back becomes searchable for the next customer with the same problem, so the agent's hit rate on "we've seen this before" climbs steadily. The DIY version of this, a vector store you maintain, a memory layer you debug, an isolation scheme you hope is watertight, gets harder to run as it grows. Infrastructure that improves with use is the opposite bet, and it's the one worth making.

Who's building this

Teams building support copilots, helpdesk automation, and in-product assistants are the natural fit here, anywhere an agent talks to the same people repeatedly and looks foolish when it forgets them.

If you want to see the memory pattern in a smaller, self-contained form first, the memory-powered personal assistant example walks through the same add-and-retrieve loop a support agent uses, without the multi-tenancy.

Get started

The fastest way in is the getting started guide, then creating memories and memory scoping to see how per-customer isolation works. If you're weighing this against rolling your own, the comparison of agent memory platforms is a reasonable place to start, and there's a free tier to build against.

FAQs

How does the agent know a customer's plan changed?

You send the upgrade through as you would any other fact, and Exabase Memory resolves it against what it already knows. The old plan tier is treated as superseded rather than kept alongside the new one, so the agent reasons from the current state. This is handled by entity resolution, not by you writing reconciliation logic.

Can one customer's memory leak into another's?

No, provided you scope each customer to their own Base. Data in one Base is fully isolated from data in another. Memory operations scoped to a Base ID only ever see that Base's memories.

Do I need to decide what the agent should remember?

Not unless you want to. Send raw conversation text and Exabase extracts facts, preferences, and events automatically. If you do want precise control, you can write specific memories directly instead.

What's the difference between this and just doing RAG over our help docs?

RAG retrieves documents; it doesn't remember the customer. Searching your help content is half of it, and Deep Search covers that. The other half is persistent memory of who this customer is and what's already happened, which RAG, on its own, doesn't give you. The two are complementary, as the RAG vs agent memory piece lays out.

How fast is memory retrieval? Will it slow the agent down?

Simple retrieval runs around 200ms, which is fast enough to sit inside a live agent loop without the customer noticing a pause.

Can the agent search past resolved tickets, not just docs?

Yes. Store resolved tickets as Resources and they're indexed for Deep Search like anything else. A resolved ticket from six months ago becomes the answer to today's identical question.

Is there an audit trail of what the agent knew?

Every memory has creation and modification timestamps, so you can reconstruct what the agent knew at the point it responded, useful for both debugging and accountability.