Customer onboarding automation
Create a Base per customer, pre-populate it with their documents, and let Workers keep it current, so each customer's onboarding agent has a ready-made knowledge base from day one.

Onboarding a customer well usually means giving them something that already knows their situation: an assistant that has their documents, understands their setup, and can answer their questions from day one. Done by hand, that's a setup project per customer, someone gathering documents, loading them somewhere, wiring up an assistant, and then keeping it current. It works for ten customers and falls apart at a thousand, because the manual setup doesn't scale.
This page is for developers building onboarding that scales without that per-customer effort. The pattern is straightforward: create a Base per customer, pre-populate it with their documents through Extract, set up a Worker to keep it current, and point their onboarding agent at it. Each customer gets a ready-made, isolated, self-maintaining knowledge base from the moment they sign up, and creating the thousandth is the same as creating the first.
The problem
Good onboarding is personalised, and personalisation usually means manual work. An onboarding assistant that actually helps needs to know this customer, their documents, their configuration, their context, not just generic product information. Assembling that per customer is a real task: gather their materials, extract the content, store it somewhere the assistant can use, and connect it all up. Multiply by every new customer and it becomes a bottleneck that either limits how fast you can onboard or forces you to drop the personalisation that made onboarding good.
There's an isolation requirement layered on top. Each customer's onboarding environment has to contain only their data, with no chance of one customer's documents showing up in another's onboarding. Building that separation by hand, per customer, is the kind of partitioning work that's easy to get subtly wrong and serious when you do.
And the knowledge base can't be a one-time snapshot. A customer's situation changes during and after onboarding, documents get updated, new ones arrive, and an onboarding assistant working from stale material gives stale answers. Keeping each customer's environment current is yet more recurring work, and it's the part most likely to be skipped, so onboarding knowledge bases tend to be accurate at creation and decreasingly so afterward. The whole thing, per-customer provisioning, isolation, and upkeep, is what stands between you and onboarding that scales.
What Exabase unlocks
With provisioning, isolation, and upkeep handled by infrastructure, onboarding a customer becomes a repeatable operation rather than a setup project.
A new customer gets a ready-made environment. On signup, you create their Base and pre-populate it with their documents through extraction, so their onboarding agent has everything it needs from day one without anyone assembling it by hand. The setup that used to be a per-customer task becomes an automated step in your signup flow.
Each environment is isolated by construction. Because every customer has their own Base, their onboarding data is structurally separated from every other customer's, with no partitioning logic for you to build and no risk of cross-contamination. The isolation comes free with the provisioning.
And the knowledge base maintains itself. A Worker keeps each customer's environment current, re-processing updated documents and adding new ones, so the onboarding agent works from the present state rather than a snapshot taken at signup. The environment is ready-made, isolated, and self-maintaining, all three from the start, which is what lets onboarding scale without the manual effort multiplying alongside the customer count.
How it works
Four primitives compose into the onboarding pattern: Bases provision and isolate, Extract pre-populates, Resources store, and Workers maintain.
Bases
A Base is the per-customer environment, created with a single API call and fully isolated. Creating one on signup gives the customer their own sealed space for memory, documents, and search, and because it's one call, provisioning scales without any change in approach from the first customer to the ten-thousandth. This is the multi-tenant SaaS pattern applied to onboarding: the Base is both the customer's environment and the boundary that keeps it separate.
Extract
Extract pre-populates the environment. The customer's documents, whatever format they arrive in, are run through extraction into clean, searchable text, so the Base is populated with their actual content rather than starting empty. This is what makes the knowledge base ready from day one instead of something the customer has to fill themselves. The general case is document extraction at scale.
Resources
Resources are the customer's knowledge base inside their Base, the extracted documents, stored and indexed for Deep Search so the onboarding agent can find what it needs by meaning. This is the store the agent draws on to answer the customer's questions from their own materials.
Workers
Workers keep each customer's environment current. A Worker re-processes updated documents and adds new ones on a schedule, so the onboarding knowledge base stays accurate as the customer's situation evolves rather than freezing at signup. This is the same autonomous-upkeep engine behind self-maintaining knowledge bases, applied per customer.
Example architecture
The pattern is a provisioning flow that runs on signup and a Worker that runs thereafter.
On signup, provision. Create a Base for the customer, run their documents through Extract, and store the results as Resources in their Base, indexed for Deep Search. The environment is ready before the customer's first interaction.
Set up maintenance. Create a Worker for the Base to re-process changed documents and add new ones on a schedule, so the knowledge base stays current on its own.
Point the agent at it. The onboarding agent runs Deep Search over the customer's Resources to answer from their materials, and can use Memory scoped to the Base to remember the customer's onboarding progress.
Repeat per customer, automatically. Because each step is an API call, the whole provisioning runs as an automated flow on signup rather than a manual setup, so onboarding the next customer takes no more effort than the last.
Each customer is provisioned a ready, isolated, self-maintaining Base on signup, and the onboarding agent works from it immediately. The per-customer setup becomes an automated step.
What compounds over time
The value here is in how the cost behaves as you grow, which is the inverse of manual onboarding.
Manual per-customer setup gets more expensive in total as you add customers, because each one is its own effort, so onboarding either becomes a growing operational burden or a bottleneck on growth. Automated provisioning flips that: the effort to onboard a customer is the same whether they're your tenth or ten-thousandth, so the cost per customer stays flat and the total doesn't balloon. The infrastructure absorbs the scale that manual setup couldn't.
Each customer's environment also compounds in the ordinary way once created. Their knowledge base stays current as Workers maintain it, and if the onboarding agent uses memory, it gets better at helping that customer as it learns them, because the memory self-organises. So you get flat provisioning cost across customers and compounding value within each one, which is the combination that makes personalised onboarding viable at scale rather than a luxury you ration.
Who's building this
Teams building onboarding for any product where customers benefit from an assistant that knows their specific situation, B2B SaaS, professional services platforms, anything with a document-heavy or configuration-heavy onboarding. Developers automating what is currently a manual setup-per-customer process are the fit.
This pattern composes several others: it's multi-tenant memory for SaaS for the isolation, document extraction at scale for the pre-population, and self-maintaining knowledge bases for the upkeep, combined into a provisioning flow. If the onboarding agent needs to remember customers over time, long-term memory for any agent covers that, and many onboarding assistants are essentially a RAG pipeline over the customer's own documents.
Get started
Start with the getting started guide, then creating Bases for provisioning, submitting jobs for pre-population, and creating Workers for upkeep. There's a free tier to build against.
FAQs
How does a new customer get a ready knowledge base on signup?
Your signup flow creates a Base for the customer, runs their documents through Extract, and stores the results as searchable Resources in that Base. Because each step is an API call, this runs automatically as part of onboarding rather than as manual setup.
How is one customer's onboarding data kept separate from another's?
Each customer has their own Base, which is fully isolated. Their documents and the agent's operations are scoped to that Base and can't reach another customer's, so isolation comes with the provisioning rather than being separate work. It's the multi-tenant SaaS pattern.
Does the knowledge base stay current after onboarding, or is it a snapshot?
It stays current. A Worker re-processes updated documents and adds new ones on a schedule, so the customer's environment reflects their present situation rather than freezing at signup. This is the self-maintaining knowledge bases pattern applied per customer.
Does this scale to a large number of customers?
Yes. Creating a Base and provisioning its contents is the same set of API calls for every customer, so the effort to onboard the ten-thousandth is the same as the first. There's no per-customer manual setup that grows with your customer count.
Can the onboarding agent remember each customer's progress?
Yes. Scope Memory to the customer's Base and the agent can remember where they are in onboarding, what they've asked, and what they still need, alongside searching their documents. The long-term memory use case covers this.
How is this different from just building a knowledge base per customer?
It is a knowledge base per customer, but provisioned, isolated, and maintained automatically rather than by hand. The point is that the per-customer setup, isolation, and upkeep are handled by infrastructure, so it scales without the manual effort that building each one by hand would require.







