Industries

AI infrastructure for publishing

Submit invoices and receipts through one API, get text, metadata, and structure out automatically, and let Workers process new ones on a schedule.

Publishing runs on two kinds of memory: the catalogue, every manuscript, title, and submission the house has handled, and the editorial knowledge, the style guides, the past decisions, the preferences of particular authors. Both tend to be hard to access. The catalogue is a pile of documents no one can search by meaning, and the editorial knowledge lives in editors' heads and scattered notes. If you're building tools for publishers, the infrastructure you need underneath handles both: extraction and search across the catalogue, and memory that holds editorial context.

This page is for teams building editorial and catalogue tools. Exabase gives you manuscript and submission processing through Extract, meaning-based search across the library through Deep Search, and editorial memory that holds style guides, past decisions, and author preferences. The catalogue becomes searchable and the editorial knowledge becomes something the house keeps rather than something that walks out with an editor.

What you can build

Publishing tools tend to be one of a few shapes, each on infrastructure that already exists.

An editorial agent that remembers a house's style guide, past editorial decisions, and an author's preferences across a relationship, so it applies consistent judgement rather than starting fresh each manuscript, long-term memory holding editorial context.

A manuscript and submission processor that turns the flood of incoming documents into clean, searchable text at scale, the document extraction at scale pattern.

A catalogue search tool that lets editors search an entire back catalogue by meaning, finding titles, passages, and themes regardless of phrasing, Deep Search over the library.

A decision-aware editorial tool that captures why editorial calls were made, not just what was decided, so the reasoning is queryable later, the decision trace infrastructure pattern.

Publishing problems, solved

The problems publishing builders run into are specific, and each has an answer.

Editorial knowledge that lives in people. Style guides, house conventions, and the accumulated judgement of past decisions tend to be tacit. Memory holds them as house knowledge, and keeps them current through contradiction resolution, so when a style decision changes the agent applies the current one rather than the superseded version. The editorial knowledge stops depending on which editor is on the manuscript.

Author preferences across a relationship. A house works with an author over many titles. Memory holds an author's preferences and the history of working with them, so an editorial agent picks up that context rather than rebuilding it each project.

Manuscripts and submissions at volume. Incoming documents arrive constantly, in every format. Extract turns manuscripts, submissions, and back-catalogue titles into clean, searchable text through one API, including scans, at the volume a publisher handles.

A catalogue nobody can search by meaning. Deep Search finds by meaning across the whole library, so an editor searching for a theme or a passage finds it regardless of exact wording, and search holds quality across a large catalogue where naive search collapses.

The infrastructure underneath

Four primitives carry most publishing tools. Extract turns manuscripts, submissions, and catalogue titles into searchable text at scale. Deep Search searches the library by meaning. Memory holds style guides, editorial decisions, and author preferences. Resources store the catalogue as searchable content. Bases isolate per imprint or per client if you're building this as a product for multiple houses. One API key, rather than assembling extraction, search, and memory yourself.

A catalogue and an editorial memory that compound

A publishing tool on this foundation builds value in both directions. The catalogue grows into a more complete searchable library with every title processed, paid for once at extraction and useful indefinitely, and because Deep Search holds quality at scale, a large back catalogue stays as searchable as a small one. At the same time the editorial memory deepens: every decision and author interaction adds to the house's accumulated judgement, and because the memory self-organises, it stays coherent and current rather than becoming a contradictory pile. The editorial knowledge that used to leave with departing editors becomes a durable asset, and the catalogue that used to be an inert archive becomes queryable, both improving as the house grows.

Get started

Start with the getting started guide, then the use-case pages that match what you're building: document extraction at scale for manuscripts, long-term memory for any agent for editorial context, and decision trace infrastructure for capturing editorial reasoning. There's a free tier to build against.

FAQs

Can an editorial agent remember our style guide and past decisions?

Yes. Memory holds style guides, house conventions, and past editorial decisions as house knowledge, and keeps them current through contradiction resolution, so the agent applies the current convention rather than a superseded one.

Does it remember an author's preferences across titles?

Yes. Memory holds an author's preferences and the history of working with them, so an editorial agent carries that context across projects rather than rebuilding it each time.

Can it process manuscripts and submissions at scale?

Yes. Extract turns manuscripts, submissions, and catalogue titles into clean, searchable text through one API, handling scans as well as native formats, at publishing volume.

Can editors search the whole back catalogue by meaning?

Yes. Deep Search finds by meaning across the library, so a search for a theme or passage surfaces it regardless of exact wording, and it holds quality across a large catalogue.

Can it capture why editorial decisions were made?

Yes. Decision trace infrastructure captures the reasoning behind decisions as searchable memory, so the house's editorial reasoning is queryable later rather than lost.

Is this a finished publishing product or something I build on?

Something you build on. Exabase is the infrastructure, extraction, catalogue search, and editorial memory, and you build the editorial or catalogue tool on top.