Retrieving extraction results
Getting a single job
Endpoint: GET /v2/extract/{jobId}
Returns the current state of an extraction job. If the job is still processing, call again until state reaches completed, or use a webhook to avoid polling entirely.
import { Exabase } from "@exabase/sdk";
const api = new Exabase({
apiKey: process.env.EXABASE_API_KEY,
});
const job = await api.extract.get({ jobId: "job-id" });
if (job.state === "completed") {
console.log(job.extraction?.common?.mimeType);
console.log(job.extraction?.common?.chunkCount);
console.log(job.extraction?.document?.pages);
}
curl https://api.exabase.io/v2/extract/<jobId> \
-H 'X-Api-Key: <EXABASE_API_KEY>'
Example response for a completed PDF job:
{
"id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"workspaceId": "...",
"userId": "...",
"kind": "document",
"name": "Q1 Report",
"url": null,
"state": "completed",
"createdAt": "2025-06-01T10:00:00.000Z",
"extraction": {
"common": {
"mimeType": "application/pdf",
"size": 204800,
"thumbnail": "https://cdn.exabase.io/...",
"chunkCount": 42
},
"document": {
"pages": 18,
"author": "Jane Smith",
"title": "Q1 Report",
"creationDate": "2025-03-15T08:00:00.000Z",
"pdfRender": { "url": "https://cdn.exabase.io/..." }
}
},
"links": {
"chunks": "https://api.exabase.io/v2/extract/<jobId>/chunks?start=1&end=20",
"download": "https://api.exabase.io/v2/extract/<jobId>/download"
}
}
Listing jobs
Endpoint: GET /v2/extract
Returns jobs in reverse-chronological order. Use nextCursor to page through results. You can filter by state or kind.
let cursor: string | null = null;
do {
const page = await api.extract.list({
limit: 50,
...(cursor && { cursor }),
state: "completed",
});
for (const job of page.items) {
console.log(job.id, job.name, job.state);
}
cursor = page.nextCursor;
} while (cursor);
curl 'https://api.exabase.io/v2/extract?limit=50&state=completed' \
-H 'X-Api-Key: <EXABASE_API_KEY>'
Reading text chunks
When extraction.common.chunkCount is set, the text content has been split
into searchable chunks that you can retrieve in pages.
Endpoint: GET /v2/extract/{jobId}/chunks
const result = await api.extract.getChunks({
jobId: "job-id",
start: 1,
end: 20,
});
for (const chunk of result.items) {
console.log(chunk.sequence, chunk.text);
// chunk.pageNumber — set for document chunks
// chunk.timeStart / chunk.timeEnd — set for audio/video chunks
}
curl 'https://api.exabase.io/v2/extract/<jobId>/chunks?start=1&end=20' \
-H 'X-Api-Key: <EXABASE_API_KEY>'
Downloading attachments
Download all files associated with a job (original file, thumbnail, screenshot, transcript, PDF render, etc.) as a single ZIP archive. Available files depend on the extraction type (see the extraction data table on the About page).
Endpoint: GET /v2/extract/{jobId}/download
import { createWriteStream } from "fs";
import { Readable } from "stream";
const response = await fetch(
`https://api.exabase.io/v2/extract/${jobId}/download`,
{ headers: { "X-Api-Key": process.env.EXABASE_API_KEY! } },
);
Readable.fromWeb(response.body!).pipe(createWriteStream("attachments.zip"));
curl https://api.exabase.io/v2/extract/<jobId>/download \
-H 'X-Api-Key: <EXABASE_API_KEY>' \
--output attachments.zip