Retrieving extraction results

Getting a single job

Endpoint: GET /v2/extract/{jobId}

Returns the current state of an extraction job. If the job is still processing, call again until state reaches completed, or use a webhook to avoid polling entirely.

import { Exabase } from "@exabase/sdk";

const api = new Exabase({
  apiKey: process.env.EXABASE_API_KEY,
});

const job = await api.extract.get({ jobId: "job-id" });

if (job.state === "completed") {
  console.log(job.extraction?.common?.mimeType);
  console.log(job.extraction?.common?.chunkCount);
  console.log(job.extraction?.document?.pages);
}
curl https://api.exabase.io/v2/extract/<jobId> \
  -H 'X-Api-Key: <EXABASE_API_KEY>'

Example response for a completed PDF job:

{
  "id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "workspaceId": "...",
  "userId": "...",
  "kind": "document",
  "name": "Q1 Report",
  "url": null,
  "state": "completed",
  "createdAt": "2025-06-01T10:00:00.000Z",
  "extraction": {
    "common": {
      "mimeType": "application/pdf",
      "size": 204800,
      "thumbnail": "https://cdn.exabase.io/...",
      "chunkCount": 42
    },
    "document": {
      "pages": 18,
      "author": "Jane Smith",
      "title": "Q1 Report",
      "creationDate": "2025-03-15T08:00:00.000Z",
      "pdfRender": { "url": "https://cdn.exabase.io/..." }
    }
  },
  "links": {
    "chunks": "https://api.exabase.io/v2/extract/<jobId>/chunks?start=1&end=20",
    "download": "https://api.exabase.io/v2/extract/<jobId>/download"
  }
}

Listing jobs

Endpoint: GET /v2/extract

Returns jobs in reverse-chronological order. Use nextCursor to page through results. You can filter by state or kind.

let cursor: string | null = null;

do {
  const page = await api.extract.list({
    limit: 50,
    ...(cursor && { cursor }),
    state: "completed",
  });

  for (const job of page.items) {
    console.log(job.id, job.name, job.state);
  }

  cursor = page.nextCursor;
} while (cursor);
curl 'https://api.exabase.io/v2/extract?limit=50&state=completed' \
  -H 'X-Api-Key: <EXABASE_API_KEY>'

Reading text chunks

When extraction.common.chunkCount is set, the text content has been split into searchable chunks that you can retrieve in pages.

Endpoint: GET /v2/extract/{jobId}/chunks

const result = await api.extract.getChunks({
  jobId: "job-id",
  start: 1,
  end: 20,
});

for (const chunk of result.items) {
  console.log(chunk.sequence, chunk.text);
  // chunk.pageNumber — set for document chunks
  // chunk.timeStart / chunk.timeEnd — set for audio/video chunks
}
curl 'https://api.exabase.io/v2/extract/<jobId>/chunks?start=1&end=20' \
  -H 'X-Api-Key: <EXABASE_API_KEY>'

Downloading attachments

Download all files associated with a job (original file, thumbnail, screenshot, transcript, PDF render, etc.) as a single ZIP archive. Available files depend on the extraction type (see the extraction data table on the About page).

Endpoint: GET /v2/extract/{jobId}/download

import { createWriteStream } from "fs";
import { Readable } from "stream";

const response = await fetch(
  `https://api.exabase.io/v2/extract/${jobId}/download`,
  { headers: { "X-Api-Key": process.env.EXABASE_API_KEY! } },
);

Readable.fromWeb(response.body!).pipe(createWriteStream("attachments.zip"));
curl https://api.exabase.io/v2/extract/<jobId>/download \
  -H 'X-Api-Key: <EXABASE_API_KEY>' \
  --output attachments.zip