Submitting extraction jobs

You can submit either a file or a URL. Both paths return an ExtractJob object immediately with state: "pending".

Submitting a file

Using the SDK

The simplest approach is the SDK, which streams the file directly without buffering into memory.

Endpoint: POST /v2/extract

import { Exabase } from "@exabase/sdk";
import { createReadStream } from "fs";

const api = new Exabase({
  apiKey: process.env.EXABASE_API_KEY,
});

const job = await api.extract.createFromFile({
  file: createReadStream("./report.pdf"),
  name: "Q1 Report",
});

console.log(job.id);    // extraction job id
console.log(job.state); // "pending"

curl https://api.exabase.io/v2/extract \
  -X POST \
  -H 'X-Api-Key: <EXABASE_API_KEY>' \
  -F 'file=@report.pdf' \
  -F 'name=Q1 Report'

Using the upload API directly

If you need more control (for example to track upload progress or integrate with your own storage pipeline), you can upload the file yourself using the pre-signed URL and then register the job.

Step 1: Get an upload URL

Endpoint: GET /v2/extract/upload

const upload = await api.extract.getUploadUrl({
  filename: "report.pdf",
});
// upload.url         — PUT target
// upload.headers     — headers to include in the PUT request
// upload.storagePath — pass this to step 3

curl 'https://api.exabase.io/v2/extract/upload?filename=report.pdf' \
  -H 'X-Api-Key: <EXABASE_API_KEY>'

Step 2: Upload the file

import { readFileSync } from "fs";

await fetch(upload.url, {
  method: "PUT",
  headers: upload.headers,
  body: readFileSync("./report.pdf"),
});

curl '<upload.url>' \
  --request PUT \
  --header 'Content-Type: application/pdf' \
  --data-binary @report.pdf

Step 3: Create the extraction job

Endpoint: POST /v2/extract

const job = await api.extract.create({
  storagePath: upload.storagePath,
  name: "Q1 Report",
});

console.log(job.id);    // extraction job id
console.log(job.state); // "pending"

curl https://api.exabase.io/v2/extract \
  -X POST \
  -H 'X-Api-Key: <EXABASE_API_KEY>' \
  -H 'Content-Type: application/json' \
  -d '{
    "storagePath": "<storagePath from step 1>",
    "name": "Q1 Report"
  }'

Submitting a URL

Pass a URL directly. Exabase fetches the content and determines how to process it – documents and media files are downloaded and processed as files; plain web pages are processed as bookmarks.

const job = await api.extract.create({
  url: "https://example.com/research-paper.pdf",
});

console.log(job.id);
console.log(job.kind); // "document", "image", "bookmark", etc.

curl https://api.exabase.io/v2/extract \
  -X POST \
  -H 'X-Api-Key: <EXABASE_API_KEY>' \
  -H 'Content-Type: application/json' \
  -d '{ "url": "https://example.com/research-paper.pdf" }'

Reprocessing a failed job

If a job reaches failed state you can trigger a new processing attempt without re-uploading the file.

Endpoint: POST /v2/extract/{jobId}/reprocess

await api.extract.reprocess({ jobId: "job-id" });

curl https://api.exabase.io/v2/extract/<jobId>/reprocess \
  -X POST \
  -H 'X-Api-Key: <EXABASE_API_KEY>'