Index Documents

curl -X POST "https://customerDomain.ambersearch.de/api/indexing/documents" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": {
      "datasource": "internal_wiki",
      "id": "wiki-getting-started",
      "title": "Getting Started with the Internal Wiki",
      "body": {
        "mime_type": "text/plain",
        "text_content": "This guide will help you navigate and contribute to our internal wiki..."
      },
      "file_type": "wiki",
      "path_preview": "https://wiki.company.com/getting-started",
      "path": "Engineering / Runbooks / Getting Started",
      "author": "[email protected]",
      "last_modified": "2026-04-09T12:00:00Z",
      "data_source_sub": "engineering",
      "data_source_sub_sub": "runbooks",
      "permissions": { "allow_anonymous_access": true },
      "custom_properties": [
        { "name": "department", "value": "Engineering" }
      ]
    }
  }'

import requests

response = requests.post(
    "https://customerDomain.ambersearch.de/api/indexing/documents",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
    },
    json={
        "documents": {
            "datasource": "internal_wiki",
            "id": "wiki-getting-started",
            "title": "Getting Started with the Internal Wiki",
            "body": {
                "mime_type": "text/plain",
                "text_content": "This guide will help you navigate and contribute to our internal wiki...",
            },
            "file_type": "wiki",
            "path_preview": "https://wiki.company.com/getting-started",
            "path": "Engineering / Runbooks / Getting Started",
            "author": "[email protected]",
            "last_modified": "2026-04-09T12:00:00Z",
            "data_source_sub": "engineering",
            "data_source_sub_sub": "runbooks",
            "permissions": {"allow_anonymous_access": True},
            "custom_properties": [
                {"name": "department", "value": "Engineering"}
            ],
        }
    },
)
print(response.json())

const axios = require("axios");

axios
    .post(
        "https://customerDomain.ambersearch.de/api/indexing/documents",
        {
            documents: {
                datasource: "internal_wiki",
                id: "wiki-getting-started",
                title: "Getting Started with the Internal Wiki",
                body: {
                    mime_type: "text/plain",
                    text_content:
                        "This guide will help you navigate and contribute to our internal wiki...",
                },
                file_type: "wiki",
                path_preview: "https://wiki.company.com/getting-started",
                path: "Engineering / Runbooks / Getting Started",
                author: "[email protected]",
                last_modified: "2026-04-09T12:00:00Z",
                data_source_sub: "engineering",
                data_source_sub_sub: "runbooks",
                permissions: { allow_anonymous_access: true },
                custom_properties: [
                    { name: "department", value: "Engineering" },
                ],
            },
        },
        {
            headers: {
                Authorization: "Bearer YOUR_API_KEY",
                "Content-Type": "application/json",
            },
        }
    )
    .then((res) => console.log(res.data));

{
  "status": "ok",
  "datasource": "internal_wiki",
  "documents_received": null,
  "message": "document sent to the indexing pipeline.",
  "results": [
    {
      "document_id": "wiki-getting-started",
      "outcome": "sent",
      "message": "document sent to the indexing pipeline."
    }
  ]
}

Indexing API (BETA) --- Overview · Datasource & properties · Index documents · Permissions

Overview

Once your datasource is configured, you can push documents to POST /documents. Each document is identified by a unique id within its datasource. Indexing a document with an existing id will update it. The endpoint accepts either a single document object or an array in the same request body — there is no separate bulk route. Wrap your payload in documents:

{ "documents": { ... } }            // single
{ "documents": [ { ... }, { ... } ] }  // bulk

For searchable content, you choose one of two approaches per document:

Send file binaries --- Post the file as Base64-encoded bytes (binary_base64) with the correct mime_type. amberSearch handles content extraction server-side.
Send text directly --- Post text_content with the content you want indexed.

Details: Content extraction (two options).

Index a single document

Send a POST request to the /documents endpoint:

curl -X POST "https://customerDomain.ambersearch.de/api/indexing/documents" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": {
      "datasource": "internal_wiki",
      "id": "wiki-getting-started",
      "title": "Getting Started with the Internal Wiki",
      "body": {
        "mime_type": "text/plain",
        "text_content": "This guide will help you navigate and contribute to our internal wiki..."
      },
      "file_type": "wiki",
      "path_preview": "https://wiki.company.com/getting-started",
      "path": "Engineering / Runbooks / Getting Started",
      "author": "[email protected]",
      "last_modified": "2026-04-09T12:00:00Z",
      "data_source_sub": "engineering",
      "data_source_sub_sub": "runbooks",
      "permissions": { "allow_anonymous_access": true },
      "custom_properties": [
        { "name": "department", "value": "Engineering" }
      ]
    }
  }'

import requests

response = requests.post(
    "https://customerDomain.ambersearch.de/api/indexing/documents",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
    },
    json={
        "documents": {
            "datasource": "internal_wiki",
            "id": "wiki-getting-started",
            "title": "Getting Started with the Internal Wiki",
            "body": {
                "mime_type": "text/plain",
                "text_content": "This guide will help you navigate and contribute to our internal wiki...",
            },
            "file_type": "wiki",
            "path_preview": "https://wiki.company.com/getting-started",
            "path": "Engineering / Runbooks / Getting Started",
            "author": "[email protected]",
            "last_modified": "2026-04-09T12:00:00Z",
            "data_source_sub": "engineering",
            "data_source_sub_sub": "runbooks",
            "permissions": {"allow_anonymous_access": True},
            "custom_properties": [
                {"name": "department", "value": "Engineering"}
            ],
        }
    },
)
print(response.json())

const axios = require("axios");

axios
    .post(
        "https://customerDomain.ambersearch.de/api/indexing/documents",
        {
            documents: {
                datasource: "internal_wiki",
                id: "wiki-getting-started",
                title: "Getting Started with the Internal Wiki",
                body: {
                    mime_type: "text/plain",
                    text_content:
                        "This guide will help you navigate and contribute to our internal wiki...",
                },
                file_type: "wiki",
                path_preview: "https://wiki.company.com/getting-started",
                path: "Engineering / Runbooks / Getting Started",
                author: "[email protected]",
                last_modified: "2026-04-09T12:00:00Z",
                data_source_sub: "engineering",
                data_source_sub_sub: "runbooks",
                permissions: { allow_anonymous_access: true },
                custom_properties: [
                    { name: "department", value: "Engineering" },
                ],
            },
        },
        {
            headers: {
                Authorization: "Bearer YOUR_API_KEY",
                "Content-Type": "application/json",
            },
        }
    )
    .then((res) => console.log(res.data));

{
  "status": "ok",
  "datasource": "internal_wiki",
  "documents_received": null,
  "message": "document sent to the indexing pipeline.",
  "results": [
    {
      "document_id": "wiki-getting-started",
      "outcome": "sent",
      "message": "document sent to the indexing pipeline."
    }
  ]
}

Per-document outcomes

The pipeline returns one of these outcomes for each document so you can act on duplicates and skipped items:

`outcome`	Meaning
`sent`	New or changed --- queued into the indexing pipeline.
`already_indexed`	The same `id` is already in Solr with the same `last_modified` and the same `allow_token_document` set; nothing was queued.
`skipped`	Pipeline rejected the document (e.g. computed file suffix longer than 10 chars). The `message` explains why.
`failed`	Validation or runtime error. Only appears in bulk responses; single-document errors come back as HTTP errors.

Response codes for `POST /documents`

The HTTP response is 200 for any well-formed request, even if individual documents fail — inspect results[*].outcome for per-document state.

Code	When
200 OK	Request accepted. Per-doc state lives in `results[*].outcome`.
404 Not Found	(single-doc form only) the referenced `datasource` does not exist. In the bulk form this becomes a per-doc `failed` entry.
422 Unprocessable Entity (service envelope)	(single-doc form only) custom-property or sub-reference schema violation — see Schema validation & errors. In the bulk form this becomes a per-doc `failed` entry.

Index multiple documents

Send an array under documents. The server processes them sequentially and returns one result per document. A failure on one document does not abort the rest — each entry’s outcome is reported individually.

curl -X POST "https://customerDomain.ambersearch.de/api/indexing/documents" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": [
      {
        "datasource": "internal_wiki",
        "id": "doc-002",
        "title": "Second Document",
        "body": { "mime_type": "text/plain", "text_content": "Content of doc two." },
        "file_type": "wiki",
        "path_preview": "https://wiki.company.com/doc-002",
        "last_modified": "2026-04-09T12:00:00Z"
      },
      {
        "datasource": "internal_wiki",
        "id": "doc-003",
        "title": "Third Document",
        "body": { "mime_type": "text/plain", "text_content": "Content of doc three." },
        "file_type": "wiki",
        "path_preview": "https://wiki.company.com/doc-003",
        "last_modified": "2026-04-09T13:00:00Z"
      }
    ]
  }'

{
  "status": "ok",
  "documents_received": 2,
  "message": "2 sent, 0 already indexed, 0 skipped, 0 failed.",
  "results": [
    { "document_id": "doc-002", "outcome": "sent", "message": "document sent to the indexing pipeline." },
    { "document_id": "doc-003", "outcome": "sent", "message": "document sent to the indexing pipeline." }
  ]
}

Document fields

For the full field reference (types, required/optional, descriptions) see Datasource & properties --- Standard document fields. A few field-level reminders:

file_type is required and must be one of the supported values listed under File types (or fetched from GET /file-types).
data_source_sub / data_source_sub_sub must be slug keys that match a sub declared on the datasource (word_word form, e.g. engineering, runbooks). amberSearch builds the compound storage values ({datasource}__{sub_key} etc.) for you.
author is a free-form string (an email or display name); it is not an object.
path is a breadcrumb string for display (e.g. Engineering / Runbooks / Onboarding). For the URL to open the document, use path_preview.

Content extraction (two options)

Approach	What you send	Who extracts searchable text
Server processing	`binary_base64` + `mime_type`	amberSearch processes the file and derives text for the index.
Client-supplied text	`text_content`	You supply the content to be indexed.

Send either binary_base64 or text_content in body, never both. Server processing --- Encode the file with standard Base64. Set mime_type to the real media type for that file. There is no closed list of supported types; use the correct IANA media type for whatever you upload. Client-supplied text --- Pass text_content with the content you want indexed.

import base64

with open("quarterly-report.pdf", "rb") as f:
    binary_base64 = base64.standard_b64encode(f.read()).decode("ascii")

# Server processing: { "mime_type": "application/pdf", "binary_base64": binary_base64 }
# Client text:       { "mime_type": "text/plain",      "text_content": "..." }

Per-document binary cap: 100 MB (decoded). The cap is enforced by the downstream indexing service — the public /api/indexing proxy does not decode binary_base64 itself, so an oversize payload reaches upstream and is rejected there; the resulting error is forwarded verbatim. The default is controlled by indexing_api_max_binary_bytes; ask your operator if you need it raised.

Custom properties on documents

If your datasource has property definitions, send matching values in custom_properties on each document. Each entry is a { "name", "value" } pair where name matches a property definition key.

curl -X POST "https://customerDomain.ambersearch.de/api/indexing/documents" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": {
      "datasource": "internal_wiki",
      "id": "onboarding",
      "title": "Onboarding",
      "body": { "mime_type": "text/plain", "text_content": "..." },
      "file_type": "wiki",
      "path_preview": "https://wiki.company.com/onboarding",
      "last_modified": "2026-03-25T10:00:00Z",
      "custom_properties": [
        { "name": "department", "value": "Engineering" },
        { "name": "ticket_id", "value": "WIKI-42" }
      ],
      "permissions": { "allow_anonymous_access": true }
    }
  }'

Values are validated against the datasource schema --- the property name must exist in the definitions, and the value must match the declared property_type. See Schema validation & errors for details.

Updating a document

To update an existing document, simply re-index it with the same id. The entire document is replaced with the new payload:

The pipeline re-indexes a document only when last_modified has changed or its access tokens have changed (any change to the permissions block — allow_anonymous_access, allowed_users, or allowed_groups). Edits to other fields (e.g. title, body, custom_properties) without bumping last_modified or changing permissions are treated as no-ops, and the response reports outcome: "already_indexed". Bump last_modified whenever you want content edits to take effect.

curl -X POST "https://customerDomain.ambersearch.de/api/indexing/documents" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": {
      "datasource": "internal_wiki",
      "id": "wiki-getting-started",
      "title": "Getting Started with the Internal Wiki (Updated)",
      "body": {
        "mime_type": "text/plain",
        "text_content": "Updated content for the getting started guide..."
      },
      "file_type": "wiki",
      "path_preview": "https://wiki.company.com/getting-started",
      "last_modified": "2026-05-01T09:00:00Z",
      "permissions": { "allow_anonymous_access": true }
    }
  }'

If neither last_modified nor the access tokens have changed, nothing is re-queued and the response reports outcome: "already_indexed".

Deleting a document

curl -X DELETE "https://customerDomain.ambersearch.de/api/indexing/documents/internal_wiki/wiki-getting-started" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response codes for DELETE /documents/{datasource}/{document_id}

Code	When
200 OK	Delete-by-id issued to Solr — idempotent, so this is returned even if no document with that `id` existed.
404 Not Found	Datasource not found.

Indexing a document does not make it immediately searchable. The server controls when new and updated documents are committed to the search index. This can take up to 2 hours depending on system load and commit scheduling. Do not rely on instant availability after a successful POST.

Schema validation & errors

Every document is validated against the datasource’s schema (object definitions, property definitions, and declared sub keys) before it is persisted. If validation fails the API returns HTTP 422 with this envelope:

{
  "status": "error",
  "details": {
    "error_code": "UNKNOWN_PROPERTY",
    "message": "Property 'priority' is not defined on datasource 'internal_wiki'. Valid properties: ['department', 'ticket_id']."
  }
}

Pydantic field-level errors (e.g. missing required fields, invalid id characters) use a different shape — the standard FastAPI HTTPValidationError, wrapped in {"detail": [...]} with one entry per offending field. The error_code envelope above is only used for the service-layer schema-validation cases listed below.

Error codes

Error code	When it is returned
`NO_PROPERTY_DEFINITIONS`	The datasource has no property definitions, but the document includes `custom_properties`. Define properties on the datasource first.
`UNKNOWN_PROPERTY`	A `custom_properties` entry has a `name` that does not match any `property_definitions` on the datasource.
`INVALID_PROPERTY_TYPE`	The value of a custom property does not match the declared `property_type`. For example, sending a string for an `INTEGER` property or a non-boolean for a `BOOLEAN` property.
`INVALID_DATA_SOURCE_SUB`	`data_source_sub` (or `data_source_sub_sub` when no subs are declared at all) does not match a top-level sub key of the datasource.
`INVALID_DATA_SOURCE_SUB_SUB`	`data_source_sub_sub` is not a child of the supplied `data_source_sub`, or is unknown across the whole datasource when sent without a parent.
`AMBIGUOUS_DATA_SOURCE_SUB_SUB`	`data_source_sub_sub` was sent without `data_source_sub`, but the same child key exists under multiple parent subs. Set `data_source_sub` explicitly.

Expected value types

`property_type`	Accepted JSON values
`TEXT`	string
`INTEGER`	number (integer or float)
`DATE`	number (epoch seconds) or ISO 8601 string
`BOOLEAN`	`true` / `false`

In bulk requests, validation runs per document and failures are reported in results with outcome: "failed" and a message describing the cause. The other documents in the same batch are still processed.

Envelope summary

Source	Envelope shape
FastAPI `HTTPException` (most 400 / 404 / 409 / 503)	`{"detail": "<message>"}`
Pydantic body / path validation (422)	`{"detail": [{"loc": [...], "msg": "...", "type": "..."}]}`
Service-layer schema validation (422)	`{"status": "error", "details": {"error_code": "<CODE>", "message": "<message>"}}`

For 401 / 422-Pydantic / 500 responses that can come from any endpoint, see Common HTTP responses on the overview page.

Next steps

Permissions

Configure fine-grained access controls for your documents.

Datasource & custom properties

Adjust object types, sub-datasources, and property schemas on the datasource.

Datasource & custom properties Custom Properties

API Documentation

Endpoints

OpenAI-like Endpoints

AmberSearch MCP Server

Indexing API

Overview

Index a single document

Per-document outcomes

Response codes for `POST /documents`

Index multiple documents

Document fields

Content extraction (two options)

Custom properties on documents

Updating a document

Deleting a document

Schema validation & errors

Error codes

Expected value types

Envelope summary

Next steps

Permissions

Datasource & custom properties

​Overview

​Index a single document

​Per-document outcomes

​Response codes for POST /documents

​Index multiple documents

​Document fields

​Content extraction (two options)

​Custom properties on documents

​Updating a document

​Deleting a document

​Schema validation & errors

​Error codes

​Expected value types

​Envelope summary

​Next steps

Permissions

Datasource & custom properties

Overview

Index a single document

Per-document outcomes

Response codes for `POST /documents`

Index multiple documents

Document fields

Content extraction (two options)

Custom properties on documents

Updating a document

Deleting a document

Schema validation & errors

Error codes

Expected value types

Envelope summary

Next steps