Skip to main content

Overview

Once your datasource is configured, you can push documents to POST /documents. Each document is identified by a unique id within its datasource. Indexing a document with an existing id will update it. The endpoint accepts either a single document object or an array in the same request body — there is no separate bulk route. Wrap your payload in documents:
{ "documents": { ... } }            // single
{ "documents": [ { ... }, { ... } ] }  // bulk
For searchable content, you choose one of two approaches per document:
  1. Send file binaries --- Post the file as Base64-encoded bytes (binary_base64) with the correct mime_type. amberSearch handles content extraction server-side.
  2. Send text directly --- Post text_content with the content you want indexed.
Details: Content extraction (two options).

Index a single document

Send a POST request to the /documents endpoint:
curl -X POST "https://customerDomain.ambersearch.de/api/indexing/documents" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": {
      "datasource": "internal_wiki",
      "id": "wiki-getting-started",
      "title": "Getting Started with the Internal Wiki",
      "body": {
        "mime_type": "text/plain",
        "text_content": "This guide will help you navigate and contribute to our internal wiki..."
      },
      "file_type": "wiki",
      "path_preview": "https://wiki.company.com/getting-started",
      "path": "Engineering / Runbooks / Getting Started",
      "author": "[email protected]",
      "last_modified": "2026-04-09T12:00:00Z",
      "data_source_sub": "engineering",
      "data_source_sub_sub": "runbooks",
      "permissions": { "allow_anonymous_access": true },
      "custom_properties": [
        { "name": "department", "value": "Engineering" }
      ]
    }
  }'
{
  "status": "ok",
  "datasource": "internal_wiki",
  "documents_received": null,
  "message": "document sent to the indexing pipeline.",
  "results": [
    {
      "document_id": "wiki-getting-started",
      "outcome": "sent",
      "message": "document sent to the indexing pipeline."
    }
  ]
}

Per-document outcomes

The pipeline returns one of these outcomes for each document so you can act on duplicates and skipped items:
outcomeMeaning
sentNew or changed --- queued into the indexing pipeline.
already_indexedThe same id is already in Solr with the same last_modified and the same allow_token_document set; nothing was queued.
skippedPipeline rejected the document (e.g. computed file suffix longer than 10 chars). The message explains why.
failedValidation or runtime error. Only appears in bulk responses; single-document errors come back as HTTP errors.

Response codes for POST /documents

The HTTP response is 200 for any well-formed request, even if individual documents fail — inspect results[*].outcome for per-document state.
CodeWhen
200 OKRequest accepted. Per-doc state lives in results[*].outcome.
404 Not Found(single-doc form only) the referenced datasource does not exist. In the bulk form this becomes a per-doc failed entry.
422 Unprocessable Entity (service envelope)(single-doc form only) custom-property or sub-reference schema violation — see Schema validation & errors. In the bulk form this becomes a per-doc failed entry.

Index multiple documents

Send an array under documents. The server processes them sequentially and returns one result per document. A failure on one document does not abort the rest — each entry’s outcome is reported individually.
curl -X POST "https://customerDomain.ambersearch.de/api/indexing/documents" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": [
      {
        "datasource": "internal_wiki",
        "id": "doc-002",
        "title": "Second Document",
        "body": { "mime_type": "text/plain", "text_content": "Content of doc two." },
        "file_type": "wiki",
        "path_preview": "https://wiki.company.com/doc-002",
        "last_modified": "2026-04-09T12:00:00Z"
      },
      {
        "datasource": "internal_wiki",
        "id": "doc-003",
        "title": "Third Document",
        "body": { "mime_type": "text/plain", "text_content": "Content of doc three." },
        "file_type": "wiki",
        "path_preview": "https://wiki.company.com/doc-003",
        "last_modified": "2026-04-09T13:00:00Z"
      }
    ]
  }'
{
  "status": "ok",
  "documents_received": 2,
  "message": "2 sent, 0 already indexed, 0 skipped, 0 failed.",
  "results": [
    { "document_id": "doc-002", "outcome": "sent", "message": "document sent to the indexing pipeline." },
    { "document_id": "doc-003", "outcome": "sent", "message": "document sent to the indexing pipeline." }
  ]
}

Document fields

For the full field reference (types, required/optional, descriptions) see Datasource & properties --- Standard document fields. A few field-level reminders:
  • file_type is required and must be one of the supported values listed under File types (or fetched from GET /file-types).
  • data_source_sub / data_source_sub_sub must be slug keys that match a sub declared on the datasource (word_word form, e.g. engineering, runbooks). amberSearch builds the compound storage values ({datasource}__{sub_key} etc.) for you.
  • author is a free-form string (an email or display name); it is not an object.
  • path is a breadcrumb string for display (e.g. Engineering / Runbooks / Onboarding). For the URL to open the document, use path_preview.

Content extraction (two options)

ApproachWhat you sendWho extracts searchable text
Server processingbinary_base64 + mime_typeamberSearch processes the file and derives text for the index.
Client-supplied texttext_contentYou supply the content to be indexed.
Send either binary_base64 or text_content in body, never both. Server processing --- Encode the file with standard Base64. Set mime_type to the real media type for that file. There is no closed list of supported types; use the correct IANA media type for whatever you upload. Client-supplied text --- Pass text_content with the content you want indexed.
import base64

with open("quarterly-report.pdf", "rb") as f:
    binary_base64 = base64.standard_b64encode(f.read()).decode("ascii")

# Server processing: { "mime_type": "application/pdf", "binary_base64": binary_base64 }
# Client text:       { "mime_type": "text/plain",      "text_content": "..." }
Per-document binary cap: 100 MB (decoded). The cap is enforced by the downstream indexing service — the public /api/indexing proxy does not decode binary_base64 itself, so an oversize payload reaches upstream and is rejected there; the resulting error is forwarded verbatim. The default is controlled by indexing_api_max_binary_bytes; ask your operator if you need it raised.

Custom properties on documents

If your datasource has property definitions, send matching values in custom_properties on each document. Each entry is a { "name", "value" } pair where name matches a property definition key.
curl -X POST "https://customerDomain.ambersearch.de/api/indexing/documents" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": {
      "datasource": "internal_wiki",
      "id": "onboarding",
      "title": "Onboarding",
      "body": { "mime_type": "text/plain", "text_content": "..." },
      "file_type": "wiki",
      "path_preview": "https://wiki.company.com/onboarding",
      "last_modified": "2026-03-25T10:00:00Z",
      "custom_properties": [
        { "name": "department", "value": "Engineering" },
        { "name": "ticket_id", "value": "WIKI-42" }
      ],
      "permissions": { "allow_anonymous_access": true }
    }
  }'
Values are validated against the datasource schema --- the property name must exist in the definitions, and the value must match the declared property_type. See Schema validation & errors for details.

Updating a document

To update an existing document, simply re-index it with the same id. The entire document is replaced with the new payload:
The pipeline re-indexes a document only when last_modified has changed or its access tokens have changed (any change to the permissions block — allow_anonymous_access, allowed_users, or allowed_groups). Edits to other fields (e.g. title, body, custom_properties) without bumping last_modified or changing permissions are treated as no-ops, and the response reports outcome: "already_indexed". Bump last_modified whenever you want content edits to take effect.
curl -X POST "https://customerDomain.ambersearch.de/api/indexing/documents" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": {
      "datasource": "internal_wiki",
      "id": "wiki-getting-started",
      "title": "Getting Started with the Internal Wiki (Updated)",
      "body": {
        "mime_type": "text/plain",
        "text_content": "Updated content for the getting started guide..."
      },
      "file_type": "wiki",
      "path_preview": "https://wiki.company.com/getting-started",
      "last_modified": "2026-05-01T09:00:00Z",
      "permissions": { "allow_anonymous_access": true }
    }
  }'
If neither last_modified nor the access tokens have changed, nothing is re-queued and the response reports outcome: "already_indexed".

Deleting a document

curl -X DELETE "https://customerDomain.ambersearch.de/api/indexing/documents/internal_wiki/wiki-getting-started" \
  -H "Authorization: Bearer YOUR_API_KEY"
Response codes for DELETE /documents/{datasource}/{document_id}
CodeWhen
200 OKDelete-by-id issued to Solr — idempotent, so this is returned even if no document with that id existed.
404 Not FoundDatasource not found.
Indexing a document does not make it immediately searchable. The server controls when new and updated documents are committed to the search index. This can take up to 2 hours depending on system load and commit scheduling. Do not rely on instant availability after a successful POST.

Schema validation & errors

Every document is validated against the datasource’s schema (object definitions, property definitions, and declared sub keys) before it is persisted. If validation fails the API returns HTTP 422 with this envelope:
{
  "status": "error",
  "details": {
    "error_code": "UNKNOWN_PROPERTY",
    "message": "Property 'priority' is not defined on datasource 'internal_wiki'. Valid properties: ['department', 'ticket_id']."
  }
}
Pydantic field-level errors (e.g. missing required fields, invalid id characters) use a different shape — the standard FastAPI HTTPValidationError, wrapped in {"detail": [...]} with one entry per offending field. The error_code envelope above is only used for the service-layer schema-validation cases listed below.

Error codes

Error codeWhen it is returned
NO_PROPERTY_DEFINITIONSThe datasource has no property definitions, but the document includes custom_properties. Define properties on the datasource first.
UNKNOWN_PROPERTYA custom_properties entry has a name that does not match any property_definitions on the datasource.
INVALID_PROPERTY_TYPEThe value of a custom property does not match the declared property_type. For example, sending a string for an INTEGER property or a non-boolean for a BOOLEAN property.
INVALID_DATA_SOURCE_SUBdata_source_sub (or data_source_sub_sub when no subs are declared at all) does not match a top-level sub key of the datasource.
INVALID_DATA_SOURCE_SUB_SUBdata_source_sub_sub is not a child of the supplied data_source_sub, or is unknown across the whole datasource when sent without a parent.
AMBIGUOUS_DATA_SOURCE_SUB_SUBdata_source_sub_sub was sent without data_source_sub, but the same child key exists under multiple parent subs. Set data_source_sub explicitly.

Expected value types

property_typeAccepted JSON values
TEXTstring
INTEGERnumber (integer or float)
DATEnumber (epoch seconds) or ISO 8601 string
BOOLEANtrue / false
In bulk requests, validation runs per document and failures are reported in results with outcome: "failed" and a message describing the cause. The other documents in the same batch are still processed.

Envelope summary

SourceEnvelope shape
FastAPI HTTPException (most 400 / 404 / 409 / 503){"detail": "<message>"}
Pydantic body / path validation (422){"detail": [{"loc": [...], "msg": "...", "type": "..."}]}
Service-layer schema validation (422){"status": "error", "details": {"error_code": "<CODE>", "message": "<message>"}}
For 401 / 422-Pydantic / 500 responses that can come from any endpoint, see Common HTTP responses on the overview page.

Next steps

Permissions

Configure fine-grained access controls for your documents.

Datasource & custom properties

Adjust object types, sub-datasources, and property schemas on the datasource.