curl -X POST "https://customerDomain.ambersearch.de/api/indexing/documents" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "documents": { "datasource": "internal_wiki", "id": "wiki-getting-started", "title": "Getting Started with the Internal Wiki", "body": { "mime_type": "text/plain", "text_content": "This guide will help you navigate and contribute to our internal wiki..." }, "file_type": "wiki", "path_preview": "https://wiki.company.com/getting-started", "path": "Engineering / Runbooks / Getting Started", "author": "[email protected]", "last_modified": "2026-04-09T12:00:00Z", "data_source_sub": "engineering", "data_source_sub_sub": "runbooks", "permissions": { "allow_anonymous_access": true }, "custom_properties": [ { "name": "department", "value": "Engineering" } ] } }'
{ "status": "ok", "datasource": "internal_wiki", "documents_received": null, "message": "document sent to the indexing pipeline.", "results": [ { "document_id": "wiki-getting-started", "outcome": "sent", "message": "document sent to the indexing pipeline." } ]}
Indexing API
Index Documents
Push documents into amberSearch to make them searchable through the Indexing API.
Once your datasource is configured, you can push documents to POST /documents. Each document is identified by a unique id within its datasource. Indexing a document with an existing id will update it.The endpoint accepts either a single document object or an array in the same request body — there is no separate bulk route. Wrap your payload in documents:
For searchable content, you choose one of two approaches per document:
Send file binaries --- Post the file as Base64-encoded bytes (binary_base64) with the correct mime_type. amberSearch handles content extraction server-side.
Send text directly --- Post text_content with the content you want indexed.
curl -X POST "https://customerDomain.ambersearch.de/api/indexing/documents" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "documents": { "datasource": "internal_wiki", "id": "wiki-getting-started", "title": "Getting Started with the Internal Wiki", "body": { "mime_type": "text/plain", "text_content": "This guide will help you navigate and contribute to our internal wiki..." }, "file_type": "wiki", "path_preview": "https://wiki.company.com/getting-started", "path": "Engineering / Runbooks / Getting Started", "author": "[email protected]", "last_modified": "2026-04-09T12:00:00Z", "data_source_sub": "engineering", "data_source_sub_sub": "runbooks", "permissions": { "allow_anonymous_access": true }, "custom_properties": [ { "name": "department", "value": "Engineering" } ] } }'
{ "status": "ok", "datasource": "internal_wiki", "documents_received": null, "message": "document sent to the indexing pipeline.", "results": [ { "document_id": "wiki-getting-started", "outcome": "sent", "message": "document sent to the indexing pipeline." } ]}
The HTTP response is 200 for any well-formed request, even if individual documents fail — inspect results[*].outcome for per-document state.
Code
When
200 OK
Request accepted. Per-doc state lives in results[*].outcome.
404 Not Found
(single-doc form only) the referenced datasource does not exist. In the bulk form this becomes a per-doc failed entry.
422 Unprocessable Entity(service envelope)
(single-doc form only) custom-property or sub-reference schema violation — see Schema validation & errors. In the bulk form this becomes a per-doc failed entry.
Send an array under documents. The server processes them sequentially and returns one result per document. A failure on one document does not abort the rest — each entry’s outcome is reported individually.
file_type is required and must be one of the supported values listed under File types (or fetched from GET /file-types).
data_source_sub / data_source_sub_sub must be slug keys that match a sub declared on the datasource (word_word form, e.g. engineering, runbooks). amberSearch builds the compound storage values ({datasource}__{sub_key} etc.) for you.
author is a free-form string (an email or display name); it is not an object.
path is a breadcrumb string for display (e.g. Engineering / Runbooks / Onboarding). For the URL to open the document, use path_preview.
amberSearch processes the file and derives text for the index.
Client-supplied text
text_content
You supply the content to be indexed.
Send eitherbinary_base64ortext_content in body, never both.Server processing --- Encode the file with standard Base64. Set mime_type to the real media type for that file. There is no closed list of supported types; use the correct IANA media type for whatever you upload.Client-supplied text --- Pass text_content with the content you want indexed.
Per-document binary cap: 100 MB (decoded). The cap is enforced by the downstream indexing service — the public /api/indexing proxy does not decode binary_base64 itself, so an oversize payload reaches upstream and is rejected there; the resulting error is forwarded verbatim. The default is controlled by indexing_api_max_binary_bytes; ask your operator if you need it raised.
If your datasource has property definitions, send matching values in custom_properties on each document. Each entry is a { "name", "value" } pair where name matches a property definition key.
Values are validated against the datasource schema --- the property name must exist in the definitions, and the value must match the declared property_type. See Schema validation & errors for details.
To update an existing document, simply re-index it with the same id. The entire document is replaced with the new payload:
The pipeline re-indexes a document only when last_modified has changed or its access tokens have changed (any change to the permissions block — allow_anonymous_access, allowed_users, or allowed_groups). Edits to other fields (e.g. title, body, custom_properties) without bumping last_modified or changing permissions are treated as no-ops, and the response reports outcome: "already_indexed". Bump last_modified whenever you want content edits to take effect.
curl -X POST "https://customerDomain.ambersearch.de/api/indexing/documents" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "documents": { "datasource": "internal_wiki", "id": "wiki-getting-started", "title": "Getting Started with the Internal Wiki (Updated)", "body": { "mime_type": "text/plain", "text_content": "Updated content for the getting started guide..." }, "file_type": "wiki", "path_preview": "https://wiki.company.com/getting-started", "last_modified": "2026-05-01T09:00:00Z", "permissions": { "allow_anonymous_access": true } } }'
If neither last_modified nor the access tokens have changed, nothing is re-queued and the response reports outcome: "already_indexed".
Response codes for DELETE /documents/{datasource}/{document_id}
Code
When
200 OK
Delete-by-id issued to Solr — idempotent, so this is returned even if no document with that id existed.
404 Not Found
Datasource not found.
Indexing a document does not make it immediately searchable. The server controls when new and updated documents are committed to the search index. This can take up to 2 hours depending on system load and commit scheduling. Do not rely on instant availability after a successful POST.
Every document is validated against the datasource’s schema (object definitions, property definitions, and declared sub keys) before it is persisted. If validation fails the API returns HTTP 422 with this envelope:
{ "status": "error", "details": { "error_code": "UNKNOWN_PROPERTY", "message": "Property 'priority' is not defined on datasource 'internal_wiki'. Valid properties: ['department', 'ticket_id']." }}
Pydantic field-level errors (e.g. missing required fields, invalid id characters) use a different shape — the standard FastAPI HTTPValidationError, wrapped in {"detail": [...]} with one entry per offending field. The error_code envelope above is only used for the service-layer schema-validation cases listed below.
The datasource has no property definitions, but the document includes custom_properties. Define properties on the datasource first.
UNKNOWN_PROPERTY
A custom_properties entry has a name that does not match any property_definitions on the datasource.
INVALID_PROPERTY_TYPE
The value of a custom property does not match the declared property_type. For example, sending a string for an INTEGER property or a non-boolean for a BOOLEAN property.
INVALID_DATA_SOURCE_SUB
data_source_sub (or data_source_sub_sub when no subs are declared at all) does not match a top-level sub key of the datasource.
INVALID_DATA_SOURCE_SUB_SUB
data_source_sub_sub is not a child of the supplied data_source_sub, or is unknown across the whole datasource when sent without a parent.
AMBIGUOUS_DATA_SOURCE_SUB_SUB
data_source_sub_sub was sent without data_source_sub, but the same child key exists under multiple parent subs. Set data_source_sub explicitly.
In bulk requests, validation runs per document and failures are reported in results with outcome: "failed" and a message describing the cause. The other documents in the same batch are still processed.