Skip to content

Vector Operations API


Note: To add vectors to the database, use the Transactions API. The endpoints below are for querying and retrieving vectors only.


Query Vectors by DocumentId

Retrieves all vectors associated with a specific document ID in a collection. This is especially useful when you want to fetch all the vectorized chunks of a document (such as a PDF split into pages or sections) for downstream processing, analysis, or display.

Endpoint: GET /vectordb/collections/{collection_id}/vectors?document_id={document_id}

URL Parameters:

ParameterTypeRequiredDescription
collection_idstringYesID of the collection containing the vectors.

Query Parameters:

ParameterTypeRequiredDescription
document_idstringYesDocument ID to query vectors for.

Response:

[
{
"id": "vector_id_1",
"document_id": "doc123",
"dense_values": [0.1, 0.2, 0.3, ...],
"sparse_values": null,
"text": "Optional text content"
},
{
"id": "vector_id_2",
"document_id": "doc123",
"dense_values": [0.4, 0.5, 0.6, ...],
"sparse_values": null,
"text": "Optional text content"
}
]

Status Codes:

CodeDescription
200Success. Returns the vectors associated with the document ID.
400Bad Request. Collection not found or other validation error.
500Server Error. Internal database or server error.

Use Case: Fetching All Vectors for a Document (Chunked PDF Example)

When working with documents such as PDFs, it’s common to split (“chunk”) the document into smaller sections such as pages, paragraphs, or sentences. Each chunk is then processed through an embedding model to generate a vector representation, and these vectors are indexed in the Cosdata vector database. All vectors generated from the same document are associated with a shared document_id.

Why is this useful?

  • Semantic Search: Enables searching or analyzing specific parts of a document at a fine-grained level.
  • Reconstruction: Allows you to retrieve all vectorized chunks for a document, which can be used to reconstruct or display the original content in context.
  • Debugging & Auditing: Useful for inspecting how a document was split and embedded, or for troubleshooting embedding/modeling issues.

Example Scenario:

Suppose you have a PDF with 3 pages. Each page is chunked and embedded, resulting in multiple vectors (one per chunk). All these vectors are stored with the same document_id (e.g., doc123). If you want to fetch all the vectors for this document (to analyze, re-embed, or display them), you can use the API below.

Example using curl:

Terminal window
curl -X GET \
"https://localhost:8443/vectordb/collections/my_collection/vectors?document_id=doc123" \
-H "Authorization: Bearer <YOUR_API_KEY>"

Replace my_collection with your collection ID, doc123 with your document ID, and <YOUR_API_KEY> with your actual API key.

Retrieves all vectors associated with a specific document ID in a collection. This is especially useful when you want to fetch all the vectorized chunks of a document (such as a PDF split into pages or sections) for downstream processing, analysis, or display.

Get Vector by ID

Retrieves a specific vector by its ID.

Endpoint: GET /vectordb/collections/{collection_id}/vectors/{vector_id}

URL Parameters:

ParameterTypeRequiredDescription
collection_idstringYesID of the collection containing the vector.
vector_idstringYesID of the vector to retrieve.

Response:

{
"id": "vector_id_1",
"document_id": "doc123",
"dense_values": [0.1, 0.2, 0.3, ...],
"sparse_values": null,
"text": "Optional text content"
}

Status Codes:

CodeDescription
200Success. Returns the requested vector.
400Bad Request. Collection not found or other validation error.
404Not Found. Vector with the specified ID not found.
500Server Error. Internal database or server error.

Check Vector Existence

Checks if a vector exists in a collection without retrieving its data.

Endpoint: HEAD /vectordb/collections/{collection_id}/vectors/{vector_id}

URL Parameters:

ParameterTypeRequiredDescription
collection_idstringYesID of the collection to check.
vector_idstringYesID of the vector to check.

Response:

No response body is returned for HEAD requests.

Status Codes:

CodeDescription
200Success. The vector exists.
404Not Found. The vector does not exist.
500Server Error. Internal database or server error.