Skip to main content
Ctrl+K

ZeusDB

  • ZeusDB Vector Database
  • GitHub
  • ZeusDB Vector Database
  • GitHub

Section Navigation

  • Getting Started
  • Usage
    • Create an Index
    • Add Data
    • Similarity Search
  • Product Quantization
  • Persistence
  • Metadata Filtering
  • Useful Utilities
  • Logging
  • Integrations
    • LangChain
    • LlamaIndex
  • ZeusDB Vector Database
  • Usage
  • Similarity Search

Similarity Search#

Perform similarity search to find the most similar vectors in your index.

HNSWIndex.search(
  vector: list[float] | list[list[float]] | np.ndarray,
  filter: dict[str, str] | None = None,
  top_k: int = 10,
  ef_search: int | None = None,
  return_vector: bool = False
)

Query the index using a new vector and retrieve the top-k nearest neighbors. Supports both single vector queries and batch searches with multiple vectors. You can also filter by metadata or return the original stored vectors.

Parameters

vector : list[float], list[list[float]], or np.ndarray, required

The query vector (single: list[float]) or batch of query vectors (list[list[float]] or 2D np.ndarray) to compare against the index. Must match the index dimension.

filter : dict[str, str] or None, default None

Optional metadata filter. Only vectors with matching key-value metadata pairs will be considered in the search.

top_k : int, default 10

Number of nearest neighbors to return for each query vector.

ef_search : int or None, default None

Search complexity parameter. Higher values improve accuracy at the cost of speed. Defaults to max(2 × top_k, 100) when not specified.

return_vector : bool, default False

If True, the result objects will include the original embedding vector. Useful for downstream processing like re-ranking or hybrid search.


Returns

Single Query

Returns list[dict] where each dict contains:

  • id - The vector ID

  • score - Similarity score (lower = more similar)

  • metadata - Associated metadata dictionary

  • vector - Original embedding vector (only if return_vector=True)

Batch Query

Returns list[list[dict]] - a list of result lists, one for each input query vector.

Examples#


🔍 Search Example 1 - Basic (Returning Top 2 most similar)

results = index.search(vector=query_vector, top_k=2)
print(results)

Output

[
  {'id': 'doc_37', 'score': 0.016932480037212372, 'metadata': {'index': '37', 'split': 'test'}}, 
  {'id': 'doc_33', 'score': 0.019877362996339798, 'metadata': {'split': 'test', 'index': '33'}}
]

🔍 Search Example 2 - Query with metadata filter

This filters on the given metadata after conducting the similarity search.

query_vector = [0.1, 0.2, 0.3, 0.1, 0.4, 0.2, 0.6, 0.7]
results = index.search(vector=query_vector, filter={"author": "Alice"}, top_k=5)
print(results)

Output

[
  {'id': 'doc_001', 'score': 0.0, 'metadata': {'author': 'Alice'}}, 
  {'id': 'doc_003', 'score': 0.0009883458260446787, 'metadata': {'author': 'Alice'}}, 
  {'id': 'doc_005', 'score': 0.0011433829786255956, 'metadata': {'author': 'Alice'}}
]

🔍 Search Example 3 - Search results include vectors

You can optionally return the stored embedding vectors alongside metadata and similarity scores by setting return_vector=True. This is useful when you need access to the raw vectors for downstream tasks such as re-ranking, inspection, or hybrid scoring.

results = index.search(vector=query_vector, filter={"split": "test"}, top_k=2, return_vector=True)
print(results)

Output

[
  {'id': 'doc_37', 'score': 0.016932480037212372, 'metadata': {'index': '37', 'split': 'test'}, 'vector': [0.36544516682624817, 0.11984539777040482, 0.7143614292144775, 0.8995016813278198]}, 
  {'id': 'doc_33', 'score': 0.019877362996339798, 'metadata': {'split': 'test', 'index': '33'}, 'vector': [0.8367619514465332, 0.6394991874694824, 0.9291712641716003, 0.9777664542198181]}
]

🔍 Search Example 4 - Batch Search with a list of vectors

Perform a similarity search on multiple query vectors simultaneously, returning results for each query.

query_vector =
[
    [0.1, 0.2, 0.3],
    [0.4, 0.5, 0.6]
]
results = index.search(vector=query_vector, top_k=3)
print(results)

Output

[
[{'id': 'a', 'score': 4.999447078546382e-09, 'metadata': {'category': 'A'}}, {'id': 'b', 'score': 0.02536815218627453, 'metadata': {'category': 'B'}}, {'id': 'c', 'score': 0.04058804363012314, 'metadata': {'category': 'A'}}],
[{'id': 'b', 'score': 4.591760305316939e-09, 'metadata': {'category': 'B'}}, {'id': 'c', 'score': 0.0018091063247993588, 'metadata': {'category': 'A'}}, {'id': 'a', 'score': 0.025368161499500275, 'metadata': {'category': 'A'}}]
]

🔍 Search Example 5 - Batch Search with NumPy Array

Perform a similarity search on multiple query vectors from a NumPy array, returning results for each query.

query_vector = np.array(
[
    [0.1, 0.2, 0.3],
    [0.7, 0.8, 0.9]
], dtype=np.float32)

results = index.search(vector=query_vector, top_k=3)
print(results)

🔍 Search Example 6 - Batch Search with metadata filter

Performs similarity search on multiple query vectors with metadata filtering, returning filtered results for each query.

results = index.search(
    [[0.1, 0.2, 0.3], [0.7, 0.8, 0.9]],
    filter={"category": "A"},
    top_k=3
)
print(results)

previous

Add Data

next

Product Quantization

© Copyright 2025, ZeusDB.

Created using Sphinx 9.1.0.

Built with the PyData Sphinx Theme 0.16.1.