Create an Index#

Create a new vector index for similarity search operations.

VectorDatabase.create(
index_type: str = “hnsw”,
dim: int = 1536,
space: str = “cosine”,
m: int = 16,
ef_construction: int = 200,
expected_size: int = 10000,
quantization_config: dict | None = None
)

Creates and initializes a new vector index with the specified configuration. The index is optimized for fast similarity search on high-dimensional vector embeddings.

Parameters

index_type : str, default “hnsw”

The type of vector index algorithm to create. Currently only supports "hnsw" (Hierarchical Navigable Small World). Case-insensitive.

dim : int, default 1536

Dimensionality of the vectors to be indexed. All vectors added to this index must have exactly this number of dimensions. The default of 1536 matches OpenAI’s text-embedding-ada-002 model output dimensionality.

space : str, default “cosine”

Distance metric used for similarity calculations during search operations. Available options:

"cosine" - Cosine similarity (recommended for normalized embeddings)
"L2" - Euclidean distance
"L1" - Manhattan distance

m : int, default 16

Maximum number of bi-directional connections created for each node during graph construction. Higher values improve search recall at the cost of increased memory usage and longer build times. Typical range: 8-64.

ef_construction : int, default 200

Size of the dynamic candidate list used during index construction. Larger values result in better index quality but increase build time and memory consumption. Typical range: 100-800.

expected_size : int, default 10000

Estimated number of vectors that will be added to the index. Used for pre-allocating internal data structures to optimize performance. This is not a hard limit - you can add more vectors than this estimate.

quantization_config : dict or None, default None

Product Quantization configuration for memory-efficient vector compression. When provided, reduces memory footprint at the cost of slight accuracy loss. See the Product Quantization section for detailed configuration options.

Returns

HNSWIndex A configured vector index ready for adding data and performing similarity searches.

Examples#

Firstly, initialize the vector database module

# Import the vector database module
from zeusdb import VectorDatabase

# Instantiate the VectorDatabase class
vdb = VectorDatabase()

Example 1 - Create a basic index with default settings

index = vdb.create()

Example 2 - Create an index optimized for OpenAI embeddings

index = vdb.create(
    index_type="hnsw",
    dim=1536, # OpenAI text-embedding-3-small dimension
    space="cosine"
)

Example 3 - Create a high-precision index for larger datasets

index = vdb.create(
    dim=3072, # OpenAI text-embedding-3-large dimension
    m=32,
    ef_construction=400,
    expected_size=100000
)

Example 4 - Create a memory-optimized index with quantization

index = vdb.create(
    dim=1536,
    expected_size=50000,
    quantization_config={
        'type': 'pq',
        'subvectors': 8,
        'bits': 8
    }
)