Create an Index#
Create a new vector index for similarity search operations.
VectorDatabase.create(
index_type: str = “hnsw”,
dim: int = 1536,
space: str = “cosine”,
m: int = 16,
ef_construction: int = 200,
expected_size: int = 10000,
quantization_config: dict | None = None
)
Creates and initializes a new vector index with the specified configuration. The index is optimized for fast similarity search on high-dimensional vector embeddings.
Parameters
- index_type : str, default “hnsw”
The type of vector index algorithm to create. Currently only supports
"hnsw"(Hierarchical Navigable Small World). Case-insensitive.- dim : int, default 1536
Dimensionality of the vectors to be indexed. All vectors added to this index must have exactly this number of dimensions. The default of 1536 matches OpenAI’s text-embedding-ada-002 model output dimensionality.
- space : str, default “cosine”
Distance metric used for similarity calculations during search operations. Available options:
"cosine"- Cosine similarity (recommended for normalized embeddings)"L2"- Euclidean distance"L1"- Manhattan distance
- m : int, default 16
Maximum number of bi-directional connections created for each node during graph construction. Higher values improve search recall at the cost of increased memory usage and longer build times. Typical range: 8-64.
- ef_construction : int, default 200
Size of the dynamic candidate list used during index construction. Larger values result in better index quality but increase build time and memory consumption. Typical range: 100-800.
- expected_size : int, default 10000
Estimated number of vectors that will be added to the index. Used for pre-allocating internal data structures to optimize performance. This is not a hard limit - you can add more vectors than this estimate.
- quantization_config : dict or None, default None
Product Quantization configuration for memory-efficient vector compression. When provided, reduces memory footprint at the cost of slight accuracy loss. See the Product Quantization section for detailed configuration options.
Returns
HNSWIndex A configured vector index ready for adding data and performing similarity searches.
Examples#
Firstly, initialize the vector database module
# Import the vector database module
from zeusdb import VectorDatabase
# Instantiate the VectorDatabase class
vdb = VectorDatabase()
Example 1 - Create a basic index with default settings
index = vdb.create()
Example 2 - Create an index optimized for OpenAI embeddings
index = vdb.create(
index_type="hnsw",
dim=1536, # OpenAI text-embedding-3-small dimension
space="cosine"
)
Example 3 - Create a high-precision index for larger datasets
index = vdb.create(
dim=3072, # OpenAI text-embedding-3-large dimension
m=32,
ef_construction=400,
expected_size=100000
)
Example 4 - Create a memory-optimized index with quantization
index = vdb.create(
dim=1536,
expected_size=50000,
quantization_config={
'type': 'pq',
'subvectors': 8,
'bits': 8
}
)