Logging#

ZeusDB Vector Database includes enterprise-grade structured logging that works automatically out of the box while providing extensive customization for advanced users.

The logging system is designed to be invisible when you don’t need it and powerful when you do. Most users will never need to configure anything, while enterprise users get full control over observability.

Basic Usage - it just works!#

For most users, logging works automatically out of the box. No need to do anything.

from zeusdb import VectorDatabase
# Logging is automatically configured - no setup required!

vdb = VectorDatabase()
index = vdb.create("hnsw", dim=1536)

# Operations are automatically logged with structured data
result = index.add({"vectors": vectors, "ids": ids})
results = index.search(query_vector, top_k=5)

What you get automatically:

  • Silent by default - Only errors and warnings in production

  • Environment detection - Appropriate defaults for dev/prod/testing

  • Structured JSON logs in production environments

  • Human-readable logs in development environments

  • Performance timing on all operations

  • Cross-platform compatibility

Smart Environment Detection#

This amazing feature automatically detects where your code is running and applies appropriate logging defaults!

  • 🏭 Production (ENVIRONMENT=production): ERROR level, JSON format, often file output

  • 💻 Development (ENVIRONMENT=development): WARNING level, human format, console output

  • 🧪 Testing (pytest, PYTEST_CURRENT_TEST): CRITICAL level, minimal output

  • 📓 Jupyter (JUPYTER_SERVER_ROOT): INFO level, human format, clean output

  • 🔄 CI/CD (CI, GITHUB_ACTIONS): WARNING level, human format for readability

How Environment Detection Works#

The system automatically checks for these indicators:

Environment

Detection Method

What It Finds

Testing

PYTEST_CURRENT_TEST

pytest automatically sets this

Testing

'pytest' in sys.modules

pytest imported

Jupyter

JUPYTER_SERVER_ROOT

Jupyter server running

Jupyter

JPY_PARENT_PID

Jupyter kernel process

Jupyter

'IPython' in sys.modules

IPython/Jupyter imported

CI/CD

CI=true

Most CI systems set this

CI/CD

GITHUB_ACTIONS=true

GitHub Actions

CI/CD

GITLAB_CI=true

GitLab CI

Production

KUBERNETES_SERVICE_HOST

Running in Kubernetes

Production

DOCKER_CONTAINER

Running in Docker

Explicit

ENVIRONMENT=production

User explicitly set

Real-World Examples#

🧪 In pytest:

$ pytest test_my_app.py
# Environment detected: 'testing'
# Applied: CRITICAL level, no console output, minimal format
# Result: Your tests run clean without log spam!

📓 In Jupyter:

# Cell 1
import zeusdb
# Environment detected: 'jupyter' (via JUPYTER_SERVER_ROOT)
# Applied: INFO level, human format, clean timestamps
# Result: Nice readable logs for exploration!

# Cell 2  
vdb = zeusdb.VectorDatabase()
# Logs: "Index created: dim=1536, vectors=0" (clean, readable)

🏭 In Docker production:

$ docker run my-app
# Environment detected: 'production' (via KUBERNETES_SERVICE_HOST)
# Applied: ERROR level, JSON format, structured output
# Result: Production-ready structured logs!

💻 On your laptop:

$ python my_script.py
# Environment detected: 'development' (default)
# Applied: WARNING level, human format, console output
# Result: Clean development experience!

Test Environment Detection Yourself#

# See what environment is detected
import zeusdb.logging_config as lc
env = lc._detect_environment()
print(f"Detected environment: {env}")

config = lc._get_smart_defaults(env)
print(f"Applied config: {config}")

Override Environment Detection#

import os

# Force production mode
os.environ['ENVIRONMENT'] = 'production'
import zeusdb  # Will use ERROR level, JSON format

# Force development mode  
os.environ['ENVIRONMENT'] = 'development'  
import zeusdb  # Will use WARNING level, human format

Intermediate Usage (Environment Variables)#

Control logging behavior with environment variables.

Quick Development Debugging

export ZEUSDB_LOG_LEVEL=debug
python your_app.py

Production JSON Logging

export ZEUSDB_LOG_LEVEL=error
export ZEUSDB_LOG_FORMAT=json
export ZEUSDB_LOG_TARGET=file
export ZEUSDB_LOG_FILE=/var/log/zeusdb/app.log
python your_app.py

Machine Learning Pipeline Debugging

# For ML workflows where you want detailed progress tracking
export ZEUSDB_LOG_LEVEL=info
export ZEUSDB_LOG_FORMAT=human
export ZEUSDB_LOG_CONSOLE=true
python train_embeddings.py

# Example output you'll see:
# 2025-01-15 14:30:15 - INFO - Starting PQ training: 5000 vectors
# 2025-01-15 14:30:19 - INFO - PQ training completed: 4.2s (compression: 192x)
# 2025-01-15 14:30:20 - INFO - Vector addition completed: 10000 vectors in 2.1s

Environment Variables Reference#

Variable

Options

Default

Description

ZEUSDB_LOG_LEVEL

trace, debug, info, warning, error, critical

warning (dev), error (prod)

Controls log verbosity

ZEUSDB_LOG_FORMAT

human, json

human (dev), json (prod)

Output format

ZEUSDB_LOG_TARGET

stdout, stderr, file

stderr

Where logs go

ZEUSDB_LOG_FILE

/path/to/file.log

zeusdb.log

Log file path (if target=file)

ZEUSDB_LOG_CONSOLE

true, false

Auto-detected

Force console output


Advanced Usage (Programmatic Control)#

For enterprise environments with existing logging infrastructure.

Option 1: Disable Auto-Configuration#

import os
os.environ["ZEUSDB_DISABLE_AUTO_LOGGING"] = "1"

# Now configure your own logging before importing ZeusDB
import logging
logging.basicConfig(level=logging.INFO, format='%(message)s')

from zeusdb_vector_database import VectorDatabase  # Will respect your existing logging setup

Option 2: Programmatic Initialization#

import os
os.environ["ZEUSDB_DISABLE_AUTO_LOGGING"] = "1"

import zeusdb

# Initialize with JSON to console 
success = zeusdb.init_logging(level="info")

# OR initialize with file logging
success = zeusdb.init_file_logging(
    log_dir="/var/log/myapp",
    level="debug", 
    file_prefix="zeusdb"
)

# Then use normally
vdb = zeusdb.VectorDatabase()

Option 3: Custom Logger Integration#

import logging
import os

# Disable auto-configuration
os.environ["ZEUSDB_DISABLE_AUTO_LOGGING"] = "1"

# Set up your own logger first
logger = logging.getLogger("myapp.zeusdb")
logger.setLevel(logging.INFO)

# Configure Rust logging to match
os.environ["ZEUSDB_LOG_LEVEL"] = "info"
os.environ["ZEUSDB_LOG_FORMAT"] = "json"

from zeusdb import VectorDatabase
# ZeusDB will integrate with your logging setup

📊 Log Output Examples#

Human-Readable (Development)#

2025-01-15 10:30:15 - zeusdb.vector - INFO - Index created: dim=1536, vectors=0
2025-01-15 10:30:16 - zeusdb.vector - INFO - Added 1000 vectors in 45ms
2025-01-15 10:30:16 - zeusdb.vector - DEBUG - Search completed: 5 results in 2ms

Structured JSON (Production)#

{"timestamp":"2025-01-15T10:30:15.123Z","level":"INFO","operation":"index_creation","dim":1536,"space":"cosine","duration_ms":12}
{"timestamp":"2025-01-15T10:30:16.456Z","level":"INFO","operation":"vector_addition","total_inserted":1000,"duration_ms":45}
{"timestamp":"2025-01-15T10:30:16.789Z","level":"DEBUG","operation":"search_complete","results_count":5,"duration_ms":2}

🔍 Monitoring and Observability#

Key Metrics to Monitor#

  • operation: Type of operation (index_creation, vector_addition, search, etc.)

  • duration_ms: Performance timing for all operations

  • total_inserted, total_errors: Success/failure rates

  • compression_ratio: Memory efficiency with quantization

  • training_progress: Quantization training status

Production Alerting Examples#

# Monitor error rates
grep '"level":"ERROR"' /var/log/zeusdb/app.log | wc -l

# Track performance degradation  
grep '"operation":"search"' /var/log/zeusdb/app.log | jq '.duration_ms' | avg

# Watch quantization training
grep '"operation":"pq_training"' /var/log/zeusdb/app.log | tail -f

🛠️ Troubleshooting#

Common Issues#

Logs not appearing?

# Check if auto-logging is disabled
echo $ZEUSDB_DISABLE_AUTO_LOGGING

# Verify log level
ZEUSDB_LOG_LEVEL=debug python -c "import zeusdb; print('Logging active')"

File logging not working?

# Check permissions
ls -la /path/to/log/directory

# Test with console first
ZEUSDB_LOG_TARGET=stderr ZEUSDB_LOG_LEVEL=info python your_app.py

Want to see Rust logs specifically?

# Enable trace level to see all Rust operations
ZEUSDB_LOG_LEVEL=trace python your_app.py

Performance Impact#

  • Minimal overhead: Structured logging adds <1% performance impact

  • Async file writing: File logging doesn’t block operations

  • Smart buffering: Logs are efficiently batched for performance

Best Practices#

Development#

export ZEUSDB_LOG_LEVEL=debug
export ZEUSDB_LOG_FORMAT=human

Staging#

export ZEUSDB_LOG_LEVEL=info
export ZEUSDB_LOG_FORMAT=json
export ZEUSDB_LOG_TARGET=file
export ZEUSDB_LOG_FILE=logs/zeusdb-staging.log

Production#

export ENVIRONMENT=production
export ZEUSDB_LOG_LEVEL=error  
export ZEUSDB_LOG_FORMAT=json
export ZEUSDB_LOG_TARGET=file
export ZEUSDB_LOG_FILE=/var/log/zeusdb/production.log

Logging stays out of the way when you don’t need it, but delivers full power and flexibility when you do. Most users never need to touch the settings, while enterprise teams can fine-tune every aspect of observability.