Alex Garcia is a developer focused on making vector search accessible and practical. As he puts it: "I'm a SQLite guy. I use SQLite for a lot of projects... I want an easier vector search thing that I don't have to install 10,000 dependencies to use.”
Core Mantra: "Simple, Local, Scalable"
Why SQLite Vec?
"I didn't go along thinking, 'Oh, I want to build vector search, let me find a database for it.' It was much more like: I use SQLite for a lot of projects, I want something lightweight that works in my current workflow."
SQLiteVec uses row-oriented storage with some key design choices:
- Vectors are stored in large chunks (megabytes) as blobs
- Data is split across 4KB SQLite pages, which affects analytical performance
- Currently uses brute force linear search without ANN indexing
- Supports binary quantization for 32x size reduction
- Handles tens to hundreds of thousands of vectors efficiently
Practical limits:
- 500ms search time for 500K vectors (768 dimensions)
- Best performance under 100ms for user experience
- Binary quantization enables scaling to ~1M vectors
- Metadata filtering and partitioning coming soon
Key advantages:
- Fast writes for transactional workloads
- Simple single-file database
- Easy integration with existing SQLite applications
- Leverages SQLite's mature storage engine
Garcia's preferred tools for local AI:
- Sentence Transformers models converted to GGUF format
- Llama.cpp for inference
- Small models (30MB) for basic embeddings
- Larger models like Arctic Embed (hundreds of MB) for recent topics
- SQLite L-Embed extension for text embeddings
- Transformers.js for browser-based implementations
1. Choose Your Storage
"There's two ways of storing vectors within SQLiteVec. One way is a manual way where you just store a JSON array... [second is] using a virtual table."
- Traditional row storage: Simple, flexible, good for small vectors
- Virtual table storage: Optimized chunks, better for large datasets
- Performance sweet spot: Up to 500K vectors with 500ms search time
2. Optimize Performance
"With binary quantization it's 1/32 of the space... and holds up at 95 percent quality"
- Binary quantization reduces storage 32x with 95% quality
- Default page size is 4KB - plan your vector storage accordingly
- Metadata filtering dramatically improves search speed
3. Integration Patterns
"It's a single file, right? So you can like copy and paste it if you want to make a backup."
- Two storage approaches: manual columns or virtual tables
- Easy backups: single file database
- Cross-platform: desktop, mobile, IoT, browser (via WASM)
4. Real-World Tips
"I typically choose the really small model... it's 30 megabytes. It quantizes very easily... I like it because it's very small, quick and easy."
- Start with smaller, efficient models (30MB range)
- Use binary quantization before trying complex solutions
- Plan for partitioning when scaling beyond 100K vectors
Alex Garcia
Nicolay Gerold: