Show HN: VectorVFS, your filesystem as a vector database
VectorVFS: Your Filesystem as a Vector Database¶
VectorVFS is a lightweight Python package that transforms your Linux filesystem into a vector database by
leveraging the native VFS (Virtual File System) extended attributes. Rather than maintaining a separate
index or external database, VectorVFS stores vector embeddings directly alongside each file—turning your
existing directory structure into an efficient and semantically searchable embedding store.
VectorVFS supports Meta’s Perception Encoders (PE) [arxiv] which
includes image/video encoders for vision language understanding, it outperforms InternVL3, Qwen2.5VL
and SigLIP2 for zero-shot image tasks. We support both CPU and GPU but if you have a large
collection of images it might take a while in the first time to embed all items if you are
not using a GPU.
Key Features¶
-
Zero-overhead indexing
Embeddings are stored as extended attributes (xattrs) on each file, eliminating the need for external
index files or services. -
Seamless retrieval
Perform searches across your filesystem, retrieving files by embedding similarity. -
Flexible embedding support
Plug in any embedding model—from pre-trained transformers to custom feature extractors—and let
VectorVFS handle storage and lookup. -
Lightweight and portable
Built on native Linux VFS functionality, VectorVFS requires no additional daemons, background
processes or databases.
Indices and tables¶
Source: vectorvfs.readthedocs.io
Post Comment