Inline filtering for vector sets#1890
Merged
Merged
Conversation
c42adbb to
64a83a7
Compare
64a83a7 to
56d244c
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates Garnet’s vector set similarity search (VSIM) to support inline filtering via a DiskANN → C# callback (instead of relying on over-retrieval + post-filtering), and updates the public docs/tests around filtering and FILTER-EF.
Changes:
- Adds an unmanaged inline-filter callback wiring from DiskANN (Rust) into Garnet (C#) and threads per-query compiled filter state via
[ThreadStatic]. - Changes
FILTER-EFsemantics/limits (default16, range[4, 256]) and updates validation + tests accordingly. - Introduces a documented binary attribute encoding/extraction path intended to accelerate filter evaluation, and adds a new design doc describing the end-to-end approach.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| website/docs/dev/filtered-search-design.md | New end-to-end design doc for filtered vector search and inline filtering. |
| website/docs/commands/vector-sets.md | Updates VSIM option docs for FILTER-EF and inline filtering behavior. |
| test/standalone/Garnet.test.vectorset/RespVectorSetTests.cs | Adds/updates VSIM filter validation tests and new “bad filter” cases. |
| test/standalone/Garnet.test.extensions/DiskANN/DiskANNServiceTests.cs | Updates DiskANN index creation tests for the new callback parameter. |
| libs/server/Resp/Vector/VectorManager.Migration.cs | Passes the new inline filter callback when recreating indexes. |
| libs/server/Resp/Vector/VectorManager.Locking.cs | Passes the new inline filter callback when creating/recreating indexes. |
| libs/server/Resp/Vector/VectorManager.Filter.cs | Adds thread-static inline filter state + candidate evaluation logic. |
| libs/server/Resp/Vector/VectorManager.cs | Switches VSIM paths toward inline filtering setup and bitmap sizing helper. |
| libs/server/Resp/Vector/VectorManager.Callbacks.cs | Adds the unmanaged callback entrypoint to call into filter evaluation. |
| libs/server/Resp/Vector/VectorFilterExpression.cs | Simplifies ExprProgram by removing redundant length fields. |
| libs/server/Resp/Vector/RespServerSessionVectors.cs | Updates FILTER-EF parsing/validation defaults and bounds. |
| libs/server/Resp/Vector/ExprRunner.cs | Iterates using program.Instructions.Length instead of removed program.Length. |
| libs/server/Resp/Vector/DiskANNService.cs | Extends create/recreate index P/Invoke signature to include filter callback. |
| libs/server/Resp/Vector/AttributeExtractor.cs | Adds binary attribute conversion/extraction APIs and minor JSON parsing cleanup. |
| Directory.Packages.props | Bumps diskann-garnet package version. |
49f0836 to
4392922
Compare
4392922 to
f024774
Compare
kevin-montrose
approved these changes
Jun 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Based on Haiyang's original two queue work, this rebased onto the quantization branch and modifies it for the new inline filtering with adaptive L.