-
Notifications
You must be signed in to change notification settings - Fork 958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add timeout support to AbstractVectorSimilarityQuery #13285
base: main
Are you sure you want to change the base?
Conversation
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution! |
This seems sane to me. @vigyasharma what do you think? |
@kaivalnp could you update CHANGES as well? |
Thanks @benwtrent! Added an entry now.. |
# Conflicts: # lucene/core/src/java/org/apache/lucene/search/ByteVectorSimilarityQuery.java # lucene/core/src/java/org/apache/lucene/search/FloatVectorSimilarityQuery.java
Saw some merge conflicts after a recent commit and resolved those.. |
Hi @benwtrent @vigyasharma could you help review this? Thanks! |
// Return a lazy-loading iterator | ||
return VectorSimilarityScorer.fromAcceptDocs( | ||
this, | ||
boost, | ||
createVectorScorer(context), | ||
new BitSetIterator(acceptDocs, cardinality), | ||
resultSimilarity); | ||
} else if (results.scoreDocs.length == 0) { | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't return null
any more whenm there are 0 results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh never mind I see this got moved to VectorSimilarityScorer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it was common in a couple of places so I moved it there to reduce repetition
@@ -105,13 +116,16 @@ public Scorer scorer(LeafReaderContext context) throws IOException { | |||
LeafReader leafReader = context.reader(); | |||
Bits liveDocs = leafReader.getLiveDocs(); | |||
|
|||
QueryTimeout queryTimeout = searcher.getTimeout(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm what if there is no timeout? will queryTimeout
be null? In that case do we still want to create a TimeLimitingKnnCollectorManager
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will
queryTimeout
be null?
Yes, this is null
when a timeout isn't set
In this case, the TimeLimitingKnnCollectorManager
returns an unwrapped KnnCollector
which does not add overhead of time checking (even null
checks) during graph search (also visible in benchmarks)
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution! |
Description
Along similar lines of #13202, adding timeout support for
AbstractVectorSimilarityQuery
which performs similarity-based vector searchesWhile the graph search happens inside
#scorer
, it may go over the configuredQueryTimeout
and we can early terminate it to return whatever partial results are found..One inherent benefit we have for exact search is that we return a lazy-loading iterator over all vectors, so this is inherently covered by the
TimeLimitingBulkScorer
(as opposed to exact search ofAbstractKnnVectorQuery
which manually goes over all vectors to retain the topK during#rewrite
)