You can further restrict the vector similarity search by providing a filter based on a specific metadata criteria.
Queries with metadata filters only return vectors which have metadata matching with the filter.
Upstash Vector allows you to filter keys which have the following value types:
Filtering is implemented as a combination of in and post-filtering. Every query is assigned a filtering budget,
determining the number of candidate vectors that can be compared against the filter during query execution. If this
budget is exceeded, the system fallbacks into post-filtering. Therefore, with highly selective filters, fewer
than topK
vectors may be returned.
A filter has a syntax that resembles SQL, which consists of operators on object keys and boolean operators to combine them.
Assuming you have a metadata like below:
Then, you can query similar vectors with a filter like below:
The equals
operator filters keys whose values are equal to the given literal.
It is applicable to string, number, and boolean values.
The not equals
operator filters keys whose values are not equal to the given literal.
It is applicable to string, number, and boolean values.
The less than
operator filters keys whose values are less than the given literal.
It is applicable to number values.
The less than or equals
operator filters keys whose values are less than or equal to the given literal.
It is applicable to number values.
The greater than
operator filters keys whose values are greater than the given literal.
It is applicable to number values.
The greater than or equals
operator filters keys whose values are greater than or equal to the given literal.
It is applicable to number values.
The glob
operator filters keys whose values match with the given UNIX glob pattern.
It is applicable to string values.
It is a case sensitive operator.
The glob operator supports the following wildcards:
*
matches zero or more characters.?
matches exactly one character.[]
matches one character from the list
[abc]
matches either a
, b
, or c
.[a-z]
matches one of the range of characters from a
to z
.[^abc]
matches any one character other than a
, b
, or c
.[^a-z]
matches any one character other than a
to z
.For example, the filter below would only match with city names whose second character is s
or z
,
and ends with anything other than m
to z
.
The not glob
operator filters keys whose values do not match with the given UNIX glob pattern.
It is applicable to string values.
It has the same properties with the glob operator.
For example, the filter below would only match with city names whose first character is anything other than A
.
The in
operator filters keys whose values are equal to any of the given literals.
It is applicable to string, number, and boolean values.
Semantically, it is equivalent to equals operator applied to all of the given literals with OR
boolean operator in between:
The not in
operator filters keys whose values are not equal to any of the given literals.
It is applicable to string, number, and boolean values.
Semantically, it is equivalent to not equals operator applied to all of the given literals with AND
boolean operator in between:
The contains
operator filters keys whose values contain the given literal.
It is applicable to array values.
The not contains
operator filters keys whose values do not contain the given literal.
It is applicable to array values.
The has field
operator filters keys which have the given JSON field.
The has not field
operator filters keys which do not have the given JSON field.
Operators above can be combined with AND
and OR
boolean operators to form
compound filters.
Boolean operators can be grouped with parentheses to have higher precedence.
When no parentheses are provided in ambiguous filters, AND
will have higher
precedence than OR
. So, the filter
would be equivalent to
It is possible to filter nested object keys by referencing them with the .
accessor.
Nested objects can be at arbitrary depths, so more than one .
accessor can be used
in the same identifier.
Apart from the CONTAINS
and NOT CONTAINS
operators, individual array elements can also
be filtered by referencing them with the []
accessor by their indexes.
Indexing is zero based.
Also, it is possible to index from the back using the #
character with negative values.
#
can be thought as the number of elements in the array, so [#-1]
would reference the
last element.
[a-zA-Z_][a-zA-Z_0-9.[\]#-]*
. In simpler terms, they should
start with characters from the English alphabet or _
, and can continue with same characters plus numbers and other accessors
like .
, [0]
, or [#-1]
.1
or 0
.You can further restrict the vector similarity search by providing a filter based on a specific metadata criteria.
Queries with metadata filters only return vectors which have metadata matching with the filter.
Upstash Vector allows you to filter keys which have the following value types:
Filtering is implemented as a combination of in and post-filtering. Every query is assigned a filtering budget,
determining the number of candidate vectors that can be compared against the filter during query execution. If this
budget is exceeded, the system fallbacks into post-filtering. Therefore, with highly selective filters, fewer
than topK
vectors may be returned.
A filter has a syntax that resembles SQL, which consists of operators on object keys and boolean operators to combine them.
Assuming you have a metadata like below:
Then, you can query similar vectors with a filter like below:
The equals
operator filters keys whose values are equal to the given literal.
It is applicable to string, number, and boolean values.
The not equals
operator filters keys whose values are not equal to the given literal.
It is applicable to string, number, and boolean values.
The less than
operator filters keys whose values are less than the given literal.
It is applicable to number values.
The less than or equals
operator filters keys whose values are less than or equal to the given literal.
It is applicable to number values.
The greater than
operator filters keys whose values are greater than the given literal.
It is applicable to number values.
The greater than or equals
operator filters keys whose values are greater than or equal to the given literal.
It is applicable to number values.
The glob
operator filters keys whose values match with the given UNIX glob pattern.
It is applicable to string values.
It is a case sensitive operator.
The glob operator supports the following wildcards:
*
matches zero or more characters.?
matches exactly one character.[]
matches one character from the list
[abc]
matches either a
, b
, or c
.[a-z]
matches one of the range of characters from a
to z
.[^abc]
matches any one character other than a
, b
, or c
.[^a-z]
matches any one character other than a
to z
.For example, the filter below would only match with city names whose second character is s
or z
,
and ends with anything other than m
to z
.
The not glob
operator filters keys whose values do not match with the given UNIX glob pattern.
It is applicable to string values.
It has the same properties with the glob operator.
For example, the filter below would only match with city names whose first character is anything other than A
.
The in
operator filters keys whose values are equal to any of the given literals.
It is applicable to string, number, and boolean values.
Semantically, it is equivalent to equals operator applied to all of the given literals with OR
boolean operator in between:
The not in
operator filters keys whose values are not equal to any of the given literals.
It is applicable to string, number, and boolean values.
Semantically, it is equivalent to not equals operator applied to all of the given literals with AND
boolean operator in between:
The contains
operator filters keys whose values contain the given literal.
It is applicable to array values.
The not contains
operator filters keys whose values do not contain the given literal.
It is applicable to array values.
The has field
operator filters keys which have the given JSON field.
The has not field
operator filters keys which do not have the given JSON field.
Operators above can be combined with AND
and OR
boolean operators to form
compound filters.
Boolean operators can be grouped with parentheses to have higher precedence.
When no parentheses are provided in ambiguous filters, AND
will have higher
precedence than OR
. So, the filter
would be equivalent to
It is possible to filter nested object keys by referencing them with the .
accessor.
Nested objects can be at arbitrary depths, so more than one .
accessor can be used
in the same identifier.
Apart from the CONTAINS
and NOT CONTAINS
operators, individual array elements can also
be filtered by referencing them with the []
accessor by their indexes.
Indexing is zero based.
Also, it is possible to index from the back using the #
character with negative values.
#
can be thought as the number of elements in the array, so [#-1]
would reference the
last element.
[a-zA-Z_][a-zA-Z_0-9.[\]#-]*
. In simpler terms, they should
start with characters from the English alphabet or _
, and can continue with same characters plus numbers and other accessors
like .
, [0]
, or [#-1]
.1
or 0
.