Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query parser fails to parse a range query string when there are escaped brackets inside the range #13234

Open
marko-bekhta opened this issue Mar 28, 2024 · 2 comments
Labels

Comments

@marko-bekhta
Copy link

Description

Assume there's a query parser created, e.g.:

Analyzer analyzer = new ClassicAnalyzer();
QueryParser queryParser = new QueryParser( "field", analyzer );

trying to parse simple ranges like:

// simple range query with no escapes works fine:
Query query = queryParser.parse( "[ 1 TO 10 ]" );

works as expected and a TermRangeQuery is created with no exceptions.

But if the range in the string contains some escaped brackets -- it leads to a parsing exception.
Let's assume one would want to extend the query parser to work with date-time fields and would want to parse something like:

// another range query but now it has some escaping between the range brackets:
query = queryParser.parse( "[ 2024\\-01\\-01T01\\:01\\:01\\+01\\:00\\[Europe\\/Warsaw\\] TO 2025\\-01\\-01T01\\:01\\:01\\+01\\:00\\[Europe\\/Warsaw\\] ]" );

where all special characters are escaped, leads to:

Caused by: org.apache.lucene.queryparser.classic.ParseException: Encountered " "]" "] "" at line 1, column 50.
Was expecting:
    "TO" ...
    
	at org.apache.lucene.queryparser.classic.QueryParser.generateParseException(QueryParser.java:1004)
	at org.apache.lucene.queryparser.classic.QueryParser.jj_consume_token(QueryParser.java:867)
	at org.apache.lucene.queryparser.classic.QueryParser.Term(QueryParser.java:532)
	at org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:366)
	at org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:251)
	at org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser.java:223)
	at org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:137)
	... 4 more

Note, the idea to do range queries for dates is to have something along the lines:

QueryParser queryParser = new QueryParser( "field", analyzer ) {
	@Override
	protected Query newRangeQuery(String field, String part1, String part2, boolean startInclusive, boolean endInclusive) {
		var p1 = parseValue(part1);
		var p2 = parseValue(part2);

		return createRangeQueryForDates( field, p1, p2, startInclusive, endInclusive );
	}
};

but because of the parsing error described above, execution never reaches this point.

Version and environment details

Java version: 17.0.9, vendor: Amazon.com Inc.
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "6.7.10-200.fc39.x86_64", arch: "amd64", family: "unix"

Lucene 9.10.0

@benchaplin
Copy link
Contributor

You can get around this by placing each range term in quotes:

query = queryParser.parse( "[ \"2024\\-01\\-01T01\\:01\\:01\\+01\\:00\\[Europe\\/Warsaw\\]\" TO \"2025\\-01\\-01T01\\:01\\:01\\+01\\:00\\[Europe\\/Warsaw\\]\" ]" );

In fact, then you don't need to escape anything other than the quotes:

query = queryParser.parse( "[ \"2024-01-01T01:01:01+01:00[Europe/Warsaw]\" TO \"2025-01-01T01:01:01+01:00[Europe/Warsaw]\" ]" );

Both will be parsed to [2024-01-01t01:01:01+01:00[europe/warsaw] TO 2025-01-01t01:01:01+01:00[europe/warsaw]].

(I've added some tests showing this: #13323)

@marko-bekhta
Copy link
Author

Thanks for looking at this and for the suggestion! I've also tested it out and can confirm that it worked.
I'll let you decide how you'd want to proceed with this ticket (looking at the linked PR, you are considering whether an update to the parser should be applied to support more query string variations)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants