Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MOD-5756: Modify FT.SEARCH to avoid unnecessary escaping #4433

Merged
merged 118 commits into from
May 26, 2024
Merged

Conversation

nafraf
Copy link
Collaborator

@nafraf nafraf commented Feb 10, 2024

Description

auto escape on querying TAG fields following the @tag{my_match} syntax
The change was done using dialect 5.

This requires:
#4125 MOD 6186 - Fix FT.EXPLAIN, FT.EXPLAINCLI: INFIX, SUFFIX
#4604 which creates the parser-v3

Only a single tag will be inside the {}, operators (OR | and comma ,) inside the brackets will not be supported, it will be considered as part of the tag.

Special meaning characters, should be escaped to be considered as part of a tag:
* prefix, suffix, infix
w' wildcard
$ param
}

Examples of queries using the new DIALECT 5
127.0.0.1:6379> FT.explaincli idx "@t:{abc*}" DIALECT 5
1) TAG:@t {
2)   PREFIX{abc*}
3) }
4)
127.0.0.1:6379> FT.explaincli idx "@t:{abc\\*}" DIALECT 5
1) TAG:@t {
2)   abc\*
3) }
4)
127.0.0.1:6379> FT.explaincli idx '@t:{abc\*}' DIALECT 5 
1) TAG:@t {
2)   abc\*
3) }
4) 
127.0.0.1:6379> FT.explaincli idx "@t:{w'*'}" DIALECT 5 
1) TAG:@t {
2)   WILDCARD{*}
3) }
4) 
127.0.0.1:6379> FT.explaincli idx "@t:{\\w'*'}" DIALECT 5
1) TAG:@t {
2)   \w'*'
3) }
4)
127.0.0.1:6379> FT.explaincli idx "@t:{$n}" PARAMS 2 n 'abc'  DIALECT 5
1) TAG:@t {
2)   abc
3) }
4) 
127.0.0.1:6379> FT.explaincli idx "@t:{\\$n}" PARAMS 2 n 'abc'  DIALECT 5
1) TAG:@t {
2)   \$n
3) }
4) 
127.0.0.1:6379> FT.explaincli idx "@t:{abc:?-123}"  DIALECT 5
1) TAG:@t {
2)   abc:?-123
3) }
4) 

Which issues this PR fixes

  1. MOD 5756

Main objects this PR modified

  1. New v3 query parser

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

@nafraf nafraf marked this pull request as ready for review February 13, 2024 15:16
Copy link

codecov bot commented Feb 13, 2024

Codecov Report

Attention: Patch coverage is 85.71429% with 49 lines in your changes are missing coverage. Please review.

Project coverage is 86.25%. Comparing base (adaf9e9) to head (80cfd91).

Files Patch % Lines
src/query_parser/v3/lexer.c 85.71% 49 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4433      +/-   ##
==========================================
+ Coverage   86.17%   86.25%   +0.07%     
==========================================
  Files         190      190              
  Lines       34549    34822     +273     
==========================================
+ Hits        29774    30035     +261     
- Misses       4775     4787      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tests/pytests/test_json.py Outdated Show resolved Hide resolved
tests/pytests/test_tags.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@oshadmi oshadmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! 👏🏼 👍🏼 😍

@oshadmi oshadmi enabled auto-merge May 25, 2024 22:12
@oshadmi oshadmi added this pull request to the merge queue May 25, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 25, 2024
@nafraf nafraf added this pull request to the merge queue May 26, 2024
Merged via the queue into master with commit e907ece May 26, 2024
10 checks passed
@nafraf nafraf deleted the nafraf_parser-v3 branch May 26, 2024 19:15
Copy link

Backport failed for 2.10, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.10
git worktree add -d .worktree/backport-4433-to-2.10 origin/2.10
cd .worktree/backport-4433-to-2.10
git switch --create backport-4433-to-2.10
git cherry-pick -x cb6a21000a03b99ba63366b897a1dc1a979b9e7a cc9a69ceb41b88c8687816ef9cfcf97e7035857f c007d7d8261f502406e13659d982fdc0a0290aed 347427144f9a51eb8618ed6d00286f085f58bdca 8b5ed653ea0eb4c3280ae1d762fdab1b57db4ebd e3e64a7024b0552b2929cdac06df0ec7ab9dedd3 8f34250688a0472e510a47061adc8f573667387a 688f2908e82d747a223cfc18527a6fc486e881df 9fda8bed47f768cb865a4ddde59e34c6170a3f1d 9bda6c4e88e57cf2bc511a0832f4c4934008bbe9 e17b55b501f1ce353bb4de796a8bee305efb6de8 74a05f8666a1c72be0fc4992a94afda6374855f9 673f8eb1f5b08e2d4855f85698d563d06222d9d8 90e7145d79b2ec07fbbe9f3a8ebb41a345c421a6 286e7eca43ac6461eb560c83d5951628640b83fa d50eaa746870a68ba5aa600eda5dba1663cb51cb 0e9dd4c153fc3acfbc2b4cf4f01ad549943e89aa fdcd031132107657c961d1f024bed134bf1bd1bc 68a2608295e4e7514f35b618db7dba052f6175f8 94fec6a7b460ac379486a0a3e6d106762140190b ad5ee34adafe2c401745090023afab106067040c 611e195c0b10109eae6711364b440744189e3417 ff0a2ef99942a1692f11710fc532ecd390870827 e079ac449494870351a6d77f3e0430de83d359e7 3195bcc6ddc17b86b5dbc3a694eaa64da5b3482c 7b2c786992fa6033cc5d4b0a60f39c2f61dd9e87 627bd28ef70f98bc87f8dc0ecef51b1b494e5ed6 1f743601155aeb894f098e6f280ddf759b19d349 321c6b05475ee7394e4701165fc325c7041b53bb 296874c4bc2a864d00a26d5ec6be8ee808cc9677 7fcc3f5c8a2d020f74c706aa9b7386f9e2b59b93 cc0e441505bcb2bb0f38f5e6846591080017f19e daec8d4e586f37cd08565d7436d547d3203cab66 6db124474441fe32ae86bb37f90afc59d1bfbd82 f65c8136f8738f7ba3c1d43d9a056a55a89c9353 21258e22d348f978ce784b13ef1aed31f59637e2 6fcf8f5c9517ed074310d698b3a8f2576721f902 7ec833ba137494d9568243f6cf0e375427c0c832 de2dca7bffca0529999f794fd0783f899cddfbcf 622eb71cfab393bcb344118cfd6ec28fb70ef528 8dc371ad304fb79d2ab2b5034dc64155d7f20f6f fac5ac40a5c05302fa5b4de37c9e45a9bcdcc4e0 a1143f2131f68a972e57a7f117e9bc06efbf454b d1451b6693c79838c6087076a7af06d991b8719b 245b47d3e7a064187a092729e87c178d4fd3bf24 4b2ad86d7f095779e3655b680641ef0b5f568687 08f2e60dee0d6411879abbd81e79d548270b5bda 2dd2996022b48987d131301dbbd93b35850a7ebb 20d5b6422b5089986e7fcb462438b4b64fc9dc14 0c20118a50163cf8133afc2337e30c626c1ee659 7f7e875d72e2e30feed82053ed6938dc9c4a3250 a2c4703ebb4b0b8ef6127610afbcfa93320da368 2231353c7e730348c5c06d0e07f72807f06a79b9 1c04c8590c7be6ba4d76d2a015cd38e9faca6c57 0dae78b38953d3e9cdeefa47fcb1999897a2a118 60afcfdbebfdcec37d838c0cf2ed801ff7ddb6db d0abb94c37bc3d6c58160eb278c45da30c5b7d3a c9284931f5e5913270ec63f16e88ffbfb2ea2ebf 301dcaf867fc442c8cab2801572bffd0c2c9aa95 5acaed5afebb8d4d4b7d6d45c11e9329907fa82d 0d816fc3bd831a5aa2cf23785523df0f1b780919 572fe76992604eaaa3c00d4f8d99accad1d3837b 541d8bb03a456ba967c47b09333c2e2b24bc8cb9 81704b51a6617edf7e394ad520bc868e3fbcbc6b d8c9cec1e9cc392cdc4683d841802fb4bb8450b1 b1dada8b1dab3f6708da2d6a1e75e413ecc9a85d d3e147ae850eed22332e1c1166350012c1885aaa c05367bc6c6de9a5c7e05c46cc9a6e076e47fc82 4ea8fc857ddfc2aa58f447409b1a61f0a0199a3b c08118143d24011bdd97e26343d5e88ed3890801 b91f28ccaf8ff8aee23a4a869fd57387e77bfb6a f707099ca7cd2e838e9e6bd025d5829cc1905114 6fd07aa7673629fd8423c5544dfa63fcb8d56590 5f8a974dd31b76536bfd0d200d01947b5dd6614d db4f0c2fa4e505ed12acf142af1ab670e3922f6b 598d53ff32b5d7f0128860b1d644e0f29fb488d3 99c5e20057312e035cf4326f2ce69f7f6196c3ce 66002f17859fc1aa386671c9130e7a3d421c04f8 3196c6e295e63f78abe69025ee45eb765edcb205 535c73ad18ddf43a9e092a8d280492eab4be4124 538002dcc85754d6acf359eecdc5625ae40c5235 2ac2d2c4f7ebae6c18adb59a9b714d2c704ff3ea

nafraf added a commit that referenced this pull request May 27, 2024
* Add query_parser/v3

* Add more tag tests

* Add test for TEXT testExact

* autoescaping single tag between brackets

* Fix tests COORD=1

* Fix wildcard support

* wildcard + prefix/infix/suffix is invalid

* Test backward compatibility

* Test some uncovered cases

* Test prefix/infix/suffix with TEXT field

* Remove temporary debug messages

* Test: escape 'w' single_tag

* Add more tag tests

* WIP: Tests to increase coverage parser v3

* Add more tests

* Fix lexer.c v3

* Fix test invalid syntax

* Test punct and cntrl characters

* split unescaped_tag rule

* Add one more test UNESCAPED_TAG

* Fix expected test format  in cluster

* Update src/query_parser/v3/lexer.rl - Fix description

Co-authored-by: Omer Shadmi <76992134+oshadmi@users.noreply.github.com>

* Test pipe with dialect < 5, add comment about backslack escaping

* Test escaping $

* One more test escaping $

* lexer v3 - remove leading and trailing spaces

* Test short tags

* Add JSON tests

* Use comma separator for JSON tests

* Add test testTagUNF()

* Test tag autoescaping using DEFAULT_DIALECT 5

* More test using DEFAULT DIALECT 5 in test_search_params: test_geo, test_attr, test_binary_data.

* Revert changes to test_search_params:test_geo

* testTagUNF: Create index before hashes

* Revert change in QueryNode_DumpSds()

* Test aggregate with TAG autoescaping

* Fix testTagAutoescaping, remove additional right curly brace

* Update testDialect5InvalidSyntax()

* update parser/v3 taking latest parser/v2

* Create parser v3

* Test dialect: DEFAULT_DIALECT as module arg

* More tests for text queries

* Improve invalid syntax text

* Fix parser v3, unary op after field name

* Add missint test with modifierlist

* WIP: Test isempty() with DIALECT 5

* test_v1_vs_v2_vs_v5()

* Add tests to improve codecov using dialect 5

* One more test to improve codecov

* Add WITHCOUNT to fix test with DIALECT 5

* Fix testEmptyValueTags() for DIALECT > 2

* Fix test_tags

* Test float without leading zero

* Fix wrong float number test

* Test ragel minization at the end of compilation

* Revert "Test ragel minization at the end of compilation"

This reverts commit 0dae78b.

* Fix make-parser.mk

* cpp-test parser v3

* Update parser v3

* Update lexer

* Fix tests

* Fix float number syntax, leading zero is optional

* Test number format

* Support numbers with multiple signs

* Create macros in parser.y v3

* Use set_max_dialect

* Fix testEmptyValueTags()

* MOD-6750 Fix numeric range syntax (#4505)

Fix numeric range syntax

* MOD-6749: Querying numeric fields using simple operators (#4516)

* Simplify single_tag

* Remove unescaped_tag2

* lexer v3 - remove colon from tag expressions

* Create unescaped_tag2 to create UNESCAPED_TAG without escape

* Fix leading/trailing spaces deletion

* Validate tok.len

* Remove debugging code

* Fix lexer.rl v3 format

* Temp: Try to fix sanitizer

* Revert "Temp: Try to fix sanitizer"

This reverts commit 66002f1.

* Test tag with * as literal

* Fix parser: tag rules

* Update tests/pytests/test_tags.py - minor typo

---------

Co-authored-by: Omer Shadmi <76992134+oshadmi@users.noreply.github.com>
(cherry picked from commit e907ece)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants