Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancing Analyzer Accuracy for String Parsing and Special Characters Handling #51

Open
ksg97031 opened this issue Aug 27, 2023 · 1 comment

Comments

@ksg97031
Copy link
Member

ksg97031 commented Aug 27, 2023

The current analyzer employs a fundamental string parsing logic (regular expressions, string splitting ..), which means that it is not guaranteed to be 100% accurate. This is because some characters, such as double quotes, can be interpreted as special characters by the analyzer.

For example, the following Python code:

from django.urls import path
from . import views

urlpatterns = [
    path("example\"'route", views.app2_index, name="index"), 
]

will not be parsed correctly by the analyzer because the double quotes(") are interpreted as part of the path variable.

Similarly, the following Go code:

e.GET("/pet,comma", func(c echo.Context) error {
    return c.String(http.StatusOK, "Hello, Pet!")
})

will also not be parsed correctly because the comma(,) is interpreted as a delimiter.

As this doesn't represent a universal scenario, I'm not sure whether to keep as a known issue or implement a shared lexer and parser to handle these cases more comprehensively.

@hahwul
Copy link
Member

hahwul commented Aug 27, 2023

You're right. guaranteeing perfection is challenging due to our tool's reliance on regular expressions and string matching for analysis. Creating a Lexer/Parser would involve abstracting the code, considering each language's syntax, and identifying endpoints. This would necessitate significant changes to our current structure.

While I agree it's the right long-term direction, taking the first step is proving to be tough 😨

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants