Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

difference between protocol and scheme absolute/relative urls check logic, also node vs browser #8091

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

akhoury
Copy link
Member

@akhoury akhoury commented Dec 18, 2019

Scheme-absolute seems to win the people's consensus

BUT the behavior is different from the server (node environment) and the browser, so remember, there are 2 factors with 2 possible values each, here's a table

For example: //google.com

ENV/TYPE Protocol Absolute Scheme Absolute
Node TRUE FALSE
BROWSER TRUE TRUE

For example for ////google.com

ENV/TYPE Protocol Absolute Scheme Absolute
Node FALSE FALSE
BROWSER TRUE TRUE

In order to mitigate this in the results below, we replace all 3+ leading slashes with 2 slashes, url.replace(/^\/{3,}/, '//');

Reason of different behavior node vs browser

Since there is no equivalent, on the browser, using document.createElement('a'), to the slashesDenoteHost of URL.parse(str, parseQueryString, slashesDenoteHost)), on the browser, slashesDenoteHost is as always set to true.

Also, the browser treats 3+ leading i.e ////google.com slashes as 2 leading slashes, and sets a host value on the result from parseUrl(), while node, using URL.parse('////google.com', true, true) does not set a host value to it, so we think it's a relative url, while but res.redirect redirects it externally without an issue.

Why are we using protocol-absolute?

Unlike the others.

Because for our purposes, we want to consider any redirect attempts to //example.com or attempts to set an img src, to be external. Maybe we should use a different function name? i.e. isUrlExternal() instead?

in Node

url:                    c:\aaa.aaa
protocol-absolute:              FALSE
scheme-absolute:                FALSE

url:                    ./aaaa
protocol-absolute:              FALSE
scheme-absolute:                FALSE

url:                    ../aaaa
protocol-absolute:              FALSE
scheme-absolute:                FALSE

url:                    aaa...aaaa
protocol-absolute:              FALSE
scheme-absolute:                FALSE

url:                    aaa
protocol-absolute:              FALSE
scheme-absolute:                FALSE

url:                    aaa.aaa
protocol-absolute:              FALSE
scheme-absolute:                FALSE

url:                    /aaa.aaa
protocol-absolute:              FALSE
scheme-absolute:                FALSE

url:                    //aaa.aaa
protocol-absolute:              TRUE      <--- different here
scheme-absolute:                FALSE

url:                    ///aaa.aaa/bbb
protocol-absolute:              TRUE      <--- different here
scheme-absolute:                FALSE

url:                    ////aaa.aaa/bbb
protocol-absolute:              TRUE      <--- different here
scheme-absolute:                FALSE

url:                    ///////aaa.aaa/bbb
protocol-absolute:              TRUE      <--- different here
scheme-absolute:                FALSE

url:                    http://aaa.aa
protocol-absolute:              TRUE
scheme-absolute:                TRUE

url:                    https://aaa.aa///
protocol-absolute:              TRUE
scheme-absolute:                TRUE

url:                    ftp://aaa.aa
protocol-absolute:              TRUE
scheme-absolute:                TRUE

url:                    smb://aaa.aa
protocol-absolute:              TRUE
scheme-absolute:                TRUE

url:                    git+ssh://aaa.aa
protocol-absolute:              TRUE
scheme-absolute:                TRUE

in Browser

https://jsfiddle.net/hm129g0r/15/

url: c:\aaa.aaa
protocol-absolute: TRUE
scheme-absolute: TRUE

url: ./aaaa
protocol-absolute: FALSE
scheme-absolute: FALSE

url: ../aaaa
protocol-absolute: FALSE
scheme-absolute: FALSE

url: aaa...aaaa
protocol-absolute: FALSE
scheme-absolute: FALSE

url: aaa
protocol-absolute: FALSE
scheme-absolute: FALSE

url: aaa.aaa
protocol-absolute: FALSE
scheme-absolute: FALSE

url: /aaa.aaa
protocol-absolute: FALSE
scheme-absolute: FALSE

url: //aaa.aaa
protocol-absolute: TRUE       <--- not different here
scheme-absolute: TRUE

url: ///aaa.aaa/bbb
protocol-absolute: TRUE       <--- not different here
scheme-absolute: TRUE

url: ////aaa.aaa/bbb
protocol-absolute: TRUE       <--- not different here
scheme-absolute: TRUE

url: ///////aaa.aaa/bbb
protocol-absolute: TRUE       <--- not different here
scheme-absolute: TRUE

url: http://aaa.aa
protocol-absolute: TRUE
scheme-absolute: TRUE

url: https://aaa.aa///
protocol-absolute: TRUE
scheme-absolute: TRUE

url: ftp://aaa.aa
protocol-absolute: TRUE
scheme-absolute: TRUE

url: smb://aaa.aa
protocol-absolute: TRUE
scheme-absolute: TRUE

url: git+ssh://aaa.aa
protocol-absolute: TRUE
scheme-absolute: TRUE

@akhoury
Copy link
Member Author

akhoury commented Dec 18, 2019

also, tested with using a string parsing solution like: https://github.com/unshiftio/url-parse it does recognize //aaa.aaa as an external url, so it sets a host on the result, but not ///aaa.aaa, sooo we could just use that module, but we still have to strip 3+ leading slashes url.replace(/^\/{3,}/, '//');

@akhoury
Copy link
Member Author

akhoury commented Dec 18, 2019

OR we can have utils.isAbsoluteUrl, when on the browser, to always use the server to check via ajax or socket emit, this way we always rely on the server to decide

return utils.parseUrl(url, true, true);
},

parseUrl: function (url, parseQueryString, slashesDenoteHost) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend returning a runtime-independent object from this function instead of using isBrowser multiple places.

For instance, returning an object with booleans isSchemeAbsolute, isProtocolAbsolute etc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, I would have to "parse" twice each time, since when you're testing for scheme-absolute you would have to set slashesDenoteHost=true and would have to strip the leading slashes

isProtocolAbsoluteUrl: function (url) {
    url = url.replace(/^\/{3,}/, '//');
    var a = utils.parseUrl(url, true, **true);

We don't need to do that when testing for scheme-absolute, so in order to get both results each time, we need to require('url').parse() twice

Which is doable I guess, but just letting you know.

public/src/utils.js Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants