Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect handling of empty host in absolute URL #821

Open
1 task done
kenballus opened this issue Feb 6, 2023 · 2 comments
Open
1 task done

Incorrect handling of empty host in absolute URL #821

kenballus opened this issue Feb 6, 2023 · 2 comments
Labels
bug Hacktoberfest We think it's good for https://hacktoberfest.digitalocean.com/ help wanted

Comments

@kenballus
Copy link
Contributor

Describe the bug

Absolute URLs are permitted to have empty hosts in RFC 3986.

Relevant grammar rules:

   host          = IP-literal / IPv4address / reg-name
   reg-name      = *( unreserved / pct-encoded / sub-delims )

Thus, a URL like a://:1 conforms to the standard.
However, yarl rejects this URL.

urllib3, CPython urllib, rfc3986, furl, and hyperlink all correctly handle this situation.

To Reproduce

Try running the following snippet:

>>> yarl.URL("a://:1")

Expected behavior

The parse should have succeeded, resulting in URL('a://:1').

Logs/tracebacks

This is the output from the snippet:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.10/site-packages/yarl/_url.py", line 163, in __new__
    raise ValueError("Invalid URL: host is required for absolute urls")
ValueError: Invalid URL: host is required for absolute urls


### Python Version

```console
$ python --version
Python 3.10.9

multidict Version

$ python -m pip show multidict
Name: multidict
Version: 6.0.4
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache 2
Location: /home/bkallus/fuzzing/url_differential_fuzzing/url_fuzz_env/lib/python3.10/site-packages
Requires:
Required-by: yarl

yarl Version

$ python -m pip show yarl
Name: yarl
Version: 1.8.2
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl/
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache-2.0
Location: /home/bkallus/fuzzing/url_differential_fuzzing/url_fuzz_env/lib/python3.10/site-packages
Requires: idna, multidict
Required-by:

OS

Arch Linux

Additional context

No response

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@kenballus kenballus added the bug label Feb 6, 2023
@webknjaz
Copy link
Member

webknjaz commented Feb 6, 2023

Hi, if you could send a PR with just a test that could then be marked as xfail, it'd be useful.

@kenballus
Copy link
Contributor Author

Hi, if you could send a PR with just a test that could then be marked as xfail, it'd be useful.

Done!

I'm not too familiar with the codebase, but I imagine the fix is to delete lines 187 and 188 in yarl/_url.py:

187:                if host is None:
188:                    raise ValueError("Invalid URL: host is required for absolute urls")

A host is always permitted to be empty, unless scheme-specific rules say otherwise.

@webknjaz webknjaz added help wanted Hacktoberfest We think it's good for https://hacktoberfest.digitalocean.com/ labels Nov 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Hacktoberfest We think it's good for https://hacktoberfest.digitalocean.com/ help wanted
Projects
None yet
Development

No branches or pull requests

2 participants