Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic when folder path with dot serves a webpage #178

Open
raphCode opened this issue Mar 15, 2022 · 3 comments
Open

Panic when folder path with dot serves a webpage #178

raphCode opened this issue Mar 15, 2022 · 3 comments
Labels
bug Something isn't working disk-writing Issue regarding content-writing good first issue Good for newcomers

Comments

@raphCode
Copy link
Contributor

raphCode commented Mar 15, 2022

When there is a webpage served under /folder/file1.html as well as under /folder, this creates a conflict:
In the first case, suckit creates a local folder, and in the second case it wants to save the webpage at the same path as the folder, crashing:

[ERROR] Couldn't create fusor.net/old-boards/songs.com: Is a directory (os error 21)

thread '<unnamed>' panicked at 'Couldn't create fusor.net/old-boards/songs.com: Is a directory (os error 21)', src/logger.rs:42:9
stack backtrace:
   0: rust_begin_unwind
             at /rustc/1.58.1/library/std/src/panicking.rs:498:5
   1: core::panicking::panic_fmt
             at /rustc/1.58.1/library/core/src/panicking.rs:107:14
   2: core::panicking::panic_display
             at /rustc/1.58.1/library/core/src/panicking.rs:63:5
   3: suckit::logger::Logger::error
             at ./src/logger.rs:42:9
   4: suckit::disk::save_file
             at ./src/disk.rs:26:21
   5: suckit::scraper::Scraper::handle_url
             at ./src/scraper.rs:263:33
   6: suckit::scraper::Scraper::run::{{closure}}::{{closure}}
             at ./src/scraper.rs:313:33
@raphCode raphCode changed the title Panic when folder serves a webpage Panic when folder path serves a webpage Mar 15, 2022
@raphCode raphCode changed the title Panic when folder path serves a webpage Panic when folder path with dot serves a webpage Mar 16, 2022
@raphCode
Copy link
Contributor Author

The problem seems a combination of a dot being contained in the folder name and a link leading to this folder without a trailing slash.
In url_helper.rs:28 the missing slash does not trigger the if path, and the dot in the folder name is interpreted as an extension, so the else path is not triggered either.

I believe this is the reason why wget detects the document type by its content instead of filename. Subsequently it cannot convert links on the fly, but only after the download of all webpages has finished, which is the exact behavior observed.

@Skallwar Skallwar added bug Something isn't working good first issue Good for newcomers disk-writing Issue regarding content-writing labels Mar 17, 2022
@CalderWhite
Copy link

I also have this issue!

@Skallwar
Copy link
Owner

I have limited bandwidth at the moment. I will have a look when I can but in the meantime, I encourage everyone that's facing the issue and want it fixed to take a look and submit a PR if they can. I will make time for reviews

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working disk-writing Issue regarding content-writing good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants