Website to PDF

A web crawler that prints a website to .pdf format

🌐🕸️ ⏩ 📂📜

Requirements:

✔️ 🐍 python 3.x environment

✔️ 📁 wkHTMLtoPDF installed on system

✔️ 🐍 pdfkit pypi library. pdfkit is a python wrapper for wkHTMLtoPDF.

✔️ 🐍 BeautifulSoup 4 pypi library

How to use:

▶️ Set list urls_to_parse with all URLs to save to .pdf format.

urls_to_parse = ["<URL_1>", "<URL_2>", ..., "<URL_N>"] # Where URL_n is your desired URL.

The list can be collected by either:

🅰️ ➡️ Using return from get_url_list_from_site( <MY SITE eg. http://example.com> )

or

🅱️ ➡️ Using return from get_url_list_from_file( <MY FILE | DEFAULT = input/urls.txt> )

▶️ Run website-to-pdf.py

▶️ All URLs will be saved as .pdf to the output/ directory from source website-to-pdf.py

License:

MIT license compliant. Software provided as is. All content is free to use and modify.

¹

GitHub shields provided by Shields.io ↩

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
input		input
network-check		network-check
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
website-to-pdf.py		website-to-pdf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

input

input

network-check

network-check

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

website-to-pdf.py

website-to-pdf.py

Repository files navigation

Website to PDF

Requirements:

How to use:

License:

About

Releases

Packages

Languages

License

AndrewKhassapov/website-to-pdf

Folders and files

Latest commit

History

Repository files navigation

Website to PDF

Requirements:

How to use:

License:

Footnotes

About

Topics

Resources

License

Stars

Watchers

Forks

Languages