Tumblr Downloader

Download and archive all your likes and following in your tumblr blog using tumblr API.

Install

Require Python >= 3.5, you can install the python from official website or install Anaconda 3 instead.

* You can use virtualenv, conda create or pipenv to create an isolated running environment.

Install the requirement by:

pip install -r requirements.txt

Or install the requirements manually:

Package Name
git+https://github.com/tumblr/pytumblr.git
requests
pyyaml
beautifulsoup4
lxml

To be noticed, the official pytumblr package may not be the latest version, so it's better to use pip install git+https://github.com/tumblr/pytumblr.git to download the latest version of pytumblr.

Execution

In some regions you will need a proxy to use this downloader. If the downloader is not running with proxy, try to set the proxy Global Mode and re

Enter https://www.tumblr.com/oauth/apps to register an application and get a OAuth Key. OAuth 2.0 is the way to authentication and access the content of your blog via tumblr API. (Get to know about OAuth)

But note that Tumblr API has rate limits, so don't overuse it or spread your OAuth Key to public.

Rate Limits

Newly registered consumers are rate limited to 1,000 requests per hour, and 5,000 requests per day. If your application requires more requests for either of these periods, please use the 'Request rate limit removal' link on an app above.
After you registration you will get a OAuth Consumer Key and a Secret Key.
Config the functions you need in main.py, currently three functions are provided:

Function Name	Explanation
download_likes()	Download all the posts you liked
download_following()	Download all the posts in the blogs you are following
download_blog(`name or url of the blog`)	Download all the posts in the blog you specified

The required parameter of download_blog is the name or URL of the blog. Take the official support blog as an example, the blog name should be support, and the URL should be support.tumblr.com.
download_blog has two optional parameters:
- before_timestamp is a unix timestamp, all posts posted before this timestamp will be downloaded from the newest to the oldest. If not specified (which is default), it will use present time as the parameter, which means all the posts will be downloaded. This parameter is useful when the script breaks down and you want to resume it.
- max_count is used to control the max count of downloaded posts in case it takes to much time to download one blog. If not specified, all the posts will be downloaded.
download_likes has two optional parameters:
- before_timestamp is described above.
- rename is used when you want to rename all the files as blog-{no.} . True is default option. If set false, it will use original post's name.
download_following has three optional parameters:
- start_blog, which you can use it to specify which blog to start. This is useful when the script breaks down and you want to resume it.
- start_page is the page number to start.
- max_page is the max page number to download in case it takes too much time downloading one blog. max_page cannot be larger than 50, since downloader cannot access 50 and more pages via tumblr API.
  
  When using the offset parameter the maximum limit on the offset is 1000. If you would like to get more results than that use either before or after.
Set the downloader to not download reblog posts by settingdownloader.reblog = False.
Set the downloader to not download content that has already been downloaded, say from a previous run, by setting downloader.redownload = False.

Run python main.py to start the first-time config, you will be redirected to a interactive console provided by pytumblr.
1. First input the OAuth Consumer Key and Secret Key you get before.
2. The console will return an authorize url to authorize your own tumblr account to the downloader. Copy and paste it in web browser and visit it. The page will ask you to authorize. Allow it. Then the url will be redirected to another url, which contains the oauth_verifier token, copy and paste it back to the console.
3. Finally the downloader will get oauth_token and oauth_token_secret and continue to download your blog.

Reference

API | Tumblr

Applications | Tumblr

PyTumblr | Github

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
README-zh.md		README-zh.md
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
tumblr.py		tumblr.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README-zh.md

README-zh.md

README.md

README.md

main.py

main.py

requirements.txt

requirements.txt

tumblr.py

tumblr.py

util.py

util.py

Repository files navigation

Tumblr Downloader

Install

Execution

Reference

About

Releases

Packages

Contributors 2

Languages

Thesharing/tumblr-downloader

Folders and files

Latest commit

History

Repository files navigation

Tumblr Downloader

Install

Execution

Reference

About

Topics

Resources

Stars

Watchers

Forks

Languages