Skip to content

Thesharing/tumblr-downloader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tumblr Downloader

Download and archive all your likes and following in your tumblr blog using tumblr API.

中文版本

Install

Require Python >= 3.5, you can install the python from official website or install Anaconda 3 instead.

* You can use virtualenv, conda create or pipenv to create an isolated running environment.

Install the requirement by:

pip install -r requirements.txt

Or install the requirements manually:

Package Name
git+https://github.com/tumblr/pytumblr.git
requests
pyyaml
beautifulsoup4
lxml

To be noticed, the official pytumblr package may not be the latest version, so it's better to use pip install git+https://github.com/tumblr/pytumblr.git to download the latest version of pytumblr.

Execution

In some regions you will need a proxy to use this downloader. If the downloader is not running with proxy, try to set the proxy Global Mode and re

  1. Enter https://www.tumblr.com/oauth/apps to register an application and get a OAuth Key. OAuth 2.0 is the way to authentication and access the content of your blog via tumblr API. (Get to know about OAuth)

    But note that Tumblr API has rate limits, so don't overuse it or spread your OAuth Key to public.

    Rate Limits

    Newly registered consumers are rate limited to 1,000 requests per hour, and 5,000 requests per day. If your application requires more requests for either of these periods, please use the 'Request rate limit removal' link on an app above.

  2. After you registration you will get a OAuth Consumer Key and a Secret Key.

  3. Config the functions you need in main.py, currently three functions are provided:

Function Name Explanation
download_likes() Download all the posts you liked
download_following() Download all the posts in the blogs you are following
download_blog(name or url of the blog) Download all the posts in the blog you specified
  • The required parameter of download_blog is the name or URL of the blog. Take the official support blog as an example, the blog name should be support, and the URL should be support.tumblr.com.

  • download_blog has two optional parameters:

    • before_timestamp is a unix timestamp, all posts posted before this timestamp will be downloaded from the newest to the oldest. If not specified (which is default), it will use present time as the parameter, which means all the posts will be downloaded. This parameter is useful when the script breaks down and you want to resume it.
    • max_count is used to control the max count of downloaded posts in case it takes to much time to download one blog. If not specified, all the posts will be downloaded.
  • download_likes has two optional parameters:

    • before_timestamp is described above.
    • rename is used when you want to rename all the files as blog-{no.} . True is default option. If set false, it will use original post's name.
  • download_following has three optional parameters:

    • start_blog, which you can use it to specify which blog to start. This is useful when the script breaks down and you want to resume it.

    • start_page is the page number to start.

    • max_page is the max page number to download in case it takes too much time downloading one blog. max_page cannot be larger than 50, since downloader cannot access 50 and more pages via tumblr API.

      When using the offset parameter the maximum limit on the offset is 1000. If you would like to get more results than that use either before or after.

  • Set the downloader to not download reblog posts by settingdownloader.reblog = False.

  • Set the downloader to not download content that has already been downloaded, say from a previous run, by setting downloader.redownload = False.

  1. Run python main.py to start the first-time config, you will be redirected to a interactive console provided by pytumblr.
    1. First input the OAuth Consumer Key and Secret Key you get before.
    2. The console will return an authorize url to authorize your own tumblr account to the downloader. Copy and paste it in web browser and visit it. The page will ask you to authorize. Allow it. Then the url will be redirected to another url, which contains the oauth_verifier token, copy and paste it back to the console.
    3. Finally the downloader will get oauth_token and oauth_token_secret and continue to download your blog.

Reference

API | Tumblr

Applications | Tumblr

PyTumblr | Github

About

Download and archive all your likes and following in your tumblr blog using tumblr API.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages