Skip to content
This repository has been archived by the owner on Jun 10, 2024. It is now read-only.

v0.3.9

Compare
Choose a tag to compare
@binux binux released this 18 Mar 21:00
· 273 commits to master since this release

New features:

  • Support for Python 3.6.
  • Auto Pause: the project will be paused for scheduler.PAUSE_TIME (default: 5min) when last scheduler.FAIL_PAUSE_NUM (default: 10) task failed, and dispatch scheduler.UNPAUSE_CHECK_NUM (default: 3) tasks after scheduler.PAUSE_TIME. Project will resume if any one of last scheduler.UNPAUSE_CHECK_NUM tasks success.
  • Each callback now have a default 30s process time limit. (Platform support required) @beader
  • New Javascript render engine - Splash support: Enabled by fetch argument --splash-endpoint=http://splash:8050/execute
  • Python3 webdav support.
  • Python3 from projects import project support.
  • A link to corresponding task is added to webui debug page when debugging a exists task in webui.
  • New user_agent parameter in self.crawl, you can set user-agent by headers though.

Fix several bugs:

  • New webui dashboard frontend framework - vue.js, improved the performance when having large number of tasks (e.g. http://demo.pyspider.org/)
  • Fix crawl_config doesn't work in webui while debugging a script issue.
  • Fix CSS Selector Helper doesn't work issue. @ackalker
  • Fix connection_timeout not working issue.
  • FIx need_auth option not applied on webdav issue.
  • Fix "fix can't dump counter to file: scheduler.all" error.
  • Some other fixes