Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is there alternative project like this project? #976

Open
socialpercon opened this issue Sep 19, 2021 · 4 comments
Open

is there alternative project like this project? #976

socialpercon opened this issue Sep 19, 2021 · 4 comments

Comments

@socialpercon
Copy link

  • pyspider version:
  • Operating system:
  • Start up command:

Expected behavior

Actual behavior

How to reproduce

is there alternative project like this project?

i don't understand why this project no longer maintainece. i think alternative project more powerful... but i don't know...

@Chaffy-0
Copy link

Chaffy-0 commented Dec 6, 2021

Scrapy

@JermellB
Copy link

JermellB commented Dec 19, 2021

This project isn't maintained any more because their javascript rendering capability is done by phantomjs which is no longer maintained.

Like @Chaffy-0 said, Scrapy is likely the best option if you wanted to do a spider like this.

These days, elasticsearch comes paired with one if you were doing something simple and didn't need to collect and process your own data from the wild.

Most places I've done stuff @ will use things like selenium + chrome or firefox, paired with beautiful soup for the rendered html parsing. Then you could keep track of where you'd spider with simple things like a bloom filter implemented on top of redis or something.

But yeah, Scrapy if you don't feel like getting too dirty.

@milahu
Copy link

milahu commented Apr 18, 2022

some active python web scraper projects
https://github.com/Gerapy/Gerapy
https://github.com/howie6879/ruia

@roniemartinez
Copy link

Just in case people will be interested in my project 🙇 : https://github.com/roniemartinez/dude

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants