Skip to content

song940/node-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

node-crawler

simple crawler

Installation

$ npm install @song940/crawler --save

Example

const Crawler = require('@song940/crawler');

const engine = new Crawler({ 
    url: 'https://news.ycombinator.com'
});

engine.parse = async ($, commit) => {
    $('table.itemlist tr.athing').each((i, row) => {
        const title = $('td.title', row).text();
        commit({ title });
    });
};

engine.on('commit', posts => {
    console.log(posts);
});

for(var i=1;i<100;i++){
    engine.push(`${engine.url}/news?p=${i}`);
}

engine.on('end', () => {
    console.log('all job done');
});

engine.start();

Contributing

  • Fork this Repo first
  • Clone your Repo
  • Install dependencies by $ npm install
  • Checkout a feature branch
  • Feel free to add your features
  • Make sure your features are fully tested
  • Publish your local branch, Open a pull request
  • Enjoy hacking <3

ISC

This work is licensed under the ISC license.