Skip to content

💫 Crawl urls from a webpage and provide a DomCrawler with Scraper Library

License

Notifications You must be signed in to change notification settings

Mediashare/crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crawler

💫 Crawl urls from a webpage and provide a DomCrawler with Scraper Library.

DomCrawler

Scraper use DomCrawler library. This is symfony component for DOM navigation for HTML and XML documents. You can retrieve Documentation Here.

Installation

composer require mediashare/crawler

Usage

<?php
require 'vendor/autoload.php';

use Mediashare\Crawler\Crawler;

$crawler = new Crawler("https://mediashare.fr");
$crawler->run();
dump($crawler);
With Config
<?php
require 'vendor/autoload.php';

use Mediashare\Crawler\Crawler;
use Mediashare\Crawler\Config;

$config = new Config();
$config->setWebspider(true); // All website crawling
$config->setVerbose(true); // Prompt progress bar
$config->setPathRequires(['/Kernel/']); // Not crawl other path
$config->setPathExceptions(['/CodeSnippet/']); // Not crawl this path

$crawler = new Crawler("https://mediashare.fr", $config);
$crawler->run();
dump($crawler);