PHP Class Spatie\Crawler\Crawler

Afficher le fichier Open project: spatie/crawler Class Usage Examples

Protected Properties

Свойство Type Description
$baseUrl Url
$baseUrl Spatie\Crawler\Url;
$client GuzzleHttp\Client
$concurrency integer
$crawlObserver Spatie\Crawler\CrawlObserver
$crawlProfile Spatie\Crawler\CrawlProfile
$crawlQueue CrawlQueue
$crawledUrls Illuminate\Support\Collection

Méthodes publiques

Méthode Description
__construct ( Client $client )
__construct ( Client $client, integer $concurrency = 10 )
create ( ) : static
create ( array $clientOptions = [] ) : static
setConcurrency ( integer $concurrency )
setCrawlObserver ( Spatie\Crawler\CrawlObserver $crawlObserver )
setCrawlProfile ( Spatie\Crawler\CrawlProfile $crawlProfile )
startCrawling ( Url | string $baseUrl )

Méthodes protégées

Méthode Description
addAllLinksToCrawlQueue ( string $html, Url $foundOnUrl )
crawlAllLinks ( string $html ) Crawl all links in the given html.
crawlUrl ( Url $url ) Crawl the given url.
extractAllLinks ( string $html ) : Collection
getAllLinks ( string $html ) : Url[] Get all links in the given html.
getCrawlRequests ( ) : Generato\Generator
handleResponse ( Psr\Http\Message\ResponseInterface | null $response, integer $index )
hasAlreadyCrawled ( Url $url ) : boolean Determine if the crawled has already crawled the given url.
normalizeUrl ( Url $url ) Normalize the given url.
normalizeUrl ( Url $url ) : Url
startCrawlingQueue ( )

Method Details

__construct() public méthode

public __construct ( Client $client )
$client GuzzleHttp\Client

__construct() public méthode

public __construct ( Client $client, integer $concurrency = 10 )
$client GuzzleHttp\Client
$concurrency integer

addAllLinksToCrawlQueue() protected méthode

protected addAllLinksToCrawlQueue ( string $html, Url $foundOnUrl )
$html string
$foundOnUrl Url

crawlUrl() protected méthode

Crawl the given url.
protected crawlUrl ( Url $url )
$url Url

create() public static méthode

public static create ( ) : static
Résultat static

create() public static méthode

public static create ( array $clientOptions = [] ) : static
$clientOptions array
Résultat static

getCrawlRequests() protected méthode

protected getCrawlRequests ( ) : Generato\Generator
Résultat Generato\Generator

handleResponse() protected méthode

protected handleResponse ( Psr\Http\Message\ResponseInterface | null $response, integer $index )
$response Psr\Http\Message\ResponseInterface | null
$index integer

hasAlreadyCrawled() protected méthode

Determine if the crawled has already crawled the given url.
protected hasAlreadyCrawled ( Url $url ) : boolean
$url Url
Résultat boolean

normalizeUrl() protected méthode

Normalize the given url.
protected normalizeUrl ( Url $url )
$url Url

normalizeUrl() protected méthode

protected normalizeUrl ( Url $url ) : Url
$url Url
Résultat Url

setConcurrency() public méthode

public setConcurrency ( integer $concurrency )
$concurrency integer

setCrawlObserver() public méthode

public setCrawlObserver ( Spatie\Crawler\CrawlObserver $crawlObserver )
$crawlObserver Spatie\Crawler\CrawlObserver

setCrawlProfile() public méthode

public setCrawlProfile ( Spatie\Crawler\CrawlProfile $crawlProfile )
$crawlProfile Spatie\Crawler\CrawlProfile

startCrawling() public méthode

public startCrawling ( Url | string $baseUrl )
$baseUrl Url | string

startCrawlingQueue() protected méthode

protected startCrawlingQueue ( )

Property Details

$baseUrl protected_oe property

protected Url,Spatie\Crawler $baseUrl
Résultat Url

$baseUrl protected_oe property

protected Url;,Spatie\Crawler $baseUrl
Résultat Spatie\Crawler\Url;

$client protected_oe property

protected Client,GuzzleHttp $client
Résultat GuzzleHttp\Client

$concurrency protected_oe property

protected int $concurrency
Résultat integer

$crawlObserver protected_oe property

protected CrawlObserver,Spatie\Crawler $crawlObserver
Résultat Spatie\Crawler\CrawlObserver

$crawlProfile protected_oe property

protected CrawlProfile,Spatie\Crawler $crawlProfile
Résultat Spatie\Crawler\CrawlProfile

$crawlQueue protected_oe property

protected CrawlQueue,Spatie\Crawler $crawlQueue
Résultat CrawlQueue

$crawledUrls protected_oe property

protected Collection,Illuminate\Support $crawledUrls
Résultat Illuminate\Support\Collection