PHP Class PicoFeed\Scraper\CandidateParser

Author: Frederic Guillot
Inheritance: implements PicoFeed\Scraper\ParserInterface
Afficher le fichier Open project: fguillot/picofeed

Méthodes publiques

Méthode Description
__construct ( string $html ) Constructor.
execute ( ) : string Get the relevant content with the list of potential attributes.
findContentWithArticle ( ) : string Find
tag.
findContentWithBody ( ) : string Find tag.
findContentWithCandidates ( ) : string Find content based on the list of tag candidates.
findNextLink ( ) : string Find link for next page of the article.
shouldRemove ( DomDocument $dom, DomNode $node ) : boolean Return false if the node should not be removed.
stripAttributes ( DomDocument $dom, DOMXPath $xpath ) Remove blacklisted attributes.
stripGarbage ( string $content ) : string Strip useless tags.
stripTags ( DOMXPath $xpath ) Remove blacklisted tags.

Method Details

__construct() public méthode

Constructor.
public __construct ( string $html )
$html string

execute() public méthode

Get the relevant content with the list of potential attributes.
public execute ( ) : string
Résultat string

findContentWithArticle() public méthode

Find
tag.
public findContentWithArticle ( ) : string
Résultat string

findContentWithBody() public méthode

Find tag.
public findContentWithBody ( ) : string
Résultat string

findContentWithCandidates() public méthode

Find content based on the list of tag candidates.
public findContentWithCandidates ( ) : string
Résultat string

shouldRemove() public méthode

Return false if the node should not be removed.
public shouldRemove ( DomDocument $dom, DomNode $node ) : boolean
$dom DomDocument
$node DomNode
Résultat boolean

stripAttributes() public méthode

Remove blacklisted attributes.
public stripAttributes ( DomDocument $dom, DOMXPath $xpath )
$dom DomDocument
$xpath DOMXPath

stripGarbage() public méthode

Strip useless tags.
public stripGarbage ( string $content ) : string
$content string
Résultat string

stripTags() public méthode

Remove blacklisted tags.
public stripTags ( DOMXPath $xpath )
$xpath DOMXPath