PHP Class Symfony\Component\DomCrawler\Crawler

Author: Fabien Potencier ([email protected])
Inheritance: extends SplObjectStorage
Afficher le fichier Open project: symfony/dom-crawler Class Usage Examples

Protected Properties

Свойство Type Description
$uri The current URI

Méthodes publiques

Méthode Description
__construct ( mixed $node = null, string $currentUri = null, string $baseHref = null )
add ( DOMNodeList | DOMNode | array | string | null $node ) Adds a node to the current list of nodes.
addContent ( string $content, null | string $type = null ) Adds HTML/XML content.
addDocument ( DOMDocument $dom ) Adds a \DOMDocument to the list of nodes.
addHtmlContent ( string $content, string $charset = 'UTF-8' ) Adds an HTML content to the list of nodes.
addNode ( DOMNode $node ) Adds a \DOMNode instance to the list of nodes.
addNodeList ( DOMNodeList $nodes ) Adds a \DOMNodeList to the list of nodes.
addNodes ( array $nodes ) Adds an array of \DOMNode instances to the list of nodes.
addXmlContent ( string $content, string $charset = 'UTF-8', integer $options = LIBXML_NONET ) Adds an XML content to the list of nodes.
attr ( string $attribute ) : string | null Returns the attribute value of the first node of the list.
children ( ) : Crawler Returns the children nodes of the current selection.
clear ( ) Removes all the nodes.
count ( ) : integer
each ( Closure $closure ) : array Calls an anonymous function on each node of the list.
eq ( integer $position ) : Crawler Returns a node given its position in the node list.
evaluate ( string $xpath ) : array | Crawler Evaluates an XPath expression.
extract ( array $attributes ) : array Extracts information from the list of nodes.
filter ( string $selector ) : Crawler Filters the list of nodes with a CSS selector.
filterXPath ( string $xpath ) : Crawler Filters the list of nodes with an XPath expression.
first ( ) : Crawler Returns the first node of the current selection.
form ( array $values = null, string $method = null ) : Form Returns a Form object for the first node in the list.
getBaseHref ( ) : string Returns base href.
getIterator ( ) : ArrayIterator
getNode ( integer $position ) : DOMElement | null
getUri ( ) : string Returns the current URI.
html ( ) : string Returns the first node of the list as HTML.
image ( ) : Symfony\Component\DomCrawler\Image Returns an Image object for the first node in the list.
images ( ) : Symfony\Component\DomCrawler\Image[] Returns an array of Image objects for the nodes in the list.
last ( ) : Crawler Returns the last node of the current selection.
link ( string $method = 'get' ) : Symfony\Component\DomCrawler\Link Returns a Link object for the first node in the list.
links ( ) : Symfony\Component\DomCrawler\Link[] Returns an array of Link objects for the nodes in the list.
nextAll ( ) : Crawler Returns the next siblings nodes of the current selection.
nodeName ( ) : string Returns the node name of the first node of the list.
parents ( ) : Crawler Returns the parents nodes of the current selection.
previousAll ( ) : Crawler Returns the previous sibling nodes of the current selection.
reduce ( Closure $closure ) : Crawler Reduces the list of nodes by calling an anonymous function.
registerNamespace ( string $prefix, string $namespace )
selectButton ( string $value ) : Crawler Selects a button by name or alt value for images.
selectImage ( string $value ) : Crawler Selects images by alt value.
selectLink ( string $value ) : Crawler Selects links by name or alt value for clickable images.
setDefaultNamespacePrefix ( string $prefix ) Overloads a default namespace prefix to be used with XPath and CSS expressions.
siblings ( ) : Crawler Returns the siblings nodes of the current selection.
slice ( integer $offset, integer $length = null ) : Crawler Slices the list of nodes by $offset and $length.
text ( ) : string Returns the node value of the first node of the list.
xpathLiteral ( string $s ) : string Converts string for XPath expressions.

Méthodes protégées

Méthode Description
sibling ( DOMElement $node, string $siblingDir = 'nextSibling' ) : array

Private Methods

Méthode Description
createDOMXPath ( DOMDocument $document, array $prefixes = [] ) : DOMXPath
createSubCrawler ( DOMElement | DOMElement[] | DOMNodeList | null $nodes ) : static Creates a crawler for some subnodes.
discoverNamespace ( DOMXPath $domxpath, string $prefix ) : string
filterRelativeXPath ( string $xpath ) : Crawler Filters the list of nodes with an XPath expression.
findNamespacePrefixes ( string $xpath ) : array
relativize ( string $xpath ) : string Make the XPath relative to the current context.

Method Details

__construct() public méthode

public __construct ( mixed $node = null, string $currentUri = null, string $baseHref = null )
$node mixed A Node to use as the base for the crawling
$currentUri string The current URI
$baseHref string The base href value

add() public méthode

This method uses the appropriate specialized add*() method based on the type of the argument.
public add ( DOMNodeList | DOMNode | array | string | null $node )
$node DOMNodeList | DOMNode | array | string | null A node

addContent() public méthode

If the charset is not set via the content type, it is assumed to be ISO-8859-1, which is the default charset defined by the HTTP 1.1 specification.
public addContent ( string $content, null | string $type = null )
$content string A string to parse as HTML/XML
$type null | string The content type of the string

addDocument() public méthode

Adds a \DOMDocument to the list of nodes.
public addDocument ( DOMDocument $dom )
$dom DOMDocument A \DOMDocument instance

addHtmlContent() public méthode

The libxml errors are disabled when the content is parsed. If you want to get parsing errors, be sure to enable internal errors via libxml_use_internal_errors(true) and then, get the errors via libxml_get_errors(). Be sure to clear errors with libxml_clear_errors() afterward.
public addHtmlContent ( string $content, string $charset = 'UTF-8' )
$content string The HTML content
$charset string The charset

addNode() public méthode

Adds a \DOMNode instance to the list of nodes.
public addNode ( DOMNode $node )
$node DOMNode A \DOMNode instance

addNodeList() public méthode

Adds a \DOMNodeList to the list of nodes.
public addNodeList ( DOMNodeList $nodes )
$nodes DOMNodeList A \DOMNodeList instance

addNodes() public méthode

Adds an array of \DOMNode instances to the list of nodes.
public addNodes ( array $nodes )
$nodes array An array of \DOMNode instances

addXmlContent() public méthode

The libxml errors are disabled when the content is parsed. If you want to get parsing errors, be sure to enable internal errors via libxml_use_internal_errors(true) and then, get the errors via libxml_get_errors(). Be sure to clear errors with libxml_clear_errors() afterward.
public addXmlContent ( string $content, string $charset = 'UTF-8', integer $options = LIBXML_NONET )
$content string The XML content
$charset string The charset
$options integer Bitwise OR of the libxml option constants LIBXML_PARSEHUGE is dangerous, see http://symfony.com/blog/security-release-symfony-2-0-17-released

attr() public méthode

Returns the attribute value of the first node of the list.
public attr ( string $attribute ) : string | null
$attribute string The attribute name
Résultat string | null The attribute value or null if the attribute does not exist

children() public méthode

Returns the children nodes of the current selection.
public children ( ) : Crawler
Résultat Crawler A Crawler instance with the children nodes

clear() public méthode

Removes all the nodes.
public clear ( )

count() public méthode

public count ( ) : integer
Résultat integer

each() public méthode

The anonymous function receives the position and the node wrapped in a Crawler instance as arguments. Example: $crawler->filter('h1')->each(function ($node, $i) { return $node->text(); });
public each ( Closure $closure ) : array
$closure Closure An anonymous function
Résultat array An array of values returned by the anonymous function

eq() public méthode

Returns a node given its position in the node list.
public eq ( integer $position ) : Crawler
$position integer The position
Résultat Crawler A new instance of the Crawler with the selected node, or an empty Crawler if it does not exist

evaluate() public méthode

Since an XPath expression might evaluate to either a simple type or a \DOMNodeList, this method will return either an array of simple types or a new Crawler instance.
public evaluate ( string $xpath ) : array | Crawler
$xpath string An XPath expression
Résultat array | Crawler An array of evaluation results or a new Crawler instance

extract() public méthode

You can extract attributes or/and the node value (_text). Example: $crawler->filter('h1 a')->extract(array('_text', 'href'));
public extract ( array $attributes ) : array
$attributes array An array of attributes
Résultat array An array of extracted values

filter() public méthode

This method only works if you have installed the CssSelector Symfony Component.
public filter ( string $selector ) : Crawler
$selector string A CSS selector
Résultat Crawler A new instance of Crawler with the filtered list of nodes

filterXPath() public méthode

The XPath expression is evaluated in the context of the crawler, which is considered as a fake parent of the elements inside it. This means that a child selector "div" or "./div" will match only the div elements of the current crawler, not their children.
public filterXPath ( string $xpath ) : Crawler
$xpath string An XPath expression
Résultat Crawler A new instance of Crawler with the filtered list of nodes

first() public méthode

Returns the first node of the current selection.
public first ( ) : Crawler
Résultat Crawler A Crawler instance with the first selected node

form() public méthode

Returns a Form object for the first node in the list.
public form ( array $values = null, string $method = null ) : Form
$values array An array of values for the form fields
$method string The method for the form
Résultat Form A Form instance

getBaseHref() public méthode

Returns base href.
public getBaseHref ( ) : string
Résultat string

getIterator() public méthode

public getIterator ( ) : ArrayIterator
Résultat ArrayIterator

getNode() public méthode

public getNode ( integer $position ) : DOMElement | null
$position integer
Résultat DOMElement | null

getUri() public méthode

Returns the current URI.
public getUri ( ) : string
Résultat string

html() public méthode

Returns the first node of the list as HTML.
public html ( ) : string
Résultat string The node html

image() public méthode

Returns an Image object for the first node in the list.
public image ( ) : Symfony\Component\DomCrawler\Image
Résultat Symfony\Component\DomCrawler\Image An Image instance

images() public méthode

Returns an array of Image objects for the nodes in the list.
public images ( ) : Symfony\Component\DomCrawler\Image[]
Résultat Symfony\Component\DomCrawler\Image[] An array of Image instances

last() public méthode

Returns the last node of the current selection.
public last ( ) : Crawler
Résultat Crawler A Crawler instance with the last selected node

nextAll() public méthode

Returns the next siblings nodes of the current selection.
public nextAll ( ) : Crawler
Résultat Crawler A Crawler instance with the next sibling nodes

nodeName() public méthode

Returns the node name of the first node of the list.
public nodeName ( ) : string
Résultat string The node name

parents() public méthode

Returns the parents nodes of the current selection.
public parents ( ) : Crawler
Résultat Crawler A Crawler instance with the parents nodes of the current selection

previousAll() public méthode

Returns the previous sibling nodes of the current selection.
public previousAll ( ) : Crawler
Résultat Crawler A Crawler instance with the previous sibling nodes

reduce() public méthode

To remove a node from the list, the anonymous function must return false.
public reduce ( Closure $closure ) : Crawler
$closure Closure An anonymous function
Résultat Crawler A Crawler instance with the selected nodes

registerNamespace() public méthode

public registerNamespace ( string $prefix, string $namespace )
$prefix string
$namespace string

selectButton() public méthode

Selects a button by name or alt value for images.
public selectButton ( string $value ) : Crawler
$value string The button text
Résultat Crawler A new instance of Crawler with the filtered list of nodes

selectImage() public méthode

Selects images by alt value.
public selectImage ( string $value ) : Crawler
$value string The image alt
Résultat Crawler A new instance of Crawler with the filtered list of nodes

setDefaultNamespacePrefix() public méthode

Overloads a default namespace prefix to be used with XPath and CSS expressions.
public setDefaultNamespacePrefix ( string $prefix )
$prefix string

sibling() protected méthode

protected sibling ( DOMElement $node, string $siblingDir = 'nextSibling' ) : array
$node DOMElement
$siblingDir string
Résultat array

siblings() public méthode

Returns the siblings nodes of the current selection.
public siblings ( ) : Crawler
Résultat Crawler A Crawler instance with the sibling nodes

slice() public méthode

Slices the list of nodes by $offset and $length.
public slice ( integer $offset, integer $length = null ) : Crawler
$offset integer
$length integer
Résultat Crawler A Crawler instance with the sliced nodes

text() public méthode

Returns the node value of the first node of the list.
public text ( ) : string
Résultat string The node value

xpathLiteral() public static méthode

Escaped characters are: quotes (") and apostrophe ('). Examples: echo Crawler::xpathLiteral('foo " bar'); prints 'foo " bar' echo Crawler::xpathLiteral("foo ' bar"); prints "foo ' bar" echo Crawler::xpathLiteral('a\'b"c'); prints concat('a', "'", 'b"c')
public static xpathLiteral ( string $s ) : string
$s string String to be escaped
Résultat string Converted string

Property Details

$uri protected_oe property

The current URI
protected $uri