PHP Class eZXMLInputParser

显示文件 Open project: ezsystems/ezpublish-legacy Class Usage Examples

Public Properties

Property Type Description
$AllowMultipleSpaces
$AllowNumericEntities
$DOMDocumentClass
$DetectErrorLevel
$Document
$InputTags Each array element describes a tag that comes from the input. Arrays index is a tag's name. Each element is an array that may contain the following members: 'name' - a string representing a new name of the tag, 'nameHandler' - a name of the function that returns new tag name. Function format: function tagNameHandler( $tagName, &$attributes ) If no of those elements are defined the original tag's name is used. 'noChildren' - boolean value that determines if this tag could have child tags, default value is false. Example: public $InputTags = array( 'original-name' => array( 'name' => 'new-name' ), 'original-name2' => array( 'nameHandler' => 'tagNameHandler', 'noChildren' => true ), ... );
$IsInputValid
$Messages
$Namespaces
$OutputTags Each array element describes a tag presented in the output. Arrays index is a tag's name. Each element is an array that may contain the following members: 'parsingHandler' - "Parsing handler" called at parse pass 1 before processing tag's children. 'initHandler' - "Init handler" called at pass 2 before proccessing tag's children. 'structHandler' - "Structure handler" called at pass 2 after proccessing tag's children, but before schema validity check. It can be used to implement structure transformations. 'publishHandler' - "Publish handler" called at pass 2 after schema validity check, so it is called in case the element has it's guaranteed place in the DOM tree. 'attributes' - an array that describes attributes transformations. Array's index is the original name of an attribute, and the value is the new name. 'requiredInputAttributes' - attributes that are required in the input tag. If they are not presented it raises invalid input flag. Example: public $OutputTags = array( 'custom' => array( 'parsingHandler' => 'parsingHandlerCustom', 'initHandler' => 'initHandlerCustom', 'structHandler' => 'structHandlerCustom', 'publishHandler' => 'publishHandlerCustom', 'attributes' => array( 'title' => 'name' ) ), ... );
$ParentStack
$ParseLineBreaks options that depend on parameters passed
$QuitProcess
$RemoveDefaultAttrs
$StrictHeaders
$TrimSpaces options that depend on settings
$ValidateErrorLevel
$XMLSchema
$eZPublishVersion

Public Methods

Method Description
callInputHandler ( $handlerName, $tagName, &$attributes )
callOutputHandler ( $handlerName, $element, &$params )
convertNumericEntities ( $text )
createAndPublishElement ( $elementName, &$ret ) and call 'structure' and 'publish' handlers)
createRootNode ( ) \public
eZXMLInputParser ( $validateErrorLevel = self::ERROR_NONE, $detectErrorLevel = self::ERROR_NONE, $parseLineBreaks = false, $removeDefaultAttrs = false ) *!
entitiesDecode ( $text )
fixSubtree ( $element, $mainChild ) Remove only nodes that don't match schema (recursively)
getMessages ( ) \public
handleError ( $type, $message )
isValid ( ) \public
parseAttributes ( $attributeString )
parseTag ( &$data, &$pos, &$parent ) The main recursive function for pass 1
performPass1 ( &$data ) * \public Pass 1: Parsing the source HTML string.
performPass2 ( ) *! \public Pass 2: Process the tree, run handlers, rebuild and validate.
process ( $text, $createRootNode = true ) *! \public Call this function to process your input
processAttributesBySchema ( $element )
processBySchemaPresence ( $element ) Check if the element is allowed to exist in this document and remove it if not.
processBySchemaTree ( $element ) Check that element has a correct position in the tree and fix it if not.
processNewElements ( $createdElements )
processSubtree ( $element, &$lastHandlerResult ) main recursive function for pass 2
removeAllAttributes ( DOMElement $element ) *! Removes all attribute nodes from element node $element
setAttributes ( $element, $attributes )
setDOMDocumentClass ( $DOMDocumentClass ) \public
setParseLineBreaks ( $value ) \public
setRemoveDefaultAttrs ( $value ) \public
washText ( $textContent )

Protected Methods

Method Description
wordMatchSupport ( $newTagName, $attributes, $attributeString ) *! Returns modified attributes parameter

Private Methods

Method Description
findEndOpeningTagPosition ( string $data, integer $tagBeginPos, integer $offset ) : integer | false Finds the postion of the > character which marks the end of the opening tag that starts at $tagBeginPos in $data.
isValidXmlTag ( string $code ) : boolean Checks whether $code can be considered as a valid XML excerpt. If not, it's probably because we found a '>' in the middle of an attribute.

Method Details

callInputHandler() public method

public callInputHandler ( $handlerName, $tagName, &$attributes )

callOutputHandler() public method

public callOutputHandler ( $handlerName, $element, &$params )

convertNumericEntities() public method

public convertNumericEntities ( $text )

createAndPublishElement() public method

and call 'structure' and 'publish' handlers)
public createAndPublishElement ( $elementName, &$ret )

createRootNode() public method

\public
public createRootNode ( )

eZXMLInputParser() public method

The constructor. \param $validate \param $validateErrorLevel Determines types of errors that break input processing It's possible to combine any error types, by creating a bitmask of EZ_XMLINPUTPARSER_ERROR_* constants. \c true value means that all errors defined by $detectErrorLevel parameter will break further processing \param $detectErrorLevel Determines types of errors that will be detected and added to error log ($Messages).
public eZXMLInputParser ( $validateErrorLevel = self::ERROR_NONE, $detectErrorLevel = self::ERROR_NONE, $parseLineBreaks = false, $removeDefaultAttrs = false )

entitiesDecode() public method

public entitiesDecode ( $text )

fixSubtree() public method

Remove only nodes that don't match schema (recursively)
public fixSubtree ( $element, $mainChild )

getMessages() public method

\public
public getMessages ( )

handleError() public method

public handleError ( $type, $message )

isValid() public method

\public
public isValid ( )

parseAttributes() public method

public parseAttributes ( $attributeString )

parseTag() public method

The main recursive function for pass 1
public parseTag ( &$data, &$pos, &$parent )

performPass1() public method

* \public Pass 1: Parsing the source HTML string.
public performPass1 ( &$data )

performPass2() public method

*! \public Pass 2: Process the tree, run handlers, rebuild and validate.
public performPass2 ( )

process() public method

*! \public Call this function to process your input
public process ( $text, $createRootNode = true )

processAttributesBySchema() public method

public processAttributesBySchema ( $element )

processBySchemaPresence() public method

Check if the element is allowed to exist in this document and remove it if not.
public processBySchemaPresence ( $element )

processBySchemaTree() public method

Check that element has a correct position in the tree and fix it if not.
public processBySchemaTree ( $element )

processNewElements() public method

public processNewElements ( $createdElements )

processSubtree() public method

main recursive function for pass 2
public processSubtree ( $element, &$lastHandlerResult )

removeAllAttributes() public method

*! Removes all attribute nodes from element node $element
public removeAllAttributes ( DOMElement $element )
$element DOMElement

setAttributes() public method

public setAttributes ( $element, $attributes )

setDOMDocumentClass() public method

\public
public setDOMDocumentClass ( $DOMDocumentClass )

setParseLineBreaks() public method

\public
public setParseLineBreaks ( $value )

setRemoveDefaultAttrs() public method

\public
public setRemoveDefaultAttrs ( $value )

washText() public method

public washText ( $textContent )

wordMatchSupport() protected method

*! Returns modified attributes parameter
protected wordMatchSupport ( $newTagName, $attributes, $attributeString )

Property Details

$AllowMultipleSpaces public_oe property

public $AllowMultipleSpaces

$AllowNumericEntities public_oe property

public $AllowNumericEntities

$DOMDocumentClass public_oe property

public $DOMDocumentClass

$DetectErrorLevel public_oe property

public $DetectErrorLevel

$Document public_oe property

public $Document

$InputTags public_oe property

Each array element describes a tag that comes from the input. Arrays index is a tag's name. Each element is an array that may contain the following members: 'name' - a string representing a new name of the tag, 'nameHandler' - a name of the function that returns new tag name. Function format: function tagNameHandler( $tagName, &$attributes ) If no of those elements are defined the original tag's name is used. 'noChildren' - boolean value that determines if this tag could have child tags, default value is false. Example: public $InputTags = array( 'original-name' => array( 'name' => 'new-name' ), 'original-name2' => array( 'nameHandler' => 'tagNameHandler', 'noChildren' => true ), ... );
public $InputTags

$IsInputValid public_oe property

public $IsInputValid

$Messages public_oe property

public $Messages

$Namespaces public_oe property

public $Namespaces

$OutputTags public_oe property

Each array element describes a tag presented in the output. Arrays index is a tag's name. Each element is an array that may contain the following members: 'parsingHandler' - "Parsing handler" called at parse pass 1 before processing tag's children. 'initHandler' - "Init handler" called at pass 2 before proccessing tag's children. 'structHandler' - "Structure handler" called at pass 2 after proccessing tag's children, but before schema validity check. It can be used to implement structure transformations. 'publishHandler' - "Publish handler" called at pass 2 after schema validity check, so it is called in case the element has it's guaranteed place in the DOM tree. 'attributes' - an array that describes attributes transformations. Array's index is the original name of an attribute, and the value is the new name. 'requiredInputAttributes' - attributes that are required in the input tag. If they are not presented it raises invalid input flag. Example: public $OutputTags = array( 'custom' => array( 'parsingHandler' => 'parsingHandlerCustom', 'initHandler' => 'initHandlerCustom', 'structHandler' => 'structHandlerCustom', 'publishHandler' => 'publishHandlerCustom', 'attributes' => array( 'title' => 'name' ) ), ... );
public $OutputTags

$ParentStack public_oe property

public $ParentStack

$ParseLineBreaks public_oe property

options that depend on parameters passed
public $ParseLineBreaks

$QuitProcess public_oe property

public $QuitProcess

$RemoveDefaultAttrs public_oe property

public $RemoveDefaultAttrs

$StrictHeaders public_oe property

public $StrictHeaders

$TrimSpaces public_oe property

options that depend on settings
public $TrimSpaces

$ValidateErrorLevel public_oe property

public $ValidateErrorLevel

$XMLSchema public_oe property

public $XMLSchema

$eZPublishVersion public_oe property

public $eZPublishVersion