PHP Class Html2Text\Html2Text

Afficher le fichier Open project: soundasleep/html2text Class Usage Examples

Méthodes publiques

Méthode Description
convert ( string $html ) : string Tries to convert the given HTML into a plain text format - best suited for e-mail display, etc.
fixMSEncoding ( DOMDocument $doc ) : DOMDocument Microsoft exchange emails often include HTML which, when passed through html2text, results in lots of double line returns everywhere.
fixNewlines ( string $text ) : string Unify newlines; in particular, \r\n becomes \n, and then \r becomes \n. This means that all newlines (Unix, Windows, Mac) all become \ns.
isOfficeDocument ( $html ) Can we guess that this HTML is generated by Microsoft Office?
iterateOverNode ( $node )
nextChildName ( $node )
prevChildName ( $node )

Method Details

convert() static public méthode

In particular, it tries to maintain the following features:

  • Links are maintained, with the 'href' copied over
  • Information in the <head> is lost
static public convert ( string $html ) : string
$html string the input HTML
Résultat string the HTML converted, as best as possible, to text

fixMSEncoding() static public méthode

To fix this any element with a className of msoNormal (the standard classname in any Microsoft export or outlook for a paragraph that behaves like a line return) is changed to a line with a break
afterwards. This cleaned up document can then be processed as normal through Html2Text.
static public fixMSEncoding ( DOMDocument $doc ) : DOMDocument
$doc DOMDocument the document to clean up
Résultat DOMDocument the modified document with less unnecessary paragraphs

fixNewlines() static public méthode

Unify newlines; in particular, \r\n becomes \n, and then \r becomes \n. This means that all newlines (Unix, Windows, Mac) all become \ns.
static public fixNewlines ( string $text ) : string
$text string text with any number of \r, \r\n and \n combinations
Résultat string the fixed text

isOfficeDocument() static public méthode

Can we guess that this HTML is generated by Microsoft Office?
static public isOfficeDocument ( $html )

iterateOverNode() static public méthode

static public iterateOverNode ( $node )

nextChildName() static public méthode

static public nextChildName ( $node )

prevChildName() static public méthode

static public prevChildName ( $node )