PHP Class PressBooks\Modules\Import\Html\Xhtml

Inheritance: extends PressBooks\Modules\Import\Import
Afficher le fichier Open project: pressbooks/pressbooks

Méthodes publiques

Méthode Description
import ( array $current_import ) : boolean
kneadHtml ( string $html, string $type, string $domain ) : string Pummel the HTML into WordPress compatible dough.
kneadandInsert ( $html, string $post_type, integer $chapter_parent, string $domain ) Pummel then insert HTML into our database
setCurrentImportOption ( array $upload ) : boolean

Méthodes protégées

Méthode Description
extractCCLicense ( string $url ) : string Expects a URL string with Creative Commons domain similar in form to: http://creativecommons.org/licenses/by-sa/4.0/
fetchAndSaveUniqueImage ( string $url ) : string Extract url and load into WP using media_handle_sideload() Will return an empty string if something went wrong.
getAuthors ( string $html ) : array Looks for meta data in the section of an HTML document.
getLicenseAttribution ( string $html ) : array Looks for div class created by the license module in PB, returns author and license information.
regexSearchReplace ( string $html ) : string Cherry pick likely content areas, then cull known, unwanted content areas
scrapeAndKneadImages ( DOMDocument $doc, string $domain ) : DOMDocument Parse HTML snippet, save all found tags using media_handle_sideload(), return the HTML with changed paths.
scrapeAndKneadMeta ( DOMDocument $doc ) : array Extracts section/book author and section/book license if they exist.
tidy ( string $html ) : string Compliance with XHTML standards, rid cruft generated by word processors

Method Details

extractCCLicense() protected méthode

Expects a URL string with Creative Commons domain similar in form to: http://creativecommons.org/licenses/by-sa/4.0/
protected extractCCLicense ( string $url ) : string
$url string
Résultat string license meta value

fetchAndSaveUniqueImage() protected méthode

Extract url and load into WP using media_handle_sideload() Will return an empty string if something went wrong.
See also: media_handle_sideload
protected fetchAndSaveUniqueImage ( string $url ) : string
$url string
Résultat string $src

getAuthors() protected méthode

Priority is given to PB generated meta data.
protected getAuthors ( string $html ) : array
$html string
Résultat array $authors

getLicenseAttribution() protected méthode

Looks for div class created by the license module in PB, returns author and license information.
protected getLicenseAttribution ( string $html ) : array
$html string
Résultat array $meta

import() public méthode

public import ( array $current_import ) : boolean
$current_import array
Résultat boolean

kneadHtml() public méthode

Pummel the HTML into WordPress compatible dough.
public kneadHtml ( string $html, string $type, string $domain ) : string
$html string
$type string front-matter, part, chapter, back-matter, ...
$domain string domain name of the webpage
Résultat string

kneadandInsert() public méthode

Pummel then insert HTML into our database
public kneadandInsert ( $html, string $post_type, integer $chapter_parent, string $domain )
$post_type string
$chapter_parent integer
$domain string domain name of the webpage

regexSearchReplace() protected méthode

Cherry pick likely content areas, then cull known, unwanted content areas
protected regexSearchReplace ( string $html ) : string
$html string
Résultat string $html

scrapeAndKneadImages() protected méthode

Parse HTML snippet, save all found tags using media_handle_sideload(), return the HTML with changed paths.
protected scrapeAndKneadImages ( DOMDocument $doc, string $domain ) : DOMDocument
$doc DOMDocument
$domain string domain name of the webpage
Résultat DOMDocument

scrapeAndKneadMeta() protected méthode

Focus is given to CreativeCommons license information genereted by PB
protected scrapeAndKneadMeta ( DOMDocument $doc ) : array
$doc DOMDocument
Résultat array $meta

setCurrentImportOption() public méthode

public setCurrentImportOption ( array $upload ) : boolean
$upload array
Résultat boolean

tidy() protected méthode

Compliance with XHTML standards, rid cruft generated by word processors
protected tidy ( string $html ) : string
$html string
Résultat string $html