Method |
Description |
|
cleanUTF8 ( string $str, boolean $force_php = false ) : string |
Cleans a UTF-8 string for well-formedness and SGML validity |
|
convertFromUTF8 ( string $str, HTMLPurifier_Config $config, HTMLPurifier_Context $context ) : string |
Converts a string from UTF-8 based on configuration. |
|
convertToASCIIDumbLossless ( string $str ) : string |
Lossless (character-wise) conversion of HTML to ASCII |
|
convertToUTF8 ( string $str, HTMLPurifier_Config $config, HTMLPurifier_Context $context ) : string |
Convert a string to UTF-8 based on configuration. |
|
iconv ( string $in, string $out, string $text, integer $max_chunk_size = 8000 ) : string |
iconv wrapper which mutes errors and works around bugs. |
|
iconvAvailable ( ) : boolean |
|
|
muteErrorHandler ( ) |
Error-handler that mutes errors, alternative to shut-up operator. |
|
testEncodingSupportsASCII ( string $encoding, boolean $bypass = false ) : Array |
This expensive function tests whether or not a given character
encoding supports ASCII. 7/8-bit encodings like Shift_JIS will
fail this test, and require special processing. Variable width
encodings shouldn't ever fail. |
|
testIconvTruncateBug ( ) : integer |
glibc iconv has a known bug where it doesn't handle the magic
IGNORE stanza correctly. In particular, rather than ignore
characters, it will return an EILSEQ after consuming some number
of characters, and expect you to restart iconv as if it were
an E2BIG. Old versions of PHP did not respect the errno, and
returned the fragment, so as a result you would see iconv
mysteriously truncating output. We can work around this by
manually chopping our input into segments of about 8000
characters, as long as PHP ignores the error code. If PHP starts
paying attention to the error code, iconv becomes unusable. |
|
unichr ( $code ) |
+----------+----------+----------+----------+ |
|
unsafeIconv ( string $in, string $out, string $text ) : string |
iconv wrapper which mutes errors, but doesn't work around bugs. |
|