PHP Class HTMLPurifier_Encoder, yii

Datei anzeigen Open project: yiisoft/yii Class Usage Examples

Public Methods

Method Description
cleanUTF8 ( string $str, boolean $force_php = false ) : string Cleans a UTF-8 string for well-formedness and SGML validity
convertFromUTF8 ( string $str, HTMLPurifier_Config $config, HTMLPurifier_Context $context ) : string Converts a string from UTF-8 based on configuration.
convertToASCIIDumbLossless ( string $str ) : string Lossless (character-wise) conversion of HTML to ASCII
convertToUTF8 ( string $str, HTMLPurifier_Config $config, HTMLPurifier_Context $context ) : string Convert a string to UTF-8 based on configuration.
iconv ( string $in, string $out, string $text, integer $max_chunk_size = 8000 ) : string iconv wrapper which mutes errors and works around bugs.
iconvAvailable ( ) : boolean
muteErrorHandler ( ) Error-handler that mutes errors, alternative to shut-up operator.
testEncodingSupportsASCII ( string $encoding, boolean $bypass = false ) : Array This expensive function tests whether or not a given character encoding supports ASCII. 7/8-bit encodings like Shift_JIS will fail this test, and require special processing. Variable width encodings shouldn't ever fail.
testIconvTruncateBug ( ) : integer glibc iconv has a known bug where it doesn't handle the magic IGNORE stanza correctly. In particular, rather than ignore characters, it will return an EILSEQ after consuming some number of characters, and expect you to restart iconv as if it were an E2BIG. Old versions of PHP did not respect the errno, and returned the fragment, so as a result you would see iconv mysteriously truncating output. We can work around this by manually chopping our input into segments of about 8000 characters, as long as PHP ignores the error code. If PHP starts paying attention to the error code, iconv becomes unusable.
unichr ( $code ) +----------+----------+----------+----------+
unsafeIconv ( string $in, string $out, string $text ) : string iconv wrapper which mutes errors, but doesn't work around bugs.

Private Methods

Method Description
__construct ( ) Constructor throws fatal error if you attempt to instantiate class

Method Details

cleanUTF8() public static method

It will parse according to UTF-8 and return a valid UTF8 string, with non-SGML codepoints excluded.
public static cleanUTF8 ( string $str, boolean $force_php = false ) : string
$str string The string to clean
$force_php boolean
return string

convertFromUTF8() public static method

Converts a string from UTF-8 based on configuration.
public static convertFromUTF8 ( string $str, HTMLPurifier_Config $config, HTMLPurifier_Context $context ) : string
$str string The string to convert
$config HTMLPurifier_Config
$context HTMLPurifier_Context
return string

convertToASCIIDumbLossless() public static method

Lossless (character-wise) conversion of HTML to ASCII
public static convertToASCIIDumbLossless ( string $str ) : string
$str string UTF-8 string to be converted to ASCII
return string ASCII encoded string with non-ASCII character entity-ized

convertToUTF8() public static method

Convert a string to UTF-8 based on configuration.
public static convertToUTF8 ( string $str, HTMLPurifier_Config $config, HTMLPurifier_Context $context ) : string
$str string The string to convert
$config HTMLPurifier_Config
$context HTMLPurifier_Context
return string

iconv() public static method

iconv wrapper which mutes errors and works around bugs.
public static iconv ( string $in, string $out, string $text, integer $max_chunk_size = 8000 ) : string
$in string Input encoding
$out string Output encoding
$text string The text to convert
$max_chunk_size integer
return string

iconvAvailable() public static method

public static iconvAvailable ( ) : boolean
return boolean

muteErrorHandler() public static method

Error-handler that mutes errors, alternative to shut-up operator.
public static muteErrorHandler ( )

testEncodingSupportsASCII() public static method

This expensive function tests whether or not a given character encoding supports ASCII. 7/8-bit encodings like Shift_JIS will fail this test, and require special processing. Variable width encodings shouldn't ever fail.
public static testEncodingSupportsASCII ( string $encoding, boolean $bypass = false ) : Array
$encoding string Encoding name to test, as per iconv format
$bypass boolean Whether or not to bypass the precompiled arrays.
return Array of UTF-8 characters to their corresponding ASCII, which can be used to "undo" any overzealous iconv action.

testIconvTruncateBug() public static method

glibc iconv has a known bug where it doesn't handle the magic IGNORE stanza correctly. In particular, rather than ignore characters, it will return an EILSEQ after consuming some number of characters, and expect you to restart iconv as if it were an E2BIG. Old versions of PHP did not respect the errno, and returned the fragment, so as a result you would see iconv mysteriously truncating output. We can work around this by manually chopping our input into segments of about 8000 characters, as long as PHP ignores the error code. If PHP starts paying attention to the error code, iconv becomes unusable.
public static testIconvTruncateBug ( ) : integer
return integer Error code indicating severity of bug.

unichr() public static method

+----------+----------+----------+----------+
public static unichr ( $code )

unsafeIconv() public static method

iconv wrapper which mutes errors, but doesn't work around bugs.
public static unsafeIconv ( string $in, string $out, string $text ) : string
$in string Input encoding
$out string Output encoding
$text string The text to convert
return string