PHP Class TextAnalysis\Models\Wordnet\Synset

Author: yooper

Inheritance: use trait TextAnalysis\Traits\WordnetPointerSymbolMap

显示文件 Open project: yooper/php-text-analysis Class Usage Examples

Protected Properties

Property	Type	Description
$definition		Each synset contains a gloss. A gloss is represented as a vertical bar (\| ), followed by a text string that continues until the end of the line. The gloss may contain a definition, one or more example sentences, or both.
$frames		In data.verb only, a list of numbers corresponding to the generic verb sentence frames for word s in the synset. frames is of the form: f_cnt + f_num w_num [ + f_num w_num...] where f_cnt a two digit decimal integer indicating the number of generic frames listed, f_num is a two digit decimal integer frame number, and w_num is a two digit hexadecimal integer indicating the word in the synset that the frame applies to. As with pointers, if this number is 00 , f_num applies to all word s in the synset. If non-zero, it is applicable only to the word indicated. Word numbers are assigned as described for pointers. Each f_num w_num pair is preceded by a +
$lexFilenum		Two digit decimal integer corresponding to the lexicographer file name containing the synset.
$lexIds		One digit hexadecimal integer that, when appended onto lemma , uniquely identifies a sense within a lexicographer file. lex_id numbers usually start with 0 , and are incremented as additional senses of the word are added to the same file, although there is no requirement that the numbers be consecutive or begin with 0 . Note that a value of 0 is the default, and therefore is not present in lexicographer files.
$linkedSynsets		A pointer from this synset to another. ptr is of the form: pointerSymbol synsetOffset pos source/target. where synsetOffset is the byte offset of the target synset in the data file corresponding to pos .
$pCnt		Three digit decimal integer indicating the number of pointers from this synset to other synsets. If p_cnt is 000 the synset has no pointers.
$srcWordIdx	integer
$synsetOffset		Current byte offset in the file represented as an 8 digit decimal integer.
$targetWordIdx	integer
$wCnt		Two digit hexadecimal integer indicating the number of words in the synset.
$words		ASCII form of a word as entered in the synset by the lexicographer, with spaces replaced by underscore characters (_ ). The text of the word is case sensitive, in contrast to its form in the corresponding index. pos file, that contains only lower-case forms. In data.adj , a word is followed by a syntactic marker if one was specified in the lexicographer file. A syntactic marker is appended, in parentheses, onto word without any intervening spaces.

Public Methods

Method	Description
__construct ( $synsetOffset, $pos )
addLinkedSynset ( Synset &$synset ) : Synset;
addWord ( string $word, integer $lexId )
getDefinition ( ) : string
getLinkedSynsets ( ) : Synset[]
getSrcWordIdx ( ) : integer
getTargetWordIdx ( ) : integer
getWords ( ) : string[]
setDefinition ( string $definition )
setSrcWordIdx ( integer $wordIdx )
setTargetWordIdx ( integer $wordIdx )

Method Details

__construct() public method

public __construct ( $synsetOffset, $pos )

addLinkedSynset() public method

public addLinkedSynset ( Synset &$synset ) : Synset;
$synset	Synset
return	Synset;

addWord() public method

public addWord ( string $word, integer $lexId )
$word	string
$lexId	integer

getDefinition() public method

public getDefinition ( ) : string
return	string

getLinkedSynsets() public method

public getLinkedSynsets ( ) : Synset[]
return	Synset[]	Returned synsets are not fully hydrated

getSrcWordIdx() public method

public getSrcWordIdx ( ) : integer
return	integer

getTargetWordIdx() public method

public getTargetWordIdx ( ) : integer
return	integer

getWords() public method

public getWords ( ) : string[]
return	string[]

setDefinition() public method

public setDefinition ( string $definition )
$definition	string

setSrcWordIdx() public method

public setSrcWordIdx ( integer $wordIdx )
$wordIdx	integer

setTargetWordIdx() public method

public setTargetWordIdx ( integer $wordIdx )
$wordIdx	integer

Property Details

$definition protected_oe property

Each synset contains a gloss. A gloss is represented as a vertical bar (| ), followed by a text string that continues until the end of the line. The gloss may contain a definition, one or more example sentences, or both.

protected $definition

$frames protected_oe property

In data.verb only, a list of numbers corresponding to the generic verb sentence frames for word s in the synset. frames is of the form: f_cnt + f_num w_num [ + f_num w_num...] where f_cnt a two digit decimal integer indicating the number of generic frames listed, f_num is a two digit decimal integer frame number, and w_num is a two digit hexadecimal integer indicating the word in the synset that the frame applies to. As with pointers, if this number is 00 , f_num applies to all word s in the synset. If non-zero, it is applicable only to the word indicated. Word numbers are assigned as described for pointers. Each f_num w_num pair is preceded by a +

protected $frames

$lexFilenum protected_oe property

Two digit decimal integer corresponding to the lexicographer file name containing the synset.

protected $lexFilenum

$lexIds protected_oe property

One digit hexadecimal integer that, when appended onto lemma , uniquely identifies a sense within a lexicographer file. lex_id numbers usually start with 0 , and are incremented as additional senses of the word are added to the same file, although there is no requirement that the numbers be consecutive or begin with 0 . Note that a value of 0 is the default, and therefore is not present in lexicographer files.

protected $lexIds

$linkedSynsets protected_oe property

A pointer from this synset to another. ptr is of the form: pointerSymbol synsetOffset pos source/target. where synsetOffset is the byte offset of the target synset in the data file corresponding to pos .

protected $linkedSynsets

$pCnt protected_oe property

Three digit decimal integer indicating the number of pointers from this synset to other synsets. If p_cnt is 000 the synset has no pointers.

protected $pCnt

$srcWordIdx protected_oe property

protected int $srcWordIdx
return	integer

$synsetOffset protected_oe property

Current byte offset in the file represented as an 8 digit decimal integer.

protected $synsetOffset

$targetWordIdx protected_oe property

protected int $targetWordIdx
return	integer

$wCnt protected_oe property

Two digit hexadecimal integer indicating the number of words in the synset.

protected $wCnt

$words protected_oe property

ASCII form of a word as entered in the synset by the lexicographer, with spaces replaced by underscore characters (_ ). The text of the word is case sensitive, in contrast to its form in the corresponding index. pos file, that contains only lower-case forms. In data.adj , a word is followed by a syntactic marker if one was specified in the lexicographer file. A syntactic marker is appended, in parentheses, onto word without any intervening spaces.

protected $words