PHP Class TextAnalysis\Models\Wordnet\Synset

Author: yooper
Inheritance: use trait TextAnalysis\Traits\WordnetPointerSymbolMap
Datei anzeigen Open project: yooper/php-text-analysis Class Usage Examples

Protected Properties

Property Type Description
$definition Each synset contains a gloss. A gloss is represented as a vertical bar (| ), followed by a text string that continues until the end of the line. The gloss may contain a definition, one or more example sentences, or both.
$frames In data.verb only, a list of numbers corresponding to the generic verb sentence frames for word s in the synset. frames is of the form: f_cnt + f_num w_num [ + f_num w_num...] where f_cnt a two digit decimal integer indicating the number of generic frames listed, f_num is a two digit decimal integer frame number, and w_num is a two digit hexadecimal integer indicating the word in the synset that the frame applies to. As with pointers, if this number is 00 , f_num applies to all word s in the synset. If non-zero, it is applicable only to the word indicated. Word numbers are assigned as described for pointers. Each f_num w_num pair is preceded by a +
$lexFilenum Two digit decimal integer corresponding to the lexicographer file name containing the synset.
$lexIds One digit hexadecimal integer that, when appended onto lemma , uniquely identifies a sense within a lexicographer file. lex_id numbers usually start with 0 , and are incremented as additional senses of the word are added to the same file, although there is no requirement that the numbers be consecutive or begin with 0 . Note that a value of 0 is the default, and therefore is not present in lexicographer files.
$linkedSynsets A pointer from this synset to another. ptr is of the form: pointerSymbol synsetOffset pos source/target. where synsetOffset is the byte offset of the target synset in the data file corresponding to pos .
$pCnt Three digit decimal integer indicating the number of pointers from this synset to other synsets. If p_cnt is 000 the synset has no pointers.
$srcWordIdx integer
$synsetOffset Current byte offset in the file represented as an 8 digit decimal integer.
$targetWordIdx integer
$wCnt Two digit hexadecimal integer indicating the number of words in the synset.
$words ASCII form of a word as entered in the synset by the lexicographer, with spaces replaced by underscore characters (_ ). The text of the word is case sensitive, in contrast to its form in the corresponding index. pos file, that contains only lower-case forms. In data.adj , a word is followed by a syntactic marker if one was specified in the lexicographer file. A syntactic marker is appended, in parentheses, onto word without any intervening spaces.

Public Methods

Method Description
__construct ( $synsetOffset, $pos )
addLinkedSynset ( Synset &$synset ) : Synset;
addWord ( string $word, integer $lexId )
getDefinition ( ) : string
getLinkedSynsets ( ) : Synset[]
getSrcWordIdx ( ) : integer
getTargetWordIdx ( ) : integer
getWords ( ) : string[]
setDefinition ( string $definition )
setSrcWordIdx ( integer $wordIdx )
setTargetWordIdx ( integer $wordIdx )

Method Details

__construct() public method

public __construct ( $synsetOffset, $pos )

addLinkedSynset() public method

public addLinkedSynset ( Synset &$synset ) : Synset;
$synset Synset
return Synset;

addWord() public method

public addWord ( string $word, integer $lexId )
$word string
$lexId integer

getDefinition() public method

public getDefinition ( ) : string
return string

getLinkedSynsets() public method

public getLinkedSynsets ( ) : Synset[]
return Synset[] Returned synsets are not fully hydrated

getSrcWordIdx() public method

public getSrcWordIdx ( ) : integer
return integer

getTargetWordIdx() public method

public getTargetWordIdx ( ) : integer
return integer

getWords() public method

public getWords ( ) : string[]
return string[]

setDefinition() public method

public setDefinition ( string $definition )
$definition string

setSrcWordIdx() public method

public setSrcWordIdx ( integer $wordIdx )
$wordIdx integer

setTargetWordIdx() public method

public setTargetWordIdx ( integer $wordIdx )
$wordIdx integer

Property Details

$definition protected_oe property

Each synset contains a gloss. A gloss is represented as a vertical bar (| ), followed by a text string that continues until the end of the line. The gloss may contain a definition, one or more example sentences, or both.
protected $definition

$frames protected_oe property

In data.verb only, a list of numbers corresponding to the generic verb sentence frames for word s in the synset. frames is of the form: f_cnt + f_num w_num [ + f_num w_num...] where f_cnt a two digit decimal integer indicating the number of generic frames listed, f_num is a two digit decimal integer frame number, and w_num is a two digit hexadecimal integer indicating the word in the synset that the frame applies to. As with pointers, if this number is 00 , f_num applies to all word s in the synset. If non-zero, it is applicable only to the word indicated. Word numbers are assigned as described for pointers. Each f_num w_num pair is preceded by a +
protected $frames

$lexFilenum protected_oe property

Two digit decimal integer corresponding to the lexicographer file name containing the synset.
protected $lexFilenum

$lexIds protected_oe property

One digit hexadecimal integer that, when appended onto lemma , uniquely identifies a sense within a lexicographer file. lex_id numbers usually start with 0 , and are incremented as additional senses of the word are added to the same file, although there is no requirement that the numbers be consecutive or begin with 0 . Note that a value of 0 is the default, and therefore is not present in lexicographer files.
protected $lexIds

$linkedSynsets protected_oe property

A pointer from this synset to another. ptr is of the form: pointerSymbol synsetOffset pos source/target. where synsetOffset is the byte offset of the target synset in the data file corresponding to pos .
protected $linkedSynsets

$pCnt protected_oe property

Three digit decimal integer indicating the number of pointers from this synset to other synsets. If p_cnt is 000 the synset has no pointers.
protected $pCnt

$srcWordIdx protected_oe property

protected int $srcWordIdx
return integer

$synsetOffset protected_oe property

Current byte offset in the file represented as an 8 digit decimal integer.
protected $synsetOffset

$targetWordIdx protected_oe property

protected int $targetWordIdx
return integer

$wCnt protected_oe property

Two digit hexadecimal integer indicating the number of words in the synset.
protected $wCnt

$words protected_oe property

ASCII form of a word as entered in the synset by the lexicographer, with spaces replaced by underscore characters (_ ). The text of the word is case sensitive, in contrast to its form in the corresponding index. pos file, that contains only lower-case forms. In data.adj , a word is followed by a syntactic marker if one was specified in the lexicographer file. A syntactic marker is appended, in parentheses, onto word without any intervening spaces.
protected $words