This tool is for thresholding infrequent words and lexical entry templates, and for expanding lexical entry templates by lexical rules.
lexrefine [options] rule_module orig_lexicon orig_template new_lexicon new_template | |
rule_module | lilfes program in which lexical rules are implemented |
orig_lexicon | input lexicon |
orig_template | input template database |
new_lexicon | refined lexicon |
new_template | refined template |
Options | |
-wf threshold | threshold of word frequency (default: 1) |
-tf threshold | threshold of the frequency of lexical entry templates (default: 0) |
-uwf threshold | threshold of the frequency of words to be regarded as unknown word (default: 1) |
-utf threshold | threshold of the frequency of lexical entry templates to be adopted for unknown words (default: 0) |
-v | print debug messages |
-vv | print many debug messages |
-vvv | print many many debug messages |
"lexrefine" refines a lexicon and a template database with the following operations.
First, remove lexical entry templates whose occurrence count is less than the threshold (the value specified by "-tf" option).
Next, apply lexical rules to remaining templates, and make lexical entry templtaes for inflected words. Write lexical rules with the following interfaces defined in "mayz/lexrefine.lil".
expand_lexical_template(+$InTemplateName, +$InTemplate, +$Freq, -$LexRules, -$NewTemplate) | |
$InTemplateName | name of an input template |
$InTemplate | input lexical entry template |
$Freq | occurrence count of the template |
$LexRules | history of applied lexical rules |
$NewTemplate | derived lexical entry template |
Apply lexical rules to a lexical entry template of a lexeme, and make a new lexical entry template. |
To use derived lexical entry templates, implement the following interface.
expand_lexicon(+$InKey, +$TemplateName, -$NewKey) | |
$InKey | a key of an input word |
$TemplateName | name of a template ('lex_template' type) |
$NewKey | a new key |
From a key of a lexicon, $InKey, make a new key, $NewKey, to which the derived template should be assigned. For example, when a template for passive is made from that for a base verb, we make a new key "loved/VBN" from "love/VB". |
Next, for each entry in a lexicon, if the occurrence count of a word (to be more precise, the key given by the third argument of "reduce_lexical_template/5") is less than the threshold (the value specified by "-wf" option), the entry is removed from the lexicon. The other entries remain in the lexicon, while the templates deleted in the first step are automatically removed from the entries.
In addition, a word is regarde as "unknown word" if its occurrence count is less than the value of the "-uwf" option. That is, templates assigned to the word are added to the template list for an unknown word. The key of an unknown word is specified with "unknown_word_key/2" defined in "mayz/lexrefine.lil".
unknown_word_key(+$InKey, -$OutKey) | |
$InKey | key of an input word |
$OutKey | key of an unknown word |
Make a key for an unknown word. |