Lexical rules

Japanese version

Lexical rules are rules for generating lexical entries from base forms of words (lexemes). Examples are rules for generating the past tense form from the uninflected form of a verb and rules for generating the plural form of a noun from the singular form of a noun. The types of lexical rules are defined in the file "enju/types.lil". Instances of lexical rule types are defined in "enju-devel/invlexrule.lil", "enju-devel/lexrule.lil"and "enju-devel/lexcommon.lil". The lexical rule typse found in "enju/types.lil" and their instances found in "enju-devel/invlexrule.lil", "enju-devel/lexrule.lil" and "enju-devel/lexcommon.lil" are all unique to ENJU.

Characteristics of Lexical Rules of ENJU

There are two types of lexical rules in ENJU, one for converting lexemes to lexical entries, the other for converting lexical entries to lexems. This is for convering the weak point of corpus-oriented grammar development by increasing the number of lexical entries. Most lexical entries acquired from corpora are the results of some kind of syntactic transformation. In order to extract grammar rules from a corpus, the first thing we have to do is to apply lexical rules in the reverse direction such that lexemes can be acquired from corpora. Then we apply lexical rules in the normal direction to expand the dictionary. Following this approach, we can acquire more lexemes, which further increase the number of lexical entries that can be acquired.

The Content of the Source File

Types of Lexical Rules

Lexical rules can be applied to all words. rules marked by (*) are special rules that can only be applied to some verbs considered to be appropriate in the training corpus.


Enju Developers' Manual Enju Home Page Tsujii Laboratory
MIYAO Yusuke (yusuke@is.s.u-tokyo.ac.jp)