LiLFeS modules
Japanese version
In addition to the tools explained above, MAYZ provides LiLFeS
modules to support grammar development. These can be used by loading
from (by "-l" option) or from a parser.
- Modules for grammar development
- Modules for parsing
- Browsing grammar development or parsing
- Using a parser in applications
"mayz/markhead.lil" is a program for annotating a head, argument, or
modifier mark toeach node in a tree. By implementing several rules
for marking, it automatically annotate marks to all nodes in a tree.
The interfaces for marking heads are as follows. The first two refer
to the MOD feature, while the other referes to the SYM feature to
determine heads.
head_tag(+$Tag)
|
The node will be marked as a head if the MOD feature
includes $Tag.
|
nonhead_tag(+$Tag)
|
The node will be marked as a non-head
(argument or modifier) if the MOD feature includes $Tag.
Arguments and modifiers are distinguished by other rules.
|
head_table(+$Sym, +$Dir, +$SymList)
|
$Sym | Symbol of the parent node
|
$Dir | Direction of searching a head ("left" or "right")
|
$SymList | List of symbols that should be marked as a head
|
When the symbol the parent node is $Sym, child nodes
are searched in the direction $Dir (if "left", left-to-right, and
"right", right-to-left), and the node labeled with the first
element of $SymList is marked as a head. If the first element is not
found in the child nodes, the node labeled with the next element is
searched. If an element of $SymList is a list, the node labeled with
a symbol in the list is marked as a head. If no symbol is found, the
left most node is marked as a head if $Dir is "left", and the right
most one if "right".
|
The following predicate marks heads in a parse tree using the above
interfaces.
mark_head(+$Tree)
|
$Tree | parse tree
|
Annotates a head mark in a parse tree using the
following algorithm.
- If one of the daughters is assigned "head", exit.
- If one of the daughters is assigned a modifier tag specified in
'head_tag/1', mark the node as a head.
- If a daughter is assigned a modifier tag specified in
'nonhead_tag/1', the node is ignored.
- Determine a head according to 'head_table/3'.
|
The interfaces for marking modifiers and arguments are as follows.
The program assumes that head marks are already assigned. The first
two refer to the MOD feature, while the rests refer to the SYM
feature.
argument_tag(+$Tag)
|
If the MOD feature includes $Tag, the node is marked
as an argument.
|
modifier_tag(+$Tag)
|
If the MOD feature includes $Tag, the node is marked
as a modifier.
|
head_argument_table(+$HeadSym, +$SymList)
|
$HeadSym | symbol of the head
|
$SymList | list of symbols
|
If the symbol of the head is $HeadSym, a sibling
node is marked as an argument if its symbol is included in $SymList.
|
argument_table(+$Sym, +$SymList)
|
$Sym | symbol of the mother
|
$SymList | list of symbols
|
If the symbol of the mother is $Sym, a sibling
node is marked as an argument if its symbol is included in $SymList.
|
left_argument_table(+$Sym, +$SymList)
|
$Sym | symbol of the mother
|
$SymList | list of symbols
|
If the symbol of the mother is $Sym, a sibling node
is marked as an argument if the node is in the left of the head and
its symbol is included in $SymList.
|
right_argument_table(+$Sym, +$SymList)
|
$Sym | symbol of the mother
|
$SymList | list of symbols
|
If the symbol of the mother is $Sym, a sibling node
is marked as an argument if the node is in the right of the head and
its symbol is included in $SymList.
|
Using the above interface, the following predicate assigns argument or
modifier marks to all nodes in a parse tree.
mark_modifier(+$Tree)
|
$Tree | parse tree
|
Nodes in $Tree are marked as a modifier or a
argument using the following algorithm.
- If the node has a tag specified by 'argument_tag/1', it is
marked as "argument".
- If the node has a tag specified by 'modifier_tag/1', it is
marked as "modifier".
- Using 'head_argument_table/2', argument marks are assigned.
- Using 'argument_table/2', argument marks are assigned.
- Using 'left_argument_table/2', argument marks are assigned.
- Using 'right_argument_table/2', argument marks are assigned.
- All the remaining nodes are assigned "modifier".
|
The above predicate ignored the nodes already assigned some marks.
This means that you can assign marks to exceptional constructions
before using the above tools. User can also use the following
interface for the marking of exceptional trees. The following
interface is used when the above predicate try to assign a mark to
each node.
mark_exceptional(+$Tree)
|
$Tree | parse tree
|
A user marks $Tree.
|
"mayz/binarizer.lil" provides a tool to binarize a tree annotated with
head, modifier, and argument marks.
tree_binarize(+$Tree, -$BinTree)
|
$Tree | input tree
|
$BinTree | binarized tree
|
$Tree is binarized into $BinTree.
|
This predicate binarizes a tree where the head is centered and the
right nodes of the head are in the lower part and the left ones are in
the higher part. If you need an exceptional binarization strategy,
the following interface can be used. It is called for each node in a
tree.
binarizer_preprocess(+$Tree, -$BinTree)
|
$Tree | input tree
|
$BinTree | binarized tree
|
"mayz/treematch.lil" provides predicates for pattern matching of parse
trees. It is useful when you use "treetrans" to convert parse trees. You
can match and substitute parse trees using patten rules.
While ap pattern of a parse tree is represented with a feature
structure representation of a parse tree (i.e., 'tree' type), you can
additionally use 'tree_any' type. It matches with zero or more than
zero parse trees. For example, the following pattern,
(tree &
TREE_NODE\SYM\"S" &
TREE_DTRS\[tree_any,
(tree & TREE_NODE\SYM\"VP"),
tree_any])
matches a tree in which the top node is labeled with "S" and it has
at least one daughter labeled with "VP". It matches a tree even when
the tree has more than zero daughters on the left and/or the right of
the "VP" tree. The trees that matched with 'tree_any' are stored in
the feature ANY_TREES\.
The following predicates are provided for the matching and the
substitution of parse trees using patterns.
tree_match(+$Patten, +$Tree)
|
$Pattern | pattern on a parse tree ('tree' or 'tree_any')
|
$Tree | input parse tree ('tree')
|
Succeeds when the pattern matches with the parse tree.
|
> ?- tree_match((tree &
TREE_NODE\SYM\"SBAR" &
TREE_DTRS\[TREE_NODE\(SYM\"RB" & WORD\SURFACE\"rather"),
TREE_NODE\(SYM\"IN" & WORD\SURFACE\"than"),
TREE_NODE\(SYM\"NP")]),
(tree &
TREE_DTRS\[tree_any & ANY_TREES\[_|_],
tree & TREE_NODE\(SYM\"IN" & WORD\SURFACE\"than"),
tree & TREE_NODE\HEAD_MARK\argument])).
yes
|
tree_substitution(+$OutPattern, -$OutTree)
|
$InPattern | pattern on a parse tree ('tree' or 'tree_any')
|
$OutTree | output ('tree')
|
Convert a pattern on a parse tree (including
'tree_any') into an ordinary parse tree (without 'tree_any').
|
tree_subst(+$InPattern, +$OutPattern, +$InTree, -$OutTree)
|
$InPattern | pattern on an input parse tree ('tree' or 'tree_any')
|
$OutPattern | pattern on an output parse tree ('tree' or 'tree_any')
|
$InTree | input parse tree (tree)
|
$OutTree | output (tree)
|
An input pattern is mathced with an input parse
tree, and if it succeeds, the output pattern is converted into an
output parse tree. That is, it is equivalent to the following
operations.
tree_match($InPattern, $InTree),
tree_substitution($OutPattern, $OutTree).
See the manual of "treetrans" for an example.
|
"mayz/grammar.lil" provides tools for looking up a lexicon and
template in databases.
import_lexicon(+$LexiconFile, +$TemplateFile)
|
$LexiconFile | file name of a lexicon
|
$TemplateFile | file name of a template database
|
Imports a lexicon and a template database.
|
lookup_lexicon(+$Word, -$TempNameList)
|
$Word | input word
|
$TempNameList | list of lex_template
|
Looks up a lexicon, and return a list of template
names.
|
lookup_template(+$TempName, -$Sign)
|
$TempName | lex_template
|
$Sign | feature structure
|
Looks up a template in a template database, and
returns a feature structure of a template.
|
To use lookup_lexicon/2, the following interface must be
implemented to get a database key from an input word.
lexicon_lookup_key(+$Word, -$Key)
|
$Word | input word
|
$Key | key of a lexicon database
|
unknown_word_lookup_key(+$Word, -$Key)
|
$Word | input word
|
$Key | key of a lexicon database for an unknown word
|
"mayz/tagger.lil" provides tools for using an external tagger. The
following predicates are used for the initialization and termination
of an external tagger.
initialize_external_tagger(+$Name, +$Arguments)
|
$Name | command name of a tagger (string)
|
$Arguments | command-line arguments of a tagger (list of strings)
|
Initializes an external tagger.
|
terminate_external_tagger
|
Terminates an external tagger.
|
is_external_tagger_initialized
|
Succeeds if a tagger is already initialized.
|
After the initialization, the following predicates are used for
turning on/off the tagger.
enable_external_tagger
|
Turns on the tagger.
|
disable_external_tagger
|
Turns off the tagger.
|
is_external_tagger_enabled
|
Succeeds if a tagger is turned on.
|
The following predicates passes an input sentence to a tagger, and
the resulting string is returned.
external_tagger(+$Input, -$Output)
|
$Input | input string
|
$Output | output string
|
When a tagger is turned on, $Input is passed to a
tagger, and the output of the tagger is returned. When a tagger is
off, $Input is just returned to $Output.
|
"mayz/morivtrans.lil" is a module for browsing the process of tree
transformation (treetrans) and lexicon extraction (lexextract). Using
a web browser supporting XHTML and XSLT (e.g. FireFox) or MoriV, you can
browse tree structures and feature structures in the process of
grammar development.
This module works as an HTTP server and a CGI. First, load this
module together with modules for tree transformation and lexicon
extraction.
% lilfes -l tree_transformation_module -l lexicon_extraction_module -l mayz/morivtrans
Next, invoke "cgi" command.
> ?- cgi.
Then, an HTTP server starts, and waits for a connection. From your browser,
access to the 27109 port of "/cgi-lilfes/moriv?" of the host where you are
running the lilfes.
http://server_host:27109/cgi-lilfes/moriv?
Input a Penn Treebank-style tree to the form, and press the "Input"
button. You will see a menu in the lower-left area, and a parse tree
in the lower-right area. You can browse trees and feature structures
using the lower-left menu.
"mayz/morivparser.lil" is a module for browsing the results of parsing
with a grammar and a disambiguation model developed with MAYZ. Using
a web browser supporting XHTML and XSLT (e.g. FireFox) or
href="http://www-tsujii.is.s.u-tokyo.ac.jp/moriv/">MoriV, you
can browse parse trees and signs of parse results.
To use this module, you need to implement the following interfaces in
order to give a symbol to show a brief parse tree of a parse result.
They are defined in "mayz/display.lil".
sign_label(+$Sign, -$Symbol)
|
$Sign | sign
|
$Symbol | string
|
Returns a symbol representing the sign.
|
lexname_label(+$LexName, -$Symbol)
|
$LexName | LEX_NAME (the 2nd argument of lexical_entry/3)
|
$Symbol | string
|
Returns a symbol representing LEX_NAME.
|
schema_edge_label_unary(+$SchemaName, -$Label)
|
$SchemaName | schema name
|
$Label | edge symbol
|
Returns a symbol assigned to the edge of unary
schema application.
|
schema_edge_label_binary(+$SchemaName,
-$LeftLabel, -$RightLabel
|
$SchemaName | schema name
|
$LeftLabel | symbol of the left edge
|
$RightLabel | symbol of the right edge
|
Returns symbols assigned to the edges of binary
schema application.
|
schema_label(+$SchemaName, -$Label
|
$SchemaName | schema name
|
$Label | symbol
|
Returns a symbol representing a schema name.
|
lex_template_label(+$LexTemplate, -$Label
|
$LexTemplate | lex_template
|
$Label | symbol
|
Returns a symbol representing a template name.
|
word_label(+$Word, -$Label)
|
$Word | word
|
$Label | symbol
|
Returns a symbol representing a word.
|
extent_label(+$Extent, -$Label)
|
$Extent | extent
|
$Label | symbol
|
Returns a symbol representing an extent (an element
of the 2nd argument of 'sentence_to_word_lattice/2').
|
This module works as an HTTP server and a CGI. When you run a parser,
load "mayz/morivparser.lil", and execute the "cgi" command. For
example, when you use "mayzup",
% mayzup -l grammar_module -l mayz/movirparser -e cgi
Then, an HTTP server starts, and waits for a connection. Using your browser,
access to the 27109 port of "/cgi-lilfes/moriv?" of the host where lilfes is
running.
http://server_host:27109/cgi-lilfes/moriv?
Enter a sentence in the form, and press the "Input" button. You will
see the brief result of parsing and a menu in the lower-left area.
You can browse parse trees and feature structures using the menu.
"mayz/morivchart.lil" is a module for browsing a parse chart (CKY
table). Using a web browser supporting XHTML and XSLT (e.g. FireFox)
or MoriV,
you can brose internal parse results generated during parsing.
To use this module, you need to implement the interfaces for getting
the symbols of parse trees. The interfaces are defined in
"mayz/display.lil". For details, see Browsing
the results of parsing.
When you run a parser, load "mayz/morivchart", and execute the "cgi"
command to run an HTTP server. Then, access to the server using your
browser. Enter a sentence in the form, and you will see the chart in
the lower-left area. By clicking a chart cell, you will get the edges
in the cell in the lower-right area.
"mayz/morivgrammar.lil" is a module for browsing a lexicon using a web
browser supporting XHTML and XSLT (e.g. FireFox) or MoriV. You can
browse a list lexical entries assigned to a word and their feature
structures.
To use this module, you need to implement interfaces defined in
"display.lil". For details, see Browsing the
results of parsing.
When you run a parser, load "mayz/morivgrammar", and execute the "cgi"
command to run an HTTP server. Then, access to the server using your
browser. Enter a word/POS in the form, and you will see a list of
lexical entries. Click the link in the list, and you will see the
feature structure of a lexical entry in the lower-right frame.
"mayz/coverage.lil" is a module to measure the coverage obtained by a
grammar developed with MAYZ. Together with a grammar module, load
"mayz/coverage.lil", and execute the following predicate.
eval_coverage(+$Lexbank, +$Lexicon, +$Template, +$OutputFile)
|
$Lexbank | name of a lexbank used for the evaluation
|
$Lexicon | file name of a lexicon
|
$Templates | file name of a template database
|
$OutputFile | file name of outputting results
|
For the evaluation of coverage, a lexbank of an unseen corpus is
used. Before the evaluation, you need to make a lexbank using
"treetrans" and "lexextract".
"mayz/evalparse.lil" is a module for evaluating the accuracy of
parsing with a grammar and a probabilistic model developed with MAYZ.
By implementing an interface to measure the number of correct answers
for a sentence, you can measure the accuracy for the whole test
corpus.
For the evaluation, the following interface is required to be
implemented.
eval_parse(+$Best, +$Correct, +$TermList,
-$NumAnswers, -$NumOutputs, -$NumCorrects, -$NumPartials, -$Errors)
|
$Best | parse_tree output by a parser
|
$Correct | correct parse_tree
|
$TermList | list of terminal nodes of a derivation
(corresponding to a lexbank)
|
$NumAnswers | Number of answers
|
$NumOutputs | Number of outputs
|
$NumCorrects | Number of exactly correct outputs
|
$NumPartials | Number of partially correct outputs
|
$Errors | list of strings (each element is output to the
result file)
|
When you run a parser, load "mayz/evalparse.lil", and execute the
following predicate. The result of evaluation is output to a file.
eval_parse_file(+$Derivbank, +$OutputFile)
|
$Derivbank | name of a derivbank
|
$OutputFile | name of an output file
|
The accuracy of parsing is measured against
$Derivbank, and the result is output to $OutputFile.
|
"mayz/parseall.lil" is a LiLFeS module to store parse results into
LiLFeS database (lildb). Each line of the input text is parsed, and
the results are stored in a database. The key of the database is the
line number of the input. If parsing fails, the result shows the
reason of the failure with the type parse_error and its
subtypes.
In this module, the following predicates are avaiable.
parse_all(+$Input, +$Output)
|
$Input | Name of input file
|
$Output | Name of database
|
Parse each line of the input file $Input, and store
the results in the database $Output.
|
parse_all(+$Output)
|
$Output | Name of database
|
Parse each line of the standard input, and store the
results in the database $Output.
|
MAYZ Toolkit Manual
MAYZ Home Page
Tsujii Laboratory
MIYAO Yusuke (yusuke@is.s.u-tokyo.ac.jp)