This tool makes amis-style data files from an unfiltered event file.
amisfilter model_name mask_module uevent_file count_file model_file event_file | |
model_name | name of a probabilistic model (also used in parsing) |
mask_module | lilfes program in which masks that are applied to unfiltered events are implemented |
uevent_file | input unfiltered event file (text or compressed (gz or bz) format) |
count_file | file to output feature counts (text format) |
model_file | output model file (AmisModel format) |
event_file | output event file (AmisEvent format) |
Options | |
-t threshold | threshold of feature frequency (default: 1) |
-n threshold | limit number of outputting events |
-v | print debug messages |
-vv | print many debug messages |
This tool makes amis-style data files from an unfiltered event file that is made by "unimaker" or "forestmaker".
"unimaker" or "forestmaker" outputs unfiltered events that are separated by "//" in the following format. The last field is a category name.
in//IN//vp[PPnp]//uni
Features of a maximum entropy model are generated by applying "0/1" masks to each field of a string in the above format. For example, supposing applying the mask like (0, 1, 1) to the above string, we obtain the following features.
_//IN//vp[PPnp]//uni
Masks are described with "feature_mask/3" defined in "amismodel.lil". A mask must be a list of 0 or 1, and must have the same length as the fields of unfiltered events.
feature_mask(+$ModelName, -$Category, -$Mask) | |
$ModelName | name of a probabilistic model |
$Category | name of a category |
$Mask | a list of 0 or 1 that represents a mask |
Specify a mask to be applied to unfiltered events. |
"amisfilter" first generates features using masks. Next, it counts the frequencies of features in observed events (empirical frequency), and outputs them into a count file. Finally, it adopts features whose frequencies are above the threshold, and makes model and event files in amis-style.