Man page - apertium-tagger(1)
Packages contains this manual
- apertium-deslatex(1)
- apertium-retxt(1)
- apertium-rertf(1)
- apertium-extract-caps(1)
- apertium-destxt(1)
- apertium-tagger(1)
- apertium-remediawiki(1)
- apertium-relatex(1)
- apertium-rehtml(1)
- apertium-rexlsx(1)
- apertium-repptx(1)
- apertium-rewxml(1)
- apertium-deshtml-alt(1)
- apertium-transfer(1)
- apertium(1)
- apertium-multiple-translations(1)
- apertium-postchunk(1)
- apertium-deswxml(1)
- apertium-desmediawiki(1)
- apertium-despptx(1)
- apertium-preprocess-transfer(1)
- apertium-utils-fixlatex(1)
- apertium-interchunk(1)
- apertium-prelatex(1)
- apertium-desrtf(1)
- apertium-deshtml(1)
- apertium-unformat(1)
- apertium-reodt(1)
- apertium-postlatex-raw(1)
- apertium-desxlsx(1)
- apertium-pretransfer(1)
- apertium-desodt(1)
- apertium-postlatex(1)
- apertium-restore-caps(1)
apt-get install apertium
Manual
APERTIUM-TAGGER (1) General Commands Manual APERTIUM-TAGGER (1)
NAME
apertium-tagger — part-of-speech tagger and trainer for Apertium
SYNOPSIS
apertium-tagger
[
options
]
-g
serialized_tagger
[
input
[
output
]]
apertium-tagger
[
options
]
-r
iterations
corpus serialized_tagger
apertium-tagger
[
options
]
-s
iterations
dictionary corpus tagger_spec serialized_tagger
tagged_corpus untagged_corpus
apertium-tagger
[
options
]
-s 0
dictionary tagger_spec serialized_tagger tagged_corpus
untagged_corpus
apertium-tagger
[
options
]
-s 0 -u
model
serialized_tagger tagged_corpus
apertium-tagger
[
options
]
-t
iterations
dictionary corpus tagger_spec serialized_tagger
DESCRIPTION
apertium-tagger is the application responsible for the apertium part-of-speech tagger training or tagging, depending on the calling options. This command only reads from the standard input if the option --tagger or -g is used.
MODES
-g
,
--tagger
Tags input text by means of Viterbi algorithm.
-r n , --retrain n
Retrains the model with n additional Baum-Welch iterations (unsupervised). This option is incompatible with -u ( --unigram )
-s n , --supervised n
Initializes parameters against a hand-tagged text (supervised) through the maximum likelihood estimate method, then performs n iterations of the Baum-Welch training algorithm (unsupervised). The CRP argument can be omitted only when n = 0.
-t n , --train n
Initializes parameters through Kupiec’s method (unsupervised), then performs n iterations of the Baum-Welch training algorithm (unsupervised).
MODELS
-u
,
--unigram=MODEL
use unigram algorithm MODEL from <https://coltekin.net/cagri/papers/trmorph-tools.pdf>
-w , --sliding-window
use the Light Sliding Window algorithm
-x , --perceptron
use the averaged perceptron algorithm
OPTIONS
-d
,
--debug
Print error (if any) or debug messages while operating.
-e, --skip-on-error
Used with -xs to ignore certain types of errors with the training corpus
-f , --first
Used in conjunction with -g ( --tagger ) makes the tagger give all lexical forms of each word, with the chosen one in the first place (after the lemma)
-m , --mark
Mark disambiguated words.
-p , --show-superficial
Prints the superficial form of the word along side the lexical form in the output stream.
-z , --null-flush
Used in conjunction with -g ( --tagger ) to flush the output after getting each null character.
--help
Display a help message.
FILES
These are the kinds of files used with each option:
dictionary
Full expanded dictionary file
corpus
Training text corpus file
tagger_spec
Tagger specification file, in XML format
serialized_tagger
Tagger data file, built in the training and used while tagging
tagged_corpus
Hand-tagged text corpus
untagged_corpus
Untagged text corpus, morphological analysis of hand-tagged corpus to use both jointly with -s option
input
Input file, stdin by default
output
Output file, stdout by default
SEE ALSO
apertium (1), lt-comp (1), lt-expand (1), lt-proc (1)
COPYRIGHT
Copyright © 2005, 2006 Universitat d’Alacant / Universidad de Alicante. This is free software. You may redistribute copies of it under the terms of the GNU General Public License : https://www.gnu.org/licenses/gpl.html.
BUGS
Many... lurking in the dark and waiting for you! Apertium February 22, 2021 APERTIUM-TAGGER (1)