Man page - apertium-deshtml(1)

Packages contains this manual

Manual


APERTIUM-DESHTML (1) General Commands Manual APERTIUM-DESHTML (1)

NAME

apertium-deshtml β€” HTML format processor for Apertium

SYNOPSIS

apertium-deshtml [ -hino ] [ input_file [ output_file ]]

DESCRIPTION

This tool is part of the Apertium open-source machine translation toolbox : https://apertium.org/.

apertium-deshtml is an HTML format processor. Data should be passed through this processor before being piped to lt-proc (1). The program takes input in the form of an HTML document and produces output suitable for processing with lt-proc (1). HTML tags and other format information are enclosed in brackets so that lt-proc (1) treats them as whitespace between words.

OPTIONS
-h
, --help

Display this help.

-i

Makes the addition of trailing sentence terminator (β€˜.’) unconditional, often leading to duplicates.

-n

Suppresses the addition of a trailing sentence terminator.

-o

Inserts a "❑" (U+2761 CURVED STEM PARAGRAPH SIGN ORNAMENT) at the end of <h[1–6]> and <title> tags.

EXAMPLES

You could write the following to show how the word β€œgener” is analysed:

echo "

<b>gener</b> " | apertium-deshtml | lt-proc ca-es.automorf.bin

SEE ALSO

apertium (1), apertium-desrtf (1), apertium-destxt (1), lt-proc (1)

COPYRIGHT

Copyright Β© 2005, 2006 Universitat d’Alacant / Universidad de Alicante. This is free software. You may redistribute copies of it under the terms of the GNU General Public License : https://www.gnu.org/licenses/gpl.html.

BUGS

Many... lurking in the dark and waiting for you! Apertium March 21, 2006 APERTIUM-DESHTML (1)