Man page - hxunent(1)
Packages contains this manual
- hxcite-mkbib(1)
- hxref(1)
- hxprune(1)
- hxincl(1)
- hxwls(1)
- xml2asc(1)
- hxunent(1)
- hxnum(1)
- hxnsxml(1)
- hxtabletrans(1)
- hxnormalize(1)
- hxaddid(1)
- hxcopy(1)
- hxname2id(1)
- hxclean(1)
- hxindex(1)
- hxunpipe(1)
- hxprintlinks(1)
- hxtoc(1)
- hxcite(1)
- hxmultitoc(1)
- hxpipe(1)
- hxunxmlns(1)
- hxselect(1)
- hxmkbib(1)
- hxremove(1)
- hxxmlns(1)
- hxcount(1)
- asc2xml(1)
- hxextract(1)
- hxuncdata(1)
apt-get install html-xml-utils
Manual
HXUNENT
NAMESYNOPSIS
DESCRIPTION
OPTIONS
DIAGNOSTICS
SEE ALSO
BUGS
NAME
hxunent - replace HTML predefined character entities by UTF-8
SYNOPSIS
hxunent [ -b ] [ -f ] [ file ]
DESCRIPTION
The hxunent command reads the file (or standard input) and copies it to standard output with &-entities by their equivalent character (encoded as UTF-8). E.g., " is replaced by " and < is replaced by <.
OPTIONS
The following options are supported:
|
-b |
The five builtin entities of XML (< > " ' &) are not replaced but copied unchanged. This is necessary if the output has to be valid XML or SGML. |
||
|
-f |
This option changes how unknown entities or lone ampersands are handled. Normally they are copied unchanged, but this option tries to "fix" them by replacing ampersands by &. Often such stray ampersands are the result of copy and paste of URLs into a document and then this option indeed fixes them and makes the document valid. |
DIAGNOSTICS
The programās exit value is 0 if all went well, otherwise:
|
1 |
The input couldnāt be read (file not found, file not readable...) |
||
|
2 |
Wrong command line arguments. |
SEE ALSO
asc2xml (1), xml2asc (1), UTF-8 (RFC 2279)
BUGS
The program assumes entities are as defined by HTML. It doesnāt read a documentās DTD to find the actual definitions in use in a document. With -f , it will even remove all entities that are not HTML entities.