Man page - hxunent(1)

Packages contains this manual

Package: html-xml-utils
apt-get install html-xml-utils

Manuals in package:

Documentations in package:

html-xml-utils

Manual

HXUNENT

NAME
SYNOPSIS
DESCRIPTION
OPTIONS
DIAGNOSTICS
SEE ALSO
BUGS

NAME

hxunent - replace HTML predefined character entities by UTF-8

SYNOPSIS

hxunent [ -b ] [ -f ] [ file ]

DESCRIPTION

The hxunent command reads the file (or standard input) and copies it to standard output with &-entities by their equivalent character (encoded as UTF-8). E.g., " is replaced by " and < is replaced by <.

OPTIONS

The following options are supported:

	-b		The five builtin entities of XML (< > " ' &) are not replaced but copied unchanged. This is necessary if the output has to be valid XML or SGML.
	-f		This option changes how unknown entities or lone ampersands are handled. Normally they are copied unchanged, but this option tries to "fix" them by replacing ampersands by &. Often such stray ampersands are the result of copy and paste of URLs into a document and then this option indeed fixes them and makes the document valid.

DIAGNOSTICS

The program’s exit value is 0 if all went well, otherwise:

	1		The input couldn’t be read (file not found, file not readable...)
	2		Wrong command line arguments.

BUGS

The program assumes entities are as defined by HTML. It doesn’t read a document’s DTD to find the actual definitions in use in a document. With -f , it will even remove all entities that are not HTML entities.