12 HTML and output
12.1 Formal HTML validation
TTH takes as its standard HTML that
can be rendered by Netscape and IE browsers versions 4 and higher
(with the caveats above). The formal standard that TTH-translated
documents follow is strictly HTML4.0[1]8
Transitional. However, TTH does
not formally validate its documents, and can be made to violate the
standard by some TEX usage.
One reason for violation
arises because HTML4.0 requires a
<title>...</title> for every document.
A title is constructed from LATEX files that contain the \title{...}
command, in which case HTML conformance is ensured by putting the
\title command before any text (i.e. in the preamble, where it
belongs). If the \title command is not desired in the TEX
file, for example because it is a plain TEX document,
a title can be provided by the author for the HTML document by putting
a line like this at the top of the TEX file.
%%tth:\begin{html}<title>Put the title here</title>\end{html}
This line will be ignored by TEX. Actually, any raw HTML output at the
start of the file is assumed by TTH to indicate that the author has
explicitly output a title. If no title indication of any of the above
types is present, TTH attempts to construct a title from the first few
plain words in the document, in much the way that the first line can
become the title of a hymn.
If commands like
\item, that output material to the HTML file occur
before the title has been constructed, the HTML title command will be
out of order and the formal standard will be violated.
In the case where the title construction fails, or if some other TEX
usage causes a violation of the formal standard, browsers will
still render the output correctly if this manual is followed.
12.2 HTML Styles
There are good reasons why the <head> and <body> tags
are by default omitted by TTH. See the FAQ [B.3] for a
brief discussion. However, the evolution of HTML standards (not yet
browsers) is towards imposing more restrictions on the freedom to omit
tags. For example XHTML requires that containers have both
opening and closing tags. Therefore TTH has a switch -w?
(where the question mark denotes an optional integer) that controls
its writing style as follows.
- Default
- Construct title. Do not enter head and body tags.
- -w -w0
- Do not construct title or enter head/body tags.
- -w1
- Enter head and body tags assuming that the title is the
dividing point.
- -w2
- Use XHTML syntax.
- -w4
- Don't use block level font size commands between paragraphs.
At present, in addition to the default style that
attempts to construct a title but does not enter head and body tags,
-w or equivalently -w0 prevents TTH from attempting to construct a
title or anything else in the way of head/body divisions. This style
is best used for documents where the author has explicitly entered the
required HTML tags. The switch -w1 invokes pedantic HTML style which
enters head and body tags under the assumption that the title
(possibly constructed automatically) is the last thing in the head
section. A style -w2 produces XHTML documents but requires
cascading style sheet (CSS) support in the browser otherwise the
rendering will not be as satisfactory as the default.
Addition of four to the writing style index (e.g. -w4) prevents
TTH employing block-level font size commands if the size is changed
immediately after a \par or implied paragraph. The additional
CSS style sheet is not inserted and, of course, the browser need not
support CSS. The (now) default writing style is to accommodate tables
and equations inside sections of larger or smaller text in a manner
that will pass standards validation. According to the standard, HTML
font changing commands like most others, are either of
inline type, in which case they are forbidden
to contain block level constructs like
tables, or block type, in which case they force a new line and so
can't be used within a paragraph. The default can't universally fix
this unnecessarily restrictive requirement of the standard (which
most browsers wisely do not honor). There are situations where
TEX usage is simply impossible to express in HTML. However, it does
fix the vast majority of sensible usages. The switch -w4 turns off
this approach, reverting to less standards-compatible style.