Should I translate TeX to HTML or not?
Ian Hutchinson
1 Page Representation Formats
TeX and LaTeX are well suited to producing electronically publishable
documents. However, it is important to realize the difference between
page layout and functional mark-up. TeX is capable of extremely
detailed page layout, specifying precisely where on the page symbols
go. HTML is not, because HTML is a functional mark-up language
(specifying primarily document structure) not a page layout
language. HTML's exact rendering is not specified by the document that
is published but is, to some degree, left to the discretion of the
browser. This is a deliberate choice. It recognizes that the window
size, resolution, or shape on which a document is viewed will vary
from reader to reader, and that therefore layout, font size, and other
choices for good readability should be at least partly up to the
reader, not the author. The result is that well designed HTML is
excellent for browsing, but clumsy for printing.
Most authors are not used to such flexibility, they are used to
producing static documents whose appearance is the same for everyone,
because, for example, they are copies of a piece of paper.
If
you require your readers to see an exact replication of what your
document looks like to you, then you cannot use HTML to transmit it,
no matter what format it starts in. That is true not just for
translated TeX but also for any authoring tool from which HTML is to
be produced. The only way to produce documents whose appearance is
completely controlled is to represent them in a page layout language
such as such as PDF or Postscript or, for that matter, DVI. These
formats are not as good as HTML for browsing, despite substantial
hyperlinking ability in PDF, but they are better for transmitting a
printable copy. Parenthetically, word processor formats
are less satisfactory for transmitting printable copy, hopeless for
browsing, and unreliable for archiving because of the instability of
the format.
2 Mathematics
TeX's excellent mathematical capabilities are absent from HTML and
browsers. There are then two main choices for representing equations
in HTML: using bit-mapped images, or using browser fonts and tables
for layout. The advantage of the bit-mapped approach is that it uses
capabilities that are essentially universal to every graphical
browser. Its disadvantages are that it requires a separate graphical
file for every equation, which becomes very cumbersome and slow to
download. Also the alignment and sizing of the graphical equations is
uncertain with respect to the rest of the text. The advantages of the
font and table approach used by TtH are that one HTML document
contains all the information, giving portability and speed of
download. The disadvantages are that it depends on having the symbol
font accessible on the browser, and that the equation layout is not as
compact or elegant as TeX's.
The MathML standard has been developed to represent mathematics in
electronic documents. MathML is not HTML. Popular browsers do not
currently (Mar 2003) render MathML without additional plugin software
or fonts. The standard is in any case that MathML is supported within
XML not strictly HTML. What is holding up wider adoption of MathML is
not questions of production of MathML, since translators such as TtM
are fully up to that job, rather it is the weakness of support in
leading browsers. But even when and if MathML is routinely supported
by browsers out of the box, documents' appearance will still be in the
hands of the browser not the author.
3 Conclusion
So should I translate to HTML? If you want to provide the easiest
browsable format, yes. If you feel it is essential to control the
precise layout for aesthetic or other reasons, no. But notice the answer has
nothing to do with whether the format starts as TeX.
File translated from
TEX
by
TTHgold,
version 3.70.
On 28 Aug 2005, 18:01.