Computer typesetting overview
Last revision August 3, 2004
The basic idea is to re-arrange portions of a text document so the output appears in a desired format. Capabilities to select fonts, set equations, create tables, etc.
Typesetting systems can be categorized along two dimensions:
- physical (or visual) design vs. logical design
- direct manipulation of text with continuous typesetting vs. page markup language with deferred typesetting.
Physical (or visual) design
Physical design means that the user directly controls the placement on the page of each of the units of his document, such as each line of text, each footnote, each paragraph, each table, etc.
In this system, the user has complete control of how the final page will be laid out, but must make many small decisions about spacing and placement.
A serious disadvantage of this system is that it does not work "top down" from a set of global formatting parameters and commands, so it can be difficult to make major changes that affect the entire document.
An extreme example would be use of a drafting program (such as MacDraw) to produce a text document, where each paragraph or other document element was separately positioned by the user.
Logical design
Logical design means the user thinks of his document in terms of logical units, such as paragraphs, sections, headers, footnotes, tables, etc.
The positioning of these units on the page, the type styles and sizes used for each, the special marks, indentations, etc., are all determined in advance as part of a consistent style.
A major advantage of this system is that you can often make major changes, such as margin widths, or paragraph indentation, or switching fonts, or changing from single to double column mode, or numbering section headings with Roman rather than Arabic numerals, by modifying a single formatting command, rather than modifying each instance individually.
Books are usually typeset according to some kind of logical design prepared by an editor or designer; the author simply supplies the text, figures, etc. to fill in the design.
Direct formatting with continuous typesetting
Direct manipulation of text with continuous typesetting is often referred to as "What You See Is What You Get", or WYSIWYG. The user selects design characteristics from menus or by "highlighting" text with a mouse and sees an instant updating of the representation of the formatted page. Text editing and typesetting functions are combined into a single program.
The program keeps track of formatting changes such as type style or size changes, indentation changes, etc., using non-printing control character codes that are not directly visible or accessible by the user.
This system requires integrated high-speed, high-resolution graphics display monitors, such as are found on the Macintosh or Windows PC, or modern Unix workstations running the X window system.
The chief advantage of this style is that it offers continuous visual feedback.
Page markup language
In a page markup language, the user delimits and selects physical or logical design elements by means of special command sequences that are embedded into the document. In other words, the user types in both the actual text of the document and the formatting commands as plain text. Command sequences are generally separated from ordinary text by use of a special character (such as \) or position (such as always at the start of the line).
The user gets no instant representation of the formatted output as he creates the document. Typesetting is deferred until the document (or a portion) is complete.
A plain text editor is used to create the document; another program does the typesetting. As the typesetting commands actually represent a specialized computer language, the user can usually combine primitive commands together into "macros" to create sophisticated effects.
The user does not need to work at a graphics display station to create his documents, but can use any ordinary computer terminal. Typeset output can be directed to a single graphics device serving many users, such as a laser printer.
This system is typical of time-sharing computers such as Unix systems.
The chief advantage of page markup languages is that they are self-documenting. That is, by examining the input document, the user can always see right away how to produce the typesetting effect.
Common typesetting implementations
Actual typesetting hardware/software combinations tend to cluster into two combinations of these dimensions:
- WYSIWYG representation with physical design on Macintosh or Windows PC micro-computers. Some of the more competent programs are incorporating logical design features, such as "styles" in Microsoft Word.
- Page markup languages that implement logical design.
TeX/LaTeX is the premier example of this system. Unix and other time-sharing systems tend to implement this type of system. The "HTML" language used by World Wide Web documents is another, simpler, example of a special purpose page markup language.