CGI Developer's Guide
- General Structure
- Text Formatting
- Physical Formatting
- Inline Multimedia
- Special Characters
This book has assumed that you have at least a working knowledge of HyperText Markup Language (HTML). This appendix provides a complete reference to HTML as well as some conceptual information. This appendix is divided into subdivisions of HTML. Different sections document each tag and sometimes provide an example.
HTML is currently in a state of flux. The current HTML standard (v2.0) is documented in RFC1866, although it is already outdated. Many HTML tags are not part of the official standard, but they are definitely part of the adopted standard and will most likely become part of the new HTML specification currently being drafted. The current Internet Engineering Task Force (IETF) draft of the new (v3.0) specification has expired and has been replaced by several, more specialized drafts. For example, there are separate drafts for tables, file upload, and client-side imagemaps.
Because of the constantly evolving nature of HTML, this reference document will in all like-lyhood be outdated by the time you read it. However, the following are some good HTML references on the Web that are well updated:
- URL:http://werbach.com/barebones/ The Barebone Guide to HTML, a good HTML reference sheet.
- URL:http://www.w3.org/pub/WWW/MarkUp/ The W3 organization's Web pages on markup languages. This has links to all the important information regarding HTML.
Because I want to present HTML tags other than those in the official 2.0 standard, this appendix is inherently subjective. These tags should work properly in almost all browsers. Additionally, extended and new tags supported by Netscape and Microsoft's Internet Explorer are included.
The general format for all HTML documents is
"metainformation" goes here
displayed content goes here
Metatags go inside of the <HEAD> tag. The <META> tag is used to embed document meta-information not defined by other HTML tags. Information within the <META> tag can be extracted by servers and/or clients for use in cataloging, identifying, and indexing document meta-information.
<META NAME="..." HTTP-EQUIV="..." CONTENT="...">
This metatag simulates an HTTP header. NAME is used to name properties such as author or publication date. If the NAME element is absent, the name can be assumed to be the value of HTTP-EQUIV. HTTP-EQUIV binds the element to an HTTP response header and the value of the header is in the CONTENT attribute.
This is the title of the HTML document; it must occur within the head of the document. The attributes of the <TITLE> tag usually appear somewhere on the browser window.
<ISINDEX> informs the HTML user agent that the document is an index document. The reader can read the document or enter a keyword search. The default URL used for processing queries can be overridden with the HREF attribute. (See Chapter 5, "Input," for more details.)
Netscape has made additional, backward-compatible tags and attributes available on top of the HTML 1.0 specification. Some of Netscape's extensions have been incorporated into HTML 2.0 and 3.0, respectively. The following attribute to the <ISINDEX> tag is a Netscape extension:
The PROMPT attribute enables you to change the default prompt supplied by the browser.
The <BASE> tag defines the base URL of the current document. All relative URLs in this document are relative to the URL in HREF. By default, the value of HREF depends on the current location of the document.
The following is the Netscape extension:
TARGET enables you to specify the appropriate Netscape browser window or frame to target the output of all links on the current document.
<LINK REV="..." REL="..." HREF="...">
<LINK> specifies links, documents, and information related to the current document. REL specifies the relationship this link has to the current document, and REV specifies the same relationship in reverse. HREF specifies the location of the link.
<LINK REL="next" HREF="chapter2.html">
<LINK REV="previous" HREF="chapter2.html">
<BODY> tags contain the information that is displayed in the Web browser. To make notes within the <BODY> tag that the browser ignores, you can use the comment tag:
<!-- ... -->
Netscape Navigator and Microsoft Internet Explorer support the following proprietary extensions to the <BODY> tag:
- BACKGROUND="..."-A background image that will be tiled in the background of the browser.
- BGCOLOR="..."-The background color. The format is #rrggbb, where rr is the hexadecimal code for the red value, gg is the code for the green value, and bb is the code for the blue value.
- TEXT="..."-The text color.
- LINK="..."-The color of links that have not yet been visited.
- VLINK="..."-The color of links that have already been visited.
For links with information about colors, see the following:
The following text blocking statements demonstrate how you can lay out the text in the body of your HTML page:
The preceding tag defines a paragraph. ALIGN specifies the alignment of the paragraph within the <p> tags.
There are physically several different ways to present a listing of items. In HTML, each list consists of a tag that specifies the kind of list and a tag (or series of tags) that specifies each item in the list.
<LI> stands for "list item." <LI> is the most common way of expressing list items, such as unordered, ordered, menus, and directory lists. The following are Netscape extensions that can be used with list item tags:
- TYPE=A|a|I|i|1-For ordered lists, defines the type preceding the list item. A specifies capital letters, a specifies lowercase letters; I specifies capital Roman numerals, and i specifies lowercase Roman numerals. 1 specifies numbers.
- VALUE=n-With ordered lists, enables you to specify the starting value. For example, if you want to start a numerical list starting with number 3, use the following:
<li value=3>item #3
Unordered lists differ from ordered lists in that, instead of each list item being labeled numerically, bullets are used.
The <UL> tag creates lists with generic bullets preceding each unordered list item.
- TYPE=DISC|CIRCLE|SQUARE-Defines the bullet type, discs, circles, or squares that can precede the list items in unordered lists.
List items in ordered lists are automatically preceded by a sequential number or letter, beginning with the number 1 or letter A and incrementing by 1 with each new list item.
Each item within the ordered list tag of the list is ordered by number or letter according to the nesting and assigned either a number or letter according to the nesting.
TYPE='...' is a Netscape extension that defines the default bullet type. See "<LI>."
Menus are a list of items. This tag has the same effect as the <UL> tag.
The <MENU> tag indicates a menu of items. The output usually looks similar or equivalent to <UL>. Menus cannot be nested.
Directories are specified with the <DIR> tag. The output is the same as the <UL> tag.
A directory of items. Usually looks similar/equivalent to <UL>. Directories cannot be nested.
Definitions are specified with the definition list tag. The following is a definition list, where <DT> is the name or title of the item to be defined and <DD> is the definition.
Source code and other text can be displayed with monowidth fonts with the <PRE> tag. The text within the <PRE> tags appears exactly as it is typed, usually using a monowidth font.
Division defines a container of information.
<DIV> defines a container of information. ALIGN specifies the alignment of all information in that container. This tag is the preferred way of aligning elements in your HTML documents.
Text can be centered within the browser window with the <CENTER> tag.
The <CENTER> tag centers the content between the tags. This tag is a proprietary solution, meaning most, if not all, browsers support it. It is a good idea to use <DIV ALIGN=CENTER> over the <CENTER> tag for the benefit of newer browsers.
The following tags describe the element between the tags. The appearance of the element is not as important as the actual definition. For example, a user should be able to specify how his or her browser displays any heading in an HTML file; these headings should be defined as headings rather than as bold text in a large font.
Headings are usually used for section headings; the alignment is specified by ALIGN. There are six headings: <H1>, <H2>, <H3>, <H4>, <H5>, and <H6>.
The <EM> tag emphasizes the text between the tags. The emphasized text is usually (but not necessarily) displayed as italics.
The <STRONG> tag strongly emphasizes text between the tags. The emphasized text is usually (but not necessarily) displayed as bold.
You can block-quote selected text with the <BLOCKQUOTE> tag. This sets off the text between tags, usually by indenting or changing the margin and centering.
You use citations when you are referring to another printed document, such as a book. Text within the <CITE> tag is usually italic.
E-mail addresses are usually wrapped in the <ADDRESS> tag.
The preceding defines an e-mail address, usually in italics.
Computer source code is usually surrounded by the <CODE> tag.
The preceding defines a source code excerpt and uses a fixed-width font.
Sample output of a program can be formatted with the <SAMP> tag.
The preceding defines sample output from a program.
The keyboard input tag will mark text that the user is to type on the keyboard. It is normally rendered in a fixed-width font.
The variable tag is used to mark a variable used in a mathematical formula or computer program. It is normally displayed in italics.
Definitions are usually formatted differently than other text. Use the <DFN> tag to display definitions.
Physical formatting has become very popular because it has a very literal style.
Text can be rendered bold with the <B> tag.
Text can be displayed in italics with the <I> tag.
The typewriter tag displays text in a typewriter-looking font.
Text can be underlined with the following tag:
Text can be displayed with a line through the middle with the <S> tag to indicate strikeout.
Subscript renders the text smaller than the normal font.
Superscript works the same as subscript tags in that it displays the text smaller than the normal text.
The <BLINK> tag makes the text within the tags blink. This is not recommended because of the way it affects different browsers.
The size and color attributes of the <FONT> tag define the size or color of the text. SIZE is a number between 1 and 7 (the default size is 3). You can specify the font to be relatively larger or smaller than the preceding font by preceding the number with either a plus (+) or minus sign (-).
<FONT SIZE=n|+n|-n COLOR="...">...</FONT>
<BASEFONT SIZE=n> defines the default size of the fonts. The default value is 3.
Text can be linked to other text with a click of the mouse; text linked in this way is called hypertext.
When the user selects the link, the browser goes to the location specified by HREF. The <A HREF="x"> variable can be either a URL or a path relative to the local document root.
The <A NAME="..."> tag sets a marker in the HTML page. NAME is the name of the marker. To reference that marker, use the following:
The following is the Netscape extension:
The TARGET tag enables you to specify the appropriate Netscape browser window or frame to target the output of all links on the current document.
This tag places an inline image within an HTML document with the SRC attribute.
<IMG SRC="..." [ALIGN=TOP|BOTTOM|MIDDLE]
[ALT="..."] [ISMAP] [USEMAP="..."]>
SRC defines the location of that image-either a URL or a relative path. ALIGN specifies the alignment of both the image and the text or graphics following the image. ALT specifies alternative text to display if the image is not displayed. ISMAP is used for server-side imagemaps, and USEMAP is used for client-side imagemaps.
Client-side imagemaps are defined using the <MAP> tag.
<AREA SHAPE=" " COORDS=" " HREF=" "|NOHREF>
NAME is the name of the map (similar to <A NAME="...">), and the <AREA> tags define the areas of the map. COORDS is a list of coordinates that define the area. HREF defines where to go if that area is selected. If you specify NOHREF, then the browser ignores you if you click in that region.
The following are Netscape extensions to the <IMG> tag:
- WIDTH=n-Defines the width of the image in pixels.
- HEIGHT=n-Defines the height of the image in pixels.
You can insert line breaks using the <BR> tag. Using this tag is the same as pressing Enter to start a new line of text.
The <BR> tag indicates a line break. In Netscape, </NOBR> prevents line breaks and <WBR> indicates where to break the line if needed.
The <HR> tag indicates a horizontal line, also known as a hard rule. Netscape extensions to the <HR> tag are the attributes SIZE=number, WIDTH=[number|percent], ALIGN=[left|right|center], and NOSHADE.
Forms can be used with the <FORM> tag to make your Web pages interactive with user-defined entries. For more detailed information on HTML forms, see Chapter 3, "HTML and Forms."
<FORM ACTION="..." METHOD=GET|POST ENCTYPE="...">...</FORM>
The ACTION, METHOD, and ENCTYPE elements define the form action, method, and encryption type.
<INPUT TYPE="..." NAME="..." VALUE="..." SIZE=n MAXLENGTH=n
The other attributes are all dependent on the TYPE attribute (see Chapter 3). TYPE is one of the following:
The <SELECT> tag lets you define a menu of items from which to select. The following is an example of the <SELECT> tag:
<SELECT NAME="..." SIZE=n [MULTIPLE]>
The <TEXTAREA> tag defines a textual area where the user can type in multiple lines of text.
<TEXTAREA NAME="..." ROWS=n COLS=n>...</TEXTAREA>
Tables are defined by rows and cells in those rows.
The <TABLE> tag defines a table. If you specify BORDER, a border will be drawn around the table.
The following are the Netscape extensions to the <TABLE> tag:
- BORDER=n-Specifies the width of the border in pixels.
- CELLSPACING=n-Defines the width between each cell in pixels.
- CELLPADDING=n-Defines the margin between the content of each cell and the cell itself in pixels.
- WIDTH=n|%n-Specifies the width of the entire table. Can be specified either as a pixel value or as a percentage of the width of the browser.
You can use the <TR> tag to specify table rows.
<TR [ALIGN=LEFT|RIGHT|CENTER] [VALIGN=TOP|MIDDLE|BOTTOM]>...</TR>
The preceding defines a row within the table. ALIGN specifies the horizontal alignment of the elements within the row and VALIGN specifies the vertical alignment.
You can specify the elements of a table cell with the <TD> tag as follows:
<TD [ALIGN=LEFT|RIGHT|CENTER] [VALIGN=TOP|MIDDLE|BOTTOM][COLSPAN=n]
This code specifies a table cell within a row. Normally, the cell lies within the row. However, you can have it extend into another row or column using the COLSPAN or ROWSPAN attribute, where n defines how far into another column or row the cell spans.
Use the <TH> tag to place headings within a table.
<TH [ALIGN=LEFT|RIGHT|CENTER] [VALIGN=TOP|MIDDLE|BOTTOM][COLSPAN=n]
<TH> tags are equivalent to <TD> except they are used as table headers. The contents of table heading tags are normally bold.
Captions can be inserted into a table as follows:
This code describes a caption in the table.
Frames are a Netscape enhancement that enable you to divide the browser window into several different components. For more detailed information on frames, see Chapter 14, "Proprietary Extensions."
The following shows the basic frame element. The example defines either a row or column of frames. You may embed multiple <FRAMESET> tags within each other.
The <BODY> tag is replaced by the <FRAMESET> tag in a framed HTML page.
<FRAMESET [SRC="..."] [NAME="..."] [MARGINWIDTH=n] [MARGINHEIGHT=n]
The preceding tag defines the frame element within the <FRAMESET> tags. SRC is the location of the document that should appear in this frame. NAME is the name of the frame. SCROLLING defines whether or not to display a scrollbar. MARGINWIDTH and MARGINHEIGHT define the margin between the content of the frame and the frame in pixels. NORESIZE prevents the user from resizing the frame.
<NOFRAMES> defines the HTML to appear if the browser does not support frames. If the browser does support frames, everything within these tags is ignored.
Table B.1 covers the HTML attributes inserted into text for characters that are not usually on a 101-key keyboard.
|General currency sign|
|Broken vertical bar|
|Left angle, quote, guillemotleft|
|Plus or minus|
|Right angle quote, guillemotright|
|Inverted question mark|
|Capital A, grave accent|
|Capital A, acute accent|
|Capital A, circumflex accent|
|Capital A, tilde|
|Capital A, dieresis or umlaut mark|
|Capital A, ring|
|Capital AE diphthong (ligature)|
|Capital C, cedilla|
|Capital E, grave accent|
|Capital E, acute accent|
|Capital E, circumflex accent|
|Capital E, dieresis or umlaut mark|
|Capital I, grave accent|
|Capital I, acute accent|
|Capital I, circumflex accent|
|Capital I, dieresis or umlaut mark|
|Capital Eth, Icelandic|
|Capital N, tilde|
|Capital O, grave accent|
|Capital O, acute accent|
|Capital O, circumflex accent|
|Capital O, tilde|
|Capital O, dieresis or umlaut mark|
|Capital O, slash|
|Capital U, grave accent|
|Capital U, acute accent|
|Capital U, circumflex accent|
|Capital U, dieresis or umlaut mark|
|Capital Y, acute accent|
|Capital THORN, Icelandic|
|Small sharp s, German (sz ligature)|
|Small a, grave accent|
|Small a, acute accent|
|Small a, circumflex accent|
|Small a, tilde|
|Small a, dieresis or umlaut mark|
|Small a, ring|
|Small ae diphthong (ligature)|
|Small c, cedilla|
|Small e, grave accent|
|Small e, acute accent|
|Small e, circumflex accent|
|Small e, dieresis or umlaut mark|
|Small i, grave accent|
|Small i, acute accent|
|Small i, circumflex accent|
|Small i, dieresis or umlaut mark|
|Small eth, Icelandic|
|Small n, tilde|
|Small o, grave accent|
|Small o, acute accent|
|Small o, circumflex accent|
|Small o, tilde|
|Small o, dieresis or umlaut mark|
|Small o, slash|
|Small u, grave accent|
|Small u, acute accent|
|Small u, circumflex accent|
|Small u, dieresis or umlaut mark|
|Small y, acute accent|
|Small thorn, Icelandic|
|Small y, dieresis or umlaut mark|