Skip to main content.

Web Based Programming Tutorials

Homepage | Forum - Join the forum to discuss anything related to programming! | Programming Resources

CGI Developer's Guide

Appendix B -- HTML Guide

Appendix B

HTML Guide


This book has assumed that you have at least a working knowledge of HyperText Markup Language (HTML). This appendix provides a complete reference to HTML as well as some conceptual information. This appendix is divided into subdivisions of HTML. Different sections document each tag and sometimes provide an example.

HTML is currently in a state of flux. The current HTML standard (v2.0) is documented in RFC1866, although it is already outdated. Many HTML tags are not part of the official standard, but they are definitely part of the adopted standard and will most likely become part of the new HTML specification currently being drafted. The current Internet Engineering Task Force (IETF) draft of the new (v3.0) specification has expired and has been replaced by several, more specialized drafts. For example, there are separate drafts for tables, file upload, and client-side imagemaps.

Because of the constantly evolving nature of HTML, this reference document will in all like-lyhood be outdated by the time you read it. However, the following are some good HTML references on the Web that are well updated:

Because I want to present HTML tags other than those in the official 2.0 standard, this appendix is inherently subjective. These tags should work properly in almost all browsers. Additionally, extended and new tags supported by Netscape and Microsoft's Internet Explorer are included.

General Structure

The general format for all HTML documents is

   "metainformation" goes here

   displayed content goes here


Metatags go inside of the <HEAD> tag. The <META> tag is used to embed document meta-information not defined by other HTML tags. Information within the <META> tag can be extracted by servers and/or clients for use in cataloging, identifying, and indexing document meta-information.

<META NAME="..." HTTP-EQUIV="..." CONTENT="...">

This metatag simulates an HTTP header. NAME is used to name properties such as author or publication date. If the NAME element is absent, the name can be assumed to be the value of HTTP-EQUIV. HTTP-EQUIV binds the element to an HTTP response header and the value of the header is in the CONTENT attribute.


This is the title of the HTML document; it must occur within the head of the document. The attributes of the <TITLE> tag usually appear somewhere on the browser window.


<ISINDEX> informs the HTML user agent that the document is an index document. The reader can read the document or enter a keyword search. The default URL used for processing queries can be overridden with the HREF attribute. (See Chapter 5, "Input," for more details.)

Netscape has made additional, backward-compatible tags and attributes available on top of the HTML 1.0 specification. Some of Netscape's extensions have been incorporated into HTML 2.0 and 3.0, respectively. The following attribute to the <ISINDEX> tag is a Netscape extension:


The PROMPT attribute enables you to change the default prompt supplied by the browser.

<BASE HREF="...">

The <BASE> tag defines the base URL of the current document. All relative URLs in this document are relative to the URL in HREF. By default, the value of HREF depends on the current location of the document.

The following is the Netscape extension:


TARGET enables you to specify the appropriate Netscape browser window or frame to target the output of all links on the current document.

<LINK REV="..." REL="..." HREF="...">

<LINK> specifies links, documents, and information related to the current document. REL specifies the relationship this link has to the current document, and REV specifies the same relationship in reverse. HREF specifies the location of the link.

For example, suppose you had two chapters of a book: chapter1.html and chapter2.html. Chapter1.html might contain one of the following links:

<LINK REL="next" HREF="chapter2.html">


<LINK REV="previous" HREF="chapter2.html">


<BODY> tags contain the information that is displayed in the Web browser. To make notes within the <BODY> tag that the browser ignores, you can use the comment tag:

<!-- ... -->

Netscape Navigator and Microsoft Internet Explorer support the following proprietary extensions to the <BODY> tag:

For links with information about colors, see the following:
For more information on HTML 2.0 and 3.0, see the following:


The following text blocking statements demonstrate how you can lay out the text in the body of your HTML page:


The preceding tag defines a paragraph. ALIGN specifies the alignment of the paragraph within the <p> tags.


There are physically several different ways to present a listing of items. In HTML, each list consists of a tag that specifies the kind of list and a tag (or series of tags) that specifies each item in the list.


<LI> stands for "list item." <LI> is the most common way of expressing list items, such as unordered, ordered, menus, and directory lists. The following are Netscape extensions that can be used with list item tags:

  <li value=3>item #3

Unordered Lists

Unordered lists differ from ordered lists in that, instead of each list item being labeled numerically, bullets are used.


The <UL> tag creates lists with generic bullets preceding each unordered list item.

Ordered Lists

List items in ordered lists are automatically preceded by a sequential number or letter, beginning with the number 1 or letter A and incrementing by 1 with each new list item.


Each item within the ordered list tag of the list is ordered by number or letter according to the nesting and assigned either a number or letter according to the nesting.


TYPE='...' is a Netscape extension that defines the default bullet type. See "<LI>."


Menus are a list of items. This tag has the same effect as the <UL> tag.


The <MENU> tag indicates a menu of items. The output usually looks similar or equivalent to <UL>. Menus cannot be nested.


Directories are specified with the <DIR> tag. The output is the same as the <UL> tag.


A directory of items. Usually looks similar/equivalent to <UL>. Directories cannot be nested.


Definitions are specified with the definition list tag. The following is a definition list, where <DT> is the name or title of the item to be defined and <DD> is the definition.



Source code and other text can be displayed with monowidth fonts with the <PRE> tag. The text within the <PRE> tags appears exactly as it is typed, usually using a monowidth font.



Division defines a container of information.


<DIV> defines a container of information. ALIGN specifies the alignment of all information in that container. This tag is the preferred way of aligning elements in your HTML documents.


Text can be centered within the browser window with the <CENTER> tag.


The <CENTER> tag centers the content between the tags. This tag is a proprietary solution, meaning most, if not all, browsers support it. It is a good idea to use <DIV ALIGN=CENTER> over the <CENTER> tag for the benefit of newer browsers.

Text Formatting

The following tags describe the element between the tags. The appearance of the element is not as important as the actual definition. For example, a user should be able to specify how his or her browser displays any heading in an HTML file; these headings should be defined as headings rather than as bold text in a large font.


Headings are usually used for section headings; the alignment is specified by ALIGN. There are six headings: <H1>, <H2>, <H3>, <H4>, <H5>, and <H6>.



The <EM> tag emphasizes the text between the tags. The emphasized text is usually (but not necessarily) displayed as italics.


Strong Emphasis

The <STRONG> tag strongly emphasizes text between the tags. The emphasized text is usually (but not necessarily) displayed as bold.


Block Quotes

You can block-quote selected text with the <BLOCKQUOTE> tag. This sets off the text between tags, usually by indenting or changing the margin and centering.



You use citations when you are referring to another printed document, such as a book. Text within the <CITE> tag is usually italic.



E-mail addresses are usually wrapped in the <ADDRESS> tag.


The preceding defines an e-mail address, usually in italics.

Source Code

Computer source code is usually surrounded by the <CODE> tag.


The preceding defines a source code excerpt and uses a fixed-width font.

Sample Output

Sample output of a program can be formatted with the <SAMP> tag.


The preceding defines sample output from a program.

Keyboard Input

The keyboard input tag will mark text that the user is to type on the keyboard. It is normally rendered in a fixed-width font.



The variable tag is used to mark a variable used in a mathematical formula or computer program. It is normally displayed in italics.



Definitions are usually formatted differently than other text. Use the <DFN> tag to display definitions.


Physical Formatting

Physical formatting has become very popular because it has a very literal style.


Text can be rendered bold with the <B> tag.



Text can be displayed in italics with the <I> tag.



The typewriter tag displays text in a typewriter-looking font.



Text can be underlined with the following tag:



Text can be displayed with a line through the middle with the <S> tag to indicate strikeout.



Subscript renders the text smaller than the normal font.



Superscript works the same as subscript tags in that it displays the text smaller than the normal text.


Netscape Extensions

The <BLINK> tag makes the text within the tags blink. This is not recommended because of the way it affects different browsers.


The size and color attributes of the <FONT> tag define the size or color of the text. SIZE is a number between 1 and 7 (the default size is 3). You can specify the font to be relatively larger or smaller than the preceding font by preceding the number with either a plus (+) or minus sign (-).

<FONT SIZE=n|+n|-n COLOR="...">...</FONT>

<BASEFONT SIZE=n> defines the default size of the fonts. The default value is 3.


Text can be linked to other text with a click of the mouse; text linked in this way is called hypertext.

<A HREF="...">...</A>

When the user selects the link, the browser goes to the location specified by HREF. The <A HREF="x"> variable can be either a URL or a path relative to the local document root.

<A NAME="...">

The <A NAME="..."> tag sets a marker in the HTML page. NAME is the name of the marker. To reference that marker, use the following:

<A HREF="filename.html#markername">...</A>

The following is the Netscape extension:


The TARGET tag enables you to specify the appropriate Netscape browser window or frame to target the output of all links on the current document.

Inline Multimedia

This tag places an inline image within an HTML document with the SRC attribute.

     [ALT="..."] [ISMAP] [USEMAP="..."]>

SRC defines the location of that image-either a URL or a relative path. ALIGN specifies the alignment of both the image and the text or graphics following the image. ALT specifies alternative text to display if the image is not displayed. ISMAP is used for server-side imagemaps, and USEMAP is used for client-side imagemaps.

Client-side imagemaps are defined using the <MAP> tag.

<MAP NAME="...">

NAME is the name of the map (similar to <A NAME="...">), and the <AREA> tags define the areas of the map. COORDS is a list of coordinates that define the area. HREF defines where to go if that area is selected. If you specify NOHREF, then the browser ignores you if you click in that region.

The following are Netscape extensions to the <IMG> tag:


You can insert line breaks using the <BR> tag. Using this tag is the same as pressing Enter to start a new line of text.


The <BR> tag indicates a line break. In Netscape, </NOBR> prevents line breaks and <WBR> indicates where to break the line if needed.


The <HR> tag indicates a horizontal line, also known as a hard rule. Netscape extensions to the <HR> tag are the attributes SIZE=number, WIDTH=[number|percent], ALIGN=[left|right|center], and NOSHADE.


Forms can be used with the <FORM> tag to make your Web pages interactive with user-defined entries. For more detailed information on HTML forms, see Chapter 3, "HTML and Forms."


The ACTION, METHOD, and ENCTYPE elements define the form action, method, and encryption type.


The other attributes are all dependent on the TYPE attribute (see Chapter 3). TYPE is one of the following:

The <SELECT> tag lets you define a menu of items from which to select. The following is an example of the <SELECT> tag:


The <TEXTAREA> tag defines a textual area where the user can type in multiple lines of text.



Tables are defined by rows and cells in those rows.


The <TABLE> tag defines a table. If you specify BORDER, a border will be drawn around the table.

The following are the Netscape extensions to the <TABLE> tag:

Table Rows

You can use the <TR> tag to specify table rows.


The preceding defines a row within the table. ALIGN specifies the horizontal alignment of the elements within the row and VALIGN specifies the vertical alignment.

Table Data

You can specify the elements of a table cell with the <TD> tag as follows:


This code specifies a table cell within a row. Normally, the cell lies within the row. However, you can have it extend into another row or column using the COLSPAN or ROWSPAN attribute, where n defines how far into another column or row the cell spans.

Table Headings

Use the <TH> tag to place headings within a table.


<TH> tags are equivalent to <TD> except they are used as table headers. The contents of table heading tags are normally bold.


Captions can be inserted into a table as follows:


This code describes a caption in the table.


Frames are a Netscape enhancement that enable you to divide the browser window into several different components. For more detailed information on frames, see Chapter 14, "Proprietary Extensions."

The following shows the basic frame element. The example defines either a row or column of frames. You may embed multiple <FRAMESET> tags within each other.


Frame Elements

The <BODY> tag is replaced by the <FRAMESET> tag in a framed HTML page.


Â[SCROLLING="yes|no|auto"] [NORESIZE]>

The preceding tag defines the frame element within the <FRAMESET> tags. SRC is the location of the document that should appear in this frame. NAME is the name of the frame. SCROLLING defines whether or not to display a scrollbar. MARGINWIDTH and MARGINHEIGHT define the margin between the content of the frame and the frame in pixels. NORESIZE prevents the user from resizing the frame.


<NOFRAMES> defines the HTML to appear if the browser does not support frames. If the browser does support frames, everything within these tags is ignored.

Special Characters

Table B.1 covers the HTML attributes inserted into text for characters that are not usually on a 101-key keyboard.

Table B.1. Non-alphanumeric characters.

Entity Name
Quotation mark
Less-than sign
Greater-than sign
Non-breaking space
Inverted exclamation
Cent sign
Pound sterling
General currency sign
Yen sign
Broken vertical bar
Section sign
Umlaut (dieresis)
Left angle, quote, guillemotleft
Not sign
Soft hyphen
Registered trademark
Macron accent
&macr; &macr; &hibar
Degree sign
Plus or minus
Superscript two
Superscript three
Acute accent
Micro sign
Paragraph sign
Middle dot
Superscript one
Masculine ordinal
Right angle quote, guillemotright
Fraction one-fourth
Fraction one-half
Fraction three-fourths
Inverted question mark
Capital A, grave accent
Capital A, acute accent
Capital A, circumflex accent
Capital A, tilde
Capital A, dieresis or umlaut mark
Capital A, ring
Capital AE diphthong (ligature)
Capital C, cedilla
Capital E, grave accent
Capital E, acute accent
Capital E, circumflex accent
Capital E, dieresis or umlaut mark
Capital I, grave accent
Capital I, acute accent
Capital I, circumflex accent
Capital I, dieresis or umlaut mark
Capital Eth, Icelandic
Capital N, tilde
Capital O, grave accent
Capital O, acute accent
Capital O, circumflex accent
Capital O, tilde
Capital O, dieresis or umlaut mark
Multiply sign
Capital O, slash
Capital U, grave accent
Capital U, acute accent
Capital U, circumflex accent
Capital U, dieresis or umlaut mark
Capital Y, acute accent
Capital THORN, Icelandic
Small sharp s, German (sz ligature)
Small a, grave accent
Small a, acute accent
Small a, circumflex accent
Small a, tilde
Small a, dieresis or umlaut mark
Small a, ring
Small ae diphthong (ligature)
Small c, cedilla
Small e, grave accent
Small e, acute accent
Small e, circumflex accent
Small e, dieresis or umlaut mark
Small i, grave accent
Small i, acute accent
Small i, circumflex accent
Small i, dieresis or umlaut mark
Small eth, Icelandic
Small n, tilde
Small o, grave accent
Small o, acute accent
Small o, circumflex accent
Small o, tilde
Small o, dieresis or umlaut mark
Division sign
Small o, slash
Small u, grave accent
Small u, acute accent
Small u, circumflex accent
Small u, dieresis or umlaut mark
Small y, acute accent
Small thorn, Icelandic
Small y, dieresis or umlaut mark