Teach Yourself Web Publishing with HTML 3.2 in 14 Days
Putting It All Online
- What Does a Web Server Do?
- Locating a Web Server
- Organizing and Installing Your HTML Files
- What's My URL?
- Test, Test, and Test Again
- Registering and Advertising Your Web Pages
- Site Indexes and Search Engines
- Announce Your Site via Usenet
- Business Cards, Letterheads, and Brochures
- Finding Out Who's Viewing Your Web Pages
For the past week you've been creating and testing your Web pages
on your local machine, with your own browser. You may not have
even had a network connection attached to your machine. And at
this point you mostly likely have a Web presentation put together
with a well-organized structure, with a reasonable amount of meaningful
images (each with carefully chosen ALT
text), written your text with wit and care, used only relative
links, and tested it extensively on your own system.
Now, on Day 8, it's finally time to publish it, to put it all online so that other people on the Web can see it and link their pages to yours. In this chapter and the next, you'll learn everything you need to get started publishing the work you've done. Today you'll learn about:
- What a Web server does and why you need one
- Where you can find a Web server on which to put your presentation
- How to install your Web presentation
- How to find out your URL
- How to test your Web pages
- Methods for advertising your presentation
- Using log files and counters to find out who's viewing your pages
To publish Web pages, you'll need a Web server. The Web server is a program that sits on a machine on the Internet, waiting for a Web browser to connect to it and make a request for a file. Once a request comes over the wire, the server locates and sends the file back to the browser. It's as easy as that.
Web Servers and Web browsers communicate using the HyperText Transfer Protocol (HTTP), a special "language" created specifically for the request and transfer of hypertext documents over the Web. Because of this, Web servers are often called HTTPD servers.
The "D" stands for "daemon." A daemon is a UNIX term for a program that sits in the background and waits for requests. When it receives a request, it wakes up, processes that request, and then goes back to sleep. You don't have to be on UNIX for a program to act like a daemon, so Web servers on any platform are still called HTTPDs. Most of the time I call them Web servers.
Although the Web server's primary purpose is to answer requests from browsers, there are several other things a Web server is responsible for. Some of these things you'll learn about today; others you'll learn about later this week.
File and Media Types
In Chapter 9, "External Files, Multimedia, and Animation," you learned a bit about content-types and how browsers and servers use file extensions to determine the type of the file. Servers are responsible for telling the browser the kind of content that a file contains. You can configure a Web server to send different kinds of media, or to handle new and different files and extensions. You'll learn more about this later in this chapter.
The Web server is also responsible for very rudimentary file management; mostly in determining where to find a file and keeping track of where it's gone. If a browser requests a file that doesn't exist, it's the Web server that sends back the page with the "404: File Not Found" message. Servers can also be configured to create aliases for files (the same file, but accessed with a different name), to redirect files to different locations (automatically pointing the browser to a new URL for files that have moved), and to return a default file or a directory listing if a browser requests a URL ending with a directory name.
Finally, servers keep log files for how many times each file on the site has been accessed, including the site that accessed it, the date, and, in some servers, the type of browser and the URL of the page they came from.
CGI Scripts, Programs, and Forms Processing
One of the more interesting (and more complex) things that a server can do is to run external programs on the server machine based on input that your readers provide from their browsers. These special programs are most often called CGI scripts, and are the basis for creating interactive forms and clickable image maps (images that contain several "hot spots" and do different operations based on the location within the image that has been selected). CGI scripts can also be used to connect a Web server with a database or other information system on the server side.
Server-Side File Processing
Some servers have the ability to process files before they send them along to the browser. On a simple level there are what are called server-side includes, which can insert a date or a chunk of boilerplate text into each page, or run a program (many of the access counters you see on pages are run in this way). Server-side processing can also be used in much more sophisticated ways to modify files on the fly for different browsers or to execute small bits of scripting code. You'll learn more about server-side processing in Chapter 27, "Web Server Hints, Tricks, and Tips."
Authentication and Security
Some Web sites require you to register for their service, and make you log in using a name and password every time you visit their site. This is called authentication, and it's a feature most Web servers now include. Using authentication, you can set up users and passwords and restrict access to certain files and directories. You can also restrict access to files or to an entire site based on site names or IP addresses-for example, to prevent anyone outside your company from viewing files that are intended for internal use.
Authentication is the ability to protect files or directories on your Web server so they require your readers to enter a name and password before the files can be viewed.
For security, some servers now provide a mechanism for secure connections and transactions using Netscape's SSL protocol. SSL provides authentication of the server (to prove that the server is who it says it is) and an encrypted connection between the browser and the server so that sensitive information between the two is kept secret.
You'll learn about both authentication and server security in Chapter 28, "Web Server Security and Access Control."
Before you can put your Web presentation on the Web, you'll need to find a Web server that you can use. Depending on how you get your access to the Internet, this may be really easy or not quite so easy.
If you get your Internet connection through school or work, that organization will most likely allow you to publish Web pages on their own Web server. Given that these organizations usually have fast connections to the Internet, and special people to administer the site for you, this is an ideal situation if you have it.
If you're in this situation, you'll have to ask your system administrator, computer consultant, Webmaster or network provider if they have a Web server available, and, if so, what the procedures are for getting your pages installed. You'll learn more about what to ask later in this chapter.
If you pay for your access to the Internet through an Internet service provider (ISP) or a commercial online service, you may also be able to publish your Web pages using that service, although it may cost you extra to do so, and there may be restrictions on the kind of pages you can publish or whether you can run CGI scripts or not. Ask your provider's help line or online groups or conferences related to Internet services to see how they have set up Web publishing.
Several organizations have popped up in the last year that provide nothing but Web publishing services. These services will usually provide you with some method for transferring your files to their site (usually FTP), and they provide the disk space and the network connection for access to your files. They also have professional site administrators on site to make sure the servers are running well all the time. Generally you are charged a flat monthly rate, with some additional cost if you use a large amount of disk space or if you have especially popular pages that take up a lot of network bandwidth. Some services even allow CGI scripts for forms and image maps and will provide consulting to help you set them up; a few will even set up their server with your own host name so that it looks as if you've got your own server running on the Web. These features can make commercial Web sites an especially attract-ive option. Appendix A, "Sources for Further Information," includes pointers to lists of these sites.
Note that unlike your main Internet service provider, which you generally want to be in your city or somewhere close to minimize phone bills, services that publish Web pages can be located anywhere on the Internet, and you can shop for the cheapest prices and best service without having to worry about geographical location.
If all else fails and you cannot find a Web server, there is another option. If your work, school, or ISP doesn't provide a Web server but they do provide an anonymous FTP or Gopher server, you can use those to publish your Web pages. You'll have a different URL (an FTP or Gopher URL), and you won't have nearly the features of a real Web site (forms, scripts, image maps), but if it's all you've got, it'll work well enough for simple pages. And often it may be a cheaper option than a dedicated Web server.
For the ultimate in Web publishing, running your own Web site is the way to go. If you run your own site, you can not only publish as much as you want to and include any kind of content you want to, but you can also use forms, CGI scripts, imagemaps, and many other special options that most other Web publishing services won't let you do. However, the cost, maintenance time and technical background required to run your own server can be daunting, and running a server is definitely not for everyone. You'll learn more about running your own server in the next chapter.
Once you have access to a Web server, you can publish the Web presentation you've labored so hard to create. But before you actually move it into place on your server, it's best to have your files organized and a good idea of what goes where so you don't lose files or so your links don't break in the process.
The Webmaster is the person who runs your Web server, which may also be your system administrator, help desk administrator, or network adminstrator. Before you can publish your files, there are several things you should learn from that Webmaster about how the server is set up. This list will also help you later on in this book when it's time to figure out what you can and cannot do with your server.
- Find out where on the server to put your files. In many cases,
your Webmaster may create a special directory for you. Know where
that directory is and how to gain access to it.
In some other cases, particularly on UNIX machines, you may be able to just create a special directory in your home directory and store your files there. If that's the case, your Webmaster will tell you the name of that directory.
- Find out the URL of your top-level directory. That URL may be different than the actual pathname to your files.
- Find out the name of your system's default index file. This is the file that is loaded by default when a URL ends with a directory name. Usually it'll be index.html but may sometimes be default.html, Homepage.html, or something else.
- Find out if you can run CGI scripts. Depending on your server, the answer may be a flat-out "no," or you may be limited to certain programs and capabilities. For now you don't need extensive details about how to do CGI; you'll learn more about it later on.
- Find out if your site has limitations on what you can put up or how much. Some sites restrict pages to specific content (for example, only work-related pages) or allow you only a few pages on the system. They may prevent more than a certain number of people to access your pages at once or may have other restrictions on what sort of publishing you can do. Make sure you understand the limitations of the system and that you can work within those limitations.
Probably the easiest way to organize each of your presentations is to include all the files for that presentation in a single directory. If you have lots of extra files-for your images, for example-you can put those in a subdirectory to that main directory. Your goal is to contain all your files in a single place rather than scattering them around on your disk. Once you have your files contained, you can set all your links in your files to be relative to that one directory. If you follow these hints, you stand the best chance of being able to move that directory around to different servers without breaking the links.
Web servers usually have a default index file that's loaded when a URL ends with a directory instead of a filename. In the previous section, one of the things you should have asked your Webmaster is what the name of that file is. For most Web servers, this file is usually called index.html (index.htm for DOS). Your home page or top-level index for each presentation should be called by this name so the server knows which page to send as the default page. Each subdirectory, in turn, if it contains any HTML files, should also have a default file. Using this default filename will also allow the URL to that page to be shorter because you don't have to include the actual filename. So, for example, your URL might be http://www.myserver.com/www/ rather than http://www.myserver.com/www/index.html.
Each file should also have an appropriate extension indicating
what kind of file it is so the server can map it to the appropriate
file type. If you've been following along in the book so far,
all your files should already have that special extension, so
this should not be a problem. Table 15.1 shows a list of the common
file extensions you should be using for your files and media,
in case you've forgotten.
|MPEG Video||.mpeg, .mpg|
If you're using special media in your Web presentation that are not part of this list, you may have to have your server specially configured to handle that file type. You'll learn more about this later in this chapter.
Got everything organized? Then all that's left is to move everything into place on the server. Once the server can access your files, you're officially published on the Web. That's all there is to it.
But where is the appropriate spot on your server? You should have found this out from your Webmaster. You should also have found out how to get to that special spot on the server, whether it's simply copying files, using FTP to put them on the server, or using some other method.
If you're using a Web server that has been set up by someone else, usually you'll have to move your Web files from your system to theirs using FTP, Zmodem transfer, or some other method. Although the HTML markup within your files is completely cross-platform, moving the actual files from one type of system to another sometimes has its gotchas. In particular, be careful to do the following:
- Transfer all files as binary.
Your FTP or file-upload program may give you an option to transfer files in binary or text mode (or may give you even different options altogether). Always transfer everything-all your HTML files, all your images, and all your media-in binary format (even the files that are indeed text; you can transfer a text file in binary mode without any problems).
If you're on a Macintosh, your transfer program will most likely give you lots of options with names such as MacBinary, AppleDouble, or other strange names. Avoid all of these. The option you want is flat binary or raw data. If you transfer files in any other format, they may not work when they get to the other side.
- Watch out for filename restrictions.
If you're moving your files to or from DOS systems, you'll have to watch out for the dreaded 8.3-the DOS rule that says filenames must be only eight characters long with three-character extensions. If your server is a pc and you've been writing your files on some other system, you may have to rename your files and the links to them to have the right file-naming conventions. (Moving files you've created on a pc to some other system is usually not a problem.)
Also, watch out if you're moving files from a Macintosh to other systems; make sure that your filenames do not have spaces or other funny characters in them. Keep your filenames as short as possible, use only letters and numbers, and you'll be fine.
- Be aware of carriage returns and line feeds.
Different systems use different methods for ending a line; the Macintosh uses carriage returns, UNIX uses line feeds, and DOS uses both. When you move files from one system to another, the vast majority of the time the end-of-line characters will be converted appropriately, but sometimes they aren't. This can result in your file coming out double-spaced or all on one single line on the system that it was moved to.
Most of the time it doesn't matter because browsers ignore spurious returns or line feeds in your HTML files. The existence or absence of either one is not terribly important. Where it might be an issue is in sections of text you've marked up with <PRE>; you may find that your well-formatted text that worked so well on one platform doesn't come out well formatted after it's been moved.
If you do have end-of-line problems, you have a couple of options for how to proceed. Many text editors allow you to save ASCII files in a format for another platform. If you know what platform you're moving to, you can prepare your files for that platform before moving them. If you're moving to a UNIX system, small filters for converting line feeds called dos2unix and unix2dos may exist on the UNIX or DOS systems. And, finally, Macintosh files can be converted to UNIX-style files using the following command line on UNIX:
tr '\015' '\012' < oldfile.html > newfile.html
In this example, oldfile.html is the original file with end-of-line problems, and newfile.html is the name of the new file.
At this point you have a server, your Web pages are installed and ready to go, and all that's left is to tell people that your presentation exists. All you need now is a URL.
If you're using a commercial Web server, or a server that someone else administers, you may be able to easily find out what your URL is by asking the administrator (and, in fact, this is one of the things you were supposed to ask your Webmaster). Otherwise, you'll have to figure it out yourself. Luckily, this isn't that hard.
As I noted in Chapter 4, "All About Links," URLs are made of three parts: the protocol, the host name, and the path to the file. To determine each of these parts, use the following questions:
- What are you using to serve the files?
If you're using a real Web server, your protocol is http. If you're using FTP or Gopher, the protocol is ftp and gopher, respectively. (Isn't this easy?)
- What's the name of your server?
This is the network name of the machine your Web server is located on, typically beginning with www; for example, www.mysite.com. If it doesn't start with www, don't worry about it; that doesn't affect whether or not people can get to your files. Note that the name you'll use is the fully qualified host name-that is, the name that people elsewhere on the Web would use to get to your Web server, which may not be the same name you use to get to your Web server. That name will usually have several parts and end with .com, .edu, or the code for your country (for example, .uk, .fr, and so on).
With some SLIP or PPP connections, you may not even have a network name, just a number-something like 22.214.171.124. You can use that as the network name.
If the server has been installed on a port other than 80, you'll need to know that number, too. Your Webmaster will know this.
- What's the path to my home page?
The path to your home page most often begins at the root of the directory where Web pages are stored (part of your server configuration), which may or may not be the top level of your file system. For example, if you've put files into the directory /home/www/files/myfiles, your pathname in the URL might just be /myfiles. This is a server-configuration question, so if you can't figure it out, you may have to ask your server administrator.
If your Web server has been set up so that you can use your home directory to store Web pages, you can use the UNIX convention of the tilde (~) to refer to the Web pages in your home directory. You don't have to include the name of the directory you created in the URL itself. So, for example, if I had the Web page home.html in a directory called public_html in my home directory (lemay), the path to that file in the URL would be
Once you know these three things, you can construct a URL. You'll probably remember from Chapter 4 that a URL looks like this:
You should be able to plug your values for each of those elements into the appropriate places in the URL structure. For example:
Now that your Web pages are available on the Net, you can take the opportunity to test them on as many platforms using as many browsers as you possibly can. It is only when you've seen how your documents look on different platforms that you'll realize how important it is to design documents that can look good on as many platforms and browsers as possible.
Try it and see you might be surprised at the results.
What happens if you upload all your files to the server, go to bring up your home page in your browser, and something goes wrong? Here's the first place to look.
If your browser can't even get to your server, this is most likely not a problem you can fix. Make sure that you have the right server name and that it's a complete hostname (usually ending in .com, .edu, .net, or some other common ending name). Make sure you haven't mistyped your URL and that you're using the right protocol. If your Webmaster told you your URL included a port number, make sure you're including that port number in the URL after the hostname.
Also make sure your network connection is working. Can you get to other Web servers? Can you get to the top-level home page for the site itself?
If none of these ideas are solving the problem, perhaps your server is down or not responding. Call your Webmaster and see if he or she can help.
What if all your files are showing up as Not Found or Forbidden? First, check your URL. If you're using a URL with a directory name at the end, try using an actual filename at the end and see if that works. Double-check the path to your files; remember that the path in the URL may be different from the path on the actual disk. Also, keep in mind that uppercase and lowercase are significant. If your file is MyFile.html, make sure you're not trying myfile.html or Myfile.html.
If the URL appears to be correct, the next thing to check is file permissions. On UNIX systems, all your directories should be world-executable, and all your files should be world-readable. You can make sure all the permissions are correct using these commands:
chmod 755 filename
chmod 755 directoryname
You can get to your HTML files just fine, but all of your images are coming up as icons or broken icons. First of all, make sure your references to your images are correct. If you've used relative pathnames, this should not have been a problem. If you've used full pathnames or file URLs, the references to your images may very well have broken when you moved the files to the server (I warned you ).
In some browsers, notably Netscape, if you select an image with the right mouse button (hold down the button on a Mac mouse), you'll get a popup menu. The View This Image menu item will try to load the image directly, which will give you the URL of the image where the browser thinks it's supposed to be (which may not be where you think it's supposed to be). You can often track down strange relative pathname problems this way.
If the references all look fine, and the images worked just fine on your local system, the only other place a problem could have occurred is in transferring the files from one system to another. As I mentioned earlier in this chapter, make sure you transfer all your image files in binary format. If you're on the Mac, make sure you transfer as raw data or just data; don't try to use MacBinary or AppleDouble format, or you'll get problems on the other side.
If your HTML and image files are working just fine, but your links don't work, you most likely used pathnames for those links that applied only to your local system-for example, you used absolute pathnames or file URLs to refer to the files you're linking to. As I mentioned for images, if you used relative pathnames and avoided file URLs, this should not be a problem.
Say you've got an HTML file or a file in some media format that displays or links just fine on your local system. But once you upload the file to the server and try to view it, the browser gives you gobbledygook-for example, it displays the HTML code itself instead of the HTML file, or it tries to display an image or media file as text.
There are two cases where this could happen. The first is when you're not using the right file extensions for your files. Make sure that you're using one of the right file extensions with the right uppercase and lowercase.
The second case where this could happen is when your server is misconfigured to handle your files. For example, if you're working on a DOS system where all your HTML files have extensions of .htm, your server may not understand that .htm is an HTML file (most modern servers do, but some older ones don't). Or, you may be using a newer form of media that your server doesn't understand. In either case, your server may be using some default content-type for your files (usually text/plain), which your browser then tries to handle (and doesn't often succeed).
To fix this, you'll have to configure your server to handle the file extensions for the media you're working with. If you're working with someone else's server, you'll have to contact your Webmaster and have them set things up correctly. Your Webmaster will need two types of information to make this change: the file extensions you're using, and the content-type you want them to return. If you don't know the content-type you want, there's a listing of the most popular types in Appendix E, "MIME Types and File Extensions."
The "build it, and they will come" motto from the movie Field of Dreams notwithstanding, people won't simply start to visit your site of their own accord after you've put it online. In fact, with more than 10 million Web pages online already and that number set to double again in the next year, it's highly unlikely that anyone could ever just stumble across your site by accident.
As a result, to get people to visit your Web site, you need to advertise its existence in as many ways as possible. After all, the higher the visibility, the greater the prospect of your site receiving lots of hits.
Hits is Web-speak for the number of visits your Web site receives. It does not differentiate between people, but instead is simply a record of the number of times a copy of your Web page has been downloaded.
In this section you'll learn about many of the avenues available for you to promote your site, including:
- Getting your site listed on the major WWW directories
- How to list your site with the major WWW indexes
- Using Usenet to announce your site
- Business cards, letterheads, and brochures
- More directories and related Web pages
Many people, when they first start working with the World Wide Web, find it hard to understand that there are numerous Web sites out there just itching for the chance to include a hyperlink for your Web pages as part of their own list. And what they find even harder to understand is that, for the most part, no cost is involved.
There is a simple reason for the existence of so many apparently philanthropic individuals. When the World Wide Web was young and fresh, the best way for a person to promote the existence of his site was by approaching other Web developers and asking them to list his site on their pages. In return for this favor, he would also list their site on his pages. Over time, this process has been refined somewhat, but today many sites will still be only too happy to include a link to your site. In fact, don't be surprised if you occasionally receive an e-mail from someone asking to be included in your list of sites.
This cooperative nature is a strikingly unique feature of the World Wide Web. Instead of competing for visitors with other similar sites, most Web pages actually include a list of their competitors.
Unfortunately, however, there is still a problem with just exchanging hyperlink references with other sites. As was originally the case, people still need to be able to locate a single site as a starting point. To this end, some sort of global Internet directory was needed. Currently, there is no single site on the World Wide Web that could be regarded as the Internet directory, but a few major directories and libraries come very close.
By far, the most well-known directory of Web sites is the Yahoo site (see Figure 15.1), created by David File and Jerry Yang at http://www.yahoo.com/. This site started in April 1994 as a small private list of David and Jerry's favorite Web sites. But since then, it has grown to be a highly regarded catalog and index of Web sites and is now its own company.
Yahoo uses an elegant multilevel catalog to organize all the sites it references. To view the contents of any level of the catalog, select the major category hyperlink that most closely represents the information you are interested in, and then follow the chain of associated pages to a list of related Web sites like the one shown in Figure 15.2. The page shown in the figure is one you should definitely take a look at. It contains a list of Announcement Services and related Web pages that can help you spread the word about your new Web site.
To add your site to the list maintained by Yahoo, return to the Yahoo home page at http://www.yahoo.com/, and select the category appropriate to your site. Work your way down the catalog through any subcategories until you locate a list of sites similar to your own. On this page, click the Add URL button (it's in the banner along the top of the page). Yahoo then displays a form like the one shown in Figure 15.3, where you can enter the URL and other details about your Web site.
After you submit the form, a new hyperlink is automatically added to the category you selected previously. In addition, your site is also listed in the daily and weekly Yahoo What's New list, which can be found at http://www.yahoo/com/New/.
The World Wide Web (W3) Virtual Library, located at http://www.w3.org/pub/DataSources/bySubject/Overview.html, is another very popular online catalog. Unlike Yahoo, which is operated by a single group of people, the W3 Virtual Library is a distributed effort. As such, the contents of each separate category are maintained by different people (all volunteers) and sometimes housed on different computers all over the world.
As a result, to submit your URL for inclusion in a category of the Virtual Library, you need to send an e-mail request to the person maintaining it. To obtain a list of the e-mail addresses for each maintainer, point your Web browser to http://www.w3.org/pub/DataSources/bySubject/Maintainers.html.
The top-level directory (see Figure 15.4) maintained by the W3 Consortium also contains a link to the Maintainers page, along with other links that describe the submission process in greater detail. In addition, you'll find information on this page that describes how people can add their own categories to the W3 Virtual Library and become maintainers themselves.
Another popular method of promoting your site is by registering it with the growing number of Yellow Pages directories that have begun to spring up on the World Wide Web. These sites can best be thought of as the electronic equivalent of your local telephone Yellow Pages directory.
As a rule, Yellow Pages sites are designed specially for commercial and business Web users who want to advertise their services and expertise. For this reason, most of the Yellow Pages sites offer both free and paid advertising space, with the paid listings including graphics, corporate logos, and advanced layout features. The free listings, on the other hand, tend to be little more than a hyperlink and a short comment. When you're starting out, however, free advertising is without a doubt the best advertising. Of the Yellow Pages sites currently in operation, these are the three most popular:
- WORLDWIDE Yellow Pages
- GTE SuperPages
- WWW Business Yellow Pages
WORLDWIDE Yellow Pages
As the name suggests, this site aims to be a global online Yellow Pages directory. It can store Web addresses, postal information, phone numbers, and information about the category your business falls under. To check out the WORLDWIDE Yellow Pages site, use http://www.yellow.com/. To submit an entry to this directory, point your Web browser to http://www.yellow.com/cgi-bin/online as shown in Figure 15.5.
When submitting an entry to WORLDWIDE Yellow Pages, be sure to include the geographic location of your business. This is especially important for commerce sites. After all, if your Online Pizza Delivery Service is based in downtown New York, there's not much chance of your making that "30 minutes or it's free" deadline to me over here in Palo Alto, California.
Like the WORLDWIDE Yellow Pages site, which focuses primarily on businesses, GTE SuperPages also focuses on businesses both on and off the Web. The page at http://www.superpages.com/ gives you access to two separate Yellow Pages-type directories: one for business information gleaned from actual United States Yellow Pages information (which includes businesses without actual Web sites), and one specifically for businesses with Web sites. Both are organized into categories, and both listings let you search for specific business names and locations.
To submit a site for inclusion in either GTE directory, see the page at http://yp.gte.net/add.phtml? (shown in Figure 15.6).
WWW Business Yellow Pages
The WWW Business Yellow Pages is not as large as the other two mentioned previously, but because it's free, no harm can be done by including an entry for your business site here. It's operated as a community service by the University of Houston, College of Business Administration, at http://www.cba.uh.edu/ylowpges/.
As with the other two Yellow Pages sites, if you want your site included at the WWW Business Yellow Pages, you need to submit an online form. The URL for the application form, as shown in Figure 15.7, is http://www.cba.uh.edu/cgi-bin/autosub.
A special type of Web site, called a What's New listing, was designed with one purpose in mind: to announce the arrival of new Web sites. The granddaddy of all the What's New listings is the one operated by the ncSA, creators of ncSA Mosaic (see Figure 15.8).
To submit your site for inclusion on the ncSA listing, follow the instructions outlined on the What's New home page at http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/whats-new.html. Currently, you need to submit an e-mail request, but it's highly likely that this requirement will change in the future.
To complement the ncSA lists, various other groups, such as Netscape Communications, operate their own What's New lists as well. Unlike the ncSA site, the Netscape page does provide an online form, like the one shown in Figure 15.9. To use this form, point your Web browser to http://home.netscape.com/escapes/submit_new.html.
In addition to the broad mainstream Web directories, many private directories on the World Wide Web cater to more specific needs. Some of these directories deal with single issues, whereas others are devoted to such areas as online commerce, education, business, and entertainment.
The best way to locate most of these directories is by using an Internet search tool such as Lycos (http://lycos.cs.cmu.edu/) or WebCrawler (http://www.webcrawler.com/). Alternatively, most of these directories will already be listed in such places as Yahoo and the W3 Virtual library, so a few minutes spent visiting relative catalogs at these sites is normally very beneficial.
The Internet Mall
For those of you who plan to operate online stores via the World Wide Web, a directory such as the Internet Mall-http://www.internet-mall.com/-which is shown in Figure 15.10, is a very good place to start. Listing your Web site on such a mall gives you instant visibility. That does not necessarily mean that people will start knocking down your doors immediately, but it does give your store a much greater chance of succeeding.
The main criteria for obtaining a listing on the Internet Mall is that your site must sell tangible products, and people must be able to place an order for them online. Apart from this, only a few types of commerce are not welcome, including these:
- Multilevel marketing schemes
- Products available through dealerships
- Franchise opportunities
- Web publishing or design services
- Marketing services
- Hotels, restaurants, and nonbusiness sites
If you want to lodge a request for the inclusion of your online store at the Internet Mall, point your Web browser to http://www.internet-mall.com/howto.htm for more information.
If you use one of Netscape's Web servers to operate your Web site, you can list your site on Netscape's own shopping mall, called the Netscape Galleria (see Figure 15.11).
In addition, if you rent space from a Web service provider that uses either of the Netscape servers, you might also qualify for a listing. For more information, visit the Netscape Galleria at http://home.netscape.com/escapes/galleria.html. You'll learn more about Netscape's Web servers in Chapter 16, "Setting Up Your Own Server."
After you have your new site listed on the major directories and maybe a few smaller directories, you next need to turn your attention to the indexing and search tools, such as Lycos, WebCrawler, and InfoSeek. Unlike directories, which contain a hierarchical list of Web sites that have been submitted for inclusion to the directory, indexes have search engines (sometimes called "spiders") that prowl the Web and store information about every page and site they find. The indexes then store a database of sites that can be searched using a form.
Once you publish your site on the Web and other people link to your site, chances are a search engine will eventually get around to finding and exploring your site. However, you can tell these indexes that your site exists ahead of time and get indexed that much faster. Each of these search engines provides a mechanism that enables you to submit your site for inclusion as part of its index.
Four search engines vie for the title of most popular on the Web: AltaVista, Lycos, WebCrawler, and InfoSeek.
One of the most popular and fastest Web indexes is Digital Equipment's AltaVista index at http://www.altavista.digital.com/. AltaVista indexes a good portion of the Web but stands out by having an extremely fast search engine. So, looking up specific search terms on the Web is quick and thorough.
You can submit your page to AltaVista using the form at http://www.altavista.digital.com/cgi-bin/query?pg=addurl (shown in Figure 15.12).
Lycos was one of the earliest search engines and still claims to have the largest overall coverage of the Web. Lycos is located at http://www.lycos.com/ (see Figure 5.13), and it's registration form for asking your site to be visited is at http://www.lycos.com/register.html.
Following its recent move onto the Internet, America Online has taken over the operation of the WebCrawler indexing system located at http://webcrawler.com/GNN/WebQuery.html (see Figure 15.14).
WebCrawler does not have as wide a coverage as Lycos or AltaVista, with less than an estimated 40 percent of the World Wide Web index, but it does have the advantage of being the Internet index system of choice for more than 3.5 million America Online and Global Network Navigator (GNN) users.
The Submit URL form for WebCrawler is located at http://webcrawler.com/WebCrawler/SubmitURLS.html.
pc Computing magazine recently voted InfoSeek at http://www.infoseek.com/ (shown in Figure 15.15) the Most Valuable Internet Tool for 1995. Like Lycos and WebCrawler, InfoSeek is a Web indexing tool, but what makes it even more powerful is its capability to search through many kinds of additional services and databases in addition to the World Wide Web. Such functionality, however, does come at a cost-only the Web search engine can be used without charge.
The other main difference between InfoSeek and other search tools is that you send your URL submission request via e-mail to firstname.lastname@example.org.
Besides the three search tools already covered, there are about 15 other search engines with differing capabilities, and you'll need to make a separate submission to each to ensure that your site is indexed.
Instead of listing the URLs and details for each of these sites, however, this discussion will turn to two special Web pages that take much of the drudgery out of submitting Web sites to search indexes and directories.
The PostMaster site shown in Figure 15.16 and located at http://www.netcreations.com/postmaster/index.html is an all-in-one submission page that asks you to fill out all the details required for more than 25 Web indexes and directories, including Yahoo, Netscape's Escapes, JumpStation, Lycos, InfoSeek, WebCrawler, World Wide Web, GNN Whole Internet Catalog, World Wide Yellow Pages, and ncSA's What's New. After completing the form, PostMaster submits your information to all these sites at once, so you don't have to do each one individually.
PostMaster also offers a commercial version of its submission system that delivers announcements about your new site to more than 200 magazines, journals, and other periodicals, in addition to all the sites included in the free version. Using the commercial version, however, is an expensive exercise.
The Submit It! service provided by Scott Banister is a lot like PostMaster in that it also helps you submit your URL to different directories and search indexes. It supports just about all the same services, but what sets it apart is the way in which you submit your information. Figure 15.17 shows a list of all the search indexes and directories currently supported by Submit It!
Submit It! doesn't ask you to complete one enormous page, something that many people find daunting. Instead, after you've filled out some general information, you select only the sites you want to submit an entry to and then perform each submission one site at a time.
To learn more about Submit It!, point your Web browser to http://www.submit-it.com/.
The World Wide Web is not the only place on the Internet you can use to announce the launch of your new Web site. Many people make use of a small set of Usenet newsgroups that are designed especially for making announcements. To locate these newsgroups, look for newsgroup names that end with .announce. (Refer to the documentation that came with your Usenet newsreader for information about how this can be done.)
One newsgroup is even devoted just to World Wide Web-related announcements. The name of this newsgroup is comp.infosystems.www.announce (see Figure 15.18). If your browser supports reading Usenet news, and you've configured it to point to your new server, you can view articles submitted to this newsgroup-and add your own announcements-by entering the following URL into the Document URL field. Figure 15.18 shows this newsgroup.
One post in particular to look for in comp.infosystems.www.announce
is an excellent FAQ called "FAQ: How to Announce Your New
Web Site." This FAQ contains an up-to-date
list of all the best and most profitable means of promoting your Web site. If you can't locate the FAQ in this newsgroup, you can view an online version at http://ep.com/faq/webannounce.html.
comp.infosystems.www.announce is a moderated newsgroup. As such, any submissions you make to it are approved by a moderator before they appear in the newsgroup listing. To ensure that your announcement is approved, you should read the charter document that outlines the announcement process. You can read this document by pointing your Web browser to http://boutell.com/%7Egrant/charter.html.
Although the Internet is a wonderful place to promote your new Web site, there is another great advertising method that many people fail to even consider.
Most businesses spend a considerable amount of money each year producing letterheads, business cards, and other promotional material. Very few, however, consider printing their e-mail addresses and home page URLs on them. But why not? With more than 35 million people on the Internet, chances are that some of your customers are already on the Internet, or will be within a few years.
By printing your e-mail address and home page URL on all your correspondence and promotional material, you can reach an entirely new group of potential site visitors. And who knows, maybe you'll even pick up new clients by spending time explaining to people what all your new address information means.
The bottom line with the promotion of your Web site is lateral thinking. You need to use every tool at your disposal if you want to have a successful and active site.
Welcome to being happily published. At this point you've got your pages up on the Web and ready to be viewed, you've advertised and publicized your site to the world, and people are (hopefully) flocking to your site in droves. Or are they? How can you tell? There are a number of ways to find out, including log files and access counters.
The best way to figure out how often your pages are being seen and by whom is to see if you can get access to your server's log files. The server keeps track of all this information and, depending on how busy the server is, may keep this information around for weeks or even months. Many commercial Web publishing providers have a mechanism for you to view your own Web logs or to get statistics about how many people are accessing your pages and from where. Ask your Webmaster for help.
If you do get access to the raw log files, you'll most likely see a whole lot of lines that look something like this (I've broken this one up onto two lines so it fits on the page):
vide-gate.coventry.ac.uk - - [17/Apr/1996:12:36:51 -0700]
"GET /index.html HTTP/1.0" 200 8916
What does this mean? This is the standard look and feel for most log files. The first part of the line is the site which accessed the file (in this case, it was a site from the United Kingdom). The two dashes are used for authentication (if you have login names and passwords set up, the user name of the person who logged in and the group they belonged to will appear here). The date and time the page was accessed is inside the brackets. The part after that is the actual filename that was accessed; here it's the index.html at the top level of the server. The GET part is the actual HTTP command the browser used; you usually see GET here. Finally, the last two numbers are the HTTP status code and the number of bytes transferred. The status code can be one of many things: 200 means the file was found and transferred correctly; 404 means the file was not found (yes, it's the same status code you get in error pages in your browser). Finally, the number of bytes transferred will usually be the same number of bytes in your actual file; if it's a smaller number, the reader interrupted the load in the middle.
If you're interested, you'll learn more about log files and how to deal with them in Chapter 27, "Web Server Hints, Tricks, and Tips."
If you don't have access to your server's log files for whatever reason, and you'd like to know at least how many people are looking at your Web pages, you can install an access counter on your page. You've probably seen these a number of times in your Web browsing; they look like odometers or little meters that say "Since July 15, 1900, this page has been accessed 5,456,234,432 times."
Lots of Web counters are available, but most of them require you to install something on your server or have what are called server-side includes set up (in fact, later in this book, in Chapter 27, you'll learn how to create your own simple access counter). A few, including the following three sites, provide access counters that don't require server setup (but may cost you some money).
The Web counter at http://www.digits.com/ is easy to set up and very popular. If you have a site without a lot of hits (less than 1,000 a day), the counter service is free. Otherwise you'll need to be part of the commercial plan, with the access counter costing $30 and up.
After you sign up for the digits.com counter service, you'll get a URL which you include on your pages as part of an <IMG> tag. Then when your page is hit, the browser goes and retrieves that URL at digits.com's server, which generates a new odometer image for you.
For more information about access counters in general (as well as a huge archive of images for access counters), see the Digit Mania home page at http://cervantes.learningco.com/kevin/digits/index.html.
In this chapter, you've reached the final point in creating a Web presentation: publishing your work to the World Wide Web at large through the use of a Web server, either installed by you or available from a Network provider. Here you learned what a Web server does and how to get one; how to organize your files and install them on the server; how to find out your URL and use it to test your pages, how to advertise your pages once they're available and how to find out who's looking at those pages. In the next chapter you'll learn how to set up and use your own Web server.
From here on, everything you learn is icing on an already-substantial cake. You'll simply be adding more features (interactivity, forms) to the presentation you already have available on the Web. Congratulations! Have some ice cream.
|Q||I have my pages published at an ISP I really like; my URL is something like http://www.thebestisp.com/users/mypages/. Instead of this URL, I'd like to have my own hostname-something like http://www.mypages.com/. How can I do this?|
|A||You have two choices. The easiest way is to ask your ISP if they allow you to have your own domain name. Many ISPs have a method for setting this up so you can still use their services and work with them-only your URL
changes. Note that this may cost more money, but if you really must have that URL, then this may be the way to go.
The other option is to set up your own server with your own domain name. This option could be significantly more expensive than working with an ISP, and it requires at least some background in basic network administration. You'll learn all about doing this in the next chapter.
|Q||I created all my image files on a Mac, uploaded them to my UNIX server using the Fetch FTP program, tested it all, and it all works fine. But now I'm getting e-mail from people saying none of my images are working. What's going on here?|
|A||Usually when you upload the files using Fetch, there'll be a pull-down menu you can choose from where the default is MacBinary. Make sure you change that to Raw Data.|
MacBinary files work fine when they're viewed on the Mac. And since I assume you're using a Mac to test your presentation, they'll work fine. But they won't work on any other system. To make sure your images work across platforms, upload them as Raw Data.
|Q||I created my files on a DOS system, using the .htm extension, like you told me to earlier in the book. Now I've published my files on a UNIX system provided by my job. The problem now is that when I try to get to my pages using my browser, I get the HTML code for those pages-not the formatted result! It all worked on my system at home what went wrong?|
|A||Some older servers will have this problem. Your server has not been set up to believe that files with a .htm extension are actually HTML files, so they send them as the default content-type (text/plain)
instead. Then, when your browser reads one of your files from a server, it reads that content-type and assumes you have a text file. So your server is messing everything up.
There are several ways you can fix this. By far the best way to fix this is to tell your Webmaster to change their server configuration so that .htm files are sent as HTML-usually a very simple step that will magically cause all your files to work properly from then on.
If you can't find your Webmaster, or for some strange reason he or she will not make this change, your only other option is to change all your filenames after you upload them to the UNIX system. Note that you'll have to change all the links within those files as well. (Finding a way to convince your Webmaster to fix this would be a much better solution.)