Java Unleashed

Chapter 27
Introducing Java network programming
One of the best features of Java is its networking support. Java has classes that range from low-level TCP/IP connections to ones that provide instant access to resources on the World Wide Web. Even if you have never done any network programming before, Java makes it easy.
The following chapters introduce you to the networking classes and how to use them. A guide to what is covered by each chapter follows:
- Chapter 27, Introduction to Java Network Programming
- chapter you are reading contains an introduction to TCP/IP networking and a list of the concepts you should be familiar with before reading the rest of the networking section.
- Chapter 28, The java.net Package
- chapter is a tour of the classes that make up the Java networking package java.net. The exceptions raised by the networking classes also are covered, as are the interfaces specified by the package.
- Chapter 29, Network Programming
- chapter, the meat of the networking chapters, contains examples of how to use the networking classes. There also is a section about deciding which Java classes best suit your networking needs.
- Chapter 30, Overview of Content and Protocol Handlers
- chapter discusses what protocol and content handlers are, how they can be applied, and provides an introduction to writing your own.
- Chapter 31, Extending Java with Content and Protocol Handlers
- classes can be written to allow the URL class to deal with new protocols and content types. For example, a Web browser written in Java could be extended to deal with a new image format. This chapter details how to write and use these classes.
Prerequisites
Though networking with Java is fairly simple, there are a few concepts and classes from other packages that you should be familiar with before reading this part of the book. If you are only interested in writing an applet that interacts with an HTTP daemon, you probably can just concentrate on the URL class for now. For the other network classes, you will need at least a passing familiarity with the World Wide Web, java.io classes, threads, and TCP/IP networking.
World Wide Web Concepts
If you are using Java you probably already have a familiarity with the Web. Knowledge of how Uniform Resource Locators (URLs) work is needed to use the URL and URLConnection classes.
java.io Classes
Once you have a network connection established using one of the low-level classes, you will be using java.io.InputStream and java.io.OutputStream objects or appropriate subclasses of the objects to communicate with the other endpoint. Also, many of the java.net classes throw java.io.IOException when they encounter a problem.
Threads
Although not strictly needed for networking, threads make using the network classes easier. Why tie up your user interface waiting for a response from a server when a separate communication thread can wait rather than the interface thread? Server applications also can service several clients simultaneously by spawning off a new thread to handle each incoming connection.
TCP/IP Networking
Before using the networking facilities of Java, you need to be familiar with the terminology and concepts of the TCP/IP networking model. The last part of this chapter should serve to get you up to speed.
Internet Networking: A Quick Overview
TCP/IP (Transmission Control Protocol/Internet Protocol) is the set of networking protocols used by Internet hosts to communicate with other Internet hosts. If you have ever had any experience with networks or network programming in general you should be able to just skim this section and check back when you find a term you are not familiar with. A list of references is given at the end of this section if you would like more detailed information.
TCP/IP and Networking Terms
Like any other technical field, computer networking has its own set of jargon. These definitions should clear up what the terms mean.
- host
- individual machine on a network. Each host on a TCP/IP network has at least one unique address (see IP number).
- hostname
- symbolic name that can be mapped into an IP number. Several methods exist for performing this mapping, such as DNS (Domain Name Service) and Suns NIS (Network Information Services).
- IETF
- Internet Engineering Task Force, a group responsible for maintaining Internet standards and defining new ones.
- internet
- network of networks. When capitalized as the Internet it refers to the globally interconnected network of networks.
- IP number
- unique address for each host on the Internet (unique in the sense that a given number may only be used by one particular machine, but a particular machine may be known by multiple IP numbers). This currently is a 32-bit number that consists of a network part and a host part. The network part identifies the network the host resides on and the host part is the specific host on that network. Sometimes the IP number is referred to as the IP address of a host.
- packet
- single message sent over a network. Sometimes a packet is referred to as a datagram, but the former term usually refers to data at the network layer and the latter refers to a higher-layer message.
- protocol
- set of data formats and messages used to transmit information. Different network entities must speak the same protocol in order to understand each other.
- protocol stack
- services can be thought of as different layers that use lower-level services to provide services to higher-level services. This set of layers providing network functionality is known as a protocol stack.
- RFC
- For Commentsdocuments in which proposed Internet standards are released. Each RFC is issued a sequential number, which is how they are usually referenced. Examples are RFC 791, which specifies the Internet Protocol (the IP of TCP/IP), and RFC 821, which specifies the protocol used for transferring e-mail between Internet hosts (SMTP).
- router
- host that knows how to forward packets between different networks. A router can be a specialized piece of network hardware or can be something as simple as a machine with two network interfaces (each on a different physical network).
- socket
- communications endpoint (that is, one end of a conversation). In the TCP/IP context, a socket usually is identified by a unique pair consisting of the source IP address and port number and the destination IP address and port number.
The Internet Protocols
TCP/IP is a set of communications protocols for communicating between different types of machines and networks (hence the name internet). The name TCP/IP comes from two of the protocols: the Transmission Control Protocol and the Internet Protocol. Other protocols in the TCP/IP suite are the User Datagram Protocol (UDP), the Internet Control Message Protocol (ICMP), and the Internet Group Multicast Protocol (IGMP).
These protocols define a standard format for exchanging information between machines (known as hosts) regardless of the physical connections between them. TCP/IP implementations exist for almost every type of hardware and operating system imaginable. Software exists to transmit IP datagrams over network hardware ranging from modems to fiber-optic cable.
TCP/IP Network Architecture
There are four layers in the TCP/IP network model. Each of the protocols in the TCP/IP suite provides for communication between entities in one of these layers. These lower-level layers are used by higher-level layers to get data from host to host. The layers are as follows, with examples of what protocols live at each layer:
- Physical (Ethernet, Token Ring, PPP)
- Network (IP)
- Transport (TCP, UDP)
- Application (Telnet, HTTP, FTP, Gopher)
The TCP/IP protocol stack.
Each layer in the stack takes data from the one above it and adds the information needed to get the data to their destination, using the services of the layer below. One way to think of this layering is like the layers of an onion. Each protocol layer adds a layer to the packet going down the protocol stack. When the packet is received, each layer peels off its addressing to determine where to send the packet next.
As an example, suppose that your Web browser wants to retrieve something from a Web server running on a host on the same physical network. The browser sends an HTTP request using the TCP layer. The TCP layer asks the IP layer to send the data to the proper host. The IP layer then would use the physical layer to send the data to the appropriate host.
At the receiving end, each layer strips off the addressing information that the sender added and determines what to do with the data. Continuing the example, the physical layer would pass the received IP packet to the IP layer. The IP layer would determine that the packet is a TCP packet and pass it to the TCP layer. The TCP layer would pass the packet to the HTTP daemon process. The HTTP daemon then processes the request and sends the data requested back through the same process to the other host.
Addressing information is added and removed at each layer.
In a case where the hosts are not on the same physical network, the IP layer would handle routing the packet through the correct series of hosts (known as routers) until it reaches its destination. One of the nice features of the IP protocol is that individual hosts do not have to know how to reach every host on the Internet. The host simply passes to a default router any packets for networks it does not know how to reach.
For example, a university might only have one machine with a physical connection to the Internet. All of the campus routers would know to forward all packets destined for the Internet to this host. Similarly, any host on the Internet only has to know to get packets to this one router to reach any host at the university. The router would forward the packets to the appropriate local routers.
An example of IP routing.
The Future: IP Version 6
Back when the TCP/IP protocols were being developed in the early 1970s, 32-bit IP numbers seemed more than capable of addressing all the hosts on an internet. Though there currently is not a lack of IP numbers, the explosive growth of the Internet in recent years is rapidly consuming the remaining unassigned addresses. To address this lack of IP numbers a new version of the IP protocols is being developed by the IETF. This new version, known as either IPv6 or IPng (IP Next Generation), will provide for a much larger address space of 128 bits. This address space will allow for approximately 3.4 x 1038 different IP addresses.
IPv6 will be backward compatible with current IP implementations to allow older clients to interoperate with newer ones. Other benefits of the new version are as follows:
- Improved support for multicasting (sending packets to several destinations at one time).
- Simplified packet header formats.
- Support for authentication and encryption of packet contents at the network layer.
- Support for designating a connection as a special flow which should be given special treatment (such as real-time audio data that needs quick delivery).
These enhancements to TCP/IP should allow the Internet to continue the phenomenal growth it has experienced over the past few years.
Where to Find More Information
This was not meant to completely cover the subject of TCP/IP. If your curiosity has been piqued, the following online documents and books might be of interest to you.
RFCs
The first and definitive source of information on the IP protocol family are the Request For Comments documents defining the standards themselves. An index of all of RFC documents is available through the Web at http://ds.internic.net/ds/rfc-index.html. This page has pointers to all currently available RFCs (organized in groups of 100) as well as a searchable index.
Table 27.1 gives the numbers of some relevant RFCs and what they cover. Keep in mind that a given RFC might have been made obsolete by a subsequent RFC. The InterNIC sites index will note in the description any documents that were made obsolete by a subsequent RFC.
Table 27.1. RFC documents.
| RFC Number | Topic |
|---|---|
| 791 | The Internet Protocol (IP) |
| 793 | The Transmission Control Protocol (TCP) |
| 768 | The User Datagram Protocol (UDP) |
| 894 | Transmission of IP Datagrams over Ethernet Networks |
| 1171 | The PPP Protocol |
| 1883 | IP Version 6 |
| 1602 | The Internet Standards Process: How an RFC Becomes a Standard |
| 1880 | Current Internet Standards |
Books
A good introduction to TCP/IP is the book TCP/IP Network Administration by Craig Hunt (OReilly and Associates, ISBN 0-937175-82-X). Though written as a guide for system administrators of UNIX machines, the book contains an excellent introduction to all aspects of TCP/IP, such as routing and the Domain Name Service (DNS).
Another book worth checking out is The Design and Implementation of the 4.3BSD UNIX Operating System by Samuel J. Leffler et al. (Addison-Wesley, ISBN 0-201-06196-1). In addition to covering how a UNIX operating system works, it contains a chapter on the TCP/IP implementation.
If you are a beginner, another way to get started get started with TCP/IP is by reading Teach Yourself TCP/IP in 14 Days by Timothy Parker (Sams Publishing, ISBN 0-672-30549-6).
Summary
This chapter is a roadmap to the next three chapters. It has shown what concepts you need to be familiar with before you dive into network programming in Java. You should be comfortable with how TCP/IP networking operates in general (or at least know where to look for more information).