This is a brief introduction to the Java Socket API. Java Sockets are a mechanism for communication over the Internet. All the classes discussed in this tutorial are in the java.net package.
A socket is an endpoint for communication. There are two kinds of socket, depending on whether one wishes to use a connectionless or a connection-oriented protocol. The connectionless communication protocol of the Internet is called UDP. The connection-oriented communication protocol of the Internet is called TCP. UDP sockets are also called datagram sockets.
Each socket is uniquely identified on the entire Internet with two numbers. The first number is a 32-bit integer called the Internet Address (or IP address). The second number is a 16-bit integer called the port of the socket. The IP address uniquely identifies one machine (also called a host or node) on the Internet. The new version of the Internet protocol (IP version 6 or IPv6) increases the size of the IP address to 128 bits. Java encapsulates the concept of an IP address with the class InetAddress. Java represents ports with 32-bit integers, even though a 16-bit integer would suffice.
While IP addresses uniquely identify machines, the reverse is not true. A machine can have several IP addresses. Note that each 16-bit port number actually represents two distinct ports: a UDP port and a TCP port.
The simplest kind of socket is a UDP socket. Such a socket is analogous to a mailbox. Data is sent and received in units called datagrams (analogous to letters and parcels) from any UDP socket to any other UDP socket. For this reason UDP sockets are also called datagram sockets. Java encapsulates the concept of a UDP socket with the class DatagramSocket, and the concept of a datagram with the class DatagramPacket.
A DatagramPacket consists of a fixed-length array of bytes together with an IP address and port. When one sends a DatagramPacket, its array of bytes is sent to the socket with the specified IP address and port (if it exists). When one receives a DatagramPacket, the data is copied into its array of bytes, and the IP address and port of the sender are copied into the IP address and port of the DatagramPacket.
A TCP socket is analogous to (one side of) a telephone connection. TCP sockets are of two kinds: ordinary sockets and server sockets. A server socket is never used for transmission of information. Its sole purpose is to listen for incoming connection requests. When a client process wishes to make a connection with a server, it first constructs an ordinary socket and then it asks for a connection with the server. When a server socket receives a connection request, it constructs an ordinary socket with an unused port number which completes the connection. The server socket then goes back to listening for connections. Once the connection is established, the two connected sockets can communicate with each other using ordinary read and write operations in either direction.
Java encapsulates the concept of an ordinary TCP socket with the class Socket, and the concept of a server socket with the class ServerSocket. Input to and output from a socket is encapsulated in Java using the InputStream and OutputStream classes, respectively.
One special case of a TCP socket is a socket that is communicating with a Web server. Web servers normally listen on TCP port 80. One could construct a socket and connect directly to a Web server if one knows its IP address. However, when one connects to a Web server, it is usually because one is interested in a document. Documents on the Internet are identified in a very different manner than one uses for identifying sockets on the Internet. The standard identifier for a document on the Internet is its Universal Resource Locator (URL). Java encapsulates the concept of a URL with the class URL.
When one constructs a URL object, the URL is checked to make sure that
it is a well-formed URL.
Interacting with the URL requires that one first establish a connection
with the Web server that is responsible for the document identified by
the URL.
The TCP socket for the connection is constructed by invoking the
openConnection
method on the URL object.
This method also performs the name resolution necessary to determine
the IP address of the Web server.
This method returns an object of type
URLConnection.
The connection to the Web server is requested by calling connect
connect
on the URLConnection object.
Input to (when defined) and output from the document
identified by the URL is encapsulated in Java using the
InputStream
and
OutputStream
classes, respectively.
If the URL specifies the http protocol, then the URLConnection will
actually be an object of the subclass HttpURLConnection which has
additional methods specific to HTML documents. Other protocols are
also supported, but only http has a public subtype associated with it.
Ken Baclawski