Internet socket

From Wikipedia, the free encyclopedia

Jump to: navigation, search

In computer networking, an Internet socket (or commonly, a network socket or socket) is the endpoint of a bidirectional communication flow across an Internet Protocol-based computer network, such as the Internet. Internet sockets (in plural) are an application programming interface (API) in an operating system, used for in inter-process communication. Internet sockets constitute a mechanism for delivering incoming data packets to the appropriate application process or thread, based on a combination of local and remote IP addresses and port numbers. Each socket is mapped by the operational system to a communicating application process or thread.

A socket address is the combination of an IP address (the location of the computer) and a port (which is mapped to the application program process) into a single identity, much like one end of a telephone connection is between a phone number and a particular extension line at that location.

An Internet socket is characterized by a unique combination of the following:

  • Protocol (TCP, UDP or raw IP). Consequently, TCP port 53 is not the same socket as UDP port 53.
  • Local socket address (Local IP address and port number)
  • Remote socket address (Only for established TCP sockets. As discussed in the Client-Server section below, this is necessary since a TCP server may serve several clients concurrently. The server creates one socket for each client, and these sockets share the same local socket address.)

The operating system forwards incoming IP packets to the corresponding application or service process by extracting the socket address information from the IP and transport protocol headers.

Within the operating system and the application that created a socket, the socket is referred to by a unique integer number called socket identifier or socket number.

In Internet standards, in many textbooks as well in this article, the term "socket" refers to an entity that is uniquely identified by the socket number. In other textbooks[1], the socket term refers to a local socket address, i.e. a "combination of an IP address and a port number". In the original definition of socket given in RFC 147 as it was related to the ARPA network in 1971, a socket was "specified as a 32 bit number with even sockets identifying receiving sockets and odd sockets identifying sending sockets." Today, however, sockets are bidirectional.

On Unix-like and Windows NT based operating systems the netstat command line tool can list all currently listening and established sockets and related information.

Contents

[edit] Socket pairs

Communicating local and remote sockets are called socket pairs. Each socket pair is described by a unique 4-tuple struct consisting of source and destination IP addresses and port numbers, i.e. of local and remote socket addresses. [2] [3]. As seen in the discussion below, in the TCP case, each unique socket pair 4-tuple is assigned a socket number, while in the UDP case, each unique local socket address is assigned a socket number.

[edit] Socket types

There are several Internet socket types:

There are also non-Internet sockets, implemented over other transport protocols, such as Systems Network Architecture (SNA).[5] See also Unix domain sockets (UDS), for internal inter-process communication.

[edit] Socket states and the client-server model

Computer processes that provide application services are called servers, and create sockets on start up that are in listening state. These sockets are waiting for initiatives from client programs. For a listening TCP socket, the remote address presented by netstat may be denoted 0.0.0.0 and the remote port number 0.

A TCP server may serve several clients concurrently, by creating a child process for each client and establishing a TCP connection between the child process and the client. Unique dedicated sockets are created for each connection. These are in established state, when a socket-to-socket virtual connection or virtual circuit (VC), also known as a TCP session, is established with the remote socket, providing a duplex byte stream.

Other possible TCP socket states presented by the netstat command are Syn-sent, Syn-Recv, Fin-wait1, Fin-wait2, Time-wait, Close-wait and Closed which relate to various start up and shutdown steps.[6]

A server may create several concurrently established TCP sockets with the same local port number and local IP address, each mapped to its own server-child process, serving its own client process. They are treated as different sockets by the operating system, since the remote socket address (the client IP address and/or port number) are different, i.e. since they have different socket pair tuples.

A UDP socket cannot be in an established state, since UDP is connectionless. Therefore, netstat does not show the state of a UDP socket. A UDP server does not create new child processes for every concurrently served client, but the same process handles incoming data packets from all remote clients sequentially through the same socket. This implies that UDP sockets are not identified by the remote address, but only by the local address, although each message has an associated remote address.

[edit] Implementation issues

Sockets are usually implemented by an API library such as Berkeley sockets, first introduced in 1983. Most implementations are based on Berkeley sockets, for example Winsock introduced 1991. Other socket API implementations exist, such as the STREAMS-based Transport Layer Interface (TLI).

Development of application programs that utilize this API is called socket programming or network programming.

These are examples of functions or methods typically provided by the API library:

  • socket() creates a new socket of a certain socket type, identified by an integer number, and allocates system resources to it.
  • bind() is typically used on the server side, and associates a socket with a socket address structure, i.e. a specified local port number and IP address.
  • listen() is used on the server side, and causes a bound TCP socket to enter listening state.
  • connect() is used on the client side, and assigns a free local port number to a socket. In case of a TCP socket, it causes an attempt to establish a new TCP connection.
  • accept() is used on the server side. It accepts a received incoming attempt to create a new TCP connection from the remote client, and creates a new socket associated with the socket address pair of this connection.
  • send() and recv(), or write() and read(), or recvfrom() and sendto(), are used for sending and receiving data to/from a remote socket.
  • close() causes the system to release resources allocated to a socket. In case of TCP, the connection is terminated.

[edit] Socket support in network equipment

Network equipment such as routers and switches traditionally do not deal with the socket identifiers of the routed or switched data. However, stateful network firewalls and Network Address Translation proxy servers automatically keep track of all active socket pairs, UDP as well as TCP, based on certain time-out settings. Also in fair queuing, layer 3 switching and Quality of Service support in routers, packet flows may be identified by extracting information about the socket pairs.

Raw sockets are typically available in network equipment, and used for routing protocols such as IGMP and OSPF, and in ICMP.

[edit] See also

[edit] Notes

[edit] External links

Personal tools