Submitted to IEICE Journal, October 1999
The Internet originally relied on stateless end-to-end transparency to ensure robustness, and was unable to offer quality of service guarantees. The growth of business usage has threatened this transparency, which can best be restored by deployment of IPv6 and end-to-end IP security. New real-time services require the deployment of quality of service technology, which will be based on the Integrated and Differentiated services concepts.
Quality of service, Internet transparency, Internet architecture, IPv6, differentiated services
There are two major themes in this clash of cultures: transparency and quality of service, which we shall discuss in turn after a brief introduction to the classical view of the architecture of the Internet.
The original Internet architecture and its current problems
Surprisingly, although Internet technical specifications have been published as RFCs (originally meaning "requests for comments") since the 1970s, it was only in 1996 that an RFC attempting to capture the Internet architecture was published [Carpenter96]. Earlier academic publications did cover the ground, however [Saltzer, Clark]. One aspect of the architecture is well known: all data on the Internet are sent divided up into separate Internet Protocol (IP) packets, each with its own header including destination and source addresses. These packets find their way through the network completely independently. Not only is this different from connection-oriented networks such as ISDN or ATM; it is also quite different from the other major packet-switching protocol, X.25, in which all packets belonging to the same flow must follow the same path through the network. Thus the basic architectural principle of the original Internet design is that routers in the network are stateless, that is they contain no information about the state of any individual communication session. A generalisation of this is the end-to-end principle: good design on the Internet requires that there should be no single points of failure in a session except for the computers at the two ends of the session. Only a failure of one of the end systems should cause the session to fail. If all intermediate systems are stateless, the end-to-end principle is automatically respected, and the network is robust against partial failure.
This basic architecture has two major problems today. The first is that the end-to-end principle is widely ignored, because it requires the network to be transparent: a packet should travel from source to destination by any available route without being changed. Unfortunately this is less and less the case. The second problem is that increasing use of the Internet for e-business and for real time services such as voice and video is leading to market demand for better quality of service for certain types of traffic. Since stateless routers have no knowledge of individual traffic flows, this is also an architectural challenge.
The reasons for the progressive loss of transparency are explored in [Carpenter1999] in some detail. They are mainly the result of corporate Intranets and their security firewalls, Web proxies and caches, and most seriously Network Address Translators. All these devices intercept and interfere with Internet packets at the Intranet/Internet border, preventing their free flow from source to destination. Sometimes this interference is sufficient to prevent applications from working (for example, a so-called transparent Web cache actually modifies the source address of packets, so that a Web server is unable to identify their true source, which may cause certain types of application to crash). Network address translators have a similar impact, and they also prevent the use of network level IP security from end to end. In general, any device that interferes with the stream of packets automatically becomes a single point of failure: thus firewalls, intended to protect a corporate network, may even be their most vulnerable point.
By proper use of Web protocols, the impact of Web proxies and caches on transparency can be minimised. However, only one way to avoid the other causes of loss of transparency is known, which is to upgrade the Internet to use the new Version 6 of the basic Internet Protocol [IPv6], and to generalise the use of end-to-end IP security instead of firewall based security. This would eliminate the need for network address translation and for traditional firewalls that interfere with traffic. Only by this technique can the stateless principle with its beneficial properties be restored.
It is widely recognised that the public Internet needs continued improvement to its service quality if it is to become the universal infrastructure for business and society. However, in order to discuss improvements it is first necessary to understand and characterise the current status of Internet service quality in somewhat better than anecdotal terms. One difficulty here is that the parameters that define "good" quality are not self-evident in the case of the Internet. In the case of traditional analogue or constant bit rate connections, a few simple parameters suffice to completely identify and characterise a service. In the case of the Internet there are at least two extra dimensions to consider: some degree of packet loss and delay jitter is considered normal and acceptable, and the requirements of individual sessions may vary by many orders of magnitude from moment to moment. Further, some applications are much less tolerant than others of transitory defects in the service. All in all, the behaviour of the Internet is much more statistical in nature than that of traditional networks. There is a similar contrast between the road system (statistically determined performance) and the railway system (largely deterministic performance).
Of course, individual Internet Service Providers (ISPs) can engineer their networks to be adequate for the traffic they have contracted to carry. Although service quality may then be satisfactory when both ends of a communication use a single service provider, it is more realistic to consider the case of traffic flowing across multiple service providers. In this case there is ample objective evidence of inconsistent and unreliable performance, in addition to every Internet user's personal observations. For example Telcordia [Huitema] runs a daily survey of World Wide Web response time. For randomly selected Web servers an average delay of 5 seconds to fetch a page is observed, but 5% of pages take more than 20 seconds or arrive more slowly than 3200 bits/second. (Needless to say, this is a minute fraction of Telcordia's available capacity.) About 33% of the observed delay is the time taken by the remote server itself to access the content; two-thirds of the time is network delay. In addition, some 15% of the sites totally fail to respond. These figures have been roughly constant since late 1997.
Another readily accessible statistic is the rate of packet loss. This can be measured in an approximate way by use of the Internet Control Message Protocol (ICMP) echo request, also known as "ping". Although this is a disruptive measure, i.e. it introduces spurious traffic, a small amount of extra ICMP traffic will not introduce a substantial measurement error on a busy route. It is trivial to run a ping test and derive values for the instantaneous loss rate and an estimate of the round-trip time variance. A systematic survey of multiple intercontinental routes made by the particle physics community [Williams] showed long-term average packet loss rates up to 20% on some routes.
Of course, the situation can be dramatically better on an over-provisioned network where congestion is unknown, meaning that loss rates and jitter are negligible. However, even with the cost of network capacity decreasing, and dramatic developments such as wave division multiplexing and terabit/second routers, it is illusory to imagine that universal over-provisioning will become the norm. Firstly, exponential growth in demand is likely to continue for years to come, yet capacity can only increase in finite jumps, so periods of under-provisioning are inevitable. Secondly, we are very unlikely to be clever enough to avoid introducing choke points into the network, for example during line outages. Thirdly, if the growth levels off, economic competition will reduce profit margins. This will penalise over-provisioning, since even if capacity is cheap, it will never be free. Thus, there will always be points of congestion in the network.
What we learn from the above is that the management of congestion is a key requirement for the Internet.
Since the Internet came to life in the days when 9600 baud modems were the norm for long distance transmission, congestion is hardly a new problem, and it was in fact the main driver for the original development of the Transmission Control Protocol (TCP) [Postel]. Also, the end-to-end principle requires error recovery to be confined to the end systems rather than depending on elements of the network itself. In the case of TCP, that means that the network does not (and cannot) signal the loss of packets. A TCP sender deduces the loss of packets from the absence of acknowledgements after a certain time has expired; it then retransmits the lost packets, but if more than one packet is lost within the round-trip time, it also slows down its transmission rate. Thus, when the network is congested, such that packets are being lost, TCP slows down automatically, with no need for explicit control of the sending rate by the network.
Furthermore, there are known techniques for penalising TCP senders that fail to slow down in this way (i.e., senders that attempt to cheat the rules of TCP). In this case it is network elements, in the form of routers, that apply the penalty [Floyd], but again with no element of signalling between the network and the end-systems involved.
TCP's congestion response works, and the Internet entirely depends on it. Without it, the Hypertext Transfer Protocol (HTTP) and thus the Web would be unable to function. However, when loss rates reach the high levels mentioned above due to serious congestion, the effect on Web throughput can be very severe, since TCP may well spend most of its time in slowed-down mode.
The situation is even more serious for new real-time services such as voice or video. These services do not use TCP, which was designed for pure data transfer. Often they simply attempt to run over the User Datagram Protocol (UDP) which has no acknowledgement, retransmission or slow-down mechanisms. If loss rates are high, the real-time stream will be severely disturbed and the audio or video stream will break up. However, the sender knows nothing of this and will just keep on sending at full speed, exacerbating the congestion that is causing the loss in the first place. Some real time applications now use protocols such as the Real Time Streaming Protocol (RTSP) [Schulzrinne98], which itself can make use of the Real Time Protocol (RTP) [Schulzrinne96]. However, these techniques do not eliminate the underlying problem that only TCP-like protocols, whose throughput responds elastically to packet loss, can live happily in a congested network
Since the problem is not new, it is unsurprising that there have been several attempts to solve it. The original design of the Internet Protocol included a "type of service" header octet in which there were bits to signal "low delay", "high throughput" and "high reliability". Unfortunately there was no explanation of how these properties could be implemented across a network. Some later work [Almquist] expanded these definitions, but still without a recipe for implementation. In practice, although some software sets some of these bits, they have been of little use and generally ignored. More practically, IBM's Systems Network Architecture has long distinguished several classes of service, e.g., to give interactive transactions priority over batch operations.
The Internet Stream Protocol, known as ST [Delgrossi], was an attempt to accommodate real time traffic streams in parallel with TCP/IP traffic, by adding a second connection-oriented network protocol in parallel with IP. This has failed to grow outside a limited experimental community.
Recently, there has been a major effort within the Internet Engineering Task Force (IETF) under the name of Integrated Services (IntServ), to specify a mechanism for supporting end-to-end sessions across the Internet that require a specific quality of service (such as a given peak capacity and transmission delay). The IntServ model [Braden] requires a module in every IP router along the path that reserves resources for each session, and then ensures that each data packet in transit is checked to see what resources it is entitled to receive. The reservations are requested using a specially designed resource reservation protocol known as RSVP. An important notion in IntServ, that we shall see again later, is admission control, i.e., a process that refuses to admit traffic to the network for which insufficient resources are available. If the RSVP request fails, the session will not start (or will do so in a degraded mode). IntServ also has the notion of traffic shaping at the input to the network; packets will be spaced out in time in such a way as to correspond to the resources reserved by RSVP.
IntServ has many attractions, but two disadvantages. One is that it requires new software or firmware in all routers along the network path concerned. The other is that if it were to be used on major ISP trunk connections, carrying millions of packets per second, the overhead per packet of implementing the necessary checks and resource management is widely believed to be unacceptable. For these reasons, it is expected that IntServ and RSVP will initially be limited to campus and small corporate networks.
Over the last two years there has been intensive work in the IETF and in the industry on a model complementary to IntServ, known as Differentiated Services (DiffServ). This is intended to offer a simpler scenario that can be scaled up massively.
The basic principle of DiffServ is that network traffic is divided up in a coarse way into a number of "behaviour aggregates," such that all the traffic in one aggregate is treated in the same way, i.e., it experiences a single class of service. For example it might be decided to treat all electronic mail traffic as a single aggregate, all Web browsing traffic as a second aggregate, and all audio and video traffic as a third. Alternatively, an Internet Service Provider might decide to treat all traffic from Customer A as one aggregate, and all traffic from Customer B as another.
All traffic in the same behaviour aggregate is distinguished by having a particular value set in a field in the header of every packet that is known as the DiffServ Code Point. This value is used to select a particular behaviour at each hop from router to router; this is referred to as a Per-Hop Behaviour (PHB) to emphasise that it has no global significance. The PHBs are essentially queue-handling algorithms applied by each router, and they are chosen to be suitable for the type of traffic that has been injected into each behaviour aggregate. Thus, all voice traffic could experience one behaviour, and all electronic mail another, regardless of source or destination.
This is intrinsically a very simple, efficient and highly scaleable model and it builds on many years of experience in router queuing mechanisms. It also builds on the classification and admission control mechanisms developed for IntServ. In the complete picture, all data packets are classified at or near their source according to a simple classification policy, and immediately marked with an appropriate DiffServ Code Point. At the same time, admission control is performed (i.e., excess traffic is either delayed or discarded). From this point on, each router along the data path has to apply the PHB that corresponds to the code point carried in each packet, in particular by delivering packets from different behaviour aggregates into different output queues which receive better or worse access to the output link.
In the integrated services model, a given communications session gets its own individual share of network capacity. By contrast, in the differentiated services model, it is a behaviour aggregate that gets a specified share; all the sessions classified into the same aggregate share this share. This is the price of massive scalability. As a result, the end-to-end service seen by each individual session depends on the characteristics of the behaviour aggregate into which it is classified.
For example, one could arrange for all web browsing traffic to be in one aggregate, all e-mail in a second, and all IP telephony in a third. Note that this separates TCP traffic from real-time traffic, and further separates interactive TCP (browsing) from background TCP (e-mail). If one also arranges that the telephony aggregate has enough capacity to support a predefined maximum number of calls, and for the interactive aggregate to have the majority share of the remaining capacity, then two goals have been achieved. Firstly, the conflicting congestion responses of TCP and non-TCP traffic have been separated, making congestion control much more tractable. Secondly, network capacity has been shared out in a way that suits the different applications.
Another way to utilise behaviour aggregates is in effect a generalisation of the concept of a virtual private network. An ISP might choose to define several classes of service which are technically identical, but sell them to individual major customers; the benefit will be similar to an IP-based VPN, but with capacity management more like that of Frame Relay or ATM. An obvious variant of this is for the ISP to offer alternative grades of service at different prices.
At the time of writing, all the basic standards for diffserv have been defined by the IETF [DIFFSERV]. Work is also in progress on standards and technology to allow the management of a differentiated services network based on management policies stored in a central repository such as an LDAP directory. Discussion has also started on standards for authentication, authorisation and accounting in an environment where different users are entitled to use, and pay for, different grades of service. Additional work will be needed, once some practical experience has been gained, on standards for end-to-end service quality.
In many ways, the Internet is now facing the issues of large-scale service management that have existed for many years in circuit-oriented telecommunications. The issues are, however, complicated by the special nature of packet-based services, the highly competitive Internet environment, and the absence of a straightforward universal charging model. Precise service level agreements are needed, both between customers and service providers, and between service providers exchanging traffic among themselves. These issues are all interrelated, and there is much to be learnt about how to achieve such agreements in practice.
The use of the Internet for e-business, and its future use as the basic infrastructure for real-time audio and video services, are forcing a reconciliation of the traditional stateless, transparent model of the Internet Protocol with the fragmented world of Intranets and with the need for end-to-end service quality. The first point requires general adoption of IPv6 and of end-to-end IP security. The second is being tackled by a variety of techniques.
The Internet of course poses many of the same operational challenges as any telecommunications infrastructure, with respect to round-the-clock support, incident management, customer relationship management, and capacity planning. However, it also has unique challenges posed by the nature of connectionless packet-switching, in particular the inevitability of congestion, and competition for capacity between qualitatively different classes of traffic. Although TCP has provided effective congestion management for almost twenty years, new technologies - Integrated Services and Differentiated Services - are emerging to offer more sophisticated options at local and global scale.
Acknowledgements: Material from a paper presented by the author at the Telecom 99 Infrastructure Forum (Geneva, October 1999) has been included. Useful comments from James Kelly (IBM) and Joel Mambretti (iCAIR, Northwestern University) are gratefully acknowledged.
[Almquist] P.Almquist, Type of Service in the Internet Protocol Suite, RFC 1349, July 1992, available from http://www.rfc-editor.org/
[Braden] R. Braden, D. Clark, S. Shenker, Integrated Services in the Internet Architecture: an Overview, RFC 1633, June 1994, available from http://www.rfc-editor.org/
[Carpenter96] B.E. Carpenter (ed.), Architectural Principles of the Internet, RFC 1958, June 1996, available from http://www.rfc-editor.org/
[Carpenter99] B.E. Carpenter, Internet Transparency, 1999, work in progress, will be available from http://www.rfc-editor.org/
[Clark] The Design Philosophy of the DARPA Internet Protocols, D.D.Clark, Proc SIGCOMM 88, ACM CCR Vol 18, Number 4, August 1988, pages 106-114 (reprinted in ACM CCR Vol 25, Number 1, January 1995, pages 102-111).
[Delgrossi] L. Delgrossi, L. Berger, Editors , Internet Stream Protocol Version 2 (ST2) Protocol Specification - Version ST2+, RFC 1819, August 1995, available from http://www.rfc-editor.org/
[DIFFSERV] For all current diffserv documents, see http://www.ietf.org/html.charters/diffserv-charter.html
[Floyd] Floyd, S., and Jacobson, V., Random Early Detection gateways for Congestion Avoidance. IEEE/ACM Transactions on Networking, Volume 1, Number 4, August 1993, pp. 397-413.
[Huitema] C. Huitema, Internet quality of service assessment, updated daily at ftp://ftp.telcordia.com/pub/huitema/stats/quality_today.html
[IPv6] See for example the web site http://playground.sun.com/pub/ipng/html/ipng-main.html
[Postel] J.Postel, Transmission Control Protocol, RFC 793, September 1981, available from http://www.rfc-editor.org/
[Saltzer] J.H. Saltzer, D.P.Reed, D.D.Clark, End-To-End Arguments in System Design, ACM TOCS, Vol 2, Number 4, November 1984, pp 277-288.
[Schulzrinne98] H. Schulzrinne, A. Rao, R. Lanphier, Real Time Streaming Protocol (RTSP), RFC 2326, April 1998, available from http://www.rfc-editor.org/
[Schulzrinne96] H. Schulzrinne, S. Casner, R. Frederick,V. Jacobson, RTP: A Transport Protocol for Real-Time Applications, RFC 1889, January 1996, available from http://www.rfc-editor.org/
[Williams] D.O. Williams, Status Report of ICFA Networking Task Force, CERN, July 1998, available at http://nicewww.cern.ch/~davidw/icfa/July98Report.html
Brian E. Carpenter is Program Director, Internet Standards and Technology, for IBM. He is currently based at iCAIR, the international Center for Advanced Internet Research, which is sponsored by IBM at Northwestern University in Evanston, Illinois. He is also Chief Architect of iCAIR and teaches a course at Northwestern.
Previously he led the networking group at CERN, the European Laboratory for Particle Physics, in Geneva, Switzerland, from 1985 to 1996. This followed ten years' experience in software for processcontrol systems at CERN, which was interrupted by three years teaching undergraduate computer science at Massey University in New Zealand.
He holds a first degree in physics and a Ph.D. in computer science, and is an M.I.E.E. He is Chair of the Internet Architecture Board and an active participant in the Internet Engineering Task Force, where he co-chairs the Differentiated Services working group. He is also a member of the Board of the Unicode Consortium.