DiSp: An Architecture for supporting
Differentiated Services in the Internet1
Anshul Kantawala (anshul at arl.wustl.edu)
Samphel Norden (samphel at arl.wustl.edu)
Ken Wong (kenw at arl.wustl.edu)
Guru Parulkar (guru at arl.wustl.edu)
Applied Research Laboratory
Washington University in St. Louis
St. Louis Mo. 63130, USA
Ph: 314-935-4855, Fax: 314-935-7302
1 This work was supported in part by NSF grant
ANI-9714698 and by Intel.
Abstract
In this paper, we propose DiSp (Differentiated Services over IP), a
new framework for supporting differentiated services over the Internet.
DiSp is different from the current IETF proposal for DiffServ but still
maintains the goals of DiffServ where we move complexity from the internal
routers out to the edge routers of DiffServ clouds and Autonomous Systems.
DiSp supports three classes of services: real-time, statistical bandwidth
and best-effort. The admission control policy for the real-time and statistical
flows allows fixed delay bound guarantees to be given to QoS applications.
We also discuss how our architecture can easily support important applications
such as Virtual Private Networks.
Table of Contents
1. Introduction
The current Internet supports only best-effort service irrespective of
the characteristics of the application that uses the service. But applications
such as IP telephony, video-on-demand, video-conferencing and other real-time
applications require end-to-end QoS (Quality of Service) support. Furthermore,
different applications require different transmission guarantees. For example,
video-on-demand applications can tolerate large delays but require bandwidth
guarantees. However, IP telephony is delay intolerant and requires
more comprehensive guarantees on bandwidth and delay. Thus, there
is a need to support service discrimination by explicit resource allocation
and scheduling in the network. Current research on QoS based networks has
resulted in the development of the Integrated Services (IntServ) architecture
using the RSVP signaling protocol for signaling per-flow requirements to
the network. IntServ is used to quantify these QoS requirements using an
admission-control-based approach. However, IntServ suffers from
scalability,
complexity
and deployment problems.
These deficiencies have led to the development of the beginnings of
an alternative QoS delivery model known as Differentiated Services (DiffServ)
[1,
2]. To address the scalability
issue, DiffServ aggregates flows into service classes rather than maintaining
per flow state. Furthermore, QoS requirements are specified out-of-band
removing the necessity for a signaling protocol such as RSVP. Packet classification
is based on the setting of bits in the TOS byte of the IP header.
Flow aggregation in DiffServ has several beneficial consequences.
First, DiffServ routers map a large number of flows to a small number of
per-hop behaviors. Thus, instead of every router having to manage individual
flows, only the edge routers need to be concerned with QoS. Second,
aggregation facilitates the construction of ``end-to-end'' services by
linking multiple autonomous domains together using simplified service agreements
at the boundaries of the domains.
DiffServ is still in its infancy and has not yet matured into a service
framework that can satisfy the diverse application requirements. There
are many issues to be resolved: 1) precise service class definitions, 2)
admission control policies, 3) strategies for policing and shaping of aggregate
flows and 4) congestion handling mechanisms. Our architectural framework
attempts to tackle some of these issues and supports two fundamental service
enhancements: 1) receiver subscriptions, and 2) statistical bandwidth guarantees.
We describe a new architecture called DiSp (Differentiated Services
over IP), that builds on the basic DiffServ idea of flow aggregation to
provide user-controlled traffic services. DiSp has four key features: 1)
it has three service classes: real-time (RT), statistical-bandwidth (SB)
and best-effort (BE), with detailed profile specifications; 2) it has mechanisms
for policing and shaping aggregate flows; 3) although real-time flows are
treated in an aggregate manner, DiSp provides service guarantees on a per-flow
basis; (We note that, in keeping with the DiffServ ideals, DiSp does not
maintain per-flow state information in ANY router and uses simple priority
scheduling mechanisms among the three classes.) 4) DiSp uses efficient
monitoring mechanisms that can provide accurate feedback for congestion
control and overall network management. The use of our proposed signaling
protocol facilitates third party negotiations which are essential for network
configuration, management and provisioning.
Our goal is to define and support a model that allows the seamless integration
of our proposed DiffServ architecture with IntServ, since both these models
are complementary. We highlight the effectiveness of our approach by considering
a challenging task of resource allocation for real-time applications using
Virtual Private Networks (VPN).
The rest of the paper will be organized as follows. In Section 2, we
will present and motivate our proposed approach. Section 3 deals with details
of our proposed architecture, followed by the details of the admission
control algorithm in Section 4. We discuss our congestion control policies
in Section 5 and support for individual real-time multicast flows in Section
6. We then describe issues regarding resource allocation for the
statistical bandwidth class and a target application, VPN, in Section 7.
We finally present related work and conclude the paper in Sections 8 and
9 respectively.
2. Overview of DiSp
DiSp supports two fundamental service enhancements: 1) receiver subscriptions,
and 2) statistical bandwidth guarantees. With receiver subscriptions, each
receiver (or set of receivers) negotiates for a fixed delay bound on a
flow originating from a remote source. Unlike other DiffServ proposals
[1, 4] which are based
on source reservations, our model emphasizes support for applications (e.g.
Video on Demand (VOD), stock quotes) where receiver reservations are more
appropriate. In these real-time applications, the sender should not be
forced to reserve and pay for resources when there are no receivers present.
Other motivations for adding receiver-based reservation control can be
found in [5].
With statistical bandwidth guarantees, each AS can negotiate an aggregate
bandwidth profile for its high bandwidth flows. One of the hard problems
for supporting such a service class is admission control and resource reservation
for the aggregate flows without prior knowledge of the routes taken by
individual flows within the aggregate. We envisage such a service class
to be useful for providing QoS support for VPNs. DiSp supports three service
classes:
-
The Real-Time (RT) class provides support for flows requiring fixed
delay bounds. Each edge router polices and smooths out real-time flows
using a modified token bucket with queue. Real-time flows are admitted
on a per-flow basis. DiSp uses QoS routing and admission control to insure
that RT flows do not encounter congestion (unless there is a hardware failure).
DiSp's signaling protocol notifies receivers when there is a need for renegotiation
or re-admission. Renegotiation can occur in two situations:
-
When a flow cannot be admitted, and the application can tolerate a lower
QoS.
-
When a flow's delay bound cannot be met (because of congestion due to a
hardware failure).
-
The Statistical Bandwidth (SB) class provides support for delay-tolerant
applications requiring a minimum bandwidth reservation. Statistical bandwidth
flows are paced out at the aggregate bandwidth. A fixed size input queue
controls the maximum allowable burst.
-
The Best-Effort (BE) class provides best-effort service similar
to the current Internet. BE TCP flows are monitored on a per-link basis
and an explicit congestion notification message is used to notify the source
of congestion. This message includes a ``current congestion window size''
parameter which is calculated by the link monitor.
Flow scheduling is performed in strict priority order: 1) RT (highest),
2) SB, 3) BE. Thus, routers (edge and internal) need only maintain three
output queues per link and do not need complex fair queueing algorithms.
3. Architecture
Our architecture is shown in Figure 1
and consists of:
-
Edge routers are responsible for policing and shaping flows in different
QoS classes and setting the TOS bytes accordingly.
-
Internal routers are responsible for scheduling packets in strict
priority among the service classes. Internal routers also monitor best-effort
traffic and generate explicit congestion notification messages during congestion.
-
The Network Operations Center (NOC) is responsible for admission control and renegotiation of
service levels.
The connecting Autonomous Systems (ASs) are responsible for marking
packets according to their respective classes. If an edge-router encounters
an unmarked packet, it is treated as part of a BE flow. Thus, DiSp provides
backward compatibility for legacy IP networks. Since DiSp handles policing
and shaping of flows in an aggregated manner, we rely on the connecting
ASs to provide flow isolation from misbehaving flows within the aggregate.
For example, an AS could be viewed an as IntServ cloud providing per flow
QoS internally to all its flows while negotiating aggregate profiles for
flows transiting through the DiffServ cloud.
Signaling Protocol in DiSp : SPiD
A crucial component of the DiSp architecture is SPiD (pronounced ``speed''),
the signaling protocol employed by DiSp. The current IETF proposal for
DiffServ does not incorporate a signaling protocol. This decision was based
on scalability concerns. The RSVP protocol has a high overhead in its functionality
(two phase approach), where per-flow state information is set up in each
router when performing resource reservation. There have been proposals
for modifying RSVP for use with aggregated flows [7].
However, aggregation introduces a host of issues (e.g. maintaining per-flow
guarantees, isolating flows) which would add to the complexity of RSVP.
Also, with regard to multicast, RSVP suffers from problems of handling
QoS reservations with heterogeneous reservation styles. SPiD is a lightweight,
efficient signaling protocol with the following key features:
-
Admission control between NOC and user. Renegotiation if the flow experiences
congestion.
-
Interaction between NOC and edge routers for setting up Service Level Agreements
(SLA).
-
Congestion notification messages for different traffic classes.
-
Cooperation with the source and/or receiver to allow a of traffic shaping
mechanism to be used at the edge router for RT flows.
SPiD Control Messages
We envisage the following types of control messages that will be used by
SPiD.
-
Congestion notification to BE TCP flows with window size
-
Negotiation/renegotiation of SLA
-
Notification of congestion for RT flows to edge routers
-
Management related event notification to NOC
-
Monitoring instructions from NOC to routers
-
Sender and receiver oriented traffic management facilities
Thus, while SPiD is lightweight , it offers an enhanced set of features
to support both hard and soft bandwidth guarantees in DiffServ. In addition,
it provides support for network management which is another key component
of our DiSp architecture.
Control Traffic
SPiD has several control messages which must receive transmission guarantees
to prevent performance degradation. DiSp uses a separate minimum spanning
tree control network with statically reserved bandwidth to avoid delays.
Profile specification
Each AS can specify a profile for each RT flow and each SB flow aggregate.
Note that RT flows are delay sensitive whereas SB flows are bandwidth sensitive.
An RT profile specifies a delay bound for a particular flow through three
parameters:
dmax |
Maximum tolerable delay between packets |
RRT |
Minimum bandwidth |
Pmax |
Maximum packet size |
Ploss |
Acceptable packet loss probability during severe congestion |
An RT profile is specified for each flow and stored in the ingress
router of an ISP.
Each AS specifies a single aggregate profile for its SB flow
to an ISP. This profile is stored in the ingress router of the ISP receiving
the SB flow. An SB profile specifies the minimum bandwidth guarantee for
a flow aggregate (not an individual flow) through two parameters:
RSB |
Minimum bandwidth |
B |
Maximum burst size |
State information storage
Each ingress router of an ISP stores profiles for each real-time flow and
each high bandwidth flow aggregate for each connected AS. Although policing
of real-time flows from a particular AS will be done in an aggregated manner
(as described in the Edge Router Internals
section), the edge-router has to adjust the policer according to the individual
flow's acceptable loss rate and selectively drop packets in times of severe
congestion when the guarantees cannot be met.
The NOC maintains a centralized database that is used for admission
control. The database maintains information about the reserved bandwidth,
delay and a list of real-time flows with their associated ingress edge-router
for each link of each router within the DiSp network. The parameters stored
for each link i, with capacity C and n RT flows include:
CRT |
Fraction of C reserved for RT flows |
CSB |
Fraction of C reserved for SB flows |
SiPimax |
Sum of max. packet lengths of all RT flows on the link |
{A1, A2, ... , An} |
List of RT flows associations Ai, where Ai =
< srci,dsti,routeri > |
Each RT flow association Ai identifies the flow (srci,dsti)
and the ingress router (routeri) hosting the flow.
Each internal router also stores a running count of number of active
best-effort flows on each link. This information is used by the signaling
protocol to provide explicit congestion window size feedback to the best-effort
sources in times of congestion.
Edge-router internals
The main function of an edge-router is policing and shaping of real-time
and high bandwidth flows (Figure 2). For each
input link in the edge-router, a modified token bucket scheme is used to
police real-time flows. High bandwidth flows are policed using a fixed
size queue and a pacer. Each output link has three output queues, one for
each service class, which are served in strict priority order: 1)RT, 2)
SB, 3) BE.
-
Real-time flow policer DiSp uses a token bucket with parameters
r
and b (Figure 3):
r is a set of timers in which the ith timer expires
every Ti seconds, where Ti is dimax
/(hop
count) for flow i. Also a queue of size b corrects any jitter
that real-time packets may have suffered in the AS network. Packets arriving
from real-time flows are queued in FIFO order if there is no token to service
them instantly. Incoming packets are dropped if the queue is full,
thus policing RT flows with respect to their reservations. When a packet
is dispatched on link i, a token of size Pimax
,
the
largest packet in flow i, is removed from the token bucket.
For example, a token of size Pimax will be generated
every dimax /(hop count) seconds for a real-time
flow i. The token supply is reduced by Pimax
every
time a packet is sent out. Packets are only sent if the token supply
is at least Pimax
. This scheme allows
the router to police and pace the real-time flows as an aggregate bundle
instead of being forced to use a separate token bucket and queue for each
real-time flow, thus reducing the overhead at edge-routers. If all flows
conform to their reservations, all RT delay guarantees will be met. The
only drawback of this scheme is that it cannot isolate non-conforming RT
flows, a responsibility of the AS egress routers.
-
Dynamic, flexible shaping of real-time flows As part of an SLA,
an AS can negotiate a set of shaping policies for RT flows. These policies
are stored in a table at the edge-router which is indexed by some bits
in the TOS byte. For non-RT flows, these bits are ignored. For enhanced
flexibility, renegotiation mechanisms dynamically update shaping policies
for RT flows.
-
Statistical bandwidth flow policer/shaper High bandwidth flows are
policed using a queue of size equal to B, the maximum allowable burst size.
Packets arriving when the queue is full are dropped, thus preventing high
bandwidth flows from exceeding their reservations during congestion. To
smooth out the burstiness in high bandwidth flows, packets are paced out
at rate R, the aggregate bandwidth for all high bandwidth flows. For example,
consider a reservation of R = 5Mb/s and B = 10MB for an input link with
a packet of size 16Kb at the head of the queue. After servicing the packet,
the pacer sets a timer to expire at t = tcurrent + 3.2 ms (3.2
ms = 16Kb/5Mb). The next packet is serviced only after this timer
expires thus making sure that the traffic does not exceed the total allocated
bandwidth for that service class.
4. Admission Control
The DiSp admission control algorithm insures that delay and bandwidth guarantees
can be met for all accepted connections. The admission control procedure
is almost the same for both real-time and statistical bandwidth flows.
The difference is that delay and jitter for statistical bandwidth flowsare
not checked. When a new connection request is made to an edge router, there
are two tests that are performed for RT flows. First, DiSp checks for sufficient
bandwidth along the route selected for the RT flow. Second, it checks
that the sum of the end-to-end delay and jitter bounds exceed the worst
case delay experienced by the packet at each hop along the route. Once
a flow satisfies these two checks, all flow associations < src, dst,
ingress-router > on the QoS route are added to the NOC database. The parameter
that maintains the overall bandwidth of the service class is updated. Finally,
the maximum packet size for the flow is computed and updated if larger.
5. Congestion Handling
Real-time and statistical bandwidth flows will not experience any congestion
during normal operation, but may experience some congestion when a link
goes down and the flows are re-routed. For this particular scenario, we
re-route the real-time and statistical bandwidth flows to the next hop
router using an alternate path. For example, consider link i in router
Rj which connects Rj to Rk goes down.
The NOC tries to find an alternate path from Rj to Rk
that can accommodate the bandwidth and delay requirements of all flows
on link i. If such a route cannot be found, the NOC will signal the originating
ASs of the respective flows indicating a need for renegotiation or readmission
of the affected flows.
Best-effort flows can encounter congestion even during normal operation
because of the aggressive windowing strategy employed by TCP. DiSp monitors
active best-effort flows and provides explicit window size parameter feedback
to flows going through a congested link. Using the Smart Port Card (SPC)
card with an embedded ATM Port Interconnect Chip (APIC) [8],
DiSp can snoop all BE flows on a link at gigabit rates and monitor the
currently available bandwidth for BE flows. Each router stores the
number of active best-effort flows and the source address of each flow
for each link. If the link experiences congestion, the router sends feedback
messages to the host indicating a smaller TCP congestion window size (number
of active flows/currently available bandwidth). We plan to add this enhancement
to the current TCP protocol to be able to handle such feedback messages
and adjust the congestion window size accordingly. This scheme is an enhancement
of the ECN (Explicit Congestion Notification) mechanism proposed for TCP
[6]. We will also experiment with varying
the holding times of the new congestion window size before allowing TCP
to resume it's normal congestion window control algorithm.
6. Multicast Support
Since IETF's DiffServ deals with aggregate flows, resource provisioning
for individual multicast flows is not supported. Using the receiver based
subscription enhancement, DiSp provides baseline support for multicast
RT flows. Consider a multicast group M with source S1 and receivers
D1 and D2, as shown in Figure
4. When a new receiver D3 wants to join M, only resources
from R4 to D3 are reserved, since there already exists
a virtual path from S1 to D1. If the QoS requirements
for D3 are different from the other receivers, the flow S1
to D3 will be considered as a new RT flow and go through the
admission control and resource reservation process. Non-RT multicast flows
will be treated as BE flows. Providing multicast support for aggregated
SB flows is an open issue.
7. Applications for Statistical Bandwidth Class
There are two major issues concerning the statistical bandwidth class:
-
What application(s) can make use of this class?
Currently most of the multimedia applications require a fixed delay
bound from the network which cannot be guaranteed using the statistical
bandwidth class. Other applications such as web surfing and ftp are very
bursty, short duration flows which are best served using the best-effort
model.
-
How do we provision resources for this class?
Since we admit flows of this class in an aggregated manner, we do not
have detailed source-destination information regarding each flow in the
aggregate. Flows belonging to one aggregate bundle will in most cases diverge
to different receivers and thus exit the network at different points. Thus,
reserving resources for an aggregate bundle is a hard problem. The obvious
solution of reserving bandwidth on all links for each aggregate leads to
an extremely inefficient solution!
One application that helps point towards a feasible solution to the above
issues is providing VPNs for corporations. Since a VPN is a fairly static
network connecting remote sites of a company, DiSp can use this information
to reserve resources along the paths connecting the sites. Currently, VPNs
only provide a secure network, but no QoS guarantees. Using the statistical
bandwidth class, DiSp can provide a VPN with some minimum bandwidth guarantee
between the remote sites. The issues that need to be addressed for such
an application are:
-
Provisioning bandwidth in the VPN
One possibility is to create one-to-many multicast groups between each
remote site and all the others, compute QoS routes for each multicast group
and reserve the minimum bandwidth as specified along these routes.
-
Efficient use of network resources
We plan to evaluate two approaches for admission control and resource
reservation for statistical bandwidth flows.
-
Over-booking Use an over-booking ratio while admitting new flows.
Under this scheme, new flows would specify peak bandwidth values but would
receive an average bandwidth equal to the (peak rate/overbooking ratio).
-
Empirical heuristic Measure effects of new flows (which have a random
set of destinations) on the link utilizations of the DiffServ cloud.
8. Related Work
Current research in DiffServ has resulted in a number of IETF drafts that
attempt to tackle the issues of defining service classes, per-hop behaviours,
integration of DiffServ with IntServ and so on. In this section we discuss
how our work complements and builds on the existing research in this area
and highlight some of the differences between our proposed scheme and the
traditional view of DiffServ.
Differences from traditional DiffServ proposals
IETF DiffServ was essentially developed to prevent any complex signaling
and allow out-of-band negotiation. Thus the use of the ToS byte information
implicitly decided the kind of service a flow would receive at a router.
However, there is a need to use a signaling protocol that can perform the
negotiation between the user and ISP, allow the service profiles to be
disseminated to the various edge routers for indicating admitted flows,
allow users to renegotiate the profile (subscriptions), and perform
congestion notification. Our proposed signaling protocol
SPiD performs
the above functions in an efficient manner. We have also utilized an admission
policy based approach that can guarantee the services demanded by the various
flows.
DiffServ also does not clearly define the types of service classes that
are provided. While some of the proposals [4]
discuss Premium and Assured service classes, no characterizing
parameters are defined apart from bandwidth. We have proposed the use of
three service classes (Real-time, Statistical-Bandwidth and Best-effort)
and defined their QoS parameters to provide greater flexibility to the
user in terms of being able to specify delay and delay jitter, in addition
to the standard bandwidth parameter. We also propose the use of a separate
control network for which a static spanning tree route is maintained. All
control messages are thus guaranteed a minimum bandwidth and do not suffer
from problems of insufficient bandwidth due to admission of higher class
flows like real-time flows. Related to this, we also utilize QoS routes
for choosing the best possible route. By restricting ourselves to a simple
three queue approach, we are removing much of the complexity that is involved
in using some compute-intensive Weighted Fair Queueing mechanism at the
edge router for scheduling the flows.
Although edge routers in DiSp treat flows from different service classes
as aggregates (which is similar to traditional DiffServ), we enforce specification
and admission of individual real-time flows rather than flow aggregations.
In general, aggregation of diverse real-time flow specifications is meaningless.
However, we police and shape real-time flows as an aggregate.
Also, DiSp's admission control is explicit whereas IETF DiffServs'
is implicit (see Section 3).
Multicast flows are treated by DiSp in the same way as any other flow.
Thus, DiSp does not require any separate mechanism to handle multicast
flows. A new receiver joining a multicast group is handled similar to any
other new flow. With heterogeneous multicast traffic, diverse profiles
(based on receiver requirements) can be easily supported.
9. Conclusions
We have proposed a new framework for supporting differentiated services
over the Internet. This approach is different from the current IETF proposal
for DiffServ while still maintaining the goals of DiffServ. Our architectural
framework allows distribution of complexity to the edge routers as well
as the AS routers. We have proposed services to support real-time and statistical
service classes apart from the usual best-effort class. The admission control
policy for the real-time and statistical flows allows hard guarantees to
be given to QoS applications.
References
1. D. Clark and J. Wroclawski. "An approach to service
allocation in the internet", Internet Draft, July 1997.
2. S. Blake, D. Black, M. Carlson, E. Davies,
Z. Wang, and W. Weiss. "An architecture for differentiated services", Internet
Draft, August 1998.
3. G. Parulkar, D. Schmidt, E. Kraemer, J. Turner,
and A. Kantawala. "An architecture for monitoring, visualization and control
of gigabit networks", IEEE Network, 11(5):34-43, October 1997.
4. K. Nichols, V. Jacobsen, and L. Zhang. "A
two-bit differentiated services architecture for the internet", Internet
Draft, November 1997.
5. B. Ohlman. "Receiver control in differentiated
services", Internet Draft, March 1998.
6. S. Floyd. "TCP and Explicit congestion notification",
ACM
Computer Communications Review, 24(5):10-23, October 1994.
7. R.Guerin, S. Blake, and S. Herzog. "Aggregating
RSVP based QoS requests",
Internet Draft, November 1997.
8. Z. Dittia, G. Parulkar, and J. R. Cox. "The
APIC Approach to High Performance Network Interface Design: Protected DMA
and Other Techniques," IEEE INFOCOM 97, Kobe, Japan, 1997.