Weighted Round-Robin Scheduling Module

Lou R. Rohan
Douglas R. Taube
CS/COE 535
Fall 2001
Washington University in Saint Louis

Introduction

This paper presents the design and implementation of a weighted round robin module for intended implementation on a Field-Programmable-Port-Extender (FPX) attached to a Washington University Gigabit Switch (WUGS). The queuing strategy used in this design is based upon input queuing (IQ). IQ structure has been demonstrated to require the least memory bandwidth for buffering ATM cells [1]. Two types of queuing implementations are active in the module. The queues are presented in First-In-First-Out (FIFO) modules to preserve cell order as they are passed through the module. Also, incoming cells are queued on a per-virtual-circuit (per-VC) basis. Queuing in this manner enables bandwidth allocation [1]. Secondly, the queues are assigned a weight to allow for priority scheduling. This method improves Quality of Service (QoS) enforcement and reduces head-of-line (HOL) blocking issues [1].

Background

Applications can be classified into two basic service groups, best-effort service and guaranteed service. A significant deployement of ATM network technology provides the basis of managed and guaranteed service but Internet traffic patterns do not adhere to the standard. Under most Internet traffic patterns, resources are not explicitly reserved, end systems do not pace their transmissions, and most network equipement does not enforce resource usage limits [2][3].

Description of the Project

This project implements a dynamic queuing scheme that intends to provide resource sharing, improve QoS enforcement and reduce HOL blocking. The queuing scheme is designed to register VC upon call establishment by means of a control cell. The VC registration follows in the implementation framework of the NCHARGE control software for FPX.

The module is designed for the FPX to provide IQ traffic per-VC. Each VC can have an assigned priority level (i.e. weight) and a weighted round-robin (WRR) scheduler is implemented to reduce HOL blocking.

Components from the course machine problem 5 were reused for this project. By reusing the SDRAM interface, the cell pacing component, and the push/pull cell components, the overall development of the project managed to the new modules introduced for WRR.

Major Components

Module Assignments

Details of Implementation

References

  1. H. Duan, J. W. Lockwood, S. M. Kang, J. D. Will, A High-performance OC-12/OC-48 Queue Design Prototype for Input-buffered ATM Switches, IEEE Infocom '97, Kobe, Japan, April 7-11, 1997, pp 20-28.
  2. Y. Chen, J. S. Turner, Design of a Weighted Fair Queuing Cell Scheduler for ATM Networks, GLOBECOM '98.
  3. S. Keshav, An Engineering Approach to Computer Networking, Addison-Wesley, pp. 238-255, 1997.
  4. Sarang Dharmapurikar and John Lockwood, Synthesizable Design of a Multi-Module Memory Controller, Washington University, Department of Computer Science, Technical Report WUCS-01-26, October, 2001.
  5. CS 535 course materials and notes, J. W. Lockwood, Washington University, Fall 2001.
  6. CS 533 course materials and notes, K. Wong, Washington University, Spring 2000.

Results

Application Results

The following results were generated in simulation using the ModelSim tool. A test file of ATM formatted cells was sent through the module. The input and output order of the cells is noted by VCI number. Per the goals of the project, traffic was divided by the module into separate queues. Each queue was registered with a priority (or weight) to determine the amount of service it would receive relative to other queues.

In the first example, only two VC were introduced to the module. The modules cell pacing rate was sent to allow all cells to be buffered before any cells were transmitted. Cells arriving on VCI 23 were assigned a weight of 2 while cells arriving on VCI 27 were given a weight of 8. Under the weighted round-robin implementation, VCI 27 will receive 8 rounds of service for every two rounds of service VCI 23 receives.

Under first-come-first-service (FCFS)scheduling conditions, the cells are scheduled to complete according to their arrival time. This would cause the higher priority cells to be blocked from service until the earlier arriving, lower priority cells complete transmission. The graphs demonstrate that the module time-slice divides the burst traffic based upon the weight of the queue in ues.

The next diagram shows seven VC transmitting cells to the module. The weights assigned to each VCI in the test file start at three on VCI 23 and increase by one for a weight of 9 on VCI 29.

Similar to the previous example, the bursts of traffic are divided according to the weight of the queue associated to the VCI. Again, FCFS scheduling would have protected the burst traffic by blocking other VC from transmitting and provided no means for priority service for specified VC.


Synthesis and Place-and-Route Results

The project was synthesized for the Xilinx Virtex-E XCV1000E device with a frequency target of 10 MHz. The synthesis indicates that the current design of the module requires 25% of the available slices on the FPGA device. The greatest delay in the module is 25 nanoseconds indicating a 40 MHz frequency of the module, which is within the 10MHz target. Additional development in the queue management component will quickly reduce both the size and delay of the implementation.

Conclusions

In this paper we have described the design, and implementation of a weighted round robin module for FPX. The module was designed for input to the WUGS and provides per-VC weighted round robin scheduling and reduces HOL blocking by using SDRAM based FIFO queues and a WRR scheduler. Dynamic scheduling constraints are managed by the modules support of the NCHARGE framework.

Computer simulation results verify the operation of the WRR scheduler to distribute link utilization to multiple queues based upon and establish priority.