Heap (Priority Queue) Management in Hardware
for Weighted Fair Queueing in High Speed Networks

Manolis Katevenis, Aggelos Ioannou, and Ioannis Mavroidis

Computer Architecture and VLSI Systems Lab,
Institute of Computer Science (ICS), FORTH
Science and Technology Park of Crete, P.O.Box 1385, Heraklion, Crete, GR 711 10 Greece

© copyright 1997-2001 by FORTH, Crete, Greece

In order to provide advanced Quality of Service (QoS) architectures in high speed networks, Weighted Fair Queueing is needed. We have been working since 1985 on weighted fair queueing, i.e. on: (i) per-flow queueing, and (ii) weighted round-robin scheduling. In weighted round-robin scheduling, we make a distinction between the case where the weight factors are picked from a relatively small ``menu'' of allowed factor values, and the general case of arbitrary weight factors. For the former case, we have proposed and designed a scheduler with per-class urgency counters. For the latter case, our work is as described below.

Several algorithms have been proposed in the literature for weighted round-robin scheduling in the general case of arbitrary weight factors. Their common substrate is that, for each flow or packet or cell, a priority value is maintained --oftentimes corresponding to the next time-to-service value-- and the scheduler has to choose the minimum among these values, corresponding e.g. to the closest next time-to-service. Maintaining a set of many thousands of (priority) values and quickly finding the smallest of them is best done using a Heap (Priority Queue) data structure. We looked at methods to manage heap data structures at high speed, using specialized hardware.

First, we analyzed implementations using an FPGA and external SRAM, and found that a few tens of clock cycles per heap operation are needed for priority queues containing several thousand elements. For example, a 16 K element heap needs 40 clock cycles per "increment" operation when implemented with one (32-bit wide) SRAM, or 24 clock cycles when two SRAMs are used; "insert" and "delete minimum" operations are less expensive: 16 clock cycles suffice, with either one or two SRAMs. Our detailed design was verified by simulation in Verilog, and is described in the TR-222 referenced below.

Second, we proposed (1997) a novel algorithm for top-to-bottom heap insertions, thus enabling one to manage a heap in a pipelined fashion, in order to achieve very high performance. In the traditional heap management algorithm, deletions (of the root) proceed from top to bottom, while insertions of new elements proceed in the reverse direction. In our modified algorithm, both operations proceed in the top-to-bottom direction; to achieve this, inserted elements are "steered", left or right, at each level of the tree, depending on which of the two subtrees the open slot to be occupied resides in.

This novel heap management algorithm can be used in ASIC implementation of WRR schedulers, where each level of the heap tree is stored in a separate memory, so that multiple heap operations can proceed in parallel, each of them operating on a different level of the tree. We have designed such a pipelined heap (priority queue) manager, as described below.


Pipelined Heap (Priority Queue) Management in Hardware

  • A. Ioannou, M. Katevenis: "Pipelined Heap (Priority Queue) Management for Advanced Scheduling in High Speed Networks", to appear in IEEE/ACM Transactions on Networking (ToN), 2007.
    - Preprint of June 2006 in PDF (330 KBytes); 12 pages - © Copyright 2006 IEEE: Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE.

    Other papers on the same topic:

    © Copyright 2000-2001 by IEEE or FORTH:
    These papers are protected by copyright. Permission to make digital/hard copies of all or part of this material without fee is granted provided that the copies are made for personal use, they are not made or distributed for profit or commercial advantage, the IEEE or FORTH copyright notice, the title of the publication and its date appear, and notice is given that copying is by permission of the Institute of Electrical and Electronics Engineers or of the Foundation for Research & Technology -- Hellas (FORTH). To copy otherwise, in whole or in part, to republish, to post on servers, or to redistribute to lists, requires prior specific written permission and/or a fee.


    The FPGA and external SRAM design, and the top-to-bottom heap insertion algorithm were described in:

    Heap Management in Hardware

    by Ioannis Mavroidis

    Technical Report FORTH-ICS/TR-222,
    Institute of Computer Science, FORTH, Heraklion, Crete, Greece
    Bachelor of Science Thesis, Department of Computer Science, University of Crete
    July 1998

    ABSTRACT:

    Critical time priority queues require hardware instead of software management. The heap data structure is an efficient way to organize a priority queue. In this report we present the various design issues and evaluate the hardware complexity and timing requirements of different possible hardware organizations of a heap using one FPGA and external memory. We also describe the organization and the datapaths we simulated in Verilog. Finally we present some more efficient techniques applicable only to ASIC implementations.

    The full Technical Report is available in:

    © Copyright 1998 by FORTH.
    Permission to make digital/hard copies of all or part of this material without fee is granted provided that the copies are made for personal use, they are not made or distributed for profit or commercial advantage, the FORTH copyright notice, the title of the publication and its date appear, and notice is given that copying is by permission of the Foundation for Research & Technology -- Hellas (FORTH). To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific written permission and/or a fee.


    Up to WRR Scheduling R&D at CARV-ICS-FORTH Last updated: June 2006, by M. Katevenis.