






















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Class: Special Topics; Subject: Computer Science; University: Georgia Institute of Technology-Main Campus; Term: Fall 2007;
Typology: Study notes
1 / 30
This page cannot be seen from the preview
Don't miss anything!























WAI TENG TANG, RICK SIOW MONG GOH, and IAN LI-JIN THNG
National University of Singapore
This article describes a new priority queue implementation for managing the pending event set in discrete event simulation. Extensive empirical results demonstrate that it consistently outperforms other current popular candidates. This new implementation, called Ladder Queue, is also theoreti- cally justified to have O (1) amortized access time complexity, as long as the mean jump parameter of the priority increment distribution is finite and greater than zero, regardless of its variance. Many practical priority increment distributions satisfy this condition including unbounded vari- ance distributions like the Pareto distribution. This renders the LadderQ the ideal discrete event queue structure for stable O (1) performance even under practical queue distributions with infinite variance. Numerical simulations ranging from 100 to 10 million events affirm the O (1) property of LadderQ and that it is a superior structure for large-scale discrete event simulation.
Categories and Subject Descriptors: E.1 [ Data ]: Data structures; F.2 [ Theory of Computation ]: Analysis of Algorithms and Problem Complexity; I.6.8 [ Simulation and Modeling ]: Types of Simulation— discrete event
General Terms: Algorithms, Performance
Additional Key Words and Phrases: Pending event set implementations, priority queue, calendar queue
The Sorted-discipline Calendar Queue (SCQ), as proposed by Brown [1988], is an important implementation of the pending event set (PES) in discrete event simulators such as GTW [Das et al. 1994], CSIM18 [Schwetman 1996], and Network Simulator v2 [Fall and Varadhan 2002]. By far, SCQ is more popular than the unsorted bucket discipline calendar queue, or UCQ. This is because the number of operations required for a basic enqueue and dequeue operation is O ( nsublist ) and O (1), respectively, in a SCQ bucket where nsublist represents
Authors’ address: Department of Electrical and Computer Engineering, National University of Singapore, 3 Engineering Drive 3, Blk E4A #05-06, Singapore 117576; email: {waitengtang,smgoh, eletlj}@nus.edu.sg. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or direct commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 1515 Broadway, New York, NY 10036 USA, fax: +1 (212) 869-0481, or [email protected]. ©C (^) 2005 ACM 1049-3301/05/0700-0175 $5.
ACM Transactions on Modeling and Computer Simulation, Vol. 15, No. 3, July 2005, Pages 175–204.
176 •^ W. T. Tang et al.
the number of events in an SCQ bucket. For the case of the UCQ, the number of operations required is higher, that is, O (1) for enqueue and guaranteed nsublist operations for dequeue. R¨onngren and Ayani [1997] provided empirical evidence that the SCQ achieves O (1) performance in benchmark scenarios where the number of events N in the queue and the mean of the jump random variable associated with the priority increment distribution, denoted as μ, remain constant. However, the SCQ sometimes falters to O ( n ) when μ varies, for example, the Camel and Change distributions, even though N is constant. Furthermore, when N varies, the performance of the SCQ becomes erratic with many peaks which occur when the queue size fluctuates by a factor of 2, revealing that the resize operations can be costly. In addition, when N varies, the SCQ does not always achieve O (1) performance even though μ remains constant, for example, Triangular distribution. These observations translate to the following:
(a) The SCQ’s size-based resize trigger is an ineffective mechanism for han- dling skewed distributions where μ varies. It results in many events being enqueued into a few buckets with long sublists and many empty buckets (i.e., skewed distribution phenomena). Long sublists make enqueue oper- ations expensive since each enqueue entails a sequential search, whereas excessive traversal of empty buckets can increase the process of dequeue operations. This problem of the SCQ has been attributed to the size-based resize triggers which are too rigid to react according to the events distribu- tion encountered since a resize trigger occurs only if the queue size halves or doubles [Brown 1988].
(b) Size-based resize trigger is not suitable for handling simulation scenarios where N varies by a factor of two frequently. This form of trigger is consid- ered inflexible because even though the SCQ can be performing well with its existing operating parameters, but due to this trigger, it still has to resize when N varies by a factor of two. (c) Sampling heuristic is inadequate to obtain good operating parameters, namely, the number of buckets and the bucketwidth, when skewed distribu- tions are encountered. During each resize operation, the SCQ samples, at most, the first twenty-five events. This is clearly too simplistic because for skewed distributions in which many events fall into some few buckets, the inter-event time-gap of the first twenty-five events, which can span several buckets, and those in the few populated buckets may vary a lot. To simply increase the events sampled is not a prudent approach unless the distri- bution is uniform. For most distributions, particularly skewed distributions such as the Camel distribution, if we simply sample more events and then take the mean or median, it is unlikely that it will be accurate since events are spread unevenly. Even if the most populated bucket is sampled, the SCQ will also perform poorly because events in that bucket will have a small av- erage time-gap whereas other events have widely diverse time-gaps. If the bucketwidth is updated to this small time-gap, there will likely be numerous empty buckets. And skipping these empty buckets will lead to inferior per- formance. Furthermore, sampling more events inevitably leads to higher
178 •^ W. T. Tang et al.
Secondly, during enqueue operations, a resize operation encompasses only resizing the Bottom structure when it becomes too populated. The Bottom struc- ture, made up of a sorted linked list, is the center of activity of the LadderQ and essentially determines the performance of the queue. However, all of LadderQ’s other buckets do not get resized. Since the Bottom structure is limited to at most 50 events, this not only cuts down the costliness of the resize, it limits the com- plexity of the enqueue operations in Bottom which relies on linear search and also ensure that complexity requirements on Bottom is at most 50 operations (i.e., O (1)) irrespective of queue size and μ. Thirdly, during dequeue operations, the LadderQ refines the bucketwidth on a bucket-by-bucket basis, reducing the cost of resize operations. Furthermore, this bucketwidth-refining methodology, which results in LadderQ having mul- tiple bucketwidths, well handles event distributions which have widely varying inter-event time-gaps. In contrast, the SCQ adopts a “one size fits all” approach where only a single bucketwidth is used for the entire SCQ structure, resulting inevitably in costly resize operations. Lastly, instead of unreliable and somewhat costly sampling heuristics, the LadderQ obtains good operating parameters dynamically, in accordance to the events in its structure, without sampling. This reduces further its management overheads on the whole.
2.1 Basic Structure of Ladder Queue
Figure 1 shows an example of the basic structure of a LadderQ. The name LadderQ arises from the semblance of the structure to a ladder with rungs. Basically, the structure consists of three tiers: a sorted linked list called Bottom ; the middle layer, called Ladder , consisting of several rungs of buckets where each bucket may contain an unsorted linked list; and a simple unsorted linked list called Top. In the example of Figure 1, a LadderQ with three rungs of buckets – Rung [1], Rung [2] and Rung [3] is shown. The actual number of rungs may vary from distribution to distribution. Later, in Section 3, it will be demonstrated that the average number of rungs is bounded by a constant irrespective of N and as long as the mean jump parameter, that is, μ, of the distribution is finite and greater than zero, regardless of its variance. In the literature, there is an interesting priority queue known as the Lazy Queue (LazyQ) [R¨onngren et al. 1993] which bears a striking similarity in terms of structure as compared to the LadderQ which we describe in this article. The LazyQ also has three tiers. However, the similarity ends there. The mechanisms employed to trigger the transfer of events between the various tiers are more convoluted. It also requires the user to know a priori the value of numerous parameters for it to perform well. It also relies on the resize operation similar to the SCQ but on top of that it comes with various resize criteria. We only provide a brief mention of the LazyQ because it has been already shown that the LazyQ provides similar performance compared to the SCQ for most scenarios and worse for some scenarios [R¨onngren and Ayani 1997]. As such, we have not included the LazyQ in our performance comparision. We have on the other hand empirically shown that the LadderQ significantly outperforms the SCQ in almost every scenario (see Section 5).
Ladder Queue •^179
Fig. 1. Basic structure of Ladder Queue.
2.2 The Algorithm
Before we present a detailed description of the LadderQ algorithm, Table I describes the operating variables associated with this three-tier structure.
2.2.1 Dequeue Operation. Initially, Top , Ladder and Bottom are empty and all incoming events are inserted into Top without any sorting. MaxTS , MinTS and NTop are updated accordingly. When the first dequeue event is fired, events in Top are transferred to the first rung of Ladder , that is, Rung [1]. The buck- etwidth of Rung [1] is obtained using
Bucketwidth [1] =
MaxTS − MinTS NTop
where MaxTs is not equal to MinTs.^1 Upon obtaining the Bucketwidth [1], the following variables are updated: RStart [1] and RCur [1] are set = MinTS , and TopStart is set = MaxTS + Bucketwidth [1]. Thereafter, events can be
(^1) Note that if MaxTs is equal to MinTs , it means all the events in Top have the same timestamp. In this article we consider the mean jump to be finite and positive. Thus, the likelihood of this occurring is extremely low. See Section 2.4 for the practical aspect of this occurrence.
Ladder Queue •^181
Bottom. Thereafter, the first event from Bottom , which is the event with the highest priority, will be returned. However, if NB (^) c does exceed THRES , a spawning action will be initiated. Bc is converted into a Bsp (see Figure 1) and this process involves adding a new rung, that is, Rung [2], where the events from Bc are then transferred into. This spawning process is repeated until the sorted linked list Bottom is created and the highest priority event is dequeued. For this spawning process, different variables are updated as compared to the transfer of events from Top to Rung [1]. In the example given in Figure 1, the variables RStart [2] and RCur [2] are set = RCur [1]. Subsequently, RCur [1] is incremented by Bucketwidth [1], and Bucketwidth [2] is set using
Bucketwidth [ i + 1] =
Bucketwidth [ i ] THRES
Events held in the parent bucket in Rung [1] are then re-distributed into Rung [2] using the bucket-index procedure given in Eq. (2) with i = 2. The rationale of using Eq. (3) to obtain the bucketwidth of Rung [2] is based on the assumption that the events in the Bsp of Rung [1] are uniformly distributed within its bucketwidth. If not, the spawning process continues on with more rungs being created. The spawning process terminates when the number of events in the lowest rung does not exceed THRES. It is shown later that if the jump distribution has a finite μ, then the average number of rungs created bounded by a constant, irrespective of N. The above describes all the possible operations that may result when a de- queue occurs. However, the majority of dequeue operations are much simpler once a previous dequeue operation has already created the sorted linked list Bottom. In such cases, since Bottom is a sorted linked list, there is a pointer that is always referencing to the first event in Bottom , that event is simply removed with negligible access times. When all the events in Bottom have been dequeued, the buckets of the lowest rung are traversed to find the next dequeue bucket Bc and the whole process is repeated again. When a child rung is de- queued of all events, the child structure is deleted and the dequeue operation shifts back to the parent rung. Figure 2 illustrates an example of the dequeue operation just described. Ini- tially, a series of enqueue operations insert events with timestamps 0.6, 0.5, 3.1, 3.05, 3.3, 3.4, 3.0 and 4.5 in that order into Top. On the first dequeue operation, events from Top are transferred to Rung [1] of Ladder as shown in the diagram, and the event with timestamp 0.5 is then dequeued. Subsequently, during the third dequeue operation, empty buckets are simply skipped and the sixth bucket is reached. For illustration purposes, it is assumed that having five events in the sixth bucket breaches the threshold THRES. The consequence of breaching THRES is that Rung [2] is spawned with a bucketwidth of 0.1. Next, the five events in the sixth bucket of Rung [1] are transferred to Rung [2]. Finally, the event with timestamp 3.0 is extracted at the third dequeue operation.
2.2.2 Successive Dequeue Operations Creates LadderQ Epochs. Finally, it should be noted that the LadderQ operates in epochs when successive dequeue
182 •^ W. T. Tang et al.
Fig. 2. An example illustrating the dequeue algorithm.
operations take place. The start of a LadderQ epoch is marked by the creation of the Ladder and Bottom structure. This occurs when all the events are held in Top (i.e., there is no Ladder structure and no Bottom structure currently) and the first dequeue operation is initiated (which creates the Ladder structure and the Bottom structure). The operating parameters of the epoch are deter- mined using Eqs. (1), (2), (3). After the epoch is started, subsequent enqueue
184 •^ W. T. Tang et al.
Continuing from the dequeue example described in Figure 2, suppose two events with timestamps 5.6 and 3.2 are to be inserted. For the event with timestamp 5.6, it is simply appended into Top since 5.6 TopStart (= 5.0). For the other event, since its timestamp is less than RCur [1] which is 3.5, it cannot be inserted in Rung [1]. Since event’s timestamp is 3.2 RCur [2] (= 3.1) this event is to be enqueued in Rung [2]. Using Eq. (2) with i = 2, this event is then appended to the third bucket of Rung [2].
2.3 Spawning vs Resize and Value of THRES
The rung spawning process in LadderQ can occur in both enqueue and dequeue operations. This is functionally different from the resize operation of the SCQ which involves the transfer of all events from an old queue to a new one with a different bucketwidth, which is an expensive procedure. However, the spawning operation in LadderQ only involves copying events from the current dequeue bucket Bc in Ladder or a single linked list in Bottom to a new rung. In short, spawning only affects a bucket or a linked list and not the entire PES structure. In comparison to the SCQ, this methodology is much more economical and efficient. Recall that during an enqueue operation, it is possible for an event to be inserted in the Bottom linked list. If events are frequently enqueued in Bottom , it is possible for LadderQ to degenerate into a single linked list structure, thus degrading the performance of LadderQ severely. In such cases, it is therefore imperative for a new child rung to be spawned in Ladder and for Bottom to be recopied to it ( Bc changes to Bsp ). Hence, further enqueue operations in the lowest-most region will be further distributed into smaller buckets resulting in enqueues having O (1) complexity instead of O ( n ) complexity due to traversing a long sorted linked list. The other spawning operation is triggered during dequeue operations when the number of events in the current dequeue bucket Bc exceeds THRES. When this happens, a new child rung is created ( Bc changes to Bsp ), and events will be redistributed in the child rung. Now, the value of THRES is independent of the event distribution and can be determined by comparing the average time required for sorting an event in Bottom and the average time required for spawning and copying Bottom events into a new child rung. A short piece of code that compares the time required per event under Linear Sort and Copy was written and run on an Intel Pentium 4 workstation. The simulation results in Figure 3 show that the threshold where Linear Sort intersects Copy is around 50. This means that during a dequeue operation, if the number of events is greater than 50, the time is better invested by copying the events to a new child rung. The value of 50 concurs with previous studies that linear sort is only efficient provided events in the list number less than 50 [Marin 1997].
2.4 Practical Aspects of LadderQ—Infinite Rung Spawning and Reusing Ladder Structure
The following presents some practical aspects of LadderQ for use in practical queue scenarios where some modification of the original LadderQ structure is required.
Ladder Queue •^185
Fig. 3. Comparison of average time required per element for each operation.
First, the practical LadderQ, incorporates a maximum limit of eight rungs that can be spawned at any one time. This is to prevent infinite spawning of rungs in Ladder. In the extensive assortment of simulation scenarios con- sidered in Section 5, the maximum number of rungs spawned did not exceed three. If there should be an occasion where there are already eight rungs, then events in the Bc (current dequeue bucket), associated with the eighth rung, are sorted to create Bottom even though the number of events in Bottom may exceed THRES. Infinite spawning of rungs can occur if the number of events exceeds THRES and the time stamps associated with all these events are all identical. This causes all the events to be always enqueued in one bucket irre- spective of the number of rungs spawned. Even though scenarios with as much as 50 events having the same time stamp is rarely encountered, this safeguard eliminates the possibility of infinite spawning that may jeopardize the per- formance of LadderQ. For other more common priority increment distribution where the mean μ of the jump random variable is finite and greater than zero, then relevant theoretical justifications (see Section 3) will demonstrate that the average number of rungs spawned by the LadderQ is bounded by a constant. Second, at the start of the simulation, the practical LadderQ is pre-initialized with five rungs (irrespective whether the distribution require less than five rungs), which is greater than the maximum number of rungs under most dis- tributions (see Section 5). After each epoch , the rungs are not deleted but rather, they are reused for the subsequent epochs. The only difference is that the buck- etwidth parameter of the rungs is different from epoch to epoch. By reusing the same rung structures, memory fragmentation is avoided and superior perfor- mance is obtained since the rungs are created only once. With the exception of Rung [1], the number of buckets in Rung [ i ], i > 1, is 50, that is, the THRES value. Hence pre-initializing Rung [2] to Rung [5] should be straightforward enough. In contrast, the number of buckets in Rung [1] is a variable which is only determined by equations (1) and (2) when a new epoch is started. To be able to reuse Rung [1] repeatedly for each epoch, we create more than enough buckets in Rung [1] when the first epoch starts. For example, if the actual num- ber of buckets required in Rung [1] is M in the first epoch, then we create 2 M
Ladder Queue •^187
It should be noted that the majority of practical simulation scenarios, in- cluding those with infinite variance in the jump variable, fit into the category where after the creation of an epoch, the epoch population becomes progres- sively smaller. Hence, the conventional LadderQ scenario is essentially a multi- epoch LadderQ where within the duration of the simulation, epochs arise and then die away eventually to give rise to another epoch. However, there are two (pathological) scenarios where the epoch population grows instead of decreases. The first scenario is an unstable queue situation where the number of events enqueued is always higher than the number of events dequeued. Hence it is possible for the epoch population to increase rather than decrease. Complexity analysis on queues implicitly require that there is some fixed N representing the target number of events in the queue for which a complexity measure is to be derived. A growing queue scenario is one where no target N can be defined and hence bear no significance for further analysis. The other scenario where the epoch population increases is where the mean jump parameter μ is zero so that every event has the same timestamp. In this scenario, uncontrolled rung spawn- ing may occur and is the reason behind the imposition of a maximum eight-rung limit to the practical LadderQ structure (see Section 2.4) and the relaxation of usual 50-element Bottom structure (i.e., sorted linked list) so that it can hold more than 50 elements. Even if such mystical scenarios are encountered, the LadderQ can also be easily adapted to yield O (1) performance as follows: If there are already eight rungs and it is detected that all events are arriving with the same timestamp, then a special tail pointer for Bottom is initialized so that an enqueue process does not require event scanning beginning from the head of the queue. This makes LadderQ clearly an O (1) structure, that is, all dequeue will be O (1) using Bottom ’s usual head pointer and all enqueue will also be O (1) using Bottom ’s special tail pointer. We now return to the more conventional LadderQ scenarios where multiple epochs are encountered. Such scenarios have the property that the priority in- crement distribution has a finite mean jump parameter μ and the queue size grows to some well-defined value N , and then maintains at that level (i.e., the number of dequeues and enqueues are roughly the same) for some time. Conse- quently, the LadderQ structure will proceed in epochs which implicitly require that the current epoch population will decrease to zero at some stage in time (if the epoch population does not decrease, then the LadderQ cannot proceed in epochs). Also noted is that during each epoch, the bucketwidth parameter, δ 1 , stays constant. In the following theoretical analysis, we demonstrate that the LadderQ is O (1) for conventional queue scenarios where the epoch population starts de- creasing from the time it was created. However, in view of Lemma 3.1, it is also clear that the worst-case LadderQ complexity in a practical queue sce- nario is where the epoch population is at maximum and equal to N. In other words, it is reasonable to consider a 1-epoch LadderQ where all event activi- ties (i.e., enqueue and dequeue activities) are assumed to only occur in Bottom or Ladder and the epoch population is always a constant equal to N. In ad- dition, the 1-epoch LadderQ will now have a bucketwidth parameter δ 1 which remains constant throughout. The 1-epoch LadderQ also has some practical
188 •^ W. T. Tang et al.
significance in that it represents the initial state of the multi-epoch LadderQ during the time when the epoch was first created with maximum events. It is also during this initial time following the epoch creation that the LadderQ experiences maximum time complexity. If this 1-epoch LadderQ is O (1), then clearly, the multi-epoch LadderQ is also O (1) (by virtue of Lemma 3.1). Hence, we state the following corollary, which is a result of Lemma 3.1:
C OROLLARY 3.1. The average time complexity of the multi-epoch LadderQ, that is, conventional LadderQ, is no larger than the average time complexity of the 1-epoch LadderQ.
It should also be noted that the theoretical analysis do not consider the cost of rung creation since rung creation is only done once as explained in Section 2.4. As for the number of buckets in Rung [1]: if N is the target size of the queue, then based on (1) where we set NTop = N , the total number of buckets required in Rung [1] is just N + 1 (where the additional one bucket is to accom- modate the maximum timestamp event) and this does not change any further during the epoch. It is noted that rung creation is just a fixed initial cost for each epoch irrespective whether we are considering a multi-epoch LadderQ or 1-epoch LadderQ. This fixed cost is O (1) per event for each epoch since the cost of creating N buckets is a one-time cost when transferring N events from Top to Ladder. Finally, an important issue for theoretical analysis is that we do not impose a maximum limit of eight rungs and the usual 50-element limit also applies to Bottom irrespective of the number rungs spawned. The case of practical LadderQ being limited to eight rungs as stated in Section 2.4 is only for a specific and obviously mystical queue situation where all the event timestamps are equal. However, as far as the mean jump parameter μ is finite and not zero, and the event size does not grow infinitely, then a maximum rung limit has no theoretical basis for its existence. This will be clear in the following sections.
3.2 Complexity of 1-epoch LadderQ
The complexity of tree-based priority queues (e.g., Splay Tree [Sleator and Tarjan [1985]) are often gauged by considering the average height of the grow- ing tree structure. Similarly, the complexity of the 1-epoch LadderQ is closely related to the average number of rungs that are spawned, as will be seen later. We begin with the following proposition:
PROPOSITION 3.1. The 1 -epoch LadderQ is O (1) if the average number of rungs for a priority event distribution is bounded by a constant.
Justification. The proposition can be justified by considering the cost of a dequeue operation and the cost of an enqueue operation in a 1-epoch LadderQ as the number of events N increases. If the costs of these two bread- and- butter operations in the 1-epoch LadderQ are O(1), then the structure is also O(1). We consider these two basic costs and analyze them in terms their fixed cost and variable cost component.
Dequeue Cost. A fixed cost of dequeue is incurred when there is already a Bottom list. In this situation, since there is always a pointer that points to the
190 •^ W. T. Tang et al.
to belong to the spawning parent bucket. Hence, the distribution of events in Rung [1] of the Ladder portion is identical to the UCQ scenario considered in Erickson et al. [2000].
3.4 Useful Lemmas Applicable to the 1-Epoch LadderQ
Several lemmas, which are to be used later for showing the average number of rungs in a 1-epoch LadderQ is bounded by a constant, are now presented. For convenience, we define the following variables:
Bi represents the random variable that the bucket Bc or Bsp in Rung [ i ] contains a certain number of events ranging from 0 to N.
δ i represents the bucketwidth of Rung [ i ]. Based on (3) and for THRES = 50, we note that δ i + 1 = δ i / 50 = δ 1 / 50 i^. Note that the analysis is for the 1-epoch LadderQ where δ 1 is a constant throughout.
μ is the finite mean of the jump random variable that defines the priority incre- ment distribution of the N events. The jump random variable has a cumulative density function denoted by F ( x ).
qi , j is used to denote the limiting probability that a bucket in Rung [ i ] has exactly j events enqueued inside.
L EMMA 3.2. For N ≥ 2 and all δ 1 > 0 , the N events are distributed amongst the one year, UCQ-like, first rung structure of the 1 -epoch LadderQ according to
P ( B 1 = 0) = q 1,0 =
μ μ + N δ 1
and for j = 1, 2 ,... , N, we have
q 1,0 B ( j ) ≤ P ( B 1 = j ) = q 1, j ≤
q 1,0 B ( j ) 1 − F (δ 1 )
where B ( j ) is the tail probability of a binomial distribution for N trials with “success” parameter p (δ 1 ) :
B ( j ) =
k = j
k
pk^ (1 − p ) N^ − k^ , (6)
p (δ 1 ) =
μ
∫ (^) δ 1
0
[1 − F ( x )] d x. (7)
Remark. Lemma 3.2 provides a mathematical description to the UCQ dis- tribution under a Hold scenario. The relevant proof is presented in Erickson et al. [2000] using a Markov chain model.
The next Lemma presents results on the probability distribution of events in the child rungs. For this purpose, it can be assumed, without loss of generality, that each bucket in Rung [1] will spawn m number of buckets as shown in Figure 4. For the 1-epoch LadderQ, m = 50. However, the proof is also applicable for all m > 1. An event enqueued in Rung [1] can be seen to be virtually enqueued in Rung [2]. This virtual enqueue process can also be applied to higher
Ladder Queue •^191
Fig. 4. Events in a parent bucket are virtually enqueued amongst m child buckets.
order rungs where virtual events in a parent bucket are seen to be virtually enqueued amongst m number of buckets in the child rung. With the virtual enqueue process in place, it is clear that each child Rung [ i ] can be considered to be a one year UCQ with bucketwidth (^) m δ i^1 − 1. Hence, Lemma 3.1 is also applicable on the successive child rungs so that the following probability expression is valid:
P ( Bi = 0) = qi ,0 =
μ μ + N (^) m δ i^1 − 1
We now present Lemma 3.3 as follows:
LEMMA 3.3. As more levels of child rungs are spawned, the probability that a bucket in the lowest child rung, denoted to be the Lth rung, has no element will asymptotically approach unity, that is, q 1,0 < q 2,0 < K < q (^) L −1,0 < q (^) L ,0 < K → 1.
PROOF. The proof is seen in (8) where we let i increase to a large value. Remark. Lemma 3.3 is also rather intuitive since as more child rungs are spawned, the more the events are spread out amongst the child rungs and thus the greater the likelihood that there will be no element in any particular bucket belonging to the lowest rung. This lemma also suggests that the number of child rungs spawned ought to be bounded.
L EMMA 3.4. Let L represent the random variable that counts the number of rungs spawned in a 1-epoch LadderQ. Then for n > 1, P ( L = n ) < (1 − qn −1,0 ).
PROOF. We provide proofs for P ( L = 2) and P ( L = 3) and then apply in- duction to obtain the desired result for P ( L = n ). It is noted that for 1-epoch LadderQ, the new child rung is spawned when the number of events in a parent bucket has reached 50. Hence, we note that:
P ( L = 1) = P ( B 1 < 50) P ( L = 2) = P ( B 2 < 50, B 1 ≥ 50) ≤ P ( B 1 ≥ 50) = 1 − P ( B 1 < 50) < 1 − q 1,0.
Ladder Queue •^193
P ROOF. We begin with the usual expression for obtaining the average num- ber of rungs for a total of N events in the 1-epoch LadderQ:
j = 1
jP ( L = j )
< 1 + 2(1 − q 1,0 ) + 3(1 − q 2,0 ) + 4(1 − q 3,0 ) + L (see Lemma 3.3) · · ·
where T (^) j and T (^) j + 1 represents the higher order terms of the series. The above sum of series is bounded as long as the ratio test T (^) j + 1 / T (^) j is less than 1. Thus,
lim j →∞
T (^) j + 1 T (^) j
= lim j →∞
j + 1 j
(1 − q (^) j +1,0 ) (1 − q (^) j ,0 )
= lim j →∞
m j^ −^1 μ + N δ 1 m j^ μ + N δ 1
m
Since the ratio test evaluates to a value that is less than unity, the average number of rungs must converge to some constant less than infinity.
C OROLLARY 3.2. The 1 -epoch LadderQ has O (1) average time complexity.
PROOF. The proof is obtained by combining Proposition 3.1 with either Theorem 3.1 or Theorem 3.2.
C OROLLARY 3.3. The 1 -epoch LadderQ has O(N ) total memory usage.
PROOF. As shown in Eq. (1), Ladder’s first rung requires N +1 (≈ N ) buckets on a transfer of events from Top and thus the first rung has an O ( N ) memory usage. Each subsequent child rung requires O ( THRES ) memory space. Since the average number of rungs is bounded (See Theorem 3.1 or Theorem 3.2), the 1-epoch LadderQ’s memory consumption is therefore bounded by O ( N ).
C OROLLARY 3.4. The average amortized dequeue cost incurred when events are transferred from Top to the multirung Ladder structure and then to the Bottom structure is O (1).
PROOF. Assume that the average number of rungs spawned is C. Therefore, the worst-case total cost (note: total cost is not the amortized cost) incurred is given as O ( N ( C +1)), where all the events in Top traversed C number of rungs before reaching Bottom (see Proposition 3.1). Hence, the cost incurred per event is O ( C + 1) = O (1) since C is some constant independent of N , which is given in Theorems 3.1 and 3.2.
3.6 Theorems for the Conventional LadderQ’s O (1) Amortized Complexity
For formality, we now have
C OROLLARY 3.5. The conventional (multi-epoch) LadderQ is, theoretically, an O(1) priority event queue structure.
PROOF. Combine the results of Corollary 3.1 and Corollary 3.2.
194 •^ W. T. Tang et al.
LadderQ, with its theoretical O (1) amortized complexity, is clearly and sig- nificantly more robust than the current O (1) priority queue structure proposed. It should be noted that the UCQ considered by Erickson et al. [2000] is O (1) only provided the bucketwidth can be kept at O (1/ N ). Therefore, for the UCQ to maintain O (1) in a dynamic queue situation where N varies, costly buck- etwidth resizes must be initiated to adjust the bucketwidth and hence the UCQ has never been widely used as the priority queue structure for practical simu- lators. In comparison with the more widely implemented SCQ, the LadderQ is also more superior. First, in the area of amortized complexity, the SCQ can at best be described as having expected O (1) complexity, meaning that there is no known theoretical proof to show O (1) except through a number of simulation examples. In fact, simulations studies conducted on the SCQ has shown that for certain scenarios where the priority increment is highly unstable, the SCQ exhibit O ( N ) characteristics either due to under or over triggering of resize operations. To demonstrate the usefulness and superiority of the LadderQ for practical implementation, Sections 4 and 5 provide simulation studies on the performance of the LadderQ in comparison with previously proposed priority queue structures. It should be noted that most of the scenarios chosen for the simulation studies are those that have been previously proposed by well-known researchers on priority queues. Other scenarios presented have been suggested by the reviewers to incorporate more stringent tests. In fact, the LadderQ does not require any special scenarios to show its superiority since it already has the distinction being O (1) theoretically, irrespective of the bucketwidth parameter (and hence N ). We note that in the numerical studies presented in Section 5, the maximum number of rungs spawned never exceeded three levels.
The performance of priority queues are often measured by the average access time to enqueue or dequeue an event under different load conditions. The pa- rameters to be varied for each queue are: the access pattern, the priority distri- bution and the queue size. The access pattern models that have been proposed either emulate the steady-state or the transient phase of a typical simulation; Classic Hold model [Jones 1986] and Up/Down model [R¨onngren et al. 1993] respectively. The priority increment distributions used for the benchmarking of priority queue structures are found in Table II, where rand() returns a random num- ber [Park and Miller 1988] in the interval [0,1]. The Camel( x , y ) distribution [R¨onngren et al. 1993] represents a 2-hump heavily skewed distribution with x % of its mass concentrated in the two humps and the duration of the two humps is y % of the total interval. The Change( A , B , x ) distribution [R¨onngren et al. 1993] was also used to test the sensitivity of the SCQ when exposed to drastic changes in priority increment distribution. The compound distribution Change( A , B , x ) interleaves two different priority increment distributions A and B together. Initially, x priority increments are drawn from A followed by an- other x priority increments drawn from B and so on. Change distributions can be used to model simulations where the priority increment distributions vary