Online Prediction of Traffic Load in Framed-ALOHA Networks using LSTM, Lecture notes of Statistics

The challenge of predicting traffic load in framed-ALOHA networks, where the base station does not have access to the cardinality of collisions. The authors propose using Long Short-Term Memory (LSTM) networks to learn from historical data and improve prediction accuracy. They argue that relying solely on the latest observations ignores valuable historical information. The document focuses on modern IoT traffic scenarios with complex statistics, and compares the proposed method to MOM-based estimators and ML methods.

Typology: Lecture notes

2021/2022

Uploaded on 09/27/2022

maya-yct
maya-yct 🇬🇧

4.8

(9)

217 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Online Supervised Learning for Traffic Load
Prediction in Framed-ALOHA Networks
Nan Jiang, Yansha Deng, Osvaldo Simeone, and Arumugam Nallanathan
Abstract
Predicting the current backlog, or traffic load, in framed-ALOHA networks enables the optimization of resource
allocation, e.g., of the frame size. However, this prediction is made difficult by the lack of information about the
cardinality of collisions and by possibly complex packet generation statistics. Assuming no prior information about
the traffic model, apart from a bound on its temporal memory, this paper develops an online learning-based adaptive
traffic load prediction method that is based on Recurrent Neural Networks (RNN) and specifically on the Long
Short-Term Memory (LSTM) architecture. In order to enable online training in the absence of feedback on the exact
cardinality of collisions, the proposed strategy leverages a novel approximate labeling technique that is inspired by
Method of Moments (MOM) estimators. Numerical results show that the proposed online predictor considerably
outperforms conventional methods and is able to adapt to changing traffic statistics.
Index Terms
Traffic load prediction, framed-ALOHA, online supervised learning, recurrent neural network.
I. INTRODUCTION
Framed-ALOHA (f-ALOHA) has been widely adopted as a key component of multiple access protocols
in many state-of-the-art wireless communication systems, including Long-Term Evolution (LTE) and 5G
New Radio (NR). In f-ALOHA, time is organized into time frames, with each frame containing multiple
Random Access Opportunities (RAOs). RAOs refer to subsets of channel resources in time, frequency,
or/and code domain, e.g., random access preambles in the LTE system. In each frame, devices select RAOs
at random and transmit to the connected Base Station (BS) in an uncoordinated manner. Collisions cause
This work was supported by the Engineering and Physical Sciences Research Council (EPSRC), U.K., under Grant EP/R006466/1 and the
European Research Council (ERC) under the European Union Horizon 2020 research and innovative programme (grant agreement No 725731).
N. Jiang, and A. Nallanathan are with School of Electronic Engineering and Computer Science, Queen Mary University of London, London,
UK (e-mail: {nan.jiang, a.nallanathan}@qmul.ac.uk).
Y. Deng, and O. Simeone are with Department of Informatics, King’s College London, London, UK (e-mail: {yansha.deng, osvaldo.simeone}
@kcl.ac.uk).
arXiv:1907.11064v1 [cs.NI] 25 Jul 2019
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Online Prediction of Traffic Load in Framed-ALOHA Networks using LSTM and more Lecture notes Statistics in PDF only on Docsity!

Online Supervised Learning for Traffic Load

Prediction in Framed-ALOHA Networks

Nan Jiang, Yansha Deng, Osvaldo Simeone, and Arumugam Nallanathan

Abstract Predicting the current backlog, or traffic load, in framed-ALOHA networks enables the optimization of resource allocation, e.g., of the frame size. However, this prediction is made difficult by the lack of information about the cardinality of collisions and by possibly complex packet generation statistics. Assuming no prior information about the traffic model, apart from a bound on its temporal memory, this paper develops an online learning-based adaptive traffic load prediction method that is based on Recurrent Neural Networks (RNN) and specifically on the Long Short-Term Memory (LSTM) architecture. In order to enable online training in the absence of feedback on the exact cardinality of collisions, the proposed strategy leverages a novel approximate labeling technique that is inspired by Method of Moments (MOM) estimators. Numerical results show that the proposed online predictor considerably outperforms conventional methods and is able to adapt to changing traffic statistics.

Index Terms Traffic load prediction, framed-ALOHA, online supervised learning, recurrent neural network.

I. INTRODUCTION Framed-ALOHA (f-ALOHA) has been widely adopted as a key component of multiple access protocols in many state-of-the-art wireless communication systems, including Long-Term Evolution (LTE) and 5G New Radio (NR). In f-ALOHA, time is organized into time frames, with each frame containing multiple Random Access Opportunities (RAOs). RAOs refer to subsets of channel resources in time, frequency, or/and code domain, e.g., random access preambles in the LTE system. In each frame, devices select RAOs at random and transmit to the connected Base Station (BS) in an uncoordinated manner. Collisions cause

This work was supported by the Engineering and Physical Sciences Research Council (EPSRC), U.K., under Grant EP/R006466/1 and the European Research Council (ERC) under the European Union Horizon 2020 research and innovative programme (grant agreement No 725731). N. Jiang, and A. Nallanathan are with School of Electronic Engineering and Computer Science, Queen Mary University of London, London, UK (e-mail: {nan.jiang, a.nallanathan}@qmul.ac.uk). Y. Deng, and O. Simeone are with Department of Informatics, King’s College London, London, UK (e-mail: {yansha.deng, osvaldo.simeone} @kcl.ac.uk).

arXiv:1907.11064v1 [cs.NI] 25 Jul 2019

... ... Frames

RAOs ... ...

Predictor ... O t-^1 Ot ...... ... ...

Ot-^2 ...... ... ... t- 2 t- 1 t t+ 1

Success Idle Collision

N^ ^^ t+^1

Fig. 1: Timeline of an f-ALOHA protocol and target predictor based on historical data about collided, successful, and idle RAOs in previous frames.

devices to retransmit in following frames, increasing the backlog of packets to be transmitted in a frame beyond the load due to newly generated packets. Adapting the number of RAOs per frame to the estimated current backlog, or traffic load, is an important step to relieve network congestion and reduce access delays. However, prediction is made difficult by the fact that the BS does not have access to the cardinality of the collisions, that is, to the number of devices that have selected the same RAO, and by the possibly complex nature of the incoming traffic statistics (see Fig. 1). As an example, the incoming traffic may consist of mixtures of different traffic types, including periodic, event-driven (bursty), multimedia streaming patterns, and etc. [1, 2]. Estimating traffic backlog in f-ALOHA can only rely on the observation on the number of RAOs in each frame that are idle, collided, or successful (see Fig. 1). Previous classical works have proposed Method of Moment (MOM)-based estimators that aim at matching the average number of such RAOs to the current measurements [3]. More recent works proposed to predict bursty traffic for event-driven applications, i.e., for massive devices being activated by an external event to request transmissions within a short period, using drift analysis [4], MOM [5], or Maximum-Likelihood (ML) estimation [6]. All these prior works estimate the current backlog only based on the latest observations of idle, collided, or successful RAOs, while ignoring historical data from prior frames. As argued in this work, information about RAOs in previous frames can be useful to learn features of the traffic statistics that enable an improved prediction. In this work, we target modern Internet of Things (IoT) traffic scenarios with complex statistics, possibly encompassing mixtures of long- and short-memory processes, e.g., a mixture of random and periodic transmissions with long duty cycles [1, 2]. In order to capture the complex dynamics of the IoT traffic, we propose an online supervised learning method that adopts a Recurrent Neural Network (RNN) model based on the state-of-the-art Long Short-Term Memory (LSTM) architecture [7]. The most relevant prior work is [8], where the authors leverage LSTM to classify mobile encrypted traffic. To the best of our knowledge, the application of LSTM to traffic backlog prediction in f-ALOHA has not been considered

goal of this paper is to predict the forthcoming traffic value N t+1^ by learning a conditional distribution P {N t+1^ = n|Ht} and then solving the Maximum A Posteriori (MAP) problem

(P1) : Nˆ t+1^ = (^) n∈{arg max 0 , 1 ,...,N max}

P {N t+1^ = n|Ht}^ , (1)

where the Nmax is an upper bound on the backlog. Unless one makes simplistic assumptions as in [4–6], problem (1) remains generally intractable even in the presence of realistic traffic models, further justifying the data-driven solutions developed in prior art and in this letter. We emphasize that we focus solely on the problem of prediction and that we leave the problem of investigating the interplay between overload control via, e.g., frame size selection and access barring, and traffic prediction to future work.

III. CONVENTIONAL BACKLOG PREDICTION In this section, we review two conventional traffic prediction methods, namely MOM-based algorithms [3] (see also [9, 10, Sec. III] [5, Sec. V-A]), and the current state-of-the-art ML estimator of [6]. Both these methods use an estimate N˜ t^ for the backlog N t^ in the current frame, given observations in frame t, as the prediction Nˆ t+1^ for the backlog in frame t + 1. We emphasize that, throughout this letter, given information Ht^ available at the end of frame t, we use the notation N˜ t^ to represent an estimate of the backlog in frame t, while Nˆ t+1^ denotes a prediction for the backlog at the beginning of frame t + 1.

A. MOM-Based Estimator

Given a backlog N t^ = n in any frame t, neglecting the possibility of detection errors, the expected number of RAOs in idle, success, and collisions state are given respectively as

Ei(n) = E[Vi|N t^ = n] = F

1 − F^1

)n , (2)

Es(n) = E[Vs|N t^ = n] = n

1 − F^1

)n− 1 , (3)

Ec(n) = E[Vc|N t^ = n] = F

1 − F^1

)n − (^) Fn

1 − F^1

)n− 1 ) , (4)

where we recall that F is the number of RAOs. These expectations can be easily computed by noting that each active of the n devices selects any of the F RAOs with equal probability. MOM estimators N˜ (^) MOMt of the current backlog N t^ aim at matching one or more of the moments in (2), (3), and (4) to the current observations V (^) it , V (^) st , and V (^) ct , respectively. A MOM estimator hence generally finds a value of n that minimizes a measure of the discrepancy between the moments in (2), (3), and (4),

on the one hand, and the respective observations, on the other. For instance, one could consider the Mean Absolute Error (MAE)

ϕt(n) =^13 (|Ei(n) − V (^) i t| + |Es(n) − V (^) st | + |Ec(n) − V (^) ct |), (5)

which would yield the MOM-based estimator

N^ ˆ (^) MOMt+1 = N˜ (^) MOMt = arg min n∈{ 0 , 1 ,...,Nmax}

ϕt(n). (6) Simplified, and generally less accurate, MOM-based estimators that enjoy closed-form solutions have been proposed in [5, 9]. As an example, in [9, Sec. 3], traffic load N t^ is estimated by matching the single moment (2) with the given observation V (^) it. Imposing the equality Ei(n) = V (^) it and using the rounded solution n as the estimate N˜ t+1^ of the backlog for the current frame yields the estimator

N^ ˆ (^) MOMt+1 = N˜ (^) MOMt = round

log(1− (^) F (^1) )

(V t i F

where round{·} is the nearest integer function.

B. ML Estimator

A more complex estimator can be obtained by using the ML estimator N˜ (^) MLt of the current backlog [6]. This is given as

N^ ˆ (^) MLt+1 = N˜ (^) MLt = arg max n∈{ 0 , 1 ,...,Nmax}^ P{O

t|N t (^) = n}. (8)

Solving problem (8) requires the computation of the probability P{Ot|N t^ = n} for each possible n ∈ { 0 , 1 , ..., Nmax} given the current observation Ot. Note that each value P{Ot|N t^ = n} represents the likelihood of a value n given the current observation. Reference [6] proposes a numerical approach that computes the vector of probabilities, or likelihoods, P{Ot|N t^ = n} for all n ∈ { 0 , 1 , ..., Nmax} as the steady- state probability vector of a Markov chain. This Markov chain is defined by letting each device sequentially, and independently, select an RAO at each step. We refer to [6] for details on the numerical procedure.

IV. ONLINE SUPERVISED LEARNING-BASED BACKLOG PREDICTION In this section, we propose an online supervised learning approach for the training of a predictor of the forthcoming traffic load N t+1^ given the observations Ht^ available at the end of frame t. Unlike the existing methods presented above, the proposed scheme aims at capturing not only the information present in the most recent observation Ot, but also the historical information in the previous observations in Ht^ in order to detect patterns in the traffic generation mechanism and in the communication protocol. To this end, the

stateful counterparts, in which the state of the LSTM layer is not re-initialized at each step [11]. In order to adapt the model parameter θ, we adopt standard stochastic gradient descent implemented via BackPropagation Through Time (BPTT) [12]. In particular, at each frame t + 1, the BS estimates the current backlog N˜ t+1^ using one of the methods discussed in Sec. III. Then, it updates the weights θ in the direction of the negative gradient of the cross-entropy loss

Lt(θ) = −log

P{ Nˆ t+1^ = N˜ t+1|O tt−+1To , θ}

where we recall that the probability P{ Nˆ t+1^ = N˜ t+1|O tt−+1To , θ} is defined by the LSTM and by the softmax layer. The gradient can be computed via BPTT using standard methods. In practice, rather than applying the gradient of Eq. (9) at frame t + 1, it is preferable to consider a window, or random mini-batch, of Tb previous values and evaluate the gradient of the average loss

Lt(θ) = −

∑^ t t′=t−Tb+

log

P{ Nˆ t′+1^ = N˜ t′+1|O tt′′+1−To , θ}

This can generally reduce the variance of the stochastic gradient and improve stability of training [13]. In order to reduce the time and computational resource needed for convergence of LSTM training, it is useful to initialize the weights of the LSTM by first running offline experiments based on available traffic models, which can be mismatched to online traffic statistics. This may be considered as an example of meta-learning [14]. We will provide an example in the next section.

V. NUMERICAL RESULTS In this section, numerical experiments are conducted to evaluate the traffic load prediction accuracy of the proposed online supervised learning method. We mostly assume the presence of Nu = 1000 devices generating a packet in any frame independently with probability 0. 005 , as well as of additional Np = 20 devices generating one packet every Tp = 10 frames in a deterministic (periodic) manner. This scenario captures the coexistence of services with both random and periodic traffic types [1, 2]. We will also consider random bursty traffic [2, Ch. 6.1], as detailed latter. Unless stated otherwise, other parameters are set accord- ing to 3GPP technical report for Machine-Type Communication [2] as follows: error detection probability ped = 0. 05 ; retransmission constraint γmax = 10; and number of RAOs F = 54. We compare the performance in terms of prediction error among the MOM predictor (7), the ML predictor (8), and three LSTM-based predictors. The first, referred to as “Offline LSTM”, trains the LSTM during an offline phase with 105 frames of synthetically generated traffic. For the offline phase, the exact backlog N t+1^ can be used when training using the cross-entropy criterion. The statistics of the offline traffic are different from the online traffic in that the former only contains the underlying random traffic, while the latter

TABLE I: Supervised Learning Hyperparameters Hyperparameters Value MemoryRMSProp learning rate To (^) α 20 0. LSTM drop-out rate 0. Minibatch size 64 Historical samples size Tb 1000

also models periodic traffic. This offline scheme is treated as the baseline, and its weights are transferred to the online LSTM-based predictors as initialization. The second scheme, referred to as “Online LSTM”, implements the proposed online scheme by using the MOM estimator (5) in the cross-entropy criterion (10), while adapting to the online traffic statistics. The third, referred to as “Genie-aided LSTM”, trains the LSTM by using criterion Eq. (10) with the correct value N t+1. The performance of this scheme provides an upper bound, due to its use of an ideal supervision in the form of signal N t+1.



   

Fig. 3: Actual traffic and predicted backlog versus the number of frames after 10^5 online training frames. We start by illustrating the operation of the proposed and reference predictors in Fig. 3. This figure plots the actual and predicted backlog after 10^5 frames along time frames. We observe that the only method that is able to predict the backlog spikes due to periodic traffic is the proposed Online LSTM method (the Genie-aided LSTM scheme is not shown). In fact, both MOM and ML are not capable of capturing historical trends in the traffic, and Offline LSTM has only observed data without packets from the periodic traffic. Fig. 4 shows the evolution (averaged over 500 training trails) of the absolute prediction error | Nˆ t+1^ −N t+1| as a function of the frames observed in the online phase for the proposed online LSTM scheme and the genie-aided LSTM scheme. It is seen that the proposed online scheme is able to fairly quickly adapt to the

degradation as compared to the other strategies. This is due to the capability of the proposed scheme to adapt to the traffic statistics. It is also interesting to note that, even in the absence of statistical regularities in the traffic, i.e., with Np = 0, the LSTM-based schemes outperform MOM and ML. This is also because the random access communication protocol itself presents temporal correlations due to retransmissions of the same devices after collisions. Fig. 6 shows the prediction error as a function of the packet generation period Tp of the periodic traffic. Increasing the period Tp results in a smaller traffic load, which improves the prediction accuracy of MOM, ML, and Offline LSTM. In contrast, as long as Tp is not too large, the prediction accuracy of the proposed LSTM is not significantly affected by the change in Tp due to its capability to adapt to the traffic statistics. Specifically, this is only true if Tp ≤ To = 20, that is, if the traffic periodicity is smaller than the memory of the LSTM predictor. In contrast, when Tp > 20 , the prediction accuracy of LSTM suddenly degrades to the same level of Offline LSTM scheme. This is because, with a memory equal to To = 20, the LSTM predictor cannot capture any traffic correlation pertaining to frames occurring more than 20 frames before the frame of interest. Therefore, if Tp > 20 , online adaptation cannot improve the prediction accuracy. This degradation can be eliminated by increasing the memory of LSTM To, but at the cost of increasing the required computational and data resources for both training and prediction.

 Tp

      

Fig. 6: Average prediction errors per frame versus the packet generation period of deterministic traffic.

We now consider a more general scenario, in which the Np = 100 devices with periodic traffic generate packets at random according to the time-limited Beta profile [2, Ch. 6.1] with period Tp = 10. The time- limited Beta profile defines a probability of packet generation that peaks in the middle of each period with

burstiness defined by the magnitude of parameters (α, β) [2, Ch. 6.1]. Note that the deterministic packet generation considered so far is a special case of the current model, that is obtained by setting α = β → ∞. In Fig. 7, we plot the average prediction errors per frame versus a function of the burstiness parameter (α, β). We observe that increasing (α, β) degrades the prediction accuracy of all methods, since a higher burstiness results in a heavier traffic accumulation, which make prediction more difficult. The prediction accuracy of Offline LSTM is especially degraded, due to its lack of training observations under bursty traffic. In contrast, the proposed Online LSTM scheme only suffers from a minor performance degradation, demonstrating its capability to adapt to the traffic statistics.

α = 

       

Fig. 7: Average prediction errors per frame versus the burstiness parameter (α, β) for time limited Beta distributed traffic.

VI. CONCLUSIONS

In this paper, we developed a traffic load prediction method based on online supervised learning in f- ALOHA networks. In the proposed method, LSTM RNNs are leveraged to capture temporal correlations due to traffic generation and protocol mechanisms. The scheme is based on a novel approximate labeling method based on backlog estimation. Numerical results demonstrate that the proposed method considerably outperforms conventional memoryless solutions, and that it can effectively adapt to new traffic statistics. A promising future direction is to develop lifelong learning and meta-learning techniques for online traffic prediction (see, e.g., [14]).