Mạng thần kinh thường xuyên cho dự đoán P11

Chia sẻ: Do Xon Xon | Ngày: | Loại File: PDF | Số trang:28

Thêm vào BST

Báo xấu

79
lượt xem 12
download

Download Vui lòng tải xuống để xem tài liệu đầy đủ

Some Practical Considerations of Predictability and Learning Algorithms for Various Signals In this chapter, predictability, detecting nonlinearity and performance with respect to the prediction horizon are considered. Methods for detecting nonlinearity of signals are ﬁrst discussed. Then, diﬀerent algorithms are compared for the prediction of nonlinear and nonstationary signals, such as real NO2 air pollutant and heart rate variability signals, together with a synthetic chaotic signal.

Chủ đề:

Bình luận(0) Đăng nhập để gửi bình luận!

Lưu

Nội dung Text: Mạng thần kinh thường xuyên cho dự đoán P11

Recurrent Neural Networks for Prediction Authored by Danilo P. Mandic, Jonathon A. Chambers Copyright c 2001 John Wiley & Sons Ltd ISBNs: 0-471-49517-4 (Hardback); 0-470-84535-X (Electronic) 11 Some Practical Considerations of Predictability and Learning Algorithms for Various Signals 11.1 Perspective In this chapter, predictability, detecting nonlinearity and performance with respect to the prediction horizon are considered. Methods for detecting nonlinearity of signals are ﬁrst discussed. Then, diﬀerent algorithms are compared for the prediction of nonlinear and nonstationary signals, such as real NO2 air pollutant and heart rate variability signals, together with a synthetic chaotic signal. Finally, bifurcations and attractors generated by a recurrent perceptron are analysed to demonstrate the ability of recurrent neural networks to model complex physical phenomena. 11.2 Introduction When modelling a signal, an initial linear analysis is ﬁrst performed on the signal, as linear models are relatively quick and easy to implement. The performance of these models can then determine whether more ﬂexible nonlinear models are necessary to capture the underlying structure of the signal. One such standard model of linear time series, the auto-regressive integrated moving average, or ARIMA(p, d, q) model popularised by Box and Jenkins (1976), assumes that the time series xk is generated by a succession of ‘random shocks’ k , drawn from a distribution with zero mean and variance σ 2 . If xk is non-stationary, then successive diﬀerencing of xk via the diﬀerencing operator, ∇xk = xk −xk−1 can provide a stationary process. A stationary process zk = ∇d xk can be modelled as an autoregressive moving average p q zk = ai zk−i + bi k−i + k. (11.1) i=1 i=1 Of particular interest are pure autoregressive (AR) models, which have an easily understood relationship to the nonlinearity detection technique of DVS (deterministic
172 INTRODUCTION 120 100 80 Measurements of NO2 level 60 40 20 0 0 500 1000 1500 2000 2500 3000 Time scale in hours (a) The raw NO2 time series Figure 11.1 The NO2 time series and its autocorrelation function versus stochastic) plots. Also, an ARMA(p, q) process can be accurately represented as a pure AR(p ) process, where p p + d (Brockwell and Davis 1991). Penalised likelihood methods such as AIC or BIC (Box and Jenkins 1976) exist for choosing the order of the autoregressive model to be ﬁtted to the data; or the point where the autocorrelation function (ACF) essentially vanishes for all subsequent lags can also be used. The autocorrelation function for a wide-sense stationary time series xk at lag h gives the correlation between xk and xk+h ; clearly, a non-zero value for the ACF at a lag h suggests that for modelling purposes at least the previous h lags should be used (p h). For instance, Figure 11.1 shows a raw NO2 signal and its autocorrelation function (ACF) for lags of up to 40; the ACF does not vanish with lag and hence a high-order AR model is necessary to model the signal. Note the peak in the ACF at a lag of 24 hours and the rise to a smaller peak at a lag of 48 hours. This is evidence of seasonal behaviour, that is, the measurement at a given time of day is likely to be related to the measurement taken at the same time on a diﬀerent day. The issue of seasonal time series is dealt with in Appendix J.
SOME PRACTICAL CONSIDERATIONS OF PREDICTABILITY 173 Series NO2 1.0 0.8 0.6 ACF 0.4 0.2 0.0 0 10 20 30 40 Lag (b) The ACF of the NO2 series Figure 11.1 Cont. 11.2.1 Detecting Nonlinearity in Signals Before deciding whether to use a linear or nonlinear model of a process, it is impor- tant to check whether the signal itself is linear or nonlinear. Various techniques exist for detecting nonlinearity in time series. Detecting nonlinearity is important because the existence of nonlinear structure in the series opens the possibility of highly accu- rate short-term predictions. This is not true for series which are largely stochastic in nature. Following the approach from Theiler et al. (1993), to gauge the eﬃcacy of the techniques for detecting nonlinearity, a surrogate dataset is simulated from a high-order autoregressive model ﬁt to the original series. Two main methods to achieve this exist, the ﬁrst involves ﬁtting a ﬁnite-order ARMA(p, q) model (we use a high-order AR(p) model to ﬁt the data). The model coeﬃcients are then used to generate the surrogate series, with the surrogate residuals k taken as random permu- tations of the residuals from the original series. The second method involves taking a Fourier transform of the series. The phases at each frequency are replaced randomly from the uniform (0, 2π) distribution while keeping the magnitude of each frequency the same as for the original series. The surrogate series is then obtained by taking the inverse Fourier transform. This series will have approximately the same autocor-
174 OVERVIEW relation function as the original series, with the approximation becoming exact in the limit as N → ∞. A discussion of the respective merits of the two methods of generating surrogate data is given in Theiler et al. (1993), the method used here is the former. Evidence of nonlinearity from any method of detection is negated if the method gives a similar result when applied to the surrogate series, which is known to be linear (Theiler et al. 1993). 11.3 Overview This chapter deals with some practical issues when performing prediction of non- linear and nonstationary signals. Techniques for detecting nonlinearity and chaotic behaviour of signals are ﬁrst introduced and a detailed analysis is provided for the NO2 air pollutant measurements taken at hourly intervals from the Leeds meteo sta- tion, UK. Various linear and nonlinear algorithms are compared for prediction of air pollutants, heart rate variability and chaotic signals. The chapter concludes with an insight into the capability of recurrent neural networks to generate and model complex nonlinear behaviour such as chaos. 11.4 Measuring the Quality of Prediction and Detecting Nonlinearity within a Signal Existence and/or discovery of an attractor in the phase space demonstrates whether the system is deterministic, purely stochastic or contains elements of both. To recon- struct the attractor examine plots in the m-dimensional space of [xk , xk−τ , . . . , xk−(m−1)τ ]T . It is critically important for the dimension of the space, m, in which the attractor resides, to be large enough to ‘untangle’ the attractor. This is known as the embedding dimension (Takens 1981). The value of τ , the lag time or lag spacing, is also important, particularly with noise present. The ﬁrst inﬂection point of the autocorrelation function is a possible starting value for τ (Beule et al. 1999). Alter- natively, if the series is known to be sampled coarsely, the value of τ can be taken as unity (Casdagli and Weigend 1993). A famous example of an attractor is given by the Lorenz equations (Lorenz 1963)  x = σ(y − x),  ˙  y = rx − y − xz, ˙ (11.2)   z = xy − bz, ˙ where σ, r and b > 0 are parameters of the system of equations. In Lorenz (1963) these equations were studied for the case σ = 10, b = 8 and r = 28. A Lorenz attractor is 3 shown in Figure 11.13(a). The discovery of an attractor for an air pollution time series would demonstrate chaotic behaviour; unfortunately, the presence of noise makes such a discovery unlikely. More robust techniques are necessary to detect the existence of deterministic structure in the presence of substantial noise.
SOME PRACTICAL CONSIDERATIONS OF PREDICTABILITY 175 11.4.1 Deterministic Versus Stochastic Plots Deterministic versus stochastic (DVS) plots (Casdagli and Weigend 1993) display the (robust) prediction error E(n) for local linear models against the number of nearest neighbours, n, used to ﬁt the model, for a range of embedding dimensions m. The data are separated into a test set and a training set, where the test set is the last M elements of the series. For each element in the test set xk , its corresponding delay vector in m-dimensional space x(k) = [xk−τ , xk−2τ , . . . , xk−mτ ]T (11.3) is constructed. This delay vector is then examined against the set of all the delay vectors constructed from the training set. From this set the n nearest neighbours are deﬁned to be the n delay vectors x(k ) which have the shortest Euclidean distance to x(k). These n nearest neighbours x(k ) along with their corresponding target values xk are used as the variables to ﬁt a simple linear model. This model is then given x(k) as an input which provides a prediction xk for the target value xk , with a robust ˆ prediction error of |xk − xk |. ˆ (11.4) This procedure is repeated for all the test set, enabling calculation of the mean robust prediction error, 1 E(n) = |xk − xk |, ˆ (11.5) M xk ∈T where T is the test set. If the optimal number of nearest neighbours n, taken to be the value giving the lowest prediction error E(n), is at, or close to, the maximum possible n, then globally linear models perform best and there is no indication of nonlinearity in the signal. As this global linear model uses all possible length m vectors of the series, it is equivalent to an AR model of order m when τ = 1. Small optimal n suggests local linear models perform best, indicating nonlinearity and/or chaotic behaviour. 11.4.2 Variance Analysis of Delay Vectors Closely related to DVS plots is the nonlinearity detection technique introduced in Khalaf and Nakayama (1998). The general idea is not to ﬁt models, linear or otherwise, using the nearest neighbours of a delay vector, but rather to examine the variability of the set of targets corresponding to groups of close (in the Euclidean distance sense) delay vectors. For each observation xk , k m + 1 construct the group, Ωk , of nearest neighbour delay vectors given by Ωk = {x(k ) : k = k & dkk αAx }, (11.6) where x(k ) = {xk −1 , xk −2 , . . . , xk −m }, dkk = x(k ) − x(k) is the Euclidean distance, 0 < α 1, N 1 Ax = |xk | N −m k=m+1
176 DETECTING NONLINEARITY WITHIN A SIGNAL 250 100 NO2 level200 50 NO2 level 150 0 100 −50 50 −100 0 −150 0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 Time in hours (k) Time in hours (k) 200 200 100 100 NO2 level 0 NO2 level 0 −100 −100 −200 −200 0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 Time in hours (k) Time in hours (k) Figure 11.2 Time series plots for NO2 . Clockwise, starting from top left: raw, simulated, simulated deseasonalised, deseasonalised and N is the length of the time series. If the series is linear, then the similar patterns x(k ) belonging to a group Ωk will map onto similar xk s. For nonlinear series, the patterns x(k ) will not map onto similar xk s. This is measured by the variance σ 2 of each group Ωk 1 2 σk = (xk − µk )2 , x(k ) ∈ Ωk . |Ωk | k 2 The measure of nonlinearity is taken to be the mean of σk over all the Ωk , denoted 2 , normalised by dividing through by σ 2 , the variance of the entire time series σN x 2 σN σ2 = 2 . σx The larger the value of σ 2 the greater the suggestion of nonlinearity (Khalaf and Nakayama 1998). A comparison with surrogate data is especially important with this method to get evidence of nonlinearity. 11.4.3 Dynamical Properties of NO2 Air Pollutant Time Series The four time series generated from the NO2 dataset are given in Figure 11.2, with the deseasonalised series on the bottom and the simulated series on the right. The
SOME PRACTICAL CONSIDERATIONS OF PREDICTABILITY 177 Series NO2 Series NO2 1.0 1.0 0.8 0.8 0.6 0.6 ACF ACF 0.4 0.4 0.2 0.2 0.0 0.0 0 10 20 30 40 0 10 20 30 40 Lag Lag Series NO2 Series NO2 1.0 1.0 0.5 0.5 ACF ACF 0.0 0.0 −0.5 −0.5 0 10 20 30 40 0 10 20 30 40 Lag Lag Figure 11.3 ACF plots for NO2 . Clockwise, starting from top left: raw, simulated, simulated deseasonalised, deseasonalised sine wave structure can clearly be seen in the raw (unaltered) time series (top left), evidence conﬁrming the relationship between NO2 and temperature. Also note that once an air pollutant series has been simulated or deseasonalised, the condition that no readings can be below zero no longer holds. The respective ACF plots for the NO2 series are given in Figure 11.3. The raw and simulated ACFs (top) are virtually identical – as should be the case, since the simulated time series is based on a linear AR(45) ﬁt to the raw data, the correlations for the ﬁrst 45 lags should be the same. Since generating the deseasonalised data involves application of the backshift operator, the autocorrelations are much reduced, although a ‘mini-peak’ can still be seen at a lag of 24 hours. Nonlinearity detection in NO2 signal Figure 11.4 shows the two-dimensional attractor reconstruction for the NO2 time series after it has been passed through a linear ﬁlter to remove some of the noise
178 DETECTING NONLINEARITY WITHIN A SIGNAL NO2 NO2 80 20 60 0 xk+τ xk+τ 40 −20 20 −40 0 0 20 40 60 80 −40 −20 0 20 xk xk NO2 NO2 6 4 5 2 xk+τ xk+τ 0 0 −2 −4 −5 −6 −6 −4 −2 0 2 4 6 −5 0 5 xk xk Figure 11.4 Attractor reconstruction plots for NO2 . Clockwise, starting from top left: raw, simulated, simulated deseasonalised and deseasonalised present. This graph shows little regularity and there is little to distinguish between the raw and the simulated plots. If an attractor does exist, then it is in a higher- dimensional space or is swamped by the random noise. The DVS plots for NO2 are given in Figure 11.5, the DVS analysis of a related air pollutant can be found in Foxall et al. (2001). The optimal n (that is, the value of n corresponding to the minimum of E(n)), is clearly less than the maximum of n for the raw data for each of the embedding dimensions (m) examined. However, the diﬀerence is not great and the minimum occurs quite close to the maximum n, so this only provides weak evidence for nonlinearity. The DVS plot for the simulated series obtains the optimal error measure at the maximum n, as is expected. The deseasonalised DVS plots follow the same pattern, except that the evidence for nonlinearity is weaker, and the best embedding dimension now is m = 6 rather than m = 2. Figure 11.6 shows the results from analysing the variance of the delay vectors for the NO2 series. The top two plots show lesser variances for the raw series, strongly suggesting nonlinearity. However, for
SOME PRACTICAL CONSIDERATIONS OF PREDICTABILITY 179 NO2 NO2 0.45 m=2 m=2 m=4 0.50 m=4 m=6 m=6 0.40 m=8 m=8 E(n) E(n) m=10 0.45 m=10 0.35 0.40 5 50 500 5000 5 50 500 5000 n n NO2 NO2 0.46 0.48 0.44 m=2 m=2 0.46 0.42 m=4 m=4 m=6 0.44 m=6 0.40 m=8 m=8 E(n) E(n) m=10 0.42 m=10 0.38 0.40 0.36 0.38 0.34 0.36 0.32 5 50 500 5000 5 50 500 5000 n n Figure 11.5 DVS plots for NO2 . Clockwise, starting from top left: raw, simulated, simulated deseasonalised and deseasonalised Table 11.1 Performance of gradient descent algorithms in prediction of the NO2 time series Recurrent NGD NNGD perceptron NLMS Predicted gain (dB) 5.78 5.81 6.04 4.75 the deseasonalised series (bottom) the variances are roughly equal, and indeed greater for higher embedding dimensions, suggesting that evidence for nonlinearity originated from the seasonality of the data. To support the analysis, experiments on prediction of this signal were performed. The air pollution data represent hourly measurements of the concentration of nitro- gen dioxide (NO2 ), over the period 1994–1997, provided by the Leeds meteo station.
180 DETECTING NONLINEARITY WITHIN A SIGNAL NO2 NO2 1.0 1.0 m=2 m=2 0.8 m=4 0.8 m=4 m=6 m=6 0.6 m=8 0.6 m=8 m=10 m=10 σ2 σ2 0.4 0.4 0.2 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 α α NO2 NO2 1.0 1.0 m=2 m=2 0.8 m=4 0.8 m=4 m=6 m=6 0.6 m=8 0.6 m=8 m=10 m=10 σ2 σ2 0.4 0.4 0.2 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 α α Figure 11.6 Delay vector variance plots for NO2 . Clockwise, starting from top left: raw, simulated, simulated deseasonalised and deseasonalised In the experiments the logistic function was chosen as the nonlinear activation func- tion of a dynamical neuron (Figure 2.6). The quantitative performance measure was the standard prediction gain, a logarithmic ratio between the expected signal and σ2 σ2 error variances Rp = 10 log(ˆs /ˆe ). The slope of the nonlinear activation function of the neuron β was set to be β = 4. The learning rate parameter η in the NGD algorithm was set to be η = 0.3 and the constant C in the NNGD algorithm was set to be C = 0.1. The order of the feedforward ﬁlter N was set to be N = 10. For simplicity, a NARMA(3,1) recurrent perceptron was used as a recurrent network. The summary of the performed experiments is given in Table 11.1. From Table 11.1, the nonlinear algorithms perform better than the linear one, conﬁrming the analysis which detected nonlinearity in the signal. To further support the analysis given in the DVS plots, Figure 11.7(a) shows prediction gains versus number of taps for linear and nonlinear feedforward ﬁlters trained by the NGD, NNGD and NLMS algorithms, whereas Figure 11.7(b) shows prediction performance of a recurrent perceptron (Fox-
SOME PRACTICAL CONSIDERATIONS OF PREDICTABILITY 181 all et al. 2001). Both the nonlinear ﬁlters trained by the NGD and NNGD algorithms outperformed the linear ﬁlter trained by the NLMS algorithm. For the tap length up to N = 10, the NNGD was outperforming the NGD; the worse performance of the NNGD over the NGD for N > 10 can be explained by the insuﬃcient approximation of the remainder of the Taylor series expansion within the derivation of the algorithm for large N . The recurrent structure achieved better performance for a smaller number of tap inputs than the standard feedforward structures. 11.5 Experiments on Heart Rate Variability Information about heart rate variability (HRV) is extracted from the electrocardio- gram (ECG). There are diﬀerent approaches to the assessment of HRV from the measured data, but most of them rely upon the so-called R–R intervals, i.e. distance in time between two successive R waves in the HRV signal. Here, we use the R–R intervals that originate from ECG obtained from two patients. The ﬁrst patient (A) was male, aged over 60, with a normal sinus rhythm, while patient (B) was also male, aged over 60, who suﬀered a miocardial infarction. In order to examine predictability of HRV signals, we use various gradient-descent-based neural adaptive ﬁlters. 11.5.1 Experimental Results Figure 11.8(a) shows the HRV for patient A, while Figure 11.8(b) shows HRV for patient B. Prediction was performed using a logistic activation function Φ of a dynam- ical neuron with N = 10. The quantitative performance measure was the standard σ2 σ2 prediction gain Rp = 10 log(ˆs /ˆe ). The slope of the nonlinear activation function of the neuron β was set to be β = 4. Due to the saturation type logistic nonlinearity, input data were prescaled to ﬁt within the range of the neuron activation function. Both the standard NGD and the data-reuse modiﬁcations of the NGD algorithm were used. The number of data-reuse iterations L was set to be L = 10. The perfor- mance comparison between the NGD algorithm and a data-reusing NGD algorithm is shown in Figure 11.9. The plots show the prediction gain versus the tap length and the prediction horizon (number of steps ahead in prediction). In all the cases from Figure 11.9, the data-reusing algorithms outperformed the standard algorithms for short-term prediction. The standard algorithms showed better prediction results for long-term prediction. As expected, the performance deteriorates with the order of prediction ahead. In the next experiment we compare the performance of a recurrent perceptron trained with the ﬁxed learning rate η = 0.3 and a recurrent perceptron trained by the NRTRL algorithm on prediction of the HRV signal. In the experi- ment the MA and the AR part of the recurrent perceptron vary from 1 to 15, while prediction horizon varies from 1 to 10. The results of the experiment are shown in Fig- ures 11.10 and 11.11. From Figure 11.10, for a relatively large input line and feedback tap delay lines, there is a saturation in performance. This conﬁrms that the recur- rent structure was able to capture the dynamics of the HRV signal. The prediction performance deteriorates with the prediction step, and due to the recurrent nature of the ﬁlter, the performance is not good for a NARMA recurrent perceptron with
182 EXPERIMENTS ON HEART RATE VARIABILITY 7 NNGD 6 NGD 5 Prediction gain [dB] NLMS 4 3 2 1 0 5 10 15 20 25 The tap length (a) Performance of the NGD, NNGD and NLMS algorithms in the prediction of NO2 time series 6 5 Prediction gain [dB] 4 3 2 1 0 10 8 10 6 8 4 6 4 2 2 0 0 The AR part The MA part (b) Performance of the recurrent perceptron in the prediction of NO2 time series Figure 11.7 Performance comparison of various structures for prediction of NO2 series a small order of the AR and MA part. Figure 11.11 shows the results of an exper- iment similar to the previous one, with the exception that the employed algorithm was the NRTRL algorithm. The NARMA(p, q) recurrent perceptron trained with this
SOME PRACTICAL CONSIDERATIONS OF PREDICTABILITY 183 1.5 1.4 1.3 1.2 Heart rate variability 1.1 1 0.9 0.8 0.7 0.6 0.5 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Number of samples (a) HRV signal for patient A 1.2 1.1 1 0.9 Heart rate variability 0.8 0.7 0.6 0.5 0.4 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Number of samples (b) HRV signal for patient B Figure 11.8 Heart rate variability signals for patients A and B algorithm persistently outperformed the standard recurrent perceptron trained by the RTRL. Figure 11.12 shows performance of the recurrent perceptron with ﬁxed η in predic- tion of HRV time series (patient B), for diﬀerent prediction horizons. Similar argu- ments as for patient A are applicable.
184 EXPERIMENTS ON HEART RATE VARIABILITY 8 6 Prediction gain [dB] 4 2 0 −2 −4 30 25 10 20 8 15 6 10 4 5 2 0 0 The tap length Prediction horizon (a) Performance of the NGD algorithm in prediction of HRV time series, patient A 14 12 10 Prediction gain [dB] 8 6 4 2 0 30 25 10 20 8 15 6 10 4 5 2 0 0 The tap length Prediction horizon (b) Performance of the NGD algorithm in prediction of HRV time series, patient B Figure 11.9 Performance comparison between standard and data-reusing algorithms for prediction of HRV signals
SOME PRACTICAL CONSIDERATIONS OF PREDICTABILITY 185 12 10 8 Prediction gain [dB] 6 4 2 0 −2 30 25 10 20 8 15 6 10 4 5 2 0 0 The tap length Prediction horizon (c) Performance of the data-reusing NGD algorithm in prediction of HRV time series, patient A, L = 10 20 15 Prediction gain [dB] 10 5 0 30 25 10 20 8 15 6 10 4 5 2 0 0 The tap length Prediction horizon (d) Performance of the data-reusing NGD algorithm in prediction of HRV time series, patient B, L = 10 Figure 11.9 Cont.
186 EXPERIMENTS ON HEART RATE VARIABILITY 7 6 Prediction gain [dB] 5 4 3 2 1 0 15 15 10 10 5 5 0 0 The AR part The MA part (a) Performance of the recurrent perceptron with ﬁxed learning rate in prediction of HRV time series, patient A, prediction horizon is 1 7 6 Prediction gain [dB] 5 4 3 2 1 0 15 15 10 10 5 5 0 0 The AR part The MA part (b) Performance of the recurrent perceptron with ﬁxed learning rate in prediction of HRV time series, patient A, prediction horizon is 2 Figure 11.10 Performance of a NARMA recurrent perceptron on prediction of HRV signals for diﬀerent prediction horizons
SOME PRACTICAL CONSIDERATIONS OF PREDICTABILITY 187 8 7 6 Prediction gain [dB] 5 4 3 2 1 0 15 15 10 10 5 5 0 0 The AR part The MA part (c) Performance of the recurrent perceptron with ﬁxed learning rate in prediction of HRV time series, patient A, prediction horizon is 5 8 7 6 Prediction gain [dB] 5 4 3 2 1 0 15 15 10 10 5 5 0 0 The AR part The MA part (d) Performance of the recurrent perceptron with ﬁxed learning rate in prediction of HRV time series, patient A, prediction horizon is 10 Figure 11.10 Cont.
188 EXPERIMENTS ON HEART RATE VARIABILITY 7 6 Prediction gain [dB] 5 4 3 2 1 0 15 15 10 10 5 5 0 0 The AR part The MA part (a) Performance of the recurrent perceptron trained with the NRTRL algorithm in prediction of HRV time series, patient A, prediction horizon is 1 7 6 Prediction gain [dB] 5 4 3 2 1 0 15 15 10 10 5 5 0 0 The AR part The MA part (b) Performance of the recurrent perceptron trained with the NRTRL algorithm in prediction of HRV time series, patient A, prediction horizon is 2 Figure 11.11 Performance of the NRTRL algorithms on prediction of HRV, for diﬀerent prediction horizons
SOME PRACTICAL CONSIDERATIONS OF PREDICTABILITY 189 7 6 Prediction gain [dB] 5 4 3 2 1 0 15 15 10 10 5 5 0 0 The AR part The MA part (c) Performance of the recurrent perceptron trained with the NRTRL algorithm in prediction of HRV time series, patient A, prediction horizon is 5 7 6 Prediction gain [dB] 5 4 3 2 1 0 15 15 10 10 5 5 0 0 The AR part The MA part (d) Performance of the recurrent perceptron trained with the NRTRL algorithm in prediction of HRV time series, patient A, prediction horizon is 10 Figure 11.11 Cont.
190 EXPERIMENTS ON HEART RATE VARIABILITY 12 10 Prediction gain [dB] 8 6 4 2 0 15 15 10 10 5 5 0 0 The AR part The MA part (a) Performance of the recurrent perceptron with ﬁxed learning rate in prediction of HRV time series, patient B, prediction horizon is 1 12 10 Prediction gain [dB] 8 6 4 2 0 15 15 10 10 5 5 0 0 The AR part The MA part (b) Performance of the recurrent perceptron with ﬁxed learning rate in prediction of HRV time series, patient B, prediction horizon is 2 Figure 11.12 Performance of a recurrent perceptron for prediction of HRV signals for diﬀerent prediction horizons