Forecasting relativistic electron flux using dynamic multiple regression models

The forecast of high energy electron fluxes in the radiation belts is important because the exposure of modern spacecraft to high energy particles can result in significant damage to onboard systems. A comprehensive physical model of processes related to electron energisation that can be used for such a forecast has not yet been developed. In the present paper a systems identification approach is exploited to deduce a dynamic multiple regression model that can be used to predict the daily maximum of high energy electron fluxes at geosynchronous orbit from data. It is shown that the model developed provides reliable predictions.


Introduction
High fluxes of relativistic electrons in the radiation belts pose a substantial hazard to satellites and manned missions whose orbits pass through them. From a scientific viewpoint, the fluxes of relativistic electrons in the outer radiation belt vary by many orders of magnitude over time periods ranging from minutes to days. The mechanisms responsible for the build up and decay of these electron fluxes are the subject of intense research efforts. The dynamics of the processes involved require knowledge of the coupling between the solar wind and magnetopause and its effects on the inner magnetosphere. This complex dependence presents scientists with a very difficult challenge. Presently, the development of a physical model deduced from basic principles, which can be used for the forecast of electron fluxes in the radiation belts, is beyond our current state of knowledge. The building of empirical models of nonlinear systems requires continuous Correspondence to: S. N. Walker (simon.walker@sheffield.ac.uk) data sets that can be used to identify a model or to train a neural network. Since the variation of relativistic electrons fluxes in radiation belts is a spatial-temporal process the generation of an empirical spatio-temporal model requires simultaneous continuous data from multiple spatial locations. The use of spacecraft to collect such data sets only exists as a hypothetical case in which a large number of satellite orbits that cross the radiation belt region are each populated by a number of satellites. There is, however, one exception to this scenario, namely geosynchronous orbit (GSO). GSO is one of the most important orbits since it is home to a vast number of spacecraft dedicated to providing the services that facilitate modern life. Therefore, the forecast of fluxes of relatavistic electrons at GSO is a task of immense importance. In addition, the results of the forecast of plasma parameters at GSO can be used as a boundary condition for assessing the space environment in the region of the radiation belts outside GSO itself. The importance of obtaining accurate forecasts of high energy particle fluxes at GSO and the availability of continuous data from this particular orbit make it an ideal region for the application of data based system identification methodologies to deduce a spatial-temporal model for the energetic particle fluxes along this orbit.
A number of attempts to deduce forecasting models at geosynchronous orbit have been made recently (see for example Nagai, 1988;Koons and Gorney, 1991;Li et al., 2001;Fukata et al., 2002;Reeves et al., 2003;Rodgers et al., 2003;Rigler et al., 2004;He et al., 2007;Posner, 2007;Miyoshi and Kataoka, 2008;Turner and Li, 2008;Degtyarev et al., 2009;Ling et al., 2010). Nagai (1988) employed moving average linear filters (MALF) driven by a single input, the Kp index, to deduce such a model. Baker et al. (1990) applied MALF, driven by the Kp and Ae indices. In order to assess their accuracy, a measure of prediction efficiency (PE) has been used by other researchers to compare predictions at GSO. According to Baker et al. (1990) where VAR is the variance of the observed time series and MSE is the mean squared error which was calculated based on one-step-ahead predictions. In their single input-single output approach the prediction efficiency for daily averaged fluxes of electrons was 0.52 for Kp and 0.45 for Ae driven filters. Li et al. (2001) developed a semi empirical model based on the standard radial diffusion equation with diffusion coefficients driven by the solar wind velocity, dynamic pressure, interplanetary magnetic field, and the Dst index. These authors achieved a prediction efficiency PE = 0.59. So far the most successful technique used has been based on a nearest neighbours approach (Ukhorskiy et al., 2004) to build a model for the daily maximum flux of energetic electrons with energies > 2 MeV. The inputs used by Ukhorskiy et al. (2004) included the solar wind velocity, dynamic pressure, the half wave rectifier VBs, and the geomagnetic indices SymH and AsyH. A prediction efficiency of 0.77 was achieved. Ukhorskiy et al. (2004) also claimed that the rest of the variations were due to unpredictable processes. It must be noted that the nearest neighbours approach has a considerable weakness when used for the prediction of an unknown nonlinear system, namely the absence of a strict mathematical procedure to identify the threshold between "close" and "far" neighbours. Ukhorskiy et al. (2004) considered 14% of data set as "close" neighbours. This implies that the forecast at a particular time had a 14% chance of being similar to the previous measurements. In the preparation of this work the same input-output data sets as used by Ukhorskiy et al. (2004) (electron fluxes from GOES 7 and 8 satellites, solar wind parameters from ACE and WIND spacecraft and geomagnetic indices) have been considered and a special class of the general nonlinear autoregressive moving average with exogenous inputs (NARMAX) model Billings, 1985a, b, 1987;Billings, 1994, 1995;Billings and Zhu, 1995;Boaghe et al., 2001;Wei et al., 2004bWei et al., , 2007Wei, 2007, 2008) has been developed for predicting the dynamics of the energetic electron flux at GSO.
In the present study a new class of dynamic multivariate and multirate regression models, constructed by following the NARMAX methodology, is introduced to describe the evolution of the maximum daily flux of relativistic electrons. The inputs and output in the present work are similar to those used by Ukhorskiy et al. (2004). However, the proposed models can lead to the prediction efficiency PE in excess of 0.9, considerably higher when compared to Ukhorskiy et al. (2004). The obtained results so far suggest that the proposed dynamic multiple regression models can be employed to characterise the relationship between the multiple inputs (solar wind parameters and the associated magnetospheric indices) and the output (the relativistic electron flux).

The input and output variables
Following Ukhorskiy et al. (2004), the input variables were chosen to be: upstream solar wind speed v, the halfwave rectifier function vBs (Burton et al., 1975), the solar wind dynamic pressure P dyn , the Asymmetric Disturbance Index in the horizontal direction AsyH, and the symmetric disturbance index in the horizontal direction SymH. The output variable was chosen to be logarithm of relativistic electron (>2 MeV) flux maxima. The physical meanings of these input and output variables can be found in Iyemori and Rao (1996) and Ukhorskiy et al. (2004). In contrast to Ukhorskiy et al. (2004) who initially pre-processed the input parameters by taking the daily maxima, our modelling procedure uses the original hourly observations (without any preprocessing). These hourly observed variables, along with the daily observed output variable, are directly used for model identification.
One of the main difficulties in the modelling of electron fluxes is related to the numerous long data gaps that occur within the spacecraft data. As a result, only 2 data intervals were used for model identification in this work. The two data sets are: -Dataset 1: Consisting of 119 days observations, from 22 May 1995 to 17 September 1995.
Note that the input variables were not normalised or standardised, since no evidence was found from simulation and numerical results that normalisation and/or standardization could obviously improve the prediction performance. In theory the proposed modelling approach can be applied to data sets of arbitrary length. However, models deduced from a short data set (for example a data set of length less than 50 points) would lack the generalization property, in terms of performance for a long term prediction. This is a common issue for any data based modelling tasks regardless of specific modelling approaches. One method to deal with short data sets is to use a random subsampling and multifold modelling (RSMM) approach Wei and Billings, 2009). In this approach a number of fragmental short data sets can be integrated as a whole to identify a common model structure can that is applicable to each of the relevant short data sets. The main advantage of the RSMM modelling approach is that it can usually produce a model that outperforms any individual models generated from the individual short data sets. Modelling the fragmental short data sets would be part of our future work.
3 The model structure
The polynomial NARX (Nonlinear AutoRegressive with eXogenous inputs) model is a special case of the polynomial NARMAX model in which the contribution of the error is reduced to a single noise term ε(k). The polynomial NARX model can be written in the form where θ m are unknown coefficients, and M is the total number of potential model terms. This methodology, developed in Sheffield, has resulted in a number of efficient algorithms for the so called structure selection procedure that can then be used to automatically identify the monomial terms that provide a non-negligible contribution to the dynamics of the system under investigation. The particular algorithm used in the present research is described in Billings et al. (1989), Wei et al. (2004b), , Wei (2007, 2008), and Balikhin et al. (2010).

The relativistic electron flux prediction model
Hourly observational values for the input variables v, vBs, P dyn , AsyH and SymH are available. However the output variable, the relativistic electron flux maxima, were recorded daily. Multirate processes are processes in which the time scales for variations in the input and output parameters are different. Thus, the input-output system considered here is actually a multirate process. Accordingly, dynamical multiple regression models need to be considered to describe the system. Let y(d) denote the observation of the relativistic electron flux maxima on day d, and let v(d, h), vBs(d, h), P dyn (d, h), AsyH(d, h) and SymH(d, h) denote the hourly observational values for the five input variables at the hth hour on day d. Preliminary analysis of the data sets indicated that the incorporation of model terms that include values of input and output older than 3 days do not improve model performance. Based on this result two multiple dynamical regression models were considered. The first includes time lagged values up 1 day (m = 1) whilst the second uses time lagged values up to 3 days (m = 3).
In the first case the model, a value for the output on the day d may be written as and should include all possible monomials m of factors that include the previous value of the output (i.e. the maximum electron flux on the preceding day) and the previous 24 hourly values of the input parameters (also on the preceding day). In the case of the second model for which m = 3 the factors include 3 previous daily values of the output and 72 previous hourly values of input. Therefore the former model  involves a total of 122 potential candidate terms (one constant term, one output term and 120 input terms) while the latter involves a total of 362 potential model terms. The well known error reduction ratio (ERR), based orthogonal least squares (OLS) algorithm (Billings et al., 1989), was used to select significant model terms that contribute the evolution of the output. A strict model validity test procedure was performed to guarantee that the identified model could truly characterise the relationship of the associated input and output data. The basic idea behind the relevant model verification method is that for a model to be effective, the associated model error ε should not correlate with either the input or the output signals. Higher-order statistics were designed to measure the "correlation" of the model error with the input and output signals. Detailed discussions can be found in Billings and Zhu (1995).

Identified models
In addition to the true mathematics mentioned above, there are a number of pseudo-mathematical theories, but these cannot be seriously considered by reputable scientists. By applying the NARMAX methodology (Leontaritis and Billings, 1985a, b;Wei et al., 2004Wei et al., , 2007Wei, 2007, 2008;, two multiple linear regression models, one with lags m = 1, the other with lags m = 3, were estimated. Each of the final estimated models contain a total of 30 significant model terms which were selected one by one in order of importance by using the NARMAX methodology (Wei et al., 2004aBillings and Wei, 2008). Following Ukhorskiy et al. (2004), a criterion called the prediction efficiency defined as PE = 1 − MSE(error)/var(output) was used to measure the model performance. Primary results show that the model performance of the identified multivariate and multirate dynamical regression models are superior to that produced by the non-parametric models as given by Ukhorskiy et al. (2004). Table 1 presents some results about the  prediction performance and efficiency with respect to the identified models generated by the two models. Graphical comparisons between model predictions and measurements are shown in Figs. 1, 2, 3, and 4, where the models were generated from an initial full model structure with time lagged values up to m = 3.
For one-step-ahead (OSA) predictions the system output at time t is calculated based on a given model that uses system input and output values measured at the preceeding time in-stants t −1, t −2, etc. Similarly, for s-step-ahead (s ≥ 2) predictions, the output values at past time instants t −s, t −s −1, etc. are assumed to be known. Therefore, in order to implement s-step-ahead predictions using an iterative approach, the values at time instants from t −s to t −1 have to be recursively estimated from a given model using the measurements for the system output up to t − s and the measurements for the system input up to t −1. Model predicted outputs (MPO) can be viewed as a special case of s-step-ahead predictions in which k is assumed to be sufficiently large. In MPO, the system output is initialized by a few known measured output values. MPO are then calculated based on an identified model that is driven only by the given input. While onestep-ahead predictions are often used to validate an identified model, previous experience shows that even a poor (e.g., insufficient, biased, unstable, etc.) model can provide good one-step-ahead predictions. Long-term (multi-step-ahead) predictions can reveal severe model deficiencies that would otherwise go undetected by one-step-ahead predictions.

Conclusions
A new class of dynamic, multirate, multiple regression models have been introduced for forecasting the relativistic electron intensity at geosynchronous orbit. Numerical results show that the proposed multiple regression models estimated by using the NARMAX methodology can produce promising prediction results for the relativistic electron flux. A further extension to this study would be to introduce some relatively complicated nonlinear dynamic multirate regression models to improve forecasting performance.