Journal cover
Journal topic
**Annales Geophysicae**
An interactive open-access journal of the European Geosciences Union

Journal topic

- About
- Editorial board
- Articles
- Special issues
- Highlight articles
- Manuscript tracking
- Subscribe to alerts
- Peer review
- For authors
- For reviewers
- EGU publications
- Imprint
- Data protection

- About
- Editorial board
- Articles
- Special issues
- Highlight articles
- Manuscript tracking
- Subscribe to alerts
- Peer review
- For authors
- For reviewers
- EGU publications
- Imprint
- Data protection

**Regular paper**
31 Jan 2019

**Regular paper** | 31 Jan 2019

Extending the regional ionosphere map coverage area

- School of Aerospace and Mechanical Engineering, Korea Aerospace University, Goyang-si, 10540, Korea

Abstract

Back to toptop
The coverage of regional ionosphere maps is determined by the
distribution of ground-based monitoring stations, e.g., GNSS receivers. Since
ionospheric delay has a high spatial correlation, ionosphere map coverage
can be extended using spatial extrapolation methods. This paper proposes a
support vector machine (SVM) to extrapolate the ionosphere map data with
solar and geomagnetic parameters. One year of IGS ionospheric delay map data
over South Korea is used to train the SVM algorithm. Subsequently, 1 month
of ionospheric delay data outside the input data region is estimated. In
addition to solar and geomagnetic environmental parameters, the ionospheric
delay data from the inner data region are used to estimate the ionospheric
delay data for the outside region. The accuracy evaluation is performed at
three levels of range −5, 10, and 15^{∘}
outside the inner data regions. The extrapolation errors are 0.33 TECU (total electron content unit) for
the 5^{∘} region and 1.95 TECU for the 15^{∘} region. These
values are substantially lower than the GPS Klobuchar model error values.
Comparison with another machine learning extrapolation method, the neural
network, shows a substantial improvement of up to 26.7 %.

How to cite

Back to top
top
How to cite.

Kim, M. and Kim, J.: Extending the coverage area of regional ionosphere maps using a support vector machine algorithm, Ann. Geophys., 37, 77-87, https://doi.org/10.5194/angeo-37-77-2019, 2019.

1 Introduction

Back to toptop
Ionospheric delay is one of the main error sources for single-frequency global navigation satellite system (GNSS) receivers. Ionosphere models or ionosphere maps can be used to correct for ionospheric delay. For real-time applications, a regional ionosphere map using regional GNSS monitoring stations can be used to provide highly accurate corrections. The regional ionosphere map coverage is determined by the distribution of GNSS ground-based monitoring stations. Since ionospheric delay has a high spatial correlation, ionosphere map coverage may be extended by using spatial extrapolation methods. In addition to the spatial correlations, time variables such as observation hour and day number, and solar and geomagnetic indices can serve as input parameters for the extrapolation.

A series of research studies have been conducted on the temporal
extrapolation (prediction) of regional ionosphere maps using past
observations. With respect to using machine learning algorithms, Kumluca et
al. (1999) applied the neural network (NN) method to forecast ionospheric
critical plasma frequencies, *fo*F2. McKinnell and Friedrich (2007)
used a NN to predict the lower ionosphere in the aurora zone. Okoh et al. (2016)
developed a regional vertical total electron content (VTEC) model for Nigeria based on observational
data from 12 stations and tested temporal and spatial extrapolation
performance. Unlike previous studies, the extrapolation performance was
improved by adding the International Reference Ionosphere (IRI) as an input.
Razin and Voosoghi (2016) applied a wavelet NN with particle swarm
optimization to predict the total electron content (TEC) over Iran. Huang and Yuan (2014) used time
and temporal variation of the TEC values as radial-basis function (RBF)
network inputs to temporal extrapolation. A support vector machine (SVM)
model has been used to predict the ionospheric *fo*F2 above Chinese
stations (Ban et al., 2011; Chen et al., 2010). Akhoondzadeh (2013) used a SVM
to predict the TEC and to detect seismo-ionospheric anomalous variations.

On the other hand, research on the spatial extrapolation of the ionosphere map is sparse. Wielgosz et al. (2003) used kriging and multiquadric method to produce instantaneous TEC maps near the Ohio continuously operating reference station (CORS) stations in near-real time. Kim and Kim (2014) applied a biharmonic spline method to extend a small ionospheric correction coverage area. Ionospheric delay observations were used as the input parameters, and the ionospheric delay outside the coverage area was extrapolated. Leandro and Santos (2006) used geographical information as inputs of a NN model for spatial extrapolation of TEC over Brazil. For spatial extrapolation, Jayapal and Zain (2016) used a NN with time and solar or geomagnetic indices. In addition to these environmental parameters, Kim and Kim (2016) used the ionospheric delay of the inner area to improve the performance of spatial extrapolation.

In addition to the NN method, a SVM algorithm can be considered for spatial extrapolation. A SVM finds a solution to the convex quadratic programming problem in training to optimize the margin so that it can be both optimal and unique. On the other hand, a NN finds the weight between each layer through the gradient descent method, and the solution has a possibility to fall into the local minima in this process. A NN is based on empirical risk minimization (ERM), which is a method of minimizing learning errors during the learning process. On the other hand, a SVM is based on structural risk minimization (SRM), so it has excellent generalization performance (Gunn, 1998). SVMs have been widely used as predictive models in various fields. Huang et al. (2015) successfully performed stock market movement predictions using a SVM. Mohandes et al. (2014) performed wind speed predictions using a SVM and compared the performance against the NN method. The results showed that the SVM achieved superior prediction performance.

This paper proposes a SVM algorithm to extend ionosphere map coverage by applying temporal and environmental parameters and ionospheric observations. The IGS ionosphere map is used as a reference map, and the extrapolation accuracy of the SVM is evaluated by comparing it to the IGS map data. The extrapolation accuracies are compared with the GPS Klobuchar model and the NN model.

2 Parameter modeling

Back to toptop
Three types of input parameters are used for the extrapolation of a regional
ionosphere map – temporal parameters, environmental parameters, and
ionospheric delay observations. An extrapolated ionospheric delay, ID_{ext},
may be represented as a function of these three parameters.

$$\begin{array}{}\text{(1)}& {\mathrm{ID}}_{\mathrm{ext}}=f\left({x}_{t}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}{x}_{\mathrm{e}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}{x}_{\mathrm{obs}}\right),\end{array}$$

where *x*_{t} and *x*_{e} are the time and the environmental parameters,
respectively, and *x*_{obs} is ionospheric delay observations in the inner area.
The inner area is defined as a geographical area where ionospheric delay
information or observations are available. The outer area is defined as a
geographical area where ionospheric delay will be estimated.

The ionospheric variation is correlated with the diurnal and seasonal time variation, and the ionospheric delay above the locations involved in the study reaches its maximum around 14:00 local time (LT) and its minimum around 02:00 LT (Wu et al., 2012). Also, the daily mean ionospheric delay is higher in spring and autumn, and lower in summer and winter (Wu et al., 2012; Mansoori et al., 2015). In order to adopt these correlations, time parameters are included in the extrapolation model. The diurnal variation is represented by an hour number (00:00–23:00 LT), and the seasonal variation is represented by a day number (0–365). To represent the repeatability of these variations, the time parameters are modeled as sinusoidal functions.

$$\begin{array}{}\text{(2)}& {x}_{t}=\left[{S}_{\mathrm{D}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}{C}_{\mathrm{D}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}{S}_{\mathrm{H}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}{C}_{\mathrm{H}}\right],\end{array}$$

where *S*_{D} and *C*_{D} are the sine and cosine, respectively, of the day
number, and *S*_{H} and *C*_{H} are the sine and cosine, respectively, of
the hour numbers. The periods used for the sinusoidal functions are set to
24 h and 365.25 days for the diurnal and seasonal parameters,
respectively. The ionosphere activity is also highly correlated with solar
and geomagnetic activity. Three parameters are selected to reflect the space
environment – the F10.7 index, geomagnetic index Kp, and sunspot number
(SSN).

$$\begin{array}{}\text{(3)}& {x}_{\mathrm{e}}=\left[\mathrm{F}\mathrm{10.7}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\mathrm{Kp}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\mathrm{SSN}\right]\end{array}$$

Although SSN has a similarity with F10.7 in representing solar activity, use of both parameters yielded slightly better estimation accuracy than use of single parameter. Therefore, both F10.7 and SSN are adopted for the environmental parameters. Experiments on the selection of optimal solar activity indices will be discussed in Sect. 5.

Disturbance storm time (Dst) may replace Kp for ionosphere storm detection but it was not selected. Dst response performance depends on ionosphere storm driver. Dst is efficient for storms driven by coronal mass ejection (CME), but it is less effective for storms driven by corotating interaction regions (CIRs) or coronal hole high speed streams (CH HSSs) (Borovsky and Denton, 2006; Denton et al., 2006). After a series of numerical experiments on selecting Dst or Kp, Kp was selected for the parameter because of its better estimation performance. The numerical experiments will be discussed in Sect. 4.

Past inner-area ionospheric delays are used to train the machine learning
algorithms, and current inner-area delays are used for the extrapolation.
The observation data set for the *N* observation points is derived as follows.

$$\begin{array}{}\text{(4)}& {x}_{\mathrm{obs}}=\left[{\mathrm{ID}}_{\mathrm{obs}}^{\mathrm{1}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}{\mathrm{ID}}_{\mathrm{obs}}^{\mathrm{2}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\mathrm{\cdots}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}{\mathrm{ID}}_{\mathrm{obs}}^{N}\right]\end{array}$$

The proposed algorithm is using fixed locations both for input and output, and it does not require a spatial structure. Other researchers' works on ionosphere prediction used raw GPS TEC measurements at varying IPP (ionospheric pierce point) and the measurement locations should be registered in the input. Our algorithm uses a grid-based ionosphere map with fixed grid points, and their location information is not required as the model inputs.

In the event of high temporal or geographical decorrelation due to geomagnetic storms, two inputs are affected: the solar or geomagnetic parameters and the ionosphere input data in inner region. Because of observation latency, the real-time solar or geomagnetic parameters may not be available in real time. However the ionosphere input data may be available in real time from GPS observations, and this fact makes for the estimation algorithm to respond to the geomagnetic storm in real time.

3 Extrapolation methods

Back to toptop
The SVM method is a machine learning theory that was proposed by Vapnik in 1995. It
uses an algorithm to find a hyperplane that maximizes the margin (Gunn,
1998). It is used in data classification and regression problems, and SVMs
used in regression are referred to as support vector regression (SVR). A SVM sets the regression function, *f*(*x*_{svm}), such that target *y*_{svm}
is in the following range.

$$\begin{array}{}\text{(5)}& {\displaystyle}& {\displaystyle}f\left({x}_{\mathrm{svm}}\right)={\widehat{y}}_{\mathrm{svm}}={w}^{\mathrm{T}}{x}_{\mathrm{svm}}+b,\text{(6)}& {\displaystyle}& {\displaystyle}f\left({x}_{\mathrm{svm}}\right)-\mathit{\epsilon}\le {y}_{\mathrm{svm}}\le f\left({x}_{\mathrm{svm}}\right)+\mathit{\epsilon},\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}\mathit{\epsilon}>\mathrm{0},\end{array}$$

where *x*_{svm} is the input that contains [*x*_{t} *x*_{e} *x*_{obs}], and *w*^{T} is the transposed weighting
matrix;
*y*_{svm} is the target that represents the true ionospheric delay in the
extrapolation region, and *x**ε* is the allowable error level for
*y*_{svm}. In many practical cases, *y*_{svm} is not in the range of
$\left(f\right({x}_{\mathrm{svm}})-\mathit{\epsilon},f({x}_{\mathrm{svm}})+\mathit{\epsilon})$, and *y*_{svm} is
frequently adjusted to the range of $\left(f\right({x}_{\mathrm{svm}})-\mathit{\xi},f({x}_{\mathrm{svm}})+\mathit{\xi})$,
where *ξ* is a slack variable. The optimal regression function is
determined when the total magnitude of the slack variable, ∑_{i}*ξ*_{i} is minimized. Also, the distance between *f*(*x*_{svm}) and the
support vector should be maximized. The distance between the SVM and
*f*(*x*_{svm}) is called the margin, and the margin may also be minimized.
Therefore, the optimal regression function minimizes ∥w∥
and *ξ* to achieve the maximum margin (Gunn, 1998).

$$\begin{array}{ll}\text{(7)}& {\displaystyle}& {\displaystyle}min{\displaystyle \frac{{\u2225w\u2225}^{\mathrm{2}}}{\mathrm{2}}}+C{\sum}_{i=\mathrm{1}}^{n}\left({\mathit{\xi}}_{i}^{-}+{\mathit{\xi}}_{i}^{+}\right),{\displaystyle}& {\displaystyle}\text{subject\hspace{0.17em}to}\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}{y}_{\mathrm{svm}}-f\left({x}_{i,\mathrm{svm}}\right)-{\mathit{\xi}}_{i}\le \mathit{\epsilon}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\text{if}\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}{y}_{\mathrm{svm}}-f\left({x}_{i,\mathrm{svm}}\right)\ge \mathit{\epsilon}\\ \text{(8)}& {\displaystyle}& {\displaystyle}{y}_{\mathrm{svm}}-f\left({x}_{i,\mathrm{svm}}\right)+{\mathit{\xi}}_{i}\ge -\mathit{\epsilon},\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}\text{if}\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}{y}_{\mathrm{svm}}-f\left({x}_{i,\mathrm{svm}}\right)\le -\mathit{\epsilon}\end{array}$$

In Eq. (7), the superscript – denotes a lower boundary and + denotes an
upper boundary. The slack variable disappears while expanding equations. *C*
is the penalty set by users. As the *C* value approaches zero, the weight for
the slack variable decreases and the relative weight for ∥w∥^{2} increases. Therefore, the regression function that maximizes
the margin can be calculated. This implies that the regression function
differs from *y*_{svm}. As *C* increases, the weight for the slack
variable sum increases rather than maximizing the margin magnitude.
Therefore, a regression function is calculated in a form similar to
*y*_{svm}. Equation (7) can be modified using a dual problem, as
follows.

$$\begin{array}{ll}{\displaystyle}& {\displaystyle}\mathrm{arg}\underset{\mathit{\beta}}{min}{\displaystyle \frac{\mathrm{1}}{\mathrm{2}}}{\mathit{\beta}}^{\mathrm{T}}K\left({x}_{i,\mathrm{SVM}},\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}{x}_{j,\mathrm{SVM}}\right)\mathit{\beta}-{f}^{\mathrm{T}}\mathit{\beta}\\ \text{(9)}& {\displaystyle}& {\displaystyle}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}f=-{y}_{\mathrm{svm}}+\mathit{\epsilon},\end{array}$$

where *β* is ${\mathit{\alpha}}^{-}-{\mathit{\alpha}}^{+}$ and *α* is Lagrange
multiplier. *K* is a kernel function that maps input data *x*_{svm} to a
higher dimension. Kernel functions have several functions, including linear
and polynomial functions. The most commonly used functions are Gaussian
kernel functions (Cristianini, 2001).

$$\begin{array}{}\text{(10)}& K\left({x}_{\mathrm{svm}},\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}{y}_{\mathrm{svm}}\right)=\mathrm{exp}\left(-{\displaystyle \frac{{\u2225{x}_{\mathrm{svm}}-{y}_{\mathrm{svm}}\u2225}^{\mathrm{2}}}{\mathrm{2}{\mathit{\sigma}}^{\mathrm{2}}}}\right)\end{array}$$

After mapping *x*_{svm} to feature space, one can determine the optimal
*β* by using quadratic programming (QP). The optimal regression
function can be computed by using the following equation (Gunn, 1998).

$$\begin{array}{ll}{\displaystyle}& {\displaystyle}f\left({x}_{\mathrm{svm}}\right)={w}^{\mathrm{T}}x+b={\sum}_{i=\mathrm{1}}^{N}{\mathit{\beta}}^{\mathrm{T}}K\left({x}_{i,\mathrm{SVM}},\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}{x}_{j,\mathrm{SVM}}\right)+{\displaystyle \frac{\mathrm{1}}{n}}\\ \text{(11)}& {\displaystyle}& {\displaystyle}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}{\sum}_{i=\mathrm{1}}^{N}{\sum}_{j=\mathrm{1}}^{N}\left\{{y}_{i,\mathrm{SVM}}-{\mathit{\beta}}_{j}^{\ast}K\left({x}_{i,\mathrm{SVM}},\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}\phantom{\rule{0.125em}{0ex}}{x}_{j,\mathrm{SVM}}\right)\right\}\end{array}$$

The flow chart of the SVM training process is shown in Fig. 1. The input
variables consist of temporal and environmental parameters and ionospheric
delays in the observation region, and these inputs are identical for each
extrapolation point. Targets include the true ionospheric delay in the *j*-th
extrapolation point. After the input and output of the SVM is defined, a kernel
matrix is generated for each input. Then, the training is performed to find
the optimal coefficients and bias of the regression function, *f*(*x*_{svm}).
The kernel function is calculated for the epoch of each input so that the
size of the matrix becomes *N*×*N*, where *N* is the number of epochs.
As the input increases, the computational time and memory usage also
increase. Therefore, the elements of the kernel matrix, including the oldest
epoch, are deleted, and the kernel functions of the recent epoch are
included in the matrix. After defining the kernel function and the boundary
of the regression function, the optimal weights and biases are calculated
using the interior point method (Ferris and Munson, 2004). When the initial
training is completed, the extrapolation and update of the kernel function
are repeated.

A NN is a statistical learning model similar to a biological neural network. It consists of neurons or perceptions, and a synapses. Neurons are interconnected with synapses, which store weights. A NN can solve problems such as pattern recognition and regression by calculating the weights from the learning of the neurons (Habarulema et al., 2011).

Several types of NNs exist – e.g., back-propagation neural network (BPNN), recurrent neural network (RNN), and time delay neural network (TDNN). This study implements a BPNN, which is one of the most commonly used NN algorithms. It is a feed-forward, multi-layer perceptron (MLP), supervised learning network (Jwo et al., 2004). In the hidden layer, activation functions determine whether the values from the previous layer are activated or not. Training is generally performed using the gradient descent method.

Figure 2 shows a flow chart of the BPNN used for the regional ionosphere map
extrapolation. The input layer includes the network inputs, *x*_{NN}, shown
in Eqs. (2), (3), and (4). The network inputs and targets are the same as
those used in the SVM. An input neuron multiplied by a weight can be
computed through the hidden layer towards the output neuron, as follows.

$$\begin{array}{ll}{\displaystyle}& {\displaystyle}{\widehat{y}}_{\mathrm{NN}}={f}^{n}\left({W}^{n,n-\mathrm{1}}{f}^{n-\mathrm{1}}\left({W}^{n-\mathrm{1},n-\mathrm{2}}{f}^{n-\mathrm{2}}\right.\right.\\ \text{(12)}& {\displaystyle}& {\displaystyle}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\phantom{\rule{0.25em}{0ex}}\left.\left.\left(\mathrm{\cdots}{f}^{\mathrm{1}}\right({W}^{\mathrm{1},\mathrm{0}}{x}_{\mathrm{NN}}+{b}^{\mathrm{1}})\mathrm{\cdots}+{b}^{n-\mathrm{2}})+{b}^{n-\mathrm{1}}\right)+{b}^{n}\right),\end{array}$$

where *b* is the network bias, *n* represents the *n*th layer, and
${W}^{n,n-\mathrm{1}}$ is the weight from *n*−1 to the *n*th layer; *x*_{NN} is the
network input, which includes the three input parameters for extrapolation,
${\widehat{y}}_{\mathrm{NN}}$ is the network output, and *f* is an activation function. The
hyperbolic tangent sigmoid function is implemented, which is the most widely
used method. The network is trained using the BPNN algorithm with true
ionospheric delays and three input parameter sets to find the optimal
weights and biases.

The network data are generally divided into training, validation, and test sets. The training set is used to calculate and update the weights. The validation set is used to verify the training results. The test set is finally used to calculate the extrapolation error. This paper uses three data sets divided by 70 %, 15 %, and 15 %, respectively. A detailed implementation of the NN can be found in Kim and Kim (2016).

4 Data processing

Back to toptop
An IGS global ionosphere map (GIM) is used to acquire reference ionospheric delay data because of its high accuracy and global coverage (IGS, 2019). Regional ionospheric delay time series are generated with the GIM data, and they are used to train the extrapolation algorithms. The extrapolated ionospheric delays outside the observation area are compared with the GIM data to evaluate the accuracy. The IGS GIM grid size is $\mathrm{2.5}{}^{\circ}\times \mathrm{5}{}^{\circ}$, but other regional ionosphere maps such as the space-based augmentation system (SBAS) ionosphere corrections have an equal latitude–longitude grid size. Therefore, a $\mathrm{5}{}^{\circ}\times \mathrm{5}{}^{\circ}$ grid size is used for the regional ionosphere map in this research.

The estimation interval is the same as the ionosphere input data interval. In this research, 2 h interval was used because 2 h interval IGS global map is implemented for the inner map. If a shorter interval inner map is used, e.g., a 5 min SBAS map or real-time GPS-derived map, and then the estimation interval becomes shorter. The proposed algorithm is not a time-prediction algorithm, as in preceding research, and the estimation interval is not an important factor for determining the accuracy.

Figure 3 illustrates the observation and extrapolation grid points. The
observation regions (blue) are set with a radius of 2650 km centered on
South Korea, and the extrapolation regions (red) are set with a radius of
4500 km in order to include the 15^{∘} extended grid point from
South Korea. Therefore, the latitude of the observation area ranges from
15 to 55^{∘} N, and the longitude ranges from
105 to 150^{∘} E. The accuracy evaluation points are
selected to perform the extrapolation. In order to accommodate the
directional characteristics of the extrapolation performance, the evaluation
point set is selected for each direction (north, south, east, and west). In
each direction, three points are selected with different distances from the
inner observation region: −5, 10, and 15^{∘}.
All the locations of the extrapolation points are represented in Table 1.

In the case with the environmental parameters (i.e., F10.7, Kp, and SSN), real-time data may not exist at the extrapolation epoch due to data latency. In order to simulate this data latency, previous one-epoch (2 h) values are used instead of the current values during the extrapolation process. This time interval is not large because it is not a temporal prediction method, but a spatial extrapolation method. The influence of the time interval on the estimation performance is much smaller than the ionosphere input data. True environmental parameters are used in the training process, but the previous one-epoch values are used in the extrapolation process. The correlation analysis between the current and previous one-epoch values confirms the correlation. The correlation coefficients between the two adjacent epochs of data for F10.7, Kp, and SSN are 0.930, 0.863, and 0.852, respectively. Since the IGS GIM uses 2 h intervals, the Kp, which is provided every 3 h, is interpolated at intervals of 2 h.

Previous research showed that extrapolation errors have a high correlation with the ionospheric delay magnitude and variation (Kim and Kim, 2014). Therefore, the high ionospheric delay season is more appropriate when evaluating the extrapolation algorithm than the low ionospheric delay season. It means that if the magnitude of the ionospheric delay and variation is small, all the extrapolation values and errors are small. In this case, it is difficult to compare the extrapolation performance for each model. The training period is set to 1 year from 1 October 2013 to 30 September 2014. In this period, the minimum and maximum ionospheric delays are 5.1 and 112.2 TECU (total electron content unit), respectively, as shown in Fig. 4. The extrapolation period is set to 1 month from 1 to 31 October 2014. The region analyzed in this paper is located around the midlatitudes. In this region, the ionospheric spatial gradient is large in the north–south direction. Also, since the southern area is close to the geomagnetic equator, its ionospheric variation is very large.

The training and extrapolation performance depend on user parameters. In the
case of the NN, extrapolation performance mainly depends on the number of
hidden neurons. If the number of hidden neurons is too high, over-fitting
may occur, and the calculation time is long. Since there are no criteria for
determining the number of hidden neurons, the optimal number of hidden
neurons must be found by analyzing the extrapolation error variation due to
the number of neurons. The model parameters with the lowest test error are
adopted as the optimal values. In Figs. 5 and 6, test errors are computed by
the mean RMS extrapolation errors at the 5^{∘} extrapolation regions.
In case of the NN, the number of hidden neurons was selected as 80 where the
error becomes a minimum. In the case of the SVM, the extrapolation result
also varies with the model parameters. This paper sets the penalty, *C*, as
10^{6} (Fig. 6), which causes the regression function to almost equal *y*.
The Gaussian function, which is widely used in SVMs, is used as a kernel
function, and *σ* is set to 10^{−6}. The values of *σ* and
*ε* are selected via trial and error to determine the lowest
extrapolation error case. They are set to 10^{−6} and 10^{−7},
respectively.

In order to select an ionospheric storm-related input parameter between Kp
and Dst, a series of experiments was performed by replacing Kp with Dst.
The experiments concluded that Kp is better for our estimation algorithm
than Dst. After replacing Kp with Dst, both the SVM and NN estimation
accuracies were degraded. At 5^{∘} extrapolation points, the SVM
estimation error was increased from 0.33 TECU (Kp) to 0.44 TECU (Dst) and
the NN estimation error was increased from 0.45 to 0.63 TECU. Similar
levels of error increases were observed at both the 10 and
15^{∘} points. The NN accuracy degradation with Dst was more
significant during high ionospheric disturbance period, when Dst < −25 nT
(9, 19–21, 28 October). However only 1 month of data is tested in
this research. One month may not be sufficient for evaluating the estimation
performance under various ionosphere conditions, e.g., CME-, CIR-, or CH
HSS-related ionospheric disturbances. Comprehensive analysis with a longer
data period, e.g., multiple years, can be a further research topic.

5 Results

Back to toptop
The regional ionosphere map extrapolation is performed using the SVM, and the IGS GIM is used as a truth value. The SVM extrapolation results are compared with the NN and Klobuchar model results. Hourly variations of the extrapolation results are analyzed with 1-day data, and then daily variations of the results are analyzed with 1-month data.

The variations of the ionospheric delay and the extrapolation results are analyzed for the data from 28 October 2014, when the daily ionospheric delay magnitude reaches its maximum for the extrapolation period (October 2014).

Figure 7 shows the ionospheric delay variations of the IGS GIM and Klobuchar
model on 28 October 2014. Data from two evaluation points, 5^{∘}
north and south, are presented. Universal time (UT) is used. The ionospheric
delay reaches its maximum at 15:00 LT (06:00 UT) and then decreases. There
are large differences between the ionospheric delays at the north and south
points because of the ionospheric spatial gradient (Kim et al., 2014). The
north–south difference produced by the Klobuchar model is significantly
smaller than the IGS GIM.

Figure 8 shows the extrapolation results for 28 October 2014. Two
extrapolation points, north 5^{∘} (N5) and south 5^{∘} (S5),
are selected. In the case of N5, the extrapolation RMS errors of the SVM and
NN are 0.23 and 0.63 TECU, respectively. The SVM outperforms the NN
with a 63.5 % error reduction. The NN error increase at 06:00 UT corresponds
to the ionosphere maximum at 06:00 UT in Fig. 7, and the overall NN error variation
at S5 follows the ionospheric delay variation. The NN error at N5 and SVM
errors at S5 and N5 do not follow the ionospheric delay variation.

Figure 9 compares the RMS errors of four 5^{∘} extrapolation points
(N5, S5, E5, and W5) on 28 October 2014. The error magnitude is the largest
at the south point where the ionospheric delay magnitude is the largest. The
SVM shows similar error levels for the north, east, and west points.
However, the NN shows larger errors than the SVM even at the north point.
This difference in extrapolation accuracy may be explained via the
ionospheric spatial gradient. The spatial gradient along the north–south
direction is significantly greater than the gradient along the east–west
direction (Kim et al., 2014; Vuković and Kos, 2016). The large gradient
increases the geographical ionospheric delay difference and frequently
causes the NN error increase. However, the SVM is more robust for this large
amount of gradient data. In general ionosphere estimation errors increase
at low geomagnetic latitude (Song et al., 2018). However, the errors at E5 and
W5 are smaller than those at N5 point even though E5 and W5 are located to
the south of N5. This is because the input of the model includes the
internal ionospheric delay for solving a spatial extrapolation problem. It
implies that the ionospheric spatial gradient is the main factor of the
extrapolation performances.

Figure 10 compares the RMS errors of four 10^{∘} extrapolation points
(N10, S10, E10, and W10) on 28 October 2014. Unlike the 5^{∘}
results in Fig. 7, there is little difference between the two models for the
northern area. However, the difference between the two models in the
southern region is increased to 0.63 TECU. It means that the extrapolation
performance of the SVM and the NN model is larger for the high ionospheric
variation region. The extrapolation errors of the east and west region are
not significantly different from those in Fig. 9.

Figure 11 compares the RMS errors of four 15^{∘} extrapolation points
(N15, S15, E15, and W15). The overall error level increases from that of the
5^{∘} points, but the SVM still outperforms the NN, particularly at
the south and north points. The SVM error at the south point is 3.24 TECU,
and the error reduction over the NN is 1.40 TECU, or 30.2 %. As the
extrapolation points become far away from the ionosphere input data points,
the extrapolation algorithm efficiency becomes diminished. Therefore, the
accuracy difference between SVM and NN has been reduced.

The spatial extrapolations are performed for the 1-month period from 1 to 31 October 2014. As with the single-day extrapolation, the 1-year data from October 2013 to September 2014 are used for the training process.

Figure 12 shows the daily extrapolation errors for the south 10^{∘}
extrapolation point (S10) in October 2014. The 1-month means of the daily
RMS errors are 1.89 TECU for the SVM and 2.54 TECU for the NN. During the 31 days,
the SVM achieved better performance than the NN for 26 days
(83.9 %). During low ionospheric delay periods, the difference in
extrapolation performance between the two methods is not significant (e.g.,
9 and 10 October). However, during high ionospheric delay periods, the
difference becomes significant (e.g., 28 October).

In order to analyze the hourly extrapolation performance, the 1-month mean of each 2 h time interval is presented in Fig. 13. The time unit is Universal Time (UT). Both the SVM and NN show an increase in extrapolation errors at 06:00 UT. During the high ionospheric variation period, 04:00–08:00 UT, the mean of the SVM error is 0.88 TECU lower than the error of the NN. Even during the low ionospheric variation period, 18:00–22:00 UT, the SVM error is 0.88 TECU lower than the NN. These results prove that the extrapolation performance of the SVM model is better for both large and small ionospheric delays. A correlation analysis with the geomagnetic index, Kp, is performed by computing statistics for each Kp value. (This is not shown as a figure.) Over all Kp values, the SVM outperforms the NN with the same level of improvement. The only exception is Kp = 5 on 5 October 12:00 UT, where the NN outperforms the SVM. However, this high Kp happens only one time among 360 epochs, and a generalized conclusion requires a further research.

Table 2 summarizes the extrapolation errors for all evaluation points in
October 2014. The 1-month mean of the errors from four directions, north,
south, east, and west, and three ranges, 5, 10, and
15^{∘}, are presented. The Klobuchar model of the GPS navigation
message (Klob.) is also shown for comparison. In all ranges, even at the
15^{∘} points, both the SVM and NN outperform the Klobuchar model.
This proves that the extrapolation methods are useful even in large areas.
In the east and west points where the ionospheric spatial gradient is small,
the accuracy improvement provided by the SVM is not significant because it
can be suitable to generalize the ionospheric delay by internal ionospheric
delay information. The SVM error is 11.8 % smaller than that of the NN in
the W15 region. In the south region, the extrapolation error is very large
due to the large ionospheric variation, and this results in the largest
improvement provided by the SVM. In particular, the S10 region contains the
largest error difference, at approximately 0.65 TECU. The average error for
each region is the largest at the 10^{∘} extrapolation region.

The difference may mainly result from the fact that the generalization performance of the SVM model is better than that of the NN for the ionospheric variations. Since ionosphere environment depends on its geomagnetic locations, the proposed extrapolation algorithm performance might be different at other locations. If the estimation region is changed, a new training and optimization process should be performed.

In order to determine optimal parameters between F10.7 and SSN, two more cases are tested; F10.7 only and SSN only. Optimal estimator structure is changing with the selection of input parameters. Before comparing the single parameter (F10.7 only or SNN only) results with the dual parameter (F10.7 and SNN) results, the same types of parameter optimizations are performed as those in Figs. 5 and 6 for each single parameter case. The SVM C value is set to 10 000 for both cases. The optimal numbers of hidden neurons are selected to 55 for the F10.7 case and 45 for the SSN case.

The extrapolation RMS errors of the single (F10.7 or SSN) and dual (F10.7 + SSN) parameters are presented in Table 3 (SVM) and Table 4 (NN). The total mean errors of the single parameter cases are greater than the dual parameter case at all extrapolation points for both estimation models. Increase of the NN errors with the single parameters at north and south points are significant. Effect of F10.7 and SSN may be complementary to each other during geomagnetic storm days (19–22 October). In this period, the estimation error reduction by the dual parameters are 26 % for SVM model and 22 % for NN model.

6 Conclusions

Back to toptop
The coverage area of a regional ionosphere map is determined by the distribution of GNSS ground stations. This paper proposes a spatial extrapolation algorithm to extend the ionosphere map coverage using a SVM. One year of IGS GIM ionospheric delay data over South Korea and environmental parameters are used as input data sets to train the SVM algorithm. From the training results, 1 month of ionospheric delay data outside the input data region is estimated. In addition to solar and geomagnetic environmental parameters, current ionospheric delay data in the inner data region are used to estimate the ionospheric delay data in the outside region.

The estimation accuracy is evaluated at 12 points; four directions (north,
south, east, and west) and three distances (5, 10,
and 15^{∘}). The accuracy improvement by the SVM is compared with
the NN. The 1-month mean of the estimation error produced by the SVM is
0.33 TECU for the 5^{∘} region, 1.01 TECU for the 10^{∘}
region, and 1.95 TECU for the 15^{∘} region. The improvement levels
over the NN for the 5, 10, and 15^{∘} regions
are 26.7, 17.9, and 5.3 %, respectively. The error reduction by
the SVM over NN is more significant at near points than at remote points.

Among the four directions, the error in the south region is the largest. The ionospheric delay and variation in the north region is usually smaller than the delay either in the east or west, but the extrapolation accuracy in the north region is even larger than in the east or west. A larger spatial gradient along the south–north direction over the east–west direction may explain this difference. This dependency on the ionospheric spatial gradient can be explained by the inherent nature of extrapolation. A large gradient along the south–north direction implies more sensitivity along the south–north direction data. The north point data are more sensitive to the southern region's input data than the western or eastern regions' input data. Since the southern region's input data has a larger variation than other regions, its variation directly affects the north point estimate and increases the error.

Although artificial neural networks are the most widely used machine learning algorithm for classification and regression problems, a SVM model is also powerful for predicting problems because of its generalization performance. Because a SVM is defined by a convex optimization problem, there are no local minima solutions. As SVM is based on structural risk minimization, it shows excellent generalization performance. In the case of our ionosphere extrapolation problem, the SVM demonstrates a better performance than the NN.

Data availability

Back to toptop
Data availability.

The IGS global ionosphere map data are available in the IGS data center. Ionosphere map data used in the analysis can be freely accessed at ftp://cddis.nasa.gov/pub/gps/products/ionex/ (IGS, 2019).

Competing interests

Back to toptop
Competing interests.

The authors declare that they have no conflict of interest.

Acknowledgements

Back to toptop
Acknowledgements.

This research was supported by the Space Core Technology Development Program
funded by the Ministry of Science and Information and Communications
Technology (ICT) (NRF-2016M1A3A3A02016943).

Edited by: Dalia Buresova

Reviewed by: three anonymous
referees

References

Back to toptop
Akhoondzadeh, M.: Support vector machines for TEC seismo-ionospheric anomalies detection, Ann. Geophys., 31, 173–186, https://doi.org/10.5194/angeo-31-173-2013, 2013.

Ban, P. P., Sun, S. J., Chen, C., and Zhao, Z. W.: Forecasting of
low-latitude storm-time ionospheric f_{0}F2 using support vector machine,
Radio Sci., 46, 1–9, https://doi.org/10.1029/2010RS004633, 2011.

Borovsky, J. E. and Denton, M. H.: Differences between CME-driven storms and CIR-driven storms, J. Geophys. Res., 111, A07S08, https://doi.org/10.1029/2005JA011447, 2006.

Chen, C., Wu, Z. S., Ban, P. P., Sun, S. J., Xu, Z. W., and Zhao, Z. W.:
Diurnal specification of the ionospheric f_{0}F2 parameter using a support
vector machine, Radio Sci., 45, 1–13, https://doi.org/10.1029/2010RS004393, 2010.

Cristianini, N.: Support vector and kernel machines, Tutorial at the 18th Int. Conf. Mach. Learn., 2001.

Denton, M. H., Borovsky, J. E., Skoug, R. M., Thomsen, M. F., Lavraud, B., Henderson, M. G., McPherron, R. L., Zhang, J. C., and Liemohn, M. W.: Geomagnetic storm driven by ICME- and CIR-dominated solar wind, J.Geophys.Res., 111, A07S07, https://doi.org/10.1029/2005JA011436, 2006.

Ferris, M. C. and Munson, T. S.: Interior-point methods for massive support vector machines, SIAM J. Optim., 13, 783–804, https://doi.org/10.1137/S1052623400374379, 2004.

Gunn, S. R.: Support vector machines for classification and regression, ISIS Technical Report, 14, 1998.

Habarulema, J. B., McKinnell, L. A., and Opperman, B. D. L.: Regional GPS TEC modeling; Attempted spatial and temporal extrapolation of TEC using neural networks, J. Geophys. Res., 116, 1–14, https://doi.org/10.1029/2010JA016269, 2011.

Huang, W., Nakamori, Y., and Wang, S. Y.: Forecasting stock market movement direction with support vector machine, Comput. Operat. Res., 32, 2513–2522, https://doi.org/10.1016/j.cor.2004.03.016, 2015.

Huang, Z. and Yuan, H.: Ionospheric single-station TEC short-term forecast using RBF neural network, Radio Sci., 49, 283–292, https://doi.org/10.1002/2013RS005247, 2014.

International GNSS Service (IGS): Global Ionosphere Map Data Archive, available at: ftp://cddis.nasa.gov/pub/gps/products/ionex/, last access: 28 January 2019.

Jayapal, V. and Zain, A. F. M.: Interpolation and extrapolation techniques based neural network in estimating the missing ionospheric TEC data, in: Progress in Electromagnetic Research Symposium (PIERS), 695–699, 2016.

Jwo, D. J., Lee, T. S., and Tseng, Y. W.: ARMA neural networks for prediction DGPS pseudorange correction, J. Navi., 57, 275–286, https://doi.org/10.1017/S0373463304002656, 2004.

Kim, J. and Kim, M.: Extending ionospheric correction coverage area by using extrapolation methods, J. Kor. Soci. Aeros. Sci. Fli. Operat., 22, 74–81, https://doi.org/10.12985/ksaa.2014.22.3.074, 2014.

Kim, J., Lee, S. W., and Lee, H. K.: An annual variation analysis of the ionospheric spatial gradient over a regional area for GNSS applications, Adv. Spa. Res., 54, 333–341, https://doi.org/10.1016/j.asr.2014.03.024, 2014.

Kim, M. and Kim, J.: Extending ionospheric correction coverage area by using a neural network method, Int. J. Aero. Spa. Sci., 17, 64–72, https://doi.org/10.5139/IJASS.2016.17.1.64, 2016.

Kumluca, A., Tulunay, E., and Topalli, I.: Temporal and spatial forecasting of ionospheric critical frequency using neural networks, Radio Sci., 34, 1497–1506, https://doi.org/10.1029/1999RS900070, 1999.

Leandro, R. F. and Santos, M. C.: A neural network approach for regional vertical total electron content modelling, Stud. Geophys. Geod., 51, 279–292, https://doi.org/10.1007/s11200-007-0015-6, 2006.

Mansoori, A. A., Khan, P. A., Bhardwaj, S., Atulkar, R., and Purohit, P. K.: Ionospheric irregularity influences on GPS time delay, Russian J. Earth Sci., 15, 1–9, https://doi.org/10.2205/2015ES000555, 2015.

McKinnell, L. A. and Friedrich, M.: A neural network-based ionospheric model for the auroral zone, J. Atmos. Sol.-Terr. Phy., 69, 1459–1470, https://doi.org/10.1016/j.jastp.2007.05.003, 2007.

Mohandes, M. A., Halawani, T. O., Rehman, S., and Hussain, A.: Support vector machines for wind speed prediction, Renewable Ener., 29, 939–947, https://doi.org/10.1016/j.renene.2003.11.009, 2014.

Okoh, D., Owolabi, O., Ekechukwu, C., Folarin, O., Arhiwo, G., Agbo, J., Bolaji, S., and Rabiu, B.: A regional GNSS-VTEC model over Nigeria using neural networks: A novel approach, Geode. Geodyn., 7, 19–31, https://doi.org/10.1016/j.geog.2016.03.003, 2016.

Razin, M. R. G. and Voosoghi, B.: Wavelet neural networks using particle swarm optimization training in modeling regional ionospheric total electron content, J. Atmos. Sol.-Terr. Phy. 149, 21–30, https://doi.org/10.1016/j.jastp.2016.09.005, 2016.

Song, R., Zhang, X., Zhou, C., Liu, J., and He, J.: Predicting TEC in China based on the neural networks optimized by genetic algorithm, Adv. Space Res., 62, 745–759, https://doi.org/10.1016/j.asr.2018.03.043, 2018.

Vuković, J. and Kos, T.: Ionospheric spatial and temporal gradients for disturbance characterization, Proceeding of 2016 European Navigation Conference (ENC), Helsinki, 1–4, https://doi.org/10.1109/EURONAV.2016.7530564, 2016.

Wielgosz, P., Grejner-Brzezinska, D., and Kashani, I.: Regional ionosphere mapping with kriging and multiquadric methods, J. GPS., 2, 48–55, https://doi.org/10.5081/jgps.2.1.48, 2003.

Wu, Y. W., Liu, R. Y., Zhang, B. C., Wu, Z. S., Ping, J. S., Liu, J. M., and Hu, Z. J.: Variations of the ionospheric TEC using simultaneous measurements from the China Crustal Movement Observation Network, Ann. Geophys., 30, 1423–1433, https://doi.org/10.5194/angeo-30-1423-2012, 2012.

Annales Geophysicae

An interactive open-access journal of the European Geosciences Union