Transfer entropy and cumulant-based cost as measures of nonlinear causal relationships in space plasmas: applications to <i>D</i><sub>st</sub>

Johnson, Jay R.; Wing, Simon; Camporeale, Enrico

doi:https://doi.org/10.5194/angeo-36-945-2018

Articles | Volume 36, issue 4

https://doi.org/10.5194/angeo-36-945-2018

Articles | Volume 36, issue 4

ANGEO Communicates

| Highlight paper

02 Jul 2018

ANGEO Communicates | Highlight paper |

| 02 Jul 2018

Transfer entropy and cumulant-based cost as measures of nonlinear causal relationships in space plasmas: applications to D_st

Jay R. Johnson, Simon Wing, and Enrico Camporeale

Abstract

It is well known that the magnetospheric response to the solar wind is nonlinear. Information theoretical tools such as mutual information, transfer entropy, and cumulant-based analysis are able to characterize the nonlinearities in the system. Using cumulant-based cost, we show that nonlinear significance of D_st peaks at 3–12 h lags that can be attributed to VB_s, which also exhibits similar behavior. However, the nonlinear significance that peaks at lags 25, 50, and 90 h can be attributed to internal dynamics, which may be related to the relaxation of the ring current. These peaks are absent in the linear and nonlinear self-significance of VB_s. Our analysis with mutual information and transfer entropy shows that both methods can establish that there are strong correlations and transfer of information from V_sw to D_st at a timescale that is consistent with that obtained from the cumulant-based analysis. However, mutual information also shows that there is a strong correlation in the backward direction, from D_st to V_sw, which is counterintuitive. In contrast, transfer entropy shows that there is no or little transfer of information from D_st to V_sw, as expected because it is the solar wind that drives the magnetosphere, not the other way around. Our case study demonstrates that these information theoretical tools are quite useful for space physics studies because these tools can uncover nonlinear dynamics that cannot be seen with the traditional analyses and models that assume linear relationships.

Download & links

Article (PDF, 358 KB)

How to cite

How to cite.

Dates

Received: 19 Jan 2018 – Discussion started: 23 Jan 2018 – Revised: 25 Apr 2018 – Accepted: 31 May 2018 – Published: 02 Jul 2018

1 Introduction

One of the most practically important concepts in dynamical systems is the notion of causality. It is particularly useful to organize observational datasets according to causal relationships in order to identify variables that drive the dynamics. Understanding causal dependencies can also help to simplify descriptions of highly complex physical processes because it constrains the coupling functions between the dynamical variables. Analysis of those coupling functions can lead to simplification of the underlying physical processes that are most important for driving the system. It is particularly useful from a practical standpoint to understand causal dependencies in systems involving natural hazards because monitoring of causal variables is closely linked with warning.

A common method to establish causal dependencies in a data stream of two variables, e.g., [a(t)] and [b(t)], is to apply linear correlation studies such as Strangeway et al. (2005), which showed the relationship between the downward Poynting flux and ion outflows. Causal relationships are typically identified by considering a time-shifted correlation function

\begin{matrix} (1) & λ_{a b} (τ) ≜ \frac{〈 a (t) b (t + τ) 〉 - 〈 a 〉 〈 b 〉}{\sqrt{〈 a^{2} 〉 - 〈 a 〉^{2}} \sqrt{〈 b^{2} 〉 - 〈 b 〉^{2}}}, \end{matrix}

where 〈…〉 is an ensemble average obtained by drawing samples at a set of measurement times, ${t_{0}, t_{1}, \dots, t_{N}}$ . For example, Borovsky et al. (1998) used such a method to identify relationships between solar wind variables and plasma sheet variables. The causal dependency that the plasma sheet responds to changes in the solar wind can be identified from the time-shift of the peak of the cross-correlation indicating a response time. From this type of analysis it can be found that the plasma sheet generally responds from the tail to the inner magnetosphere, consistent with the notion of earthward convection. Such analysis has been particularly useful to help understand plasma sheet transport.

However, the procedure of detecting causal relationships based on linear cross-correlation suffers from a number of limitations. First it should be noted that the statistical accuracy of the correlation function is limited by the resolution and length of the data stream. Second, the linear time series analysis ignores nonlinear correlations, which may be important for energy transfer in the magnetospheric system. For example, substorms are believed to involve storage and release of energy in the magnetotail, which is a highly nonlinear response. Similarly, magnetosphere–ionosphere coupling may also be highly nonlinear, involving the nonlinear development of accelerating potentials along auroral field lines and nonlinear current–voltage relationships. Third, the cross-correlation may not be a particularly clear measure when there are multiple peaks or if there is little or no asymmetry in the forward (i.e., λ_ab(τ)) and backward directions (i.e., $λ_{b a} (τ) = λ_{a b} (- τ)$ ). Finally, the cross-correlation does not provide any way to clearly distinguish between two variables that are passively correlated because of a common driver rather than causally related.

In the remainder of this paper, we will discuss other methods to identify causal relationships based on entropy-based discriminating statistics such as mutual information and transfer entropy. We will also discuss the cumulant-based method. We will illustrate the shortcomings and strengths of the various methods for studying causality with examples from nonlinear dynamics and space physics.

2 Linear vs. nonlinear dependency

It is well known that the magnetosphere responds to variation in the solar wind parameters (Baker et al., 1983; Clauer et al., 1981; Crooker and Gringauz, 1993; Johnson and Wing, 2015; Papitashvili et al., 2000; Wing and Johnson, 2015; Wing et al., 2016), and it has been established that the magnetosphere has a significant linear response to the solar wind. However, it is also expected that the magnetosphere has a nonlinear response (Balikhin et al., 2011; Klimas et al., 1998; Tsurutani et al., 1990; Valdivia et al., 2013; Vassiliadis et al., 1990). The nonlinear response may be driven by internal dynamics rather than being driven externally (Johnson and Wing, 2005; Wing et al., 2005). For example, the internal dynamics associated with loading and unloading of magnetic energy associated with storms and substorms is nonlinear (Johnson and Wing, 2014). Indeed, the data analysis of Bargatze et al. (1985) indicated that the dynamical response of the magnetosphere to solar wind input could not be entirely understood using linear prediction filters.

Suppose that we consider a set of variables a and b, which could be vectors of variables measured in time and we would like to measure their dependency. Instead of considering the covariance matrix or correlation function, we consider a more general measure of dependency between an input and output is obtained by considering whether

\begin{matrix} (2) & P (a, b) \overset{?}{=} P (a) P (b), \end{matrix}

where P(a,b) is the joint probability of input a and output b, while P(a) and P(b) are the probability of a and b respectively. If the relationship holds, then the variables a and b are independent. For all other cases, there is some measure of dependency. In the case where the system output is completely known given the input, $P (a, b) = P (a)$ . The advantage of considering Eq. (2) is that it is possible to detect the presence of higher order nonlinear dependencies between the input and output even in the absence of linear dependencies (Gershenfeld, 1998).

2.1 Mutual information and cumulant-based cost

Mutual information and cumulant-based cost are two useful measures that quantify Eq. (2). Mutual information has the advantage that in the limit of Gaussian joint probability distributions, it may be simply related to the correlation coefficient C_ab(τ) defined in Eq. (1) (Li, 1990). Cumulants have the advantage of good statistics for limited datasets and noisy systems (Deco and Schürmann, 2000). Moreover, for high-dimensional systems it is more efficient to compute moments of the data rather than try to construct the probability density function.

Correlation studies also only detect linear correlations, so if the feedback involves nonlinear processes (highly likely in this case) then their usefulness may be seriously limited. Alternatively, entropy-based measures such as mutual information (Materassi et al., 2011; Prichard and Theiler, 1995) and cumulants (Johnson and Wing, 2005) are useful for detecting linear as well as nonlinear correlations. The mutual information is constructed from the probability distribution function of the variables and may be computed using a quantization procedure where data are binned such that the samples [a(t)] are assigned discrete values $\hat{a} \in {a_{1}, a_{2}, \dots, a_{n}}$ of an alphabet ℵ₁ and [b(t)] is assigned discrete values $\hat{b} \in {b_{1}, b_{2}, \dots, b_{m}}$ of an alphabet ℵ₂. The ad hoc time-shifted mutual entropy

\begin{array}{l} M_{a b} (τ) ≜ \\ (3) & \sum_{\hat{a} \in ℵ_{1}, \hat{b} \in ℵ_{2}} p (\hat{a} (t + τ), \hat{b} (t)) \log (\frac{p (\hat{a} (t + τ), \hat{b} (t))}{p (\hat{a}) p (\hat{b})}) \end{array}

has been used as an indicator of causality, but suffers from the same problems as the time-shifted cross-correlation when it has multiple peaks and long-range correlations.

Similarly, examination of time-shifted cumulants could be used as an indicator of causality in a nonlinear system. In this case, we can define a discriminating statistic

\begin{matrix} (4) & D^{C} = \sum_{q = 1}^{\infty} \sum_{i_{1}, \dots, i_{q} \in Π_{q}} K_{1 i_{2} \dots i_{q}}^{2}, \end{matrix}

where

\begin{matrix} (5) & \begin{array}{lcl} K_{i} & = & C_{i} = 〈 z_{i} 〉 \\ K_{i j} & = & C_{i j} - C_{i} C_{j} = 〈 z_{i} z_{j} 〉 - 〈 z_{i} 〉 〈 z_{j} 〉 \\ K_{i j k} & = & C_{i j k} - C_{i j} C_{k} - C_{j k} C_{i} - C_{i k} C_{j} + 2 C_{i} C_{j} C_{k} \\ K_{i j k l} & = & C_{i j k l} - C_{i j k} C_{l} - C_{i j l} C_{k} - C_{i l k} C_{j} - C_{l j k} C_{i} \\ - C_{i j} C_{k l} - C_{i l} C_{k j} - C_{i k} C_{j l} + 2 (C_{i j} C_{k} C_{l} \\ + C_{i k} C_{j} C_{l} + C_{i l} C_{j} C_{k} + C_{j k} C_{i} C_{l} + C_{j l} C_{i} C_{k} \\ + C_{k l} C_{i} C_{j}) - 6 C_{i} C_{j} C_{k} C_{l} \end{array} \end{matrix}

are the cumulants

\begin{matrix} (6) & C_{i \dots j} = \int d z P (z) z_{i} \dots z_{j} \equiv 〈 z_{i} \dots z_{j} 〉 \end{matrix}

of the joint probability distribution for variables $z_{1}, \dots, z_{j}$ .

With only two variables, a and b, defined above, we can consider the cost function

\begin{matrix} (7) & D_{a, b}^{C} (τ) = D^{C} (a (t), b (t + τ)) . \end{matrix}

The presence of nonlinear dependence has been identified by comparing the cumulant cost for a time series with the cumulant-based cost of surrogate time series, which are constructed to have the same linear correlations as in Johnson and Wing (2005). Significance measures the difference in the discriminating statistic from the mean of the discriminating statistic of the surrogates in terms of the spread of the surrogates, σ.

In Sect. 3, we will show an application of cumulant-based analysis to the disturbance storm time index (D_st). In principle, the cross-correlation, mutual information, and cumulant-based cost should be independent of the selection of measurement points if the system is stationary; therefore, time stationarity can be examined by comparing these discriminating statistics for groups of measurements drawn from different windows of time as in Johnson and Wing (2005) and Wing et al. (2016).

2.2 Transfer entropy

Another method for determining causality is the one-sided transfer entropy (De Michelis et al., 2011; Materassi et al., 2014; Schreiber, 2000; Wing et al., 2016, 2018), which is based upon the conditional mutual information

\begin{array}{l} M_{C} (x, y | z) ≜ \\ (8) & \sum_{x \in ℵ_{1}} \sum_{y \in ℵ_{2}} \sum_{z \in ℵ_{3}} p (x, y, z) \log (\frac{p (x, y, z) p (z)}{p (x, z) p (y, z)}) . \end{array}

The conditional mutual information measures the dependence of two variables, x and y, given a conditioner variable, z. If either x or y are dependent on z, the mutual information between x and y is reduced, and this reduction of information provides a method to eliminate coincidental dependence, or conversely to identify causal dependence.

Transfer entropy considers the conditional mutual information between two variables using the past history of one of the variables as the conditioner.

\begin{array}{l} T_{a \to b} (τ) = \sum_{\hat{a} \in ℵ_{1}} \sum_{{\hat{a}}^{(k)} \in ℵ_{1}^{(k)}} \sum_{\hat{b} \in ℵ_{2}} p (\hat{a} (t + τ), {\hat{a}}^{(k)} (t), \hat{b} (t)) \\ (9) & \log (\frac{p (\hat{a} (t + τ) | {\hat{a}}^{(k)} (t), \hat{b} (t))}{p (\hat{a} (t + τ) | {\hat{a}}^{(k)} (t))}), \end{array}

where ${\hat{a}}^{(k)} (t) = [\hat{a} (t), \hat{a} (t - Δ), \dots, \hat{a} (t - (k - 1) Δ)]$ . The standard definition of transfer entropy takes k=1 (no lag), but keeping a higher embedding dimension could in principle provide a more precise measure (for example, if a has periodicity, a dimension of 2 may provide better prediction of future values of a from its past time series and therefore lower the transfer entropy). Transfer entropy as a discriminating statistic has the following advantages. First, in the absence of information flow from a to b (i.e., a(t+τ) has no additional dependence from b(t) beyond what is known from the past history of a^(k)(t)) so that $p (\hat{a} (t + τ) | {\hat{a}}^{(k)} (t), \hat{b} (t)) = p (\hat{a} (t + τ) | {\hat{a}}^{(k)} (t))$ and the transfer entropy vanishes. The transfer entropy is also highly directional so that $T_{a \to b} \neq T_{b \to a}$ . The advantage can be clearly seen for dynamical systems in which variables are forward differenced and the transfer entropy is clearly one-sided while mutual information and correlation functions can even be symmetric (Schreiber, 2000). This measure also accounts for static internal correlations, which can be used to determine whether two variables are driven by a common driver or whether the variable b is causally driving the variable a.

Both mutual information and transfer entropy require binning of data. As mentioned in Wing et al. (2016), the number of bins (n_b) needs to be chosen properly and there are some guidelines that can be followed. In general, we would like to maximize the amount of information. Having too few bins would lump too many points into the same bin, leading to loss of information. Conversely, having too many bins would leave many bins with 0 or a few number of points, which also would lead to loss of information. Sturges (1926) proposed that for a normal distribution, optimal $n_{b} = \log_{2} (n) + 1$ and bin width w = range∕n_b, where n is the number of points in the dataset and range is the maximum value minus the minimum value of the points. In practice, there is usually a range of n_b that would work.

3 Application to space weather: D_st analysis

D_st (disturbance storm time index) is an hourly index that gives a measure of the strength of the symmetric ring current that, in turn, provides a measure of the dynamics of geomagnetic storms (Dessler and Parker, 1959). Because of its global nature, D_st is often used as one of the several indices that represent the state of the magnetosphere. For example, Balasis et al. (2011) used the cumulative square amplitude of the D_st time series as a proxy for energy dissipation rate in the magnetosphere and found that it fits a power law well with log-periodic oscillations, which was interpreted as evidence for discrete-scale invariance in the D_st dynamics.

When plasma sheet ions are injected into the Earth's inner magnetosphere, they drift westward around the Earth, forming the ring current. Studies have shown that the substorm occurrence rate increases with solar wind velocity (high speed streams) (Kissinger et al., 2011; Newell et al., 2016). An increase in the solar wind electric field, VB_z, can increase the dawn–dusk electric field in the magnetotail, which in turn determines the number of plasma sheet particles that move to the inner magnetosphere (Friedel et al., 2001). Studies have shown that the electric field, VB_s (V_sw × southward IMF B_z) or VB_z, has a strong effect on the ring current dynamics (Burton et al., 1975; McPherron and O'Brien, 2001; O'Brien and McPherron, 2000; Weygand and McPherron, 2006).

For the present study, we examine the relationships between solar wind velocity (V_sw) and VB_s with D_st. We use D_st records in the period 1974–2001 obtained from Kyoto University World Data Center for Geomagnetism (http://swdcwww.kugi.kyoto-u.ac.jp/index.html, last access: 18 January 2018). The corresponding solar wind data are obtained from IMP-8, ACE, WIND, ISEE1, and ISEE3 observations. The ACE SWEPAM and MAG data and the WIND MAG data are obtained from CDAWeb (http://cdaweb.gsfc.nasa.gov/, last access: 18 January 2018). The WIND 3DP data are obtained from the 3DP team directly. The ISEE1 and ISEE3 data are obtained from UCLA (these datasets are also available at NASA NSSDC; http://nssdc.gsfc.nasa.gov/space/, last access: 18 January 2018). The IMP8 data come directly from the IMP teams. The solar wind is propagated with the minimum variance technique (Weimer et al., 2003) to GSM (X, Y, Z) = (17, 0, 0) R_E to produce 1 min files, from which hourly averaged solar wind parameters are constructed.

3.1 Cumulant-based analysis

Section 2.1 presents the method of cumulant-based cost. Here, we show an application of cumulant-based cost to detect nonlinear dynamics in D_st. We consider the forward coupling between a solar wind variable such as VB_s and D_st, which characterizes the ring current response to the solar wind driver. We therefore consider the nonlinear cross-correlations of the vector

\begin{matrix} (10) & c (t, τ) = {{VB}_{s} (t), D_{st} (t + τ)} = {z_{1}, z_{2}} . \end{matrix}

The generalization of cost is based on realizations of {z₁,z₂}. In this case, each variable is Gaussianized with unit variance to eliminate static nonlinearities (i.e., higher order self-correlations in VB_s and D_st are eliminated so that the cost measures only cross-dependence between VB_s and D_st). This procedure is explained in the next paragraph.

The distributions of D_st and VB_s are generally non-Gaussian. As such, the raw distributions (e.g., distribution of values of D_st) may have nonzero higher order cumulants (e.g., they can have a skew and kurtosis). This property makes it more difficult to interpret whether the higher order cumulants in the time evolution arise from the overall shape of the distribution of data points or from the time-ordering of the data. To eliminate the inherent nonzero cumulants in the overall distribution of data, we construct a rank-ordered map from the original dataset to a proxy dataset of the same length drawn from a Gaussian distribution (Deco and Schürmann, 2000; Kennel and Isabelle, 1992; Schreiber and Schmitz, 1996). The distribution of the proxy dataset ensures that all cumulants of the distribution beyond second order should in principle vanish. However, the time-ordering of the data can still lead to nonzero cumulants because the joint probability distribution of D_st(t+τ) and D_st(t) may be non-Gaussian even if the distribution of D_st is Gaussian. Moreover, it is simple to construct surrogate data from the Gaussianized data that share the same autocorrelation by using the same power spectrum but randomly shifting the phases of the Fourier coefficients. The surrogate data therefore have the same autocorrelation as the original data. Any deviation from the linear statistic is apparent from comparison with the surrogate data, and we interpret these deviations as evidence of nonlinear dependence because we have falsified the hypothesis that the data can be adequately described by linear statistics. This method has been successfully employed in Johnson and Wing (2005), in which the K_p record was analyzed with mutual information and cumulants.

In Fig. 1 we plot the significance obtained from the year 1999 as a function of time delay, τ. Significance extracted from ${{VB}_{s} (t), D_{st} (t + τ)}$ and ${{VB}_{s} (t), {VB}_{s} (t + τ)}$ for 1999 is plotted in panels (a) and (b), respectively. It should be noted that there is a strong linear response at around 3 h time delay. As shown in Fig. 1a, there is a clear nonlinear response with peaking around 3–10, 25, 50, and 90 h, lasting for approximately 1 week. In contrast, in Fig. 1b, the nonlinearity only has one broad peak around 3–12 h in the self-significance for VB_s, suggesting that the nonlinear and linear peaks at τ=3–12 h in Fig. 1a may be associated with VB_s. We will revisit the solar wind causal relationship with D_st using transfer entropy in Sect. 3.2.

The absence of the nonlinear peaks at τ = 25, 50, and 90 h in the self-significance for VB_s (Fig. 1b) suggests that these nonlinearities in ${{VB}_{s} (t), D_{st} (t + τ)}$ are related to internal magnetospheric dynamics. As the D_st index is thought to reflect storm activity, it is reasonable that nonlinear significance would decay on the order of 1 week as storms commonly last around that time. The strong nonlinear responses at τ = 25, 50, and 90 h are likely related to multiple modes of relaxation of the ring current following the commencement of storms. It should also be noted that other nonlinearities detected by even higher order cumulants may also be present; however, the calculation demonstrates the nonlinear nature of the underlying dynamics.

https://www.ann-geophys.net/36/945/2018/angeo-36-945-2018-f01

Figure 1Significance extracted from (a) ${{VB}_{s} (t), D_{st} (t - τ)}$ and (b) ${{VB}_{s} (t), {VB}_{s} (t - τ)}$ for 1999. It should be noted that there is a strong linear response at around 3 h time delay. There is a clear nonlinear response with a strong peak around 50 h lasting for approximately 1 week. The long-term nonlinear response is absent in the solar wind data, indicating that the long-term nonlinear correlations between VB_s and D_st are the result of internal magnetospheric dynamics.

Download

A common scenario for storm–ring current interaction is the following. A storm compresses the magnetosphere, intensifies the magnetic field in the magnetosphere, and injects energetic particles into the ring current region. The ring current intensifies during the main phase of the storm, which can last ∼ 6 h (Weygand and McPherron, 2006). Once the injection stops, the ring current begins to decay and the storm enters the recovery phase. Conservation of the magnetic moment implies that anisotropies develop in the ring current and plasma sheet. Anisotropy drives the ring current plasma unstable to ion cyclotron waves. The ion cyclotron waves scatter energetic ions into the loss cone so that they are lost from the ring current. Nonlinear interaction between waves and particles keeps the plasma near marginal stability with a steady loss of energetic particles due to wave–particle scattering. Other loss mechanisms include charge exchange, Coulomb scattering, and convection of ions to the front of the magnetopause. The ring current decay can have two stages (Kozyra et al., 2002). In the first stage, the ring current decays rapidly and the loss mechanisms can be attributed to convective outflow, pitch-angle scattering in the ring current, and O⁺ charge exchange (Hamilton et al., 1988; Weygand and McPherron, 2006). The second stage may typically begin about 1 day from the commencement of the storm (see, for example, Fig. 7 of Kozyra et al., 2002). In the second stage, the decay rate is slower and is attributed mainly to H⁺ charge exchange (Hamilton et al., 1988) and can take several days to deplete the ring current to the baseline level (Smith et al., 1976). We can speculate that the multiple nonlinear response lag times that are detected with the cumulant-based approach are likely the relaxation of the ring current due to the complex interplay of multiple loss processes.

https://www.ann-geophys.net/36/945/2018/angeo-36-945-2018-f02

Figure 2Comparison of mutual information and transfer entropy measures to determine causal driving of the magnetosphere as characterized by D_st. Note that causal driving appears to peak somewhat later (11 h) than indicated by mutual information (2 h), indicating that internal dynamics likely are very important initially. The backward transfer entropy is below the noise level for all values, indicating that D_st in no way influences the upstream solar wind velocity. Such a conclusion could not be inferred from the mutual information measure.

Download

3.2 Transfer entropy

As mentioned in Sect. 2.2, transfer entropy gives a measure of how much information is transferred from one variable to another. We have applied transfer entropy and mutual information to the relationship between the V_sw and D_st for the period 1974–2001. The result is shown in Fig. 2. Note that the mutual information measure suggests strong correlations between prior values of D_st and V_sw. This finding suggests that D_st could be a driver of V_sw, which is counterintuitive. On the other hand, the transfer entropy clearly shows that this information transfer in the backward direction (D_st→V_sw) does not rise above the noise level (the horizontal blue lines indicate mean and standard deviation of 100 surrogate datasets for which the data were randomly reordered.) This result is expected because it is the solar wind that drives the magnetosphere, not the other way around. The transfer of information from V_sw to D_st peaks at τ=8–11 h. The cumulant-based analysis in Sect. 3.1 shows that the response of D_st to VB_s has a similar timescale. This timescale is consistent with the 4 to 15 h transport time for the solar wind to reach the midnight and noon regions of the geosynchronous orbit, respectively, from the dayside magnetopause (Borovsky et al., 1998). The analysis presented here illustrates the power of the transfer entropy for accessing causality.

4 Summary

We recently used mutual information, transfer entropy, and conditional mutual information to discover the solar wind drivers of the outer radiation belt electrons (Wing et al., 2016). Because V_sw anticorrelates with solar wind density (n_sw), it is hard to isolate the effects of V_sw on radiation belt electrons, given n_sw and vice versa. However, using conditional mutual information, we were able to determine the information transfer from n_sw or any other solar wind parameters to radiation belt electrons, given V_sw (or any other solar wind parameters). We also showed that the triangle distribution in the radiation belt electron vs. solar wind velocity plot (Reeves et al., 2011) can be understood better when we consider that V_sw and n_sw transfer information to radiation belt electrons with lags of 2 and 0 days (< 24 h), respectively. Also recently, we used transfer entropy to better understand the causal parameters in the solar cycle dynamo and their response lag times (Wing et al., 2018).

As a follow-up to Wing et al. (2016, 2018), the present study demonstrates further how information theoretical tools can be useful for space physics and space weather studies. Cumulant-based analysis can be used to distinguish internal vs. external driving of the system. Both mutual information and transfer entropy give a measure of shared information between two variables (or vectors). However, unlike mutual information, transfer entropy is highly directional. To illustrate, we apply mutual information, transfer entropy, and cumulant-based analysis to investigate the dynamics of the D_st index.

Our analysis with mutual information and transfer entropy indicates that there are strong linear and nonlinear correlations and transfer of information, respectively, in the forward direction between V_sw and D_st (V_sw → D_st). However, mutual information indicates that there is also a strong correlation in the backward direction (D_st → V_sw), which is puzzling and counterintuitive. In contrast, the transfer entropy indicates that there is no information transfer in the backward direction (D_st→V_sw), as expected because it is the solar wind that drives the magnetosphere, not the other way around. The transfer of information from V_sw to D_st peaks at τ=8–11 h.

Using the cumulant-based significance, we have established that the underlying dynamics of D_st is in general nonlinear, exhibiting a quasiperiodicity which is detectable only if nonlinear correlations are taken into account. The strong nonlinear responses of D_st to VB_s at τ=25, 50, and 90 h are likely related to multiple modes of relaxation of the ring current from multiple loss mechanisms following the commencement of storms. It is, of course, possible that these nonlinearities are caused by solar wind drivers other than VB_s. However, the timing of these nonlinearities would put them well in the recovery phase of a storm, and previous studies suggested that the ring current decays in the recovery phase are strongly influenced by VB_s (Burton et al., 1975; McPherron and O'Brien, 2001; O'Brien and McPherron, 2000). The nonlinearities at τ=3–12 h are not caused by internal dynamics but rather by the solar wind driver, which is similar to the timescale for the solar wind transport time from the dayside magnetopause to the inner magnetosphere. This timescale is consistent with the timescale for the information transfer from the solar wind to D_st obtained from transfer entropy analysis.

Although linear models are useful, our results indicate that these models have to be used with caution because the solar wind–magnetosphere system is inherently nonlinear. Hence, nonlinearities generally need to be taken into account in order to describe the system accurately. Local linear models (which include slow evolution of parameters) may be able to handle some nonlinearities, but it is expected that these local linear models would have difficulties if the dynamics suddenly and rapidly change.

Data availability

All the derived data products in this paper are available upon request by email (simon.wing@jhuapl.edu).

Competing interests

The authors declare that they have no conflict of interest.

Acknowledgements

Simon Wing acknowledges support from JHU/APL Janney Fellowship, NSF grant AGS-1058456, and NASA grants (NNX13AE12G, NNX15AJ01G, NNX16AR10G, and NNX16AQ87G). Jay R. Johnson acknowledges support from NASA grants (NNH11AR07I, NNX14AM27G, NNH14AY20I, NNX16AC39G), NSF grants (ATM0902730, AGS-1203299, AGS-1405225), and DOE contract DE-AC02-09CH11466. Enrico Camporeale is partially funded by the NWO Vidi grant no. 639.072.716. We thank James M. Weygand for the solar wind data processing. The raw solar wind data from ACE, Wind, ISEE1, and ISEE3 were obtained from NASA CDAW and NSSDC.
The topical editor, Georgios Balasis, thanks one anonymous referee for help in evaluating this paper.

References

Baker, D. N., Zwickl, R. D., Bame, S. J., Hones, E. W., Tsurutani, B. T., Smith, E. J., and Akasofu, S.-I.: An ISEE 3 high time resolution study of interplanetary parameter correlations with magnetospheric activity, J. Geophys. Res., 88, 6230, https://doi.org/10.1029/ja088ia08p06230, 1983. a

Balasis, G., Papadimitriou, C., Daglis, I. A., Anastasiadis, A., Athanasopoulou, L., and Eftaxias, K.: Signatures of discrete scale invariance in Dst time series, Geophys. Res. Lett., 38, L13103, https://doi.org/10.1029/2011GL048019, 2011. a

Balikhin, M. A., Boynton, R. J., Walker, S. N., Borovsky, J. E., Billings, S. A., and Wei, H. L.: Using the NARMAX approach to model the evolution of energetic electrons fluxes at geostationary orbit, Geophys. Res. Lett., 38, L18105, https://doi.org/10.1029/2011GL048980, 2011. a

Bargatze, L. F., Baker, D. N., Hones, E. W., and McPherron, R. L.: Magnetospheric impulse response for many levels of geomagnetic activity, J. Geophys. Res., 90, 6387–6394, 1985. a

Borovsky, J. E., Thomsen, M. F., and Elphic, R. C.: The driving of the plasma sheet by the solar wind, J. Geophys. Res., 103, 17617–17640, https://doi.org/10.1029/97JA02986, 1998. a, b

Burton, R. K., McPherron, R. L., and Russell, C. T.: An Emperical Relationship Between Interplanetary Conditions and Dst, J. Geophys. Res., 80, 4204–4214, 1975. a, b

Clauer, C. R., McPherron, R. L., Searls, C., and Kivelson, M. G.: Solar wind control of auroral zone geomagnetic activity, Geophys. Res. Lett., 8, 915–918, https://doi.org/10.1029/gl008i008p00915, 1981. a

Crooker, N. U. and Gringauz, K. I.: On the low correlation between long-term averages of solar wind speed and geomagnetic activity after 1976, J. Geophys. Res., 98, 59–62, https://doi.org/10.1029/92ja01978, 1993. a

Deco, G. and Schürmann, B.: Information Dynamics, Springer-Verlag, New York, 2000. a, b

De Michelis, P., Consolini, G., Materassi, M., and Tozzi, R.: An information theory approach to the storm-substorm relationship, J. Geophys. Res.-Space, 116, A08225, https://doi.org/10.1029/2011JA016535, 2011. a

Dessler, A. J. and Parker, E. N.: Hydromagnetic theory of geomagnetic storms, J. Geophys. Res., 64, 2239–2252, https://doi.org/10.1029/JZ064i012p02239, 1959. a

Friedel, R. H. W., Korth, H., Henderson, M. G., Thomsen, M. F., and Scudder, J. D.: Plasma sheet access to the inner magnetosphere, J. Geophys. Res.-Space, 106, 5845–5858, https://doi.org/10.1029/2000ja003011, 2001. a

Gershenfeld, N.: The Nature of Mathematical Modeling, Cambridge University Press, Cambridge, 1998. a

Hamilton, D., Gloeckler, G., Ipavich, F., Stüdemann, W., Wilken, B., and Kremser, G.: Ring current development during the great geomagnetic storm of February 1986, J. Geophys. Res.-Space, 93, 14343–14355, 1988. a, b

Johnson, J. R. and Wing, S.: A solar cycle dependence of nonlinearity in magnetospheric activity, J. Geophys. Res., 110, A04211, https://doi.org/10.1029/2004ja010638, 2005. a, b, c, d, e

Johnson, J. R. and Wing, S.: External versus internal triggering of substorms: An information-theoretical approach, Geophys. Res. Lett., 41, 5748–5754, https://doi.org/10.1002/2014gl060928, 2014. a

Johnson, J. R. and Wing, S.: The dependence of the strength and thickness of field-aligned currents on solar wind and ionospheric parameters, J. Geophys. Res.-Space, 120, 3987–4008, https://doi.org/10.1002/2014ja020312, 2015. a

Kennel, M. B. and Isabelle, S.: Method to Distinguish Possible Chaos from Colored Noise and to Determine Embedding Parameters, Phys. Rev. A, 46, 3111–3118, 1992. a

Kissinger, J., McPherron, R. L., Hsu, T.-S., and Angelopoulos, V.: Steady magnetospheric convection and stream interfaces: Relationship over a solar cycle, J. Geophys. Res.-Space, 116, A00I19, https://doi.org/10.1029/2010ja015763, 2011. a

Klimas, A. J., Vassiliadis, D., and Baker, D. N.: Dst index prediction using data-derived analogues of the magnetospheric dynamics, J. Geophys. Res., 103, 20435–20448, 1998. a

Kozyra, J., Liemohn, M., Clauer, C., Ridley, A., Thomsen, M., Borovsky, J., Roeder, J., Jordanova, V., and Gonzalez, W.: Multistep Dst development and ring current composition changes during the 4–6 June 1991 magnetic storm, J. Geophys. Res.-Space, 107, SMP 33-1-SMP 33-22, https://doi.org/10.1029/2001JA000023, 2002. a, b

Li, W.: Mutual information functions versus correlation functions, J. Stat. Phys., 60, 823, https://doi.org/10.1007/BF01025996, 1990. a

Materassi, M., Ciraolo, L., Consolini, G., and Smith, N.: Predictive Space Weather: An information theory approach, Adv. Space Res., 47, 877–885, https://doi.org/10.1016/j.asr.2010.10.026, 2011. a

Materassi, M., Consolini, G., Smith, N., and De Marco, R.: Information theory analysis of cascading process in a synthetic model of fluid turbulence, Entropy, 16, 1272–1286, 2014. a

Mcpherron, R. L. and O'Brien, P.: Predicting Geomagnetic Activity: The DstIndex, in: Space Weather, edited by: Song, P., Singer, H. J., and Siscoe, G. L., https://doi.org/10.1029/GM125p0339, 2001. a, b

Newell, P., Liou, K., Gjerloev, J., Sotirelis, T., Wing, S., and Mitchell, E.: Substorm probabilities are best predicted from solar wind speed, J. Atmos. Sol.-Terr. Phy., 146, 28–37, https://doi.org/10.1016/j.jastp.2016.04.019, 2016. a

O'Brien, T. P. and McPherron, R. L.: An empirical phase space analysis of ring current dynamics: Solar wind control of injection and decay, J. Geophys. Res., 105, 7707–7720, 2000. a, b

Papitashvili, V. O., Papitashvili, N. E., and King, J. H.: Solar cycle effects in planetary geomagnetic activity: Analysis of 36-year long OMNI dataset, Geophys. Res. Lett., 27, 2797–2800, https://doi.org/10.1029/2000gl000064, 2000. a

Prichard, D. and Theiler, J.: Generalized redundancies for time series analysis, Phys. D, 84, 476–493, https://doi.org/10.1016/0167-2789(95)00041-2, 1995. a

Reeves, G. D., Morley, S. K., Friedel, R. H. W., Henderson, M. G., Cayton, T. E., Cunningham, G., Blake, J. B., Christensen, R. A., and Thomsen, D.: On the relationship between relativistic electron flux and solar wind velocity: Paulikas and Blake revisited, J. Geophys. Res.-Space, 116, A02213, https://doi.org/10.1029/2010ja015735, 2011. a

Schreiber, T.: Measuring Information Transfer, Phys. Rev. Lett., 85, 461–464, https://doi.org/10.1103/PhysRevLett.85.461, 2000. a, b

Schreiber, T. and Schmitz, A.: Improved Surrogate Data for Nonlinearity Tests, Phys. Rev. Lett., 77, 635–639, 1996. a

Smith, P. H., Hoffman, R. A., and Fritz, T. A.: Ring current proton decay by charge exchange, J. Geophys. Res., 81, 2701–2708, https://doi.org/10.1029/JA081i016p02701, 1976. a

Strangeway, R., Ergun, J. R. E., Su, Y.-J., Carlson, C. W., and Elphic, R. C.: Factors controlling ionospheric outflows as observed at intermediate altitudes, J. Geophys. Res., 110, A03221, https://doi.org/10.1029/2004ja010829, 2005. a

Sturges, H. A.: The choice of class interval, J. Am. Stat. Assoc., 21, 65–66, https://doi.org/10.1080/01621459.1926.10502161, 1926. a

Tsurutani, B. T., Sugiura, M., Iyemori, T., Goldstein, B. E., Gonzalez, W. D., Akasofu, S. I., and Smith, E. J.: The nonlinear response of AE to the IMF Bs driver: A spectral break at 5 hours, Geophys. Res. Lett., 17, 279–282, 1990. a

Valdivia, J. A., Rogan, J., Muñoz, V., Toledo, B. A., and Stepanova, M.: The magnetosphere as a complex system, Adv. Space Res., 51, 1934–1941, https://doi.org/10.1016/j.asr.2012.04.004, 2013. a

Vassiliadis, D. V., Sharma, A. S., Eastman, T. E., and Papadopoulos, K.: Low-dimensional chaos in magnetospheric activity from AE time series, Geophys. Res. Lett., 17, 1841–1844, 1990. a

Weimer, D. R., Ober, D. M., Maynard, N. C., Collier, M. R., McComas, D. J., Ness, N. F., Smith, C. W., and Watermann, J.: Predicting interplanetary magnetic field (IMF) propagation delay times using the minimum variance technique, J. Geophys. Res., 108, 1026, https://doi.org/10.1029/2002ja009405, 2003. a

Weygand, J. M. and McPherron, R. L.: Dependence of ring current asymmetry on storm phase, J. Geophys. Res.-Space, 111, A11221, https://doi.org/10.1029/2006JA011808, 2006. a, b, c

Wing, S. and Johnson, J. R.: Theory and observations of upward field-aligned currents at the magnetopause boundary layer, Geophys. Res. Lett., 42, 9149–9155, https://doi.org/10.1002/2015gl065464, 2015. a

Wing, S., Johnson, J. R., Jen, J., Meng, C.-I., Sibeck, D. G., Bechtold, K., Freeman, J., Costello, K., Balikhin, M., and Takahashi, K.: Kp forecast models, J. Geophys. Res., 110, A04203, https://doi.org/10.1029/2004ja010500, 2005. a

Wing, S., Johnson, J. R., Camporeale, E., and Reeves, G. D.: Information theoretical approach to discovering solar wind drivers of the outer radiation belt, J. Geophys. Res.-Space, 121, 9378–9399, https://doi.org/10.1002/2016ja022711, 2016. a, b, c, d, e, f

Wing, S., Johnson, J. R., and Vourlidas, A.: Information Theoretic Approach to Discovering Causalities in the Solar Cycle, Astrophys. J., 854, 2, https://doi.org/10.3847/1538-4357/aaa8e7, 2018. a, b, c

Articles

Short summary

The magnetospheric response to the solar wind is nonlinear. Information theoretical tools are able to characterize the nonlinearities in the system. We show that nonlinear significance of D_st peaks at lags of 3–12 hours which can be attributed to VB_s, which also exhibits similar behavior. However, the nonlinear significance that peaks at lags of 25, 50, and 90 hours can be attributed to internal dynamics, which may be related to the relaxation of the ring current.

Transfer entropy and cumulant-based cost as measures of nonlinear causal relationships in space plasmas: applications to Dst

2.1 Mutual information and cumulant-based cost

2.2 Transfer entropy

3.1 Cumulant-based analysis

3.2 Transfer entropy

Transfer entropy and cumulant-based cost as measures of nonlinear causal relationships in space plasmas: applications to D_st