Reconstruction of geomagnetic activity and near-Earth interplanetary conditions over the past 167 yr - Part 4: Near-Earth solar wind speed, IMF, and open solar flux

. In the concluding paper of this tetralogy, we here use the different geomagnetic activity indices to reconstruct the near-Earth interplanetary magnetic ﬁeld (IMF) and solar wind ﬂow speed, as well as the open solar ﬂux (OSF) from 1845 to the present day. The differences in how the various indices vary with near-Earth interplanetary parameters, which are here exploited to separate the effects of the IMF and solar wind speed, are shown to be statistically signiﬁcant at the 93 % level or above. Reconstructions are made using four combinations of different indices, compiled using different data and different algorithms, and the results are almost identical for all parameters. The correction to the aa index required is discussed by comparison with the Ap index from a more extensive network of mid-latitude stations. Data from the Helsinki magnetometer station is used to extend the aa index back to 1845 and the results conﬁrmed by comparison with the nearby St Petersburg observatory. The optimum variations, using all available long-term geomagnetic indices, of the near-Earth IMF and solar wind speed, and of the open solar ﬂux, are presented; all with ± 2 σ uncertainties computed using the Monte Carlo technique outlined in the earlier papers. The open solar ﬂux variation derived is shown to be very similar indeed to that obtained using the method of Lockwood et al. (1999).


Introduction
This is the fourth in a series of papers using historic geomagnetic activity data to reconstruct solar and near-Earth interplanetary behaviour since 1845. The aim of the paper is to bring together all the data and techniques developed in the previous 3 papers to generate the optimum reconstructions of near-Earth IMF, solar wind speed and open solar flux and their uncertainties. In the first paper of the series (Lockwood et al., 2013a, hereafter Paper 1) a new interdiurnal range index IDV(1d) was introduced, designed to be homogeneous in its construction such that its response to solar wind and the interplanetary magnetic field (IMF) variations was at all times as similar to that in the space age as possible. In the second paper (Lockwood et al., 2013b, hereafter Paper 2), the reconstruction of the IMF and its uncertainty from IDV(1d) was studied in detail and a Monte Carlo technique to quantify all uncertainties developed. In the third (Lockwood et al., 2014, hereafter Paper 3), a correction to IDV(1d) data for seven years during solar cycle 11 was discussed, checked against newly analysed independent data from St Petersburg and implemented. A list and brief explanation of the various geomagnetic indices used in the present paper is given in Appendix A of the accompanying Paper 3 and a glossary of the mathematical terms used here is given in Appendix A of the present paper.
The development of reconstructions of coronal and heliospheric magnetic fields from geomagnetic activity data has recently been reviewed by Lockwood (2013). In recent years a considerable degree of consensus has been achieved on Published by Copernicus Publications on behalf of the European Geosciences Union. some parameters, although there were some outstanding differences . As discussed in Paper 2, it has been known for many years that range indices such as aa and Ap depend on both the near-Earth IMF, B, and solar wind speed, V SW , with a dependency of approximately BV 2 SW (Feynman and Crooker, 1978;. This has been explained by the work of Finch et al. (2008), who showed the dependence on V SW arises from the nightside auroral electroject, and by Lockwood (2013), who explained this in terms of the effect of solar wind dynamic pressure on the near-Earth tail of the magnetosphere. This result is also consistent with the conclusions of Martini et al. (2012). On short timescales (seconds to days), geomagnetic activity depends on the southward component of the IMF (in the frame of the geomagnetic field) but, as noted by Stamper et al. (1999), on annual timescales the orientation factor of the IMF averages to an almost constant quantity so that the dependence becomes on B. In fact, the orientation factor is not quite constant for averaging on even annual timescales and this limits the maximum possible correlation with B to 0.95 and with BV 2 SW to 0.97 (Lockwood, 2013). The first attempt to separate the effects of B and V SW was by Cliver et al. (1998), who combined the aa range geomagnetic index with sunspot number, R. Although R does have a relation to B, as discussed in Paper 3, it can only explain 70 % of the variation in annual means of B since in situ space observations began (see Paper 3). Thus the uncertainties of this first separation of B and V SW were large. In their computation of the open solar flux (OSF) variation,  used the aa index and Sargent's 27 day recurrence index (RI), derived from the autocorrelation of aa at a lag of 27 days (Sargent III, 1968). The rationale for this is that the intersection of Earth with fast solar streams rarely lasts for whole 27 day solar rotation periods and so, in addition to raising the annual mean of V SW , these streams raise RI through the formation of co-rotating interaction regions and their subsequent effects on geomagnetic activity. The accuracy of using aa and RI to compute OSF is discussed later in the present paper. Svalgaard et al. (2003) generalised this concept by noting that B and solar wind speed V SW could both be derived from any combination of geomagnetic indices that had different dependencies on near-Earth interplanetary conditions and this has been exploited by Svalgaard and Cliver (2007), Rouillard et al. (2007), Lockwood et al. (2009c) and . An important contribution was made by Cliver (2005, 2010), who developed Bartels' concept of interdiurnal variation (his u index) to generate the IDV index and showed that it varied mainly with B, with very little dependence on V SW . Independently, Lockwood et al. (2006a) had also developed the median index, m, to have a different dependence on solar wind speed to the aa index. However, as pointed out by Lockwood et al. (2009c) it is important to check that the difference between the dependencies are statistically significant, otherwise the resultant B and/or V SW variations would be dominated by noise. Lockwood et al. (2009c) established that the aa and m indices were indeed significantly different in their dependence on the interplanetary parameters. In Sect. 3 of the present paper we use four combinations of long-sequence geomagnetic activity indices and show that all four meet the requirement for statistically different dependence on interplanetary conditions. In Sect. 4 we discuss the extension of indices back to 1845. We then use these combinations to derive annual values of both B and V SW (and associated uncertainties) since 1845 (Sect. 5) and hence derive the variation in the signed OSF (Sect. 6). However, we begin by discussing data calibration in Sect. 2. As discussed in papers 1 and 2, in all geomagnetic reconstructions it is tacitly or explicitly assumed that any index used responded at times before the space age in the same way that it has been observed to do in modern times. Therefore it is vital to construct all indices in the most homogeneous manner possible to make sure that this is the case. This entails more than ensuring that all instrument calibrations are correct, it requires the use of a spatial distribution of stations and an index production algorithm that remain, as far as possible, constant so that the response to interplanetary conditions also remains as constant as possible. Because in the earliest years there are very few stations from which we now have useable data, this generally makes it necessary to use a small number of stations throughout the interval covered by the reconstruction which, naturally, increases vulnerability to instrument calibration errors. In Sect. 2 we investigate an important calibration issue in relation to the widely used and homogeneously constructed aa index.

Correcting the aa index
The aa index was devised and compiled by Mayaud (1971Mayaud ( , 1972Mayaud ( , 1980. It is a "range" index, being based on the range of variation in the horizontal field component H (after subtraction of the background, quiet day variation) during 3 h intervals, giving eight values per day. Data are taken from just two midlatitude stations selected to be close to antipodal, with the Northern Hemisphere station in southern England and the Southern Hemisphere station in Australia. In both hemispheres, three different stations were needed to give a continuous index: in the north they are Greenwich (1868-1925), Abinger (1926-1956), and Hartland (1957 and in the south they are Melbourne (1868-1919), Toolangi (1920-1979), and Canberra (1980. The Southern Hemisphere and Northern Hemisphere composites are termed aa N and aa S , respectively, and aa is defined as the arithmetic mean of the two. Data from the two hemispheres agree very closely, aa N and aa S giving a correlation of 0.94 for the 27 day (solar rotation period) means and 0.98 for the annual means. The most interesting feature of aa is an upward drift in the mean level during the first half of the 19th century (e.g. Feynman and Crooker, 1978) which is accompanied by an Ann. Geophys., 32, 383-399, 2014 www.ann-geophys.net/32/383/2014/ increase in magnetic storm occurrence (Clilverd et al., 1998). Several authors have argued this rise is not real but caused by instrument calibration problems (Svalgaard et al., 2003(Svalgaard et al., , 2004Love, 2011). However, the similarity of aa N and aa S and comparisons with equivalent range indices from longlived stations such as Niemegk and Sodankylä make this unlikely (e.g. Lockwood, 2001Lockwood, , 2003Clilverd et al., 2002Clilverd et al., , 2005Cliver and Ling, 2002;Lockwood et al. 2006a). Furthermore, Lu et al. (2012) have used a time domain filtering decomposition procedure to show that aa, like other geomagnetic indices and sunspot numbers, shows secular changes rather than the discreet jumps expected of station intercalibration problems. However, there may well be one notable exception that is discussed in this section. Svalgaard et al. (2004) pointed out that the aa index, as stored in most data centres at the present time, is probably not well calibrated across the move of the Northern Hemisphere's station from Abinger to Hartland at the start of 1957. These authors deduced this by comparison of aa against the "interhour variability" index, IHV, which was devised and introduced by Svalgaard et al. (2003) and developed by Svalgaard and Cliver (2007) and uses only nightside data to minimise the effect of the diurnal geomagnetic variation. IHV for a given station is defined as the sum of the absolute values of the difference between hourly means for a specified geomagnetic component from one hour to the next over the 7 h interval around local midnight. The variation with the corrected magnetic latitude shows strong peaks in the auroral oval, indicating it responds most to the variability in the nightside westward auroral electrojet. Because the variation with corrected geomagnetic latitude is flat equatorward of 55 • , only stations equatorward of this were employed in the global IHV index developed by Svalgaard and Cliver (2007). Paper 2 demonstrates that there is only a slight difference in the responses to solar wind and IMF variations of aa and IHV, and it is not significant, and hence IHV is a valid check of the aa index. Svalgaard et al. (2004) argued that the 1957 intercalibration between the aa N data from Hartland and Abinger was responsible for an 8.1 nT step in aa, such that all the upward drift in aa during the 20th century was erroneous (even though the aa S data, which are from Toolangi for both before and after 1957, were not consistent with this idea). However, the early version of IHV that Svalgaard and Cliver employed to draw this conclusion came from just two, nearby, Northern Hemisphere stations, Cheltenham and Fredricksburg, which were intercalibrated using the available 0.75 yr of overlapping data in 1956. Using more stations, Mursula et al. (2004) found there was upward drift in IHV values over the 20th century, but it depended on the station studied; nevertheless they inferred that the upward drift in aa was probably too large. As a result, Svalgaard et al. (2003) subsequently revised their estimates of the 1957 error in aa down to 5.2 nT (this would mean that 64% of the drift in aa was erroneous). However, Mursula and Martini (2006) showed that about half of this difference was actually in the IHV estimates not aa and was caused by the use of spot values rather than hourly means in constructing the early IHV data. This was corrected by Svalgaard and Cliver (2007), who revised their estimate of the aa error further downward to 3 nT. Independently, Lockwood et al. (2006a) carried out tests of aa using the range Ap index which has been constructed since 1936 from 11-13 Northern Hemisphere stations, and range k indices from a number of other stations. They also found a step-like change around 1957 but estimated it to be only 2 nT in magnitude. Because 1957 was only 11 yr before the end of the data series available to Mayaud, and because in that time solar cycle 20 was rather unusual, this discontinuity in aa was not as apparent at that time as it is now. Other studies also indicate that aa needs adjusting by about 2 nT at this date (Jarvis, 2004;Martini and Mursula, 2008). Lockwood et al. (2006a) introduced a corrected aa index by introducing a zero-level offset change of about 2 nT in aa on 1 January 1957. We here re-evaluate that correction in the light of more recent data, including that from the recent protracted and low solar minimum between solar cycles 23 and 24. We use the Ap index to make this correction. Mayaud designed the aa index to be, as far as possible, the same as Ap in annual means and Paper 2 shows that there is no significant difference between the response of aa and Ap to interplanetary parameters. The left panel in Fig. 1 compares the time variations of the standard (uncorrected) aa index (in red) and Ap (in blue). The blue curve actually shows Ap , which is Ap linearly regressed onto aa using monthly mean data for before 1 January 1957 (the date marked by a vertical dashed line). The correlation of the 200 monthly means of aa and Ap for 1 January 1932-1 January 1957 is 0.9750 (significance exceeding (1-10 −5 ), meaning that the probability that this correlation is a chance occurrence is less than 10 −5 ; note that all correlation significances quoted in this paper are obtained by comparison to a red-noise AR-1, autoregressivenoise model). Ap is given by Ap = 1.132Ap + 5.427(nT). (1) Note the correlation for after 1 January 1957 is 0.9745 and for the whole interval is 0.966, thus although the correlations for before and after this data are almost identical, that for the whole period is somewhat lower because of a discontinuity in one or both data sequences. Given that Ap is derived from a basket of stations (rather than just the two for aa) and previous studies have questioned the calibration of aa across this date, we attribute the discontinuity to aa. The right-hand panel in Fig. 1 shows a scatter plot of the monthly aa values as a function of Ap , with values for before and after 1 January 1957 in green and mauve, respectively. The plots show that after 1957, the uncorrected aa is consistently greater than Ap indicating that there is indeed a calibration problem in the standard aa data with a discontinuity around 1957. Lockwood et al. (2006a)   similar), the data available to them were consistent with a zero-level offset shift of 2 nT. The exception of 1966 takes on additional significance after the recent minimum between solar cycles 23 and 24 during which aa is actually slightly lower than Ap , indicating a change in instrument gain is a more appropriate correction than a zero level offset change, which can also be seen by comparing the green and mauve points in the right-hand panel in Fig. 1. To quantify this impression, we take means of aa over the ranges 10 < Ap ≤ 25 and 25 < Ap ≤ 40, for both before and after 1 January 1957: the means are 18.575 and 19.763 nT for the lower Ap range and 29.930 and 32.692 nT for the upper range, for the before and after intervals respectively, giving increases of 1.188 and 2.762 nT for after 1 January 1957 compared to before then. Using Welch's t test (for unequal sample sizes and unequal variances) the probability that the difference in the means for the lower Ap range is actually as large as the 2.762 nT found for the upper Ap range (i.e. that that a constant offset applies to both) is found to be 3.3 × 10 −4 . On the other hand, the means of ratio aa / Ap were 1.003 and 1.002 for the upper and lower Ap ranges, respectively, before 1 January 1957 but 1.104 and 1.093 after it, giving changes by factors 1.091 and 1.101, which are the same to within 1 %. Hence the aa calibration error is a sensitivity change rather than a zero level offset change. Figure 2 is a detailed comparison of the effect of the two recalibrations (namely, a change in zero level offset and a change in instrument sensitivity) to generate a corrected aa (termed aa C ). The top panels show the rms (root mean square) deviation, , of aa C from Ap for data after 1 January 1957 using (a) a zero level offset change of magnitude δ and (b) an instrument sensitivity gain change of factor g, both implemented on data for 1 January 1957 and after. The horizontal dashed line shows the value for data from before 1 January 1957 in both panels. The middle panels show the correlation between aa C and Ap for monthly data for the full data series (1932-2012, inclusive), also as a function of δ and g (for panels c and d, respectively). Note that this correlation is taken for the full data sequence (comprising the uncorrected first half and the corrected second half) so that if the correction (be it gain or offset) is not optimum this correlation will be slightly degraded. The horizontal line shows the correlation coefficient for the pre-1957 data only. The bottom panels show the significance S of the difference between the peak correlation and the correlation at general values; i.e. the significance of r(δ) max −r(δ) is shown in Fig. 2e and r(g) max − r(g) in Fig. 2f : this is calculated using the Fisher Z transform using the procedure described by Lockwood (2002). The horizontal dashed lines in panels e and f are the S = 90 % significance level. In all panels, the vertical dashed lines are at the peak correlation and the vertical dot-dash lines mark the uncertainty band at the 90 % Ann. Geophys., 32, 383-399, 2014 www.ann-geophys.net/32/383/2014/ level (corresponding to a 1.65σ uncertainty). For the zero offset correction this is at δ = −2.2 ± 1.8 nT. From Fig. 2e, for δ = 0, S = 98.6 % and hence there is only a 1.2 % chance that a correction to aa is not required. However, the probability that the correction required is 8.1 nT, as originally estimated by Svalgaard et al. (2004), is less than 10 −7 ; that is it is a large as 5.2 nT, as estimated by Svalgaard et al. (2003), is 5×10 −5 ; that it is as large as 3 nT, as estimated by Svalgaard and Cliver (2007), is 0.29; whereas the correction of 2 nT proposed by Lockwood et al. (2006a) has a probability of 0.983. However, close inspection of Fig. 2, shows that the optimum gain correction factor (g = 0.910±0.066), as expected, performs very slightly better than the optimum zero offset correction. At g = 1, S = 99.46 % and hence the probability that a gain correction is not needed is only 0.54 %. In Fig. 2a it can be seen that at optimum δ (the minimum ) is very slightly larger for the post-1957 data than for the pre-1957 data, whereas in Fig. 2b the minimum is the same as the pre-1957 value. Thus, using the optimum gain factor correction results in the fit of Ap to aa after 1957 to be as good as for the pre-1957 data (the same is not quite true for the optimum offset correction, δ). Similarly, when comparing the peak correlation coefficient r for the full data set (Fig. 2c, d), it only completely matches the pre-1957 value for the optimum gain correction.
Here we implement the optimum gain correction by multiplying all standard aa values after 1 January 1957 by 0.91. This produces a corrected version of aa, termed aa C , that corelates very highly (r = 0.985) with that generated by Lockwood et al. (2006a) (and as used, for example, by Lockwood et al., 2009c). Note however, a different philosophy is adopted here with the post-1957 aa data being corrected rather than the pre-1957 data. This has the advantage that early aa data are unchanged so that comparison with earlier studies such as Nevanlinna and Kataja (1993), Nevanlinna and Ketola (1993) and Nevanlinna (2004) are made easier. Annual means of the corrected data series, aa C , are given in the Supplement to this paper.
The correction to aa has been made here using monthly data to keep sample numbers high. However, it should be noted that hereafter we use annual means of all data for all reconstructions. There are a number of reasons for this. The most important is the fact that geomagnetic activity indices respond to the southward component of the IMF in the GSM (geocentric solar magnetospheric) frame of reference. On short timescales the IMF orientation varies rapidly (on timescales down to seconds) and these variations must be averaged out if we are to obtain the IMF magnitude. Stamper et al. (1999) showed that the IMF orientation factor is almost completely averaged to a constant value in annual means. The suppression of this factor as a function of averaging timescale, and the residual uncertainties it causes, has been studied in Paper 2 and by Lockwood (2013). Taking annual means also removes other annual variations that influence the geomagnetic activity response, such as the effect of Earth's dipole tilt on the solar-wind magnetosphere coupling efficiency and on the relevant ionospheric conductivities.

The dependencies of different geomagnetic activity indices on interplanetary parameters
Figure 1 of Paper 2 shows the correlation of various commonly used geomagnetic indices with BV n SW , as a function of the exponent n, where B is the near-Earth IMF and V SW is the speed of the solar wind impinging upon the Earth. In order to reconstruct both B and V SW , it is necessary to use pairs of indices that have significantly different n. Here we use two long-duration interdiurnal variation index data series (IDV and IDV(1d) for which n ≈ 0) and two indices for which n ≈ 2 (aa C and IHV) giving four usable combinations: IDV-aa C ; IDV(1d)-aa C ; IDV-IHV; and IDV(1d)-IHV. The IHV index used is as compiled by Svalgaard and Cliver (2007) which extends over the period 1890-2006. However, when making comparisons of correlations with interplanetary parameters we have found that it is important that all the geomagnetic data cover the same interval and, in particular, all cover the recent solar minimum which sets the lower limit to the range of variation in observed annual means for both B and V SW . As a result, we have updated the Svalgaard and Cliver (2007) IHV index using, as far as possible, the same mix of stations as used for 2006, the final full year of their data set.
In this section we show that the difference of the best-fit n for these four combinations is indeed statistically significant, before using the combinations to derive both B and V SW in Sect. 5. Correlations use annual means of both IMF and geomagnetic data for the period 1966-2012, inclusive. Data gaps in the IMF data are allowed for by piecewise removal of geomagnetic data that are simultaneous (allowing for the satellite-to-Earth propagation lag) with the B and/or V SW data gap before taking the annual means (see discussion by Finch and . The time series of the four geomagnetic indices are shown in Fig. 3. In each panel, the variation of BV n SW , for the exponent n that yields the peak zero-lag correlation, is shown by the orange and black dashed line. The peak correlations, and the exponents n that yield them, are given in Table 1. Figure 4 analyses the differences between these correlations. The top panel shows the correlation coefficients, r, for the four geomagnetic indices as a function of n, using the same line colours as in Fig. 3. The large uncertainties in n in Table 1 arise from the "flat-topped" nature of these correlograms. The coloured dashed lines mark the peak correlations in each case, again using the same colours. The middle panel shows the significance, S, of the difference between the peak correlation and the (lower) correlation at any other n, computed in the same way as the bottom panels in  Table 1. In each panel, the coloured line is the geomagnetic activity index and the orange and black dashed line is the best-fit variation of BV n SW .
The IDV index (black). (Lockwood, 2002). It can be seen the optimum n values for IDV and IDV(1d) are almost identical and those for IHV and aa C are also very similar (1.6 and 1.7, respectively). The bottom panel evaluates the probability that two indices with greatly different optimum n in reality have the same dependence on solar wind speed; by plotting (1-S 1 )(1-S 2 ) as a function of n where S 1 and S 2 are the probabilities that the two indices have a dependence on n that is different from the optimum value for that index (so they are the S values shown in panel 2). The four combinations studied are colour-coded with black denoting IDV(1d)-aa C ; red for IDV-aa C ; orange for IDV-IHV and green for IDV(1d)-IHV. In all four cases the probability that the indices in reality share the same dependence on V SW peaks for n around unity, but the peak values are around 0.05-0.07 if aa C is used and 0.08-0.10 if IHV is used: the lower values for aa C being a consequence of the slightly larger difference in the optimum n values: thus aa C is slightly better for separating the effects of B and V SW (and also extends back further in time). Using aa C with IDV, the confidence level that the n values are really different exceeds 95 % and for aa C with IDV(1d) it exceeds 93 %.

The aa index
The IDV(1d) index extends back to mid-1844, making the first complete year of data 1845. Although the IDV extends back to 1835, as discussed in Paper 1, the early data are based  (2010) for 1845 onwards, we follow Bartels' advice (see Paper 1) and regard it as unsatisfactory before 1872: it is especially unreliable in the declining phases of solar cycles when the inverse dependence on solar wind speed of the diurnal range proxy causes IDV to be too small. Mayaud's aa index data series begins in 1868, but has been extended back to 1845 using the Helsinki data by Nevanlinna and Kataja (1993). In the present paper we test and use this extension. The IHV index of Svalgaard and Cliver (2007) extends back to 1890. Nevanlinna (2004) has generated the range Ak(H ) and Ak(D) indices from both the H and D component data (horizontal and declination) from the Helsinki station . Reliable data are available until 1899, giving 31 yr of overlap with aa C that offers good intercalibration opportunities. As discussed in Paper 3, Ak(H ) from Nurmijärvi has a correlation with aa C of 0.975 (significance 0.91) and the peak correlation with the interplanetary BV n SW is for n of 1.7, the same as for aa C . Hence the Ak indices from the Helsinki station (IAGA code HLS and close to the Nurmijärvi site) are suitable for homogeneously extending the aa C data series such that the extension has the same responses to B and V SW as the modern aa C data, a conclusion also reached by the study by Nevanlinna and Ketola (1993). In Paper 3, Ak(H ) HLS and Ak(D) HLS data were compared to aa C as a means of calibrating the early Helsinki data. Uncorrected, the linear correlation coefficient of Ak(H ) HLS with aa C is r = 0.919 (after the correction, naturally, r = 1) whereas for Ak(D) HLS , r = 0.984, both correlations having a significance better than (1-10 −5 ). Use of the best-fit linear regressions yields extensions to aa C for the period 1845-1868 shown by mauve and blue lines for Ak(H ) HLS and Ak(D) HLS , respectively, in the top panel in The significance, S, of the difference between the peak correlation and that at general n from the Fisher Z test. (Bottom) The probability that correlations for two different indices have the same n, (1-S 1 )(1-S 2 ), as a function of n where S 1 and S 2 are the S values for the two geomagnetic indices: for IDV(1d) and aa C (black); IDV and aa C (red); IDV and IHV (orange); and IDV(1d) and IHV (green). The n and r of peak correlations are given in Table 1. and because it does not require any correction to agree with aa C , we here use the declination-based index Ak(D) HLS to extend aa C , the best fit linear regression being aa C = 1.0805 × Ak(D) HLS + 0.837(nT).
(2) Figure 5 compares the results with linearly regressed values from the nearby St Petersburg observatory (IAGA code SPE), Ak(H ) SPE and Ak(D) SPE , as derived by Nevanlinna and Häkinnen (2010). Agreement is very good with the results using Ak(D) HLS from Helsinki. This extension to aa C using Ak(D) HLS is consistent with the work of Nevanlinna and Kataja (1993) and Nevanlinna and Ketola (1993).

The Recurrence Index, RI
The method used to derive the open solar flux from the aa index by Lockwood et al. (1999) removes the effect of the solar wind speed using Sargent's recurrence index, RI (Sargent III, 1986). Hence it is also useful to also reconstruct RI for the period 1845-1867. The recurrence index has been derived using 1 day means of Ak(H ) HLS by Nevanlinna (2004). Results using Ak(D) HLS  values from St Petersburg data. Note that Sargent's original formulation of RI used 12 h means of geomagnetic activity data whereas we are here using 1 day means for both aa C and Ak(D) HLS because they are what is available for the latter. Tests with aa C show this change has no discernable influence on the resulting RI. The correlation coefficient between the recurrence indices derived from aa C and Ak(H ) HLS (here termed RI and RI HLS , respectively) is 0.902, for the available overlap data (1868-1896) which is significant at the 0.977 level. The two are intercalibrated using the linear regression RI = 1.1927 × RI HLS + 0.0045. ( With these extensions, both the annual aa C and RI data sequences begin in 1845.

Basic equations
The bottom panel in Fig. 4 shows four pairings of geomagnetic indices which have statistically significant different responses to solar wind speed. Hence, using the linear regressions for the peak response exponents, n as given in Table 1 and BV n2 SW = s 2 I 2 + c 2 .
Rearranging Eqs. (4) and (5) gives solar wind speed, as predicted from the geomagnetic activity indices, V 12 , of and an IMF, as predicted from the geomagnetic activity indices, B 12 , of Hence using the individual regression coefficients s 1 , c 1 , s 2 , and c 2 and the known exponents n 1 and n 2 , both B 12 and V 12 can be computed provided n 1 -n 2 does not tend to zero.

Reconstructions from various index pairs
The results for individual pairings are shown in Fig. 6b and d using the same colour coding as in Fig. 4; i.e. black for IDV(1d)-aa C , red for IDV-aa C , orange for IDV-IHV, and green for IDV(1d)-IHV. Results are so similar for the different pairings in many years that all four lines are often not visible. For both B and V SW , annual means from satellite data are shown by the blue dots connected by the thin blue line. Agreement between reconstructions and the in situ space observations is good in all cases, despite the effect of data gaps. (Remember that the regressions used to derive the reconstructions allowed for data gaps, but their effect will be present in the blue dots shown in Fig. 6). The right-hand panels in Fig. 6 show the corresponding distributions of annual (in blue) and hourly (in grey) means from the in situ satellite data set (1966-2012, inclusive; note that the scale has been halved for the annual data for clarity). Figure 6a shows the international sunspot number for the same interval.
What is noteworthy about Fig. 6b and d is how similar the four curves are in each case. This is important as it means that no one feature of the variation can be attributed to an error in the geomagnetic data. In particular, the use of IHV or aa C after 1890 makes no systematic difference to either of the two reconstructions. Given that the construction of these two indices is different in almost event respect (stations used, processing algorithm, index compilation), we can be sure that both cannot be in error in a way that makes both the B and V SW estimates the same. Given the high correlation between IDV(1d) and IDV after 1872, as reported in Paper 1, it is no surprise that these two indices also produce very similar variations over this interval. Before 1872 IDV and IDV(1d) are also quite similar, but there are differences. In particular, IDV gives a lower B and exceptionally large V SW (larger than any annual means during the space age) in the declining phases of cycles 9 and 10. As discussed in Paper 1, at  Fig. 4: IDV(1d) and aa C (black); IDV and aa C (red); IDV and IHV (orange); and IDV(1d) and IHV (green). The blue dots give annual means from satellite observations. this time, IDV employs Bartels' diurnal range proxy u rather than true interdiurnal range data and this does not have the same response to B and V SW as modern IDV data. As a result, we take seriously Bartels' classification of these data as unsatisfactory (see Paper 1). Thus for before 1872 the only reliable reconstruction is from IDV(1d) and aa C (the black line). Both these indices come from the Helsinki station before 1868, but both have been verified using data from the nearby St Petersburg observatory.

Monte Carlo analysis of the optimum reconstructions and their uncertainties
The great similarity of the curves in Fig. 6 provides the possibility that we can combine all four sequences and estimate the net uncertainty. This is done in this section using a variant of the Monte Carlo method used in Paper 2. The fitted data sequences are perturbed by errors (such that after the total of 10 000 such fits have been made, the distributions follow a Gaussian fit to the observed error distribution). For each fit, the Nelder-Mead simplex search method (Nelder and Mead, 1965;Lagarias et al., 1998) is used to find the optimum set of four regression coefficients s 1 , c 1 , s 2 , and c 2 in Eqs. (4) and (5) for a given pair of indices I 1 and I 2 (with the known Ann. Geophys., 32, 383-399, 2014 www.ann-geophys.net/32/383/2014/ best-fit exponents n 1 and n 2 ) such that the rms deviation of both B 12 and V 12 from their respective observed values (B and V SW ) are minimised. Specifically, each fit minimises the quantity e given by e = < B 2 12 − B 2 > 0.5 / < B > The errors are added randomly, as described in Paper 2. The distributions they follow for B and IDV(1d) are as used in Paper 2. The standard deviation for the error in V SW measurements is taken to be 2.5 km s −1 (King and Papitashvili, 2011). Those for IHV and IDV are taken to be 0.25 and 0.2 nT from an analysis of the distribution of values from individual stations during the regression period (by way of comparison that for IDV(1d) is 0.45 nT). Figure 7 demonstrates how the results from the four pairings of indices are combined. Each one of the 10 000 fits for a given pairing yields the four coefficients s 1 , c 1 , s 2 , and c 2 from which, using Eqs. (6) and (7), both the IMF and solar wind can be computed from the observed index values in a given year. Figure 7 shows the distributions for one example year, 1907, with IMF on the left and solar wind speed on the right. The probability distributions for the four pairings are colour-coded using the same coding scheme as in Fig. 6. The (normalised) sum of the four is shown in blue and the median of that total distribution is taken as the best estimate (shown by the vertical black line) and the uncertainty is taken at the 2 σ level and shown by the grey shading (i.e. 95 % of the total of 40 000 fits lie within that band). It can be seen that for 1907, all four pairings give almost identical distributions of the IMF (left panel). Tests show that in this case the width of the distributions is set by the IMF orientation factor uncertainty (the variability in annual means of the southward component of the IMF, in the GSM frame, for a given annual mean IMF field strength), as discussed in Paper 2 and by Lockwood (2013). In other words, the total uncertainty is set by this unknown IMF orientation factor and not by any measurement errors or any data gaps in either the geomagnetic indices or the in situ data. However, agreement is not so close in the case of the solar wind speed distributions (right panel) and the total distribution of all 40 000 values (in blue) is broadened by the differences between the distributions compared to the width of each of the four composite distributions. By taking the sum of the four pairings we use all four with equal weight. Thus our uncertainty band includes both the regression errors and the uncertainty due to differences between the indices. Note that before 1890, IHV is not available and before 1872 we have concerns about IDV because of the analysis presented in Fig. 1 of Paper 3. Hence between 1872 and 1890 the best estimate and uncertainty band is set by just two pairings: aa C -IDV and aa C -IDV(1d). Before 1872, the best estimate comes from the aa C -IDV(1d) pairing alone and the uncertainty band obtained by convolving  (Lockwood et al., 2013b) for IMF B (left) and solar wind speed V SW (right). The fits minimise the normalised rms deviation of the fits from the data on both IMF, B, and solar wind speed, V SW . A total of 10 000 fits for each pairing of geomagnetic activity indices were carried out, with random errors added (drawn from measured distributions). The resulting distributions are shown for aa C and IDV(1d) (black), aa C and IDV (red), IHV and IDV(1d) (orange), and IHV and IDV (green). The blue probability distribution function is the sum of these four for which the grey band shows the 2σ points (between which are 95 % of values) and the black vertical line shows the median. the regression fit uncertainty with geomagnetic index uncertainties of ±10 %, estimated by comparison of the relevant indices with data from the H variometer at St Petersburg and comparing with the declination (D) data from both Helsinki and St Petersburg. Figure 8 gives the time series of the best estimates and uncertainty ranges for each year, for both IMF (top) and solar wind speed (bottom) evaluated as illustrated in Fig. 7 for the example of 1907. Note the uncertainties are at the 2 σ level and allow for both regression uncertainty and the uncertainties in the geomagnetic data. In the case of the IMF estimate, the dominant uncertainty in most years is set by the IMF orientation factor. Without an independent way of estimating this from historic data, the uncertainty in reconstructed IMF cannot be further reduced. Only before 1872 do uncertainties in the geomagnetic data exceed that due to the unknown IMF orientation factor.

Open solar flux
We make use of the Parker spiral theory of the heliospheric magnetic field based on the frozen-in flux theorem (Parker, 1958(Parker, , 1963. This yields the following equation  Fig. 7. All four index pairings (aa C and IDV(1d), aa C and IDV, IHV and IDV(1d) and IHV and IDV) contribute to both the best estimates and the uncertainties after 1890. Before 1890 IHV is not available. In addition, the problems highlighted with IDV before 1872 in Paper 3 mean it is not used in this interval and the reconstruction is based on aa C and IDV(1d) alone and the uncertainties are derived from those estimated for these two geomagnetic indices by comparison with data from nearby stations (principally St Petersburg). modulus of the radial component of the IMF: where η is the IMF garden-hose angle, ω is the sidereal angular velocity of the solar atmosphere in the frame of the fixed stars and d is the heliocentric distance (d = A U for Earth-based measurements, where A U is one astronomical unit). The signed open solar flux (hereafter, OSF) threading the coronal source surface (at d ss = 2.5R , where R is the mean solar radius) is then computed using the definition of excess flux given by Lockwood et al. (2009a, b): where |B r | SS is the modulus of the radial field threading the source surface, |B r | r=R is the modulus of the radial field threading the sphere at r = A U and E is the excess flux formed by phenomena between the coronal source surface and the point of observation such as longitudinal flow structure and Alfvén waves. These phenomena cause "folded flux"  -field lines that only thread the coronal source surface once but thread a surface at greater heliocentric distance d three (or five or even a larger odd number) times, resulting in the flux threading the surface at d being larger than that threading the coronal source surface (Owens et al., 2008). There has been some debate and considerable misunderstanding about the excess flux term, which it is worth clarifying here. In particular, Smith (2011) has argued that the excess flux is an artefact of using the modulus of the radial field, |B r |. However, Lockwood and Owens (2013) point out that the alternative advocated by Smith, namely defining the source sector boundaries in the near-Earth spacecraft data and then averaging over the inferred source sectors, has inherent limitations and will only reproduce the true OSF if the source surface sector boundaries are correctly identified. In particular, the averaging must be done over the sectors at the source boundary and not those seen at the spacecraft location (or, as shown by Lockwood and Owens (2013), the result is exactly the same as using the modulus and an excess flux correction would still be required), and the existence of folded flux means that unknown errors in OSF will be incurred by placing boundaries at the wrong location. Proper identification of folded flux and the source sector boundaries at the spacecraft location requires careful analysis of electron streams (or the heat flux that they carry) and there is no known way to do this using historic or palaeodata. Hence for reconstruction of past solar variations we have no alternative but to use Eq. (10) and the modulus of B r which means allowance for excess flux is essential: Owens et al. (2008) have shown directly that using the modulus means that the total flux threading a heliocentric sphere of radius d increases with d -i.e. there is a heliospherically imposed component of folded flux. Lockwood et al. (2009b, c) devised a method that allows calculation of the excess flux E from near-Earth satellite data such that the OSF derived (from Eq. 10) matches the values obtained from solar magnetograph data using the potential field source surface (PFSS) modelling. This "kinematic correction" can be applied to satellite observations and uses the observed magnetic field tangential to the solar wind flow and the observed temporal gradients in the solar wind speed on short timescales. Lockwood and Owens (2009) showed that this method gives results in excellent agreement with the observed latitudinal constancy in the modulus of the radial heliospheric field (for means over one or more solar rotations to remove Carrington longitude effects) on which Eq. (10) is based (Lockwood et al., 2004). Applying the kinematic correction to allow for excess flux, using Eq. (10), yields the OSF values shown by the blue dots in the bottom panel in Fig. 6. Given the kinematic correction is not possible from historic data, we need a regression in terms of the annual mean parameters that we can retrieve from geomagnetic activity data. A linear regression fit using annual mean data gives a fit to the kinematically corrected open solar flux of the form where F S is in 10 15 weber (Wb), B is in nanotesla (nT), ω is in radians per second (rad s −1 ), V SW is the solar wind speed in metres per second (m s −1 ), A U is in metres, and s OSF and Ann. Geophys., 32, 383-399, 2014 www.ann-geophys.net/32/383/2014/ c OSF are the best-fit linear regression coefficients. Using annual means of the in situ observations, the correlation coefficient between the kinematically corrected OSF F S and the right hand side of Eq. (11) is 0.937, meaning that 88 % of the variation is explained by the approximate form given by Eq. (11). The fit residuals are always less than ±16 %. This level of agreement should be compared to the ±10 % differences between the kinematically corrected in situ data and estimates from solar magnetograms using the PFSS method (see Lockwood et al., 2009b). The coefficients s OSF and c OSF are given in Table 2, along with those obtained using annual mean B and V SW obtained from the four pairings of geomagnetic activity indices. The signed OSF variations obtained from the four combinations using Eq. (11) are shown in Fig. 6f. Again the results are almost independent of the geomagnetic data used except for the pre-1872 data when the differences between IDV and IDV(1d) discussed above become apparent. The distributions for annual and hourly in situ OSF values are given in Fig. 6g. Figure 9 compares the signed OSF variation (for aa C -IDV(1d) in black, for aa C -IDV(1d) in red) to that obtained from aa C and its recurrence index RI using the method of Lockwood et al. (1999) (in orange). Again the results are very similar indeed. The major differences occur in the declining phases of solar cycles 9 and 10 when the method of Lockwood et al. (1999) produces results very similar to those from aa C and IDV(1d) but significantly different from those for aa C and IDV which, as discussed previously, we attribute to the inappropriate use of Bartels' diurnal range proxy u in the early IDV composite. Note that the orange line uses the method of Lockwood et al. (1999) but is different from the variation they generated because two corrections have been implemented. Firstly, Lockwood et al. (1999) used the uncorrected aa index rather than aa C and secondly they did not make the kinematic correction to allow for the excess flux generated between the source surface and the Earth. As shown by Lockwood et al. (2009c) and , these two factors have opposite effects in the data before 1957, making the OSF variation for the period 1868-1957 in the original Lockwood et al. (1999) reconstruction very similar indeed to that shown in Fig. 9, but did result in their post-1957 values being too high. Neither factor was considered by Svalgaard and Cliver (2010) when they re-scaled the Lockwood et al. (1999) OSF reconstruction in terms of the IMF B (using a variation that they assumed to be linear and that they then rescaled), thereby making the Lockwood et al. (1999) reconstruction appear seriously in error. However, they failed to mention that they had re-scaled the reconstruction. A fair and proper comparison of the OSF sequences had been given by Lockwood et al. (2009c) and of the IMF reconstructions was given by . Figure 9 also shows the variation from aa C and the median m index by Lockwood et al. (2009c) (grey-green line). Again agreement is very good except in solar cycle 14 (1901-1912)  when the aa C -m combination yields OSF values that are persistently too low compared to the other reconstructions. We attribute this to the inhomogeneity of the construction of the m index which is based on more than 50 stations in the space age but data from just one site (Potsdam) in cycle 14. Figure 10 studies the relationship between B and the signed OSF (allowing for excess flux) inherent in the reconstructions shown in Fig. 9. Comparison is made with that deduced by Lockwood et al. (2009c) and , given by their best-fit polynomial: where F S is in units of 10 15 Wb. This polynomial fit is valid in the ranges 0 ≤ B ≤ 12 nT and 0 ≤ F S ≤ 0.6×10 15 Wb. To make the inverse conversion, from B to F S , we recommend, for reasons of consistency, evaluating Eq. (12) for closely spaced values of F S and then using spline interpolation to the required value(s) of B. The grey band in Fig. 10 Fig. 10), yielding a value of F S = (0.48 ± 0.29) × 10 14 Wb which is of order one-tenth of the average value for recent solar cycles. This value is consistent with models of open solar flux that allow for the inferred time constants of OSF loss , as shown by  and .

Conclusions
This paper has demonstrated that the differences between the dependencies of some geomagnetic indices on interplanetary parameters are statistically significant and can be exploited to give valid information on the near-Earth interplanetary conditions before the start of the space age. We have exploited four pairings that give significant differences to yield reconstructions of both IMF B and solar wind speed V SW from historic geomagnetic data. The four reconstructions are very similar indeed, giving great confidence that errors have not been introduced by errors in the geomagnetic data. A best composite of all four, with uncertainties at the 2 σ level computed using a total of 40 000 fits by the Monte Carlo technique outlined in Paper 2, has also been presented. We have updated the correction required for the standard aa index in 1957, as the comparison between the behaviour of the two in the recent low solar minimum reveals a gain correction that is slightly more appropriate than a zero level offset change. The effect is very small but the level of agreement between the corrected aa (aa C ) and the Ap index after 1957 is now essentially identical to that before 1957. Data from the Helsinki Observatory have been used to extend aa C  Lockwood et al. (2009c) and the surrounding grey area is the uncertainty band. The band marked SEA MM is the mean value of B for the end of the Maunder minimum deduced from cosmogenic isotopes by Steinhilber et al. (2010) from which the best estimate of the mean for the end of the Maunder minimum F S is (0.48 ± 0.29) × 10 14 Wb.
(and hence the reconstructions of B and V SW ) back to 1845. We have used the correction to the cycle 11 data discussed in Paper 3. The quality of the extensions has been checked by comparisons with historic data from the nearby St Petersburg observatory and we have demonstrated that Helsinki had, as far as we can tell, the same response to interplanetary conditions as in modern times using data from the nearby Nurmijärvi station from the space age.
Using Parker's spiral theory, and making the necessary adjustment for heliospherically imposed excess flux, we have computed the (signed) OSF for all four sets of reconstructions. Again agreement between them is excellent and all Table 3. Maximum and minimum of 11 yr running means of annual signed open solar flux estimates <F S > 11 from the four combinations of geomagnetic activity indices and from in situ satellite observations (top row). The last column gives the percentage change, λ, between the minimum and the maximum. The first row is for in situ spacecraft measurements. four agree very closely with reconstructions using the original method (based on the aa index and its recurrence) by Lockwood et al. (1999). All four reconstructions are in excellent agreement with the relationship between OSF and near-Earth IMF B that was derived by Lockwood et al. (2009c) and so the estimate that the OSF was (0.48±0.29)×10 14 Wb at the end of the Maunder minimum (deduced using cosmogenic isotopes and this relationship) remains valid. Lastly, there has been some debate about the finding by Lockwood et al. (1999) that the OSF doubled during the 20th century because they reported a rise of 130 % between 1900 and the mid-1950s). For example, Svalgaard and Cliver (2005) contrasted the doubling with a 25% rise that they deduced in the IMF B over the same interval. Table 3 gives the minimum and maximum values of the 11 yr running means of signed OSF, [<F S > 11 ] min and [<F S > 11 ] max , from the four reconstructions (as shown in Fig. 6d) and the percentage change λ defined as λ = (13) 100 × ([< F S > 11 ] max − [< F S > 11 ] min )/[< F S > 11 ] min . Table 3 reveals that the four λ estimates range from 105 % to 119 %, confirming that the OSF did indeed double, even if the fractional drift is slightly lower than estimated by Lockwood et al. (1999). It is interesting to note that the fall in these 11 yr means seen during the space age (from a maximum value in 1986 to a minimum in the last available year which is 2007) gives a λ value (by Eq. 8) of 58 %, hence the rise between 1902 and 1955 is roughly twice the fall that has been directly observed between 1986 and the present. Hence the rate of decline since 1986 is comparable to (but somewhat larger than) the rate of rise found for the period 1901-1955. Note also from Fig. 9 that in annual means, the value seen in 2008 is only marginally higher than the lowest value derived by the Lockwood et al. (1999) procedure (which was for 1901). Lockwood et al. (2006b) pointed out that some of this discrepancy between the results of Lockwood et al. (1999) and Svalgaard and Cliver (2005) was caused by the difference between OSF and near-Earth IMF and that the latter was not proportional to the former (as demonstrated by Fig. 10 of the present paper) and hence Svalgaard and Cliver's comparison was inappropriate. However, Lockwood et al. (2006b) also noted that roughly half of Svalgaard and Cliver's underestimation arose from non-robustness of their regression fits, from the way they treated data gaps and from an incorrect summary of their own results -all of which acted to reduce the size of the drift that they deduced. Svalgaard and Cliver (2006) did not accept these arguments but Table 4 (which corresponds to Table 3 but is for the near-Earth IMF B) directly confirms the conclusions of Lockwood et al. (2006b): the rise in 11 yr running means of B (<B> 11 ) in the years 1902-1955 from the four reconstructions shown in Fig. 6b are between 46.9 and 52.7 %. Note that the fall in www.ann-geophys.net/32/383/2014/ Ann. Geophys., 32, 383-399, 2014 directly observed values of B between 1986 and 2007 is already 41 % and looks set to continue Lockwood, 2010;Lockwood et al., , 2012. Hence we find the drift in the near-Earth IMF was twice as large as that deduced by Svalgaard and Cliver (2005).