A general Cluster data and global MHD simulation comparison

Among the many challenges facing the space weather modelling community today, is the need for validation and verification methods of the numerical models available describing the complex nonlinear Sun-Earth system. Magnetohydrodynamic (MHD) models represent the latest numerical models of this environment and have the unique ability to span the enormous distances present in the magnetosphere, from several hundred kilometres to several thousand kilometres above the Earth’s surface. This makes it especially difficult to develop verification and validation methods which posses the same range spans as the models. In this paper we present a first general large-scale comparison between four years (2001–2004) worth of in situ Cluster plasma observations and the corresponding simulated predictions from the coupled Block-Adaptive-Tree-SolarwindRoe-Upwind-Scheme (BATS-R-US) MHD code. The comparison between the in situ measurements and the model predictions reveals that by systematically constraining the MHD model inflow boundary conditions a good correlation between the in situ observations and the modeled data can be found. These results have an implication for modelling studies addressing also smaller scale features of the magnetosphere. The global MHD simulation can therefore be used to place localised satellite and/or ground-based observations into a global context and fill the gaps left by measurements.


Introduction
The Earth's magnetosphere is a highly complex nonlinear system mainly influenced by the interaction of the solar wind with the terrestrial magnetic field. The processes by which energy, mass, and momentum are transferred between these two domains is a primary focus of space physics research today. The dynamic processes controlling this highly disturbed near-to far-Earth plasma domain can reach from several kilometres to several thousand kilometres above the Earth's surface. Modern satellite and ground-based measurements of this environment can only describe the processes and phenomena over limited spatial regions. They can be used as indicators for large-scale processes but in order to describe the physical processes on a global scale they often refer to three-dimensional numerical based models. By exploiting the full three-dimensional predictions of these models it is then possible to establish causal relationships between the localised observations and the global dynamics (Berchem, 2000). Here MHD codes represent the state-of-the-art computational models which are widely applicable and practical to execute in order to simulate the complex processes which are present in the geospace environment.
From the first global MHD simulation by Leboeuf et al. (1978Leboeuf et al. ( , 1981, through to the first real three-dimensional MHD models (Brecht et al., 1982;Ogino, 1986), to the codes in their present states (e.g. Janhunen, 1996;Powell et al., 1999;Raeder, 2003;Lyon et al., 2004) the models have developed and improved. Driven by recent advances in research and technology the present models are now capable of simulating the geospace environment self-consistently from the magnetosphere down to the ionosphere. Due to these unique capabilities a recent focus of global MHD investigations is the study of individual magnetospheric "events", addressing mainly the energy, mass, and momentum transfer between the different domains (e.g. Berchem et al., 2003Berchem et al., , 2008Birn et al., 2008;Daum and Wild, 2006;Daum et al., P. Daum et al.: Cluster/BATS-R-US MHD comparison 2008;Fedder et al., 1995Fedder et al., , 1998Fedder et al., , 2002Gombosi et al., 2000;Hayosh et al., 2006;Palmroth et al., 2006;Pulkkinen et al., 1998;Raeder et al., 1997;Raeder, 2006;Siscoe et al., 2007;Tóth et al., 2007;Zhang et al., 2006). For these types of studies it is essential that the MHD simulations are precisely calibrated and constrained as outlined by Walker and Ashour-Abdalla (1995) and Berchem (2000) before they can be used to fill the gaps left by measurements and link the various localised ground-and/or space-based observations. Validation and verification of the models therefore becomes indispensable since only reliable model predictions can provide these links.
Whereas verification and validation methods in the computer science community are widely standardised (IEEE Standard 1012-1986) and well documented (e.g. Schlesinger, 1979;Adrion et al., 1982;Balci, 1997;Lipaev, 2003;Dasso and Funes, 2006) the same standards and techniques cannot be used for models of natural systems as outlined by Oreskes et al. (1994). This is because natural systems are never closed and because model results are always non-unique. Models can therefore only be verified and validated to a certain degree by demonstrating an agreement between observations and predictions and all confirmation is therefore inherently partial.
In addition to the lack of standardised verification and validation techniques, most modern MHD models are executed in "frameworks" and combine different numerical models in order to achieve the vast spatial coverage they offer. Each of these models can be seen as a module representing one specific domain of the Sun-Earth system which are then coupled by control modules via standardised interfaces to achieve a global representation of the geospace environment. Similar features in the observations of each individual domain within each module of the model provides a necessary but not sufficient condition for the "grand" MHD model (all modules coupled together) "validation". Since the coupled modules themselves represent a complete new model of a natural system, the coupled model itself needs to be validated by demonstrating an agreement between observations and predictions. Based on the techniques for the verification and validation of large model systems described in Balci (1997) and Lipaev (2003) and in regards to the conclusions of the study by Oreskes et al. (1994) such combined models need two steps of "validation" before a general evaluation can be made; (1) each module on its own needs to be validated, (2) the combined and coupled modules need to be validated.
The second step consequently can only be used to evaluate the specific module configuration and has to be repeated every time the inner model configuration is changed. Whereas the first step of validation can be achieved via case and comparison studies utilizing localised ground-and/or spacebased observations, the second relies on a large spatial coverage of the complete geospace environment to perform a reliable comparison between observations and predictions.
Recent case studies (Nozawa et al., 2001;Marsh and Roble, 2002;Mozzoni et al., 2007;Garner et al., 2004) have addressed the first step of validation of the different models describing the different sub-domains of the geospace environment; starting from the ionospheric electrodynamic (IE) models, through to the inner magnetosphere (IM) models, and further to the global magnetosphere (GM) models, but there exist only very limited studies addressing the second step of validation for the global coupled model (IE/IM/GM) runs. The existing studies rely mostly on limited spatial (few Earth radii) and temporal (few hours, several orbit progressions) comparisons between the model predictions and the observations and can therefore only be seen as a limited approach for the second step of validation.
Our approach to this validation problem is presented in this study and based on the technique described in Denton and Taylor (2008) for the analysis of Cluster data sets. We utilize a large statistical sample (up to 825 000 data points; depending on the instrument) of magnetic field and plasma measurements from the Cluster spacecraft formation collected in the years between 2001 and 2004, which are publicly available via the Cluster Active Archive (CAA: http://caa.estec.esa.int/caa/; Perry et al., 2006) and compare them to simulation runs of the Block-Adaptive-Tree-Solarwind-Roe-Upwind-Scheme (BATS-R-US) MHD code, coupled with the Rice Convection Model (RCM) and the Thermosphere Ionosphere Electrodynamic General Circulation Model (TIE-GCM). By using this large Cluster data set, we are able to achieve the vast spatial coverage needed for the second validation step. Thus, we attempt to validate the BATS-R-US/RCM/TIE-GCM coupled model to find constraints and calibration factors to improve the model. By achieving a better accuracy of the model predictions, a better understanding of the physical processes present in the various domains can be achieved with implications of the model's capabilities to enlarge the localised point-to-point measurements taken by ground-and/or space-based instruments.

Instrumentation/data analysis
The quartet of ESA Cluster spacecraft  were launched in July/August 2000 into a highly elliptical orbit (19.6 (apogee)/4.0 (perigee) R E ). The orbital plane is fixed in the inertial frame of the Earth, therefore the apogee processes through 24 h of Local Time (LT) with a 12 month periodicity. Figure 1 shows the projection of the orbit path of Cluster 1 (Rumba) onto the x-z, x-y, and y-z Geocentric Solar Magnetospheric (GSM) reference planes for the time period from 1 January 2001 to 31 December 2004 in 100 min resolution steps for every 10th full orbit. Overlaid in Fig. 1 are the modeled magnetopause shapes (Shue et al., 1997) (Shue et al., 1997) indicated by the solid grey lines and the modeled bowshock shape (Bennett et al., 1997) indicated by the dashed grey lines. The models are parameterised by the solar wind and IMF conditions as shown in the upper right hand corner of the left-hand panel.
In the x-y GSM projection (centre panel) the dayside magnetosphere area which is swept over by the orbit in the first 120 days of each year is highlighted by an underlying grey area. In the x-y and the y-z GSM projections the Cluster orbits falling into a range of ±5 R E of y=0 R E are highlighted in yellow and represent the data range for the performed comparisons. (Bennett et al., 1997) indicated by the dashed grey lines. The models are parameterised by the averaged solar wind and interplanetary magnetic field (IMF) conditions for the time of interest as shown in the upper right hand corner of the x-z GSM projection (left-hand panel). The solar wind parameters were obtained from the Solar Wind Electron, Proton, and Alpha Monitor (SWEPAM; McComas et al., 1998), and the IMF conditions were obtained from the Magnetic Field Experiment (MAG; Smith et al., 1998) onboard the Advanced Composition Explorer (ACE) satellite (Stone et al., 1998). The utilised data were accessed via the ACE Science Center (ASC: http://www.srl.caltech.edu/ACE/ASC/; Garrard et al., 1998). As shown in Fig. 1 the Cluster orbit offers a great spatial coverage of the near-Earth plasma environment stretching from ±19.4 R E in x-and y-GSM direction and from ±12.5 R E in z-GSM direction. The measurements taken during the four years of interest offer an extensive data set for the validation of the coupled MHD model.
For the comparison between the Cluster measurements and the MHD model predictions we use the plasma particle observations from the Cluster Ion Spectrometry (CIS) instrument (Rème et al., 1997(Rème et al., , 2001 and the magnetic field measurements taken by the Cluster fluxgate magnetometer (FGM) instrument (Balogh et al., 1997(Balogh et al., , 2001. The measurements of the ion distribution function in the energy ranges from ∼0-40 keV obtained from the Composition and Distribution Function (CODIF) sensor of the CIS instrument that allows calculations of the moments of the distribution and thus provides averaged values for temperature, density, and velocity. These data combined with the magnetic field components obtained from the FGM instrument allow us a com-parison with the prime MHD model output parameters (see Sect. 3, Fig. 2).
To allow an efficient analysis of this extensive in situ data set, we apply the technique first described in Denton and Taylor (2008). The derived and observed individual parameters (density, pressure, temperature, magnetic field strength, and velocity components) are first interpolated to a time resolution of one minute. Following this temporal interpolation the parameters are binned according to the location of the Cluster spacecraft onto an equidistal cartesian grid with a size of 1×1 R E and averaged values within each bin are calculated. By combining several of these data files it is then possible to produce quasi two-dimensional representations of the geospace environment at any given location. To enhance the data range of these "data-planes", measurements taken in an orbital position within a perpendicular distance of ±5 R E from the required plane are used as complement. Figure 1 highlights this technique in the case for an x-z GSM plane at y=0 R E , here all measurements taken in an orbital position which falls into a range of ±5 R E (highlighted in yellow) are used to make up the data range of the x-z GSM data-plane. Following these criteria averaged data-planes of all before mentioned prime plasma parameters can be created.
It should be noted here, that these data-planes are subject to two limitations: (1) the planes only represent the plasma parameters in a statistically averaged format, they cannot represent short time fluctuations, (2) the derived plasma parameters oblige the instrument limitations (e.g. energy range). To cope with extreme driving conditions and their effect on the measured plasma parameters each interpolated data file is combined with solar wind data measurements from ACE to identify and exclude times where extreme solar wind conditions (high solar wind speeds and high proton densities) were present. The data points therefore can vary according to time and selected orientation of the data-plane. For the here presented study we mainly use x-z (noon-midnight meridian plane) GSM and x-y (equatorial) GSM data-planes with data sets consisting of 80 000 to 825 000 points.
In addition to the physical limitations of the instruments also the fact that Cluster 1 (Rumba) does not operate an active spacecraft potential remediation mechanisms has to be accounted for. Depending upon ambient plasma conditions, this can affect how well the distributions (and hence the derived moments) represent the actual local plasma conditions. The effect is most obvious when the plasma contains a very hot (or very cold) component with energies beyond the range of the instrument. Hence, as noted by Denton and Taylor (2008), the resultant plasma densities shown are likely to underestimate the actual values. It should be noted here, that in the high flux regions (cusps, magnetosheath) the spacecraft potential is quite low (<10 V) and so does not affect the CIS instrument measurements as much as in the plasma sheet and in particular the lobe region. These issues should be borne in mind not only when interpreting the results presented below, but when utilising any thermal plasma measurement. However, a comparison between densities measured on Cluster 1 (Rumba) and densities from other Cluster spacecraft where active spacecraft potential control is operating shows little quantitative difference. Therefore for the statistical purpose of this study the differences are negligible.

MHD model (coupled modules GM/IM/IE)
The corresponding MHD study uses simulation results from the coupled BATS-R-US MHD code (Powell et al., 1999), which was originally developed by the Computational MHD Group at the University of Michigan, now the Center for Space Environment Modeling (CSEM: http://csem.engin. umich.edu). The code used is version 7.73 which is part of the Space Weather Modeling Framework (SWMF: http://csem.engin.umich.edu/swmf/) described by Tóth et al. (2005) and Volberg et al. (2005). The BATS-R-US code solves three-dimensional MHD equations in a finite volume form using numerical methods related to Roe's Approximate Riemann Solver (Roe, 1981). The code employs an adaptive grid composed of rectangular blocks arranged in varying degrees of spatial refinement levels. A detailed description of the model and the numerical/parallel implementation can be found in Gombosi et al. (2003). The BATS-R-US MHD code represents the GM module of the here discussed coupled simulation.
The IM module is described by the RCM which represents the inner and middle magnetosphere with a coupling to the ionosphere (Toffoletto et al., 2003). The RCM represents the particles in terms of multiple fluids. Its equations and numerical methods have been especially designed for an accurate treatment of the inner magnetosphere, including the flow of electric currents along magnetic field lines to and from the conducting ionosphere. The RCM computes these currents and the associated electric fields self-consistently. It assumes perfectly conducting field lines and employs a precomputed time-dependent magnetic field with associated induction electric fields. The Vasyliunas equation (Vasyliunas, 1970) is used to compute the magnetic-field-aligned currents (Birkeland currents), and Ohm's law is used to compute the self-consistent ionospheric potential distribution. A detailed description of the BATS-R-US/RCM coupling can be found in De . The IE module is described by the TIE-GCM code (Ridley and Liemohn, 2002;Ridley et al., 2004). It is a two-dimensional electrostatic potential solver that obtains the field-aligned currents from the global magnetosphere (GM) module included in the SWMF and employs a statistical auroral ionosphere conductance model driven by the solar irradiation index (F10.7) and by the field-aligned current patterns.
The three modules (GM/IM/IE) are then coupled under the SWMF to form a consistent representation of the geospace environment. The coupling is achieved with nearcontinuously two-way flow information between the different modules via standardised interfaces. A schematic of the coupling is presented in Fig. 2. We shall only briefly outline the fundamental coupling aspects and refer to De Zeeuw et al. (2004) and Tóth et al. (2004Tóth et al. ( , 2005) for a detailed description.
The GM and IM modules are self-consistently coupled. magnetosphere. Starting from these calculations closed field line positions are determined and their intersections with the equatorial plane are passed to the IM module along with the overall magnetic field strength, the magnetic flux tube volumes, and the corresponding pressure and density values. IM derives its plasma distribution outer boundary and supplies the time-evolving plasma density and pressure on its spherical grid to adjust the GM values.
The IM and IE modules are only one-way coupled, the IE takes the field-aligned currents from the GM module, the electric potential of the IE calculations is then passed back via the IM to the inner GM boundary.
This GM/IM/IE coupling allows us to retrieve a wide range of output parameters from the different domains of interest. The model outputs (cf. Fig. 2) include the plasma parameters (atomic mass unit density n; kinetic pressure p; velocity v x , v y , v z ), the magnetic field components B x , B y , B z , and the electric current components j x , j y , j z (prime model outputs) as well as ionospheric electrodynamic parameters (electric potential ; Hall and Pedersen conductance H , P ) and the ionospheric electric current density J . We shall not further comment on the electrodynamic parameters since a comparison of these data sets is not in the scope of this paper. It should be noted, that the IE module is only included in the coupled simulation for completeness to represent the most commonly used coupled simulation run especially for coordinated ground-and space-based studies. The IE module is not neglectable since a change in the configuration of the coupled modules would represent a complete new natural system which then again would need validation as outlined by Oreskes et al. (1994) and Lipaev (2003).
The simulation runs used in this study have been performed at the Community Coordinated Modeling Center (CCMC; http://ccmc.gsfc.nasa.gov/) at NASA Goddard Space Flight Center (GSFC) and on the High Performance Cluster (HPC) computing facility at Lancaster University. All model runs were executed with the finest resolution of 0.25 R E (∼1600 km) in the near-tail and dayside magnetopause region (<15 R E radial distance) and a 0.5 R E (∼3200 km) resolution in the far-tail region (starting from ∼45 R E ). The adaptive nonuniform numerical simulation grid was then interpolated on an uniform equidistal cartesian grid with different resolutions 0.5/1.0 R E (using the VisAn MHD toolbox; Daum, 2007) inside a bounding box of ±21 R E . Since the CCMC is running the SWMF version 2.1 and the Lancaster HPC is running version 2.3 with a different CPU layout an initial test of the simulation results was performed by comparing the two runs. This step serves as first verification level to rule out computational and numerical errors which could arise from the different CPU layouts and the different compiler infrastructures used. The runs showed equivalent results so that we are confident in using the simulation run results for the comparison with the averaged bulk plasma parameters obtained from the instruments onboard the Cluster spacecraft.
The global MHD simulation is driven by real upstream solar wind conditions (see Fig. 3) of the y-z GSM plane at x=33 R E , at the other boundaries the code assumes a zero gradient for the plasma variables since these boundaries are far enough from the Earth that they have no significant effect on the dynamics due to the fact that they introduce a negligible effect in the resistive MHD equations describing the domain. Since a real time simulation run over the complete time span (2001)(2002)(2003)(2004) was not operable (each time step in the simulation represents a full three-dimensional bounding box calculation with over 22.8 million data points), we have calculated six day averages from the hourly ACE Level 2 merged IMF, solar wind, and energetic particle data files. Figure 3 presents an overview of these six day averaged data sets from 2001 to 2004. The first panel shows the solar wind density (also the 50% decrease, light grey trace which was used in the simulation runs as constraining factor) followed by the solar wind temperature, the velocity, and magnetic field strength components in a GSM reference frame. The last panel shows the IMF clock angle defined as arctan B y /B z . With these six day averaged solar wind and IMF conditions a quasi time-accurate simulation was performed and every cadence in the simulation therefore represents a six day cadence in real-time. Subsequently with the 243 dependent IMF and solar wind condition also 243 model outputs with an average dipole tilt of 0 • (corresponding to the four years observation time) were computed and used for the comparison.
For the times where the Cluster orbit swept over the dayside magnetosphere region, highlighted in Fig. 3 by the underlying grey areas, additional fixed input boundary model runs with an averaged dipole tilt of −11.68 • (calculated from the daily dipole tilts in the time from January to April of each year) were performed. These runs were then used for an initial comparison of the magnetopause shape and location in the model and the data in order to give a first implication if the presented constraining factor (50% density decrease) is applicable. It has to be noted here, that the model runs with fixed input boundary conditions converge into a steadystate solution through unphysical intermediate solutions but since we only use these runs for a comparison of the directly driven part of the magnetosphere reasonable implications for the constraining factor can be drawn.

Comparison
The model and data comparison contains two steps. The first step develops and evaluates the constraining factor used for the model runs and the second then builds upon this factor to present a first general large-scale comparison of the Cluster observations and the coupled MHD model run predictions.

Initial comparison
As outlined in the studies by Daum and Wild (2006) and Hayosh et al. (2003Hayosh et al. ( , 2006 using the coupled BATS-R-US model (GM/IM/IE) in comparison with high-latitude Cluster and high-/low-latitude INTERBALL-1 observations respectively, the coupled model seems to overestimate the general magnetic field and plasma compression. Therefore the bowshock and magnetopause boundary is pushed further towards the Earth as indicated by the in situ satellite observations. In order to compensate for this and to "calibrate" the model, a constraining factor for the upstream input boundary conditions is necessary. As shown in Fig. 2 the parameter which has a direct linear influence on the coupling between the GM and IM module is the solar wind proton density, this factor then also determines the location and shape of the bowshock and the magnetopause in the simulation. In respect to the results presented by Daum and Wild (2006); Hayosh et al. (2003Hayosh et al. ( , 2006; Koval et al. (2006) we have computed different simulation runs (not shown here) with varying densities (from 100% to 30% of the original value) and have compared the location of the magnetopause in these simulation runs with actual Cluster magnetopause crossings. Therefore we used the simulated data to estimate the times of the Cluster magnetopause crossings exploiting the spacecraft orbit path and the prime model output parameters. The crossing times were identified manually on the basis of several independent criteria of changing fields along the orbit path: (1) the magnetic field variations in strength and direction, (2) sudden increases in the x-GSM velocity component, and (3) locations/regions where the vector field of ∇p−p differs from zero and exhibits a drastic magnitude change. The so deduced times of crossing were then compared with the actual in situ field and plasma indications observed by the instruments. Here the simulation runs performed with density decreases of ∼50%±3% showed the best agreement between the predicted times and the actual times of the magnetopause crossings.
In order to evaluate this constraining factor we have generated x-z GSM data-planes from the Cluster observations for times when the orbit swept over the dayside magnetosphere (cf. Fig. 1 centre panel). Figure 4 first column shows the corresponding Cluster 1 (Rumba) orbit positions for these times at 100 min resolution projected onto the x-z GSM plane with all y-positions falling into the data range of y=±5 R E , the second column represents the resultant colour-coded maps (1×1 R E cell size) of the observed plasma pressure   Shue et al. (1997) and the bowshock shape after Bennett et al. (1997) parameterised by the averaged IMF and solar wind conditions as shown in the upper right hand corner of each of the orbit projections. The averaged dipole tilt for the times of interest is indicated by the −11.68 • tilted day-/nightside representation of the Earth.
The observational results shown in the second column of Fig. 4 agree with similar studies previously published (Escoubet et al., 1997;Lavraud et al., 2004;Denton and Taylor, 2008) and exhibit the classical distinguishable plasma regions. From left to right; the solar wind, then the compressed magnetosheath followed by the inner plasmasphere. Assuming that the Cluster data-planes are representative of the average dayside magnetosphere state for the times of interest, we can now compare these data-planes with corresponding simulation runs. Therefore we have used the averaged solar wind and IMF conditions present at the four time spans (see (1), (2), (3) and (4) in Fig. 3) when Cluster was located in the dayside magnetosphere and have computed the corresponding coupled MHD predictions. The model runs were performed with fixed input boundary condition (shown in the upper right hand corner of the orbit projections) but with a 50% decrease of the solar wind proton density and a fixed dipole tilt of −11.68 • was used. Figure 4 third column shows the colour-coded x-z GSM pressure distributions at y=0 R E for the BATS-R-US/RCM/TIE-GCM coupled model runs. Also here the bowshock shape (Bennett et al., 1997) and the magnetopause boundary (Shue et al., 1997) are overlaid.
Comparing the in situ pressure distributions with the modeled ones, it can be seen that both modelled and measured pressures exhibit three distinct plasma regimes and their associated boundaries after Bennett et al. (1997) and Shue et al. (1997). Here it should be noted, that the bowshock model seems to perform slightly better than the magnetopause model especially in the tailward flank regions. But as shown inŠafránková et al. (2002) these variations are common for second order surface fits as used in the Shue et al. (1997) model and spreads of 2 R E to 3 R E inside or outside the model surface are expected. Therefore the slight variances between the here employed MHD model and the magnetopause representation are expected. Both boundaries can therefore be used as reference for the following comparison and give a first indication of the applicability of the constraint used.
Following on from this initial referential comparison it can be seen that from the solar wind to the compressed magnetosheath region, the model as well as the data show a steep gradient in the pressure with similar values and distributions upstream and downstream of the bowshock. Inside the magnetosheath region the pressure values raise up to p 0.8 nPa.
Here the observations exhibit the highest values in the northern and southern cusp regions (x∼7 R E and z∼±7 R E ) and the model data exhibit the highest values at the subsolar nose region (x∼10 R E and z∼3 R E ) which then extend into the cusps. At a normal distance of 1 R E to 3 R E downstream of the magnetopause boundary the values drop again to levels of around 0.1-0.2 nPa. Here the two tail lobes above and beneath the neutral sheet become apparent. Continuing inwards to the nightside of the Earth at lower L-shell positions, the values raise again to magnetosheath level. This steep gradient in the observations is not present in the model data and suggests that the model underestimates the nightside plasma injection in L-shell regions under 3 R E .
As outlined in De Zeeuw et al. (2004); Danov and Koleva (2007); Garner et al. (2004); Lemon et al. (2004Lemon et al. ( , 2005 this underestimation is common in the coupled IM module and is heavily dependent upon the initialisation time given for the model run, until its sets into a steady state for the ring current distribution and it is dependent on the grid resolution used. The grid resolution change between the GM and IM module can cause diffusion regions which then have an influence on the current distribution and subsequently the plasma pressure distribution especially in the near-Earth regions. In the case presented here it is most likely that the underestimation is the result of the fixed input boundary conditions since the model runs therefore do not include the temporal (120 days) and spatial (±5 R E perpendicular to y=0 R E ) variations of the high-energetic plasma processes (unlike the observations).
Overall it can be said that the coupled MHD simulation reproduces reasonable well the actual statistical parameterisation of the dayside magnetosphere given by the observations and clearly exhibits the three main plasma regions separated by the boundary representations of Bennett et al. (1997) and Shue et al. (1997). Whereas with the 50% decrease these boundaries are reproduced reasonably well the >60% density runs (not shown here) put the location of the bowshock and the magnetosphere further towards the Earth due to their overestimation of the magnetic field and plasma compression. We are therefore confident in using the 50% decrease of the solar wind proton density as a constraining factor to calibrate and adjust the MHD model runs.

General comparison
The second step of comparison serves as an overall validation criteria for the coupled model runs and incorporates the complete Cluster data sets (2001)(2002)(2003)(2004) and uses the constraining factor discussed above for the model runs. In order to compare the Cluster data sets with the model, we have computed 243 simulation steps using the six day averaged solar wind and IMF conditions as shown in Fig. 3 and generated averaged plasma, magnetic field, and velocity noonmidnight meridian and equatorial maps. Figure 5a-f shows the Cluster pressure distributions (with up to 825 000 data points) in the x-z and x-y GSM reference planes and the corresponding MHD model distributions as well as the resultant deviation. Figure 7a-j represents the total magnetic field strength and deviations in the same GSM reference planes of the Cluster data sets, the MHD model predictions and  (Shue et al., 1997) and the bowshock shape (Bennett et al., 1997) parameterised by the IMF and solar wind conditions as previously shown in Fig. 1.
the empirical Tsyganenko (T'96) magnetic field model (Tsyganenko, 1995;Tsyganenko and Stern, 1996) predictions. Overlaid on all resultant maps are the magnetopause boundary (Shue et al., 1997) and the bowshock shape (Bennett et al., 1997) parameterised by the averaged IMF and solar wind conditions obtained from the six day averages as shown in Fig. 3.

Pressure comparison
The x-z GSM pressure maps of the observations and the model (see Fig. 5a-e) exhibit again the three clear distinguishable plasma regions with similar values and distributions. By using the complete Cluster data set, the nightside distribution is also shown and exhibits, in addition to the above described characteristics, a slight increase of pressure in the far tail region at x∼−15 R E with values raising to about 0.3 nPa. This is reproduced by the model but with a much thinner z-GSM spread, this is due to the fact that the model uses a fixed 0 • dipole tilt whereas the observations include the daily and annual variations. Figure 5c also shows that the Cluster data and the model predictions deviate the most on the dayside inner plasmasphere at x∼3 R E , here a difference of upto 0.4 nPa can be observed. Elsewhere the model predictions and the Cluster data are in reasonable agreement with a standard deviation of ∼0.13 nPa.
In the x-y GSM pressure map of the observations (see Fig. 5d) the clearly distinguishable plasma regions mentioned previously blend into each other and only two clear regions are apparent, the solar wind and the quasi combined magnetosheath and magnetosphere. It should be noted that the observations do not include values in the 5 R E <x<10 R E , −10 R E <y<0 R E region (also the inverse is true for the nightside) due to the orbit path and therefore only limited comparisons can be made. The data do not show the three plasma regions and clearly defined boundaries presented in the model predictions, with high plasma pressure values inside the magnetosheath and lower pressure values downstream of the dayside magnetopause. In the inner plasmasphere the model and the observations have a better agreement but the model exhibits smoother gradients, this is also reflected in the deviation plot see Fig. 5f. is consistent with the results presented by e.g. Antonova et al. (1999);De Michelis et al. (1999);Lui (2003). These near-Earth high plasma values encounters are not well reproduced by the model.
Comparing the simulation runs with the results presented in Lemon et al. (2004Lemon et al. ( , 2005 a consistency can be found. The magnetic-field perturbations in the RCM model seems to prevent the injection of a significant ring current and subsequently prevent the occurrence of a high plasma pressure in the near-Earth region. In order to get a fully self-consistent representation of the near-Earth environment, the RCM has to be further coupled with a ring current model. At the current state the SWMF does include a coupling between the RCM and the Fok et al. (2001) ring current and radiation belt model. But, this so called Comprehensive Ring Current Model (CRCM) is strictly one-way coupled and does not interfere or alter the general GM/IM/IE MHD simulation. Therefore it can be used for case studies addressing this specific plasma domain but can not be included in the general validation of the coupled MHD model under investigation here. For completeness and since it does not interfere with the general simulation in the sense of changing the natural system (Oreskes et al., 1994), in addition to the MHD model steps we have also computed the corresponding 243 coupled CRCM steps. Figure 6 shows the proton fluxes in the equatorial plane derived from the CRCM at one instant of time during the simulation runs. It can be seen that with the CRCM model, the near-Earth plasma domain can be described in great detail (e.g. Sazykin et al., 2005;Taktakishvili et al., 2007) and that the high fluxes predicted by the model are consistent with the location of the high pressure values indicated by the obser-vations. Other in situ/model comparisons of this type reveal clues to physical processes responsible for such high pressure values (e.g. Denton et al., 2005a;Lavraud and Jordanova, 2007).
Overall it can be said that the GM/IM/IE coupled MHD model plasma predictions, also without a two-way coupling to the CRCM, represent the plasma pressure observations reasonable well in all major regions for L>4 R E , considering the above mentioned instrument and model limitations.

Magnetic field comparison
Since the Cluster data-planes are spatially confined by the orbit path, we have additionally calculated the corresponding magnetic field strengths in the planes of interest using the T'96 magnetic field model. Due to the fact that no Cluster data are included in the T'96 database, the model can be used as an independent further reference for the comparison between the Cluster data and the MHD model predictions. Here has to be noted that the T'96 model is limited and confined by a parabolic magnetopause shape and does not produce values outside this boundary. Therefore the T'96 model should here only be used for a comparison of the inner plasmasphere values. This limitation is also why MHD models become indispensable for the community especially when focusing on processes which extend over this natural boundary. The T'96 calculations used here were performed with the six day averaged solar wind and IMF conditions as shown in Fig. 3 and a general D st -index of −25 nT. With these input parameters 243 model steps were calculated and averaged. The resultant total magnetic field strength distributions are shown in Fig. 7d, i.
In all six magnetic field strength representations (Fig. 7a,  b, d, f, g, i) four clear defined regions of different magnetic field strength can be observed. From left to right (top to bottom), the low magnetic field strength of ∼10 nT in the solar wind upstream of the bowshock, followed by 20-30 nT in the magnetosheath, followed by the geomagnetic field indicated by 50-75 nT in the vicinity of the magnetopause and magnitudes of over 100 nT closer to the Earth.
Concentrating at first on the resultant x-z GSM planes (Fig. 7a-e), the MHD and the T'96 model predictions show a good agreement up to the magnetopause boundary. Both model predictions reflect the four magnetic field strength regions as seen in the observations. Since the MHD model is not limited by a parabolic magnetopause, the model can also describe the magnetosheath and solar wind domain and reflects to a certain degree the major characteristics of the observations. The T'96 model reflects better than the MHD model the plasma cavity regions above and beneath the neutral sheet starting from x≤0 R E . Compared to the observations, the MHD model still seems to slightly overestimate the magnetic compression since the observations show magnetic field strengths of about 20-30 nT in the magnetosheath which are only to some extent represented in the model predictions  (Shue et al., 1997) and the bowshock shape (Bennett et al., 1997) Fig. 8. BATS-R-US/Cluster 1 (Rumba) vectors of the magnetospheric flows in the x-z and x-y GSM reference planes. The velocity flow vectors are scaled in accordance to the key on the right and colour-coded in accordance to the coupled MHD and the observations, respectively. Overlaid are the magnetopause boundary (Shue et al., 1997) and the bowshock shape (Bennett et al., 1997) parameterised by the IMF and solar wind conditions as previously shown in Fig. 1. (cf. Fig. 7b). Also compared to the T'96 model, the MHD model exhibits a much steeper gradient of the field strength in the plasma cavities. This is especially evident in the x-y GSM representation.
The observations exhibit higher magnetic field strength values throughout the nightside plasmasphere, which are not well reflected by both model predictions (cf. Fig. 7h, j).
Here it should be noted, that the observations do include the daily/annual dipole variations and include fluctuations associated with high plasma measurements. The observational data-planes therefore represent a smoothed picture of reality but with systematic spikes whereas the models do not account for this fluctuations and the daily/annual variations.
Therefore it can be found that on average, both models describe reasonable well the dayside magnetic field strengths but lack accuracy on the nightside. These discrepancies especially have to be taken into account by studies using mapping predictions between the Earth and the nightside magnetosphere near-to far-tail regions. Due to this overestimation of the magnetic compression mapping and tracing calculation along field lines exploiting the three-dimensional model data would lead to the underestimation of the actual x extent of the magnetic field lines. Figure 8 shows the velocity flow vector comparison of the Cluster data and the coupled MHD model predictions in the x-z GSM and x-y GSM reference planes. The observational flow vectors indicated by the red arrows are overlaid on the gridded MHD vectors indicated by the light grey arrows. Overlaid in both panels are the magnetopause shape (Shue et al., 1997) and the bowshock shape (Bennett et al., 1997) parameterised by the averaged IMF and solar wind conditions as shown in Fig. 3.

Velocity comparison
Here in both panels the three clearly defined plasma regions as seen before are apparent, namely the solar wind (with velocities mainly directed perpendicular to the bowshock) followed by velocity vectors indicating plasma flows around the magnetopause inside the magnetosheath and a quiet inner plasmasphere with velocities of mainly under ∼40 km/s. In the solar wind and the magnetosheath region the plasma flow vectors given by the observations are well reproduced by the model, both in direction and magnitude. The boundaries which infer to exist are perfectly matched by the model and well reflected by the Bennett et al. (1997) and Shue et al. (1997) representations. These consistencies indicate, that the constraining factor (50% decrease of the solar wind density) employed is germane.
Upstream the bowshock, the model strictly reflects the given input boundary conditions determined by the singlepoint ACE observations and does not include the variations as seen in the observations (Balogh et al., 2005). Due to this, variations between the model and the observations are expected but marginal.

Plasma density comparison
Finally we shall compare the plasma proton density of the model and the observations, to investigate if the constraining factor has also altered the proton density present in the IM module region. Thus, we attempt to evaluate how much influence the upstream input density has on the general model run. Figure 9a-d show the plasma proton density distribution given by the observations and the model in the x-z GSM and x-y GSM reference planes. Overlaid are the magnetopause boundary (Shue et al., 1997) and the bowshock shape (Bennett et al., 1997) parameterised by the IMF and solar wind conditions as previously shown in Fig. 1. For better comparison the constraint upstream values of the proton density in the model data are not shown and the plots only show values starting in close proximity of the bowshock.
Despite the lowering of the solar wind density in the simulation, the model matches the values in the IM region reasonably well. Whereas the solar wind region in the model exhibits values of around ∼3 cm −3 (blocked out but cf. Fig. 1 left-hand panel) the observations show values of ∼6-7 cm −3 as expected due to the 50% decrease, but starting downstream of the bowshock the agreement improves in both magnitude and distribution. The observations especially show high plasma densities in the magnetosheath region with a vast y-and z-GSM extent of over 20 R E . This is only partially reflected in the model predictions, but considering the constraining factor used there is still a reasonable agreement.
Considering the above comparisons it can be said, that even with the 50% decrease of the input density, the plasma, magnetic field, and velocity flow observations of Cluster are reflected well by the model on a global/medium scale.

Discussion and conclusions
The accurate global-scale modelling of the geospace environment becomes more and more relevant for solar-terrestrial studies, in order to link the various small-scale observations of space-and/or ground-based instruments. As outlined by Berchem (2000) the present global MHD models posses the unique ability to span the enormous distances present in the magnetosphere and can put the point-to-point observations into a global context. In order to give a first general large-scale evaluation of one of these global coupled MHD models, we have performed a statistical study of the bulk plasma properties as measured by the Cluster spacecraft during 2001-2004, and the near-to far-Earth simulation domain of the coupled BATS-R-US/RCM/TIE-GCM global MHD model.
Our study finds very similar statistical plasma pressure, magnetic field strength, flow velocities, and density values in the Cluster measurements and the coupled MHD model runs. This gives an indication for various kinds of event studies using the coupled model. It shows, that the current models included in the SWMF can not just be used to accomplish causal relationships, but that they also have developed into a state where they can describe the highly variable dynamics in the different plasma domains self-consistently. They can now address the linked sets of objectives concerning the nature of various time-and spatially-varying phenomena present in  (Shue et al., 1997) and the bowshock shape (Bennett et al., 1997) parameterised by the IMF and solar wind conditions as previously shown in Fig. 1.

3424
P. Daum et al.: Cluster/BATS-R-US MHD comparison the geospace environment and can be used for space weather applications.
As outlined by Walker and Ashour-Abdalla (1995) the global simulations can only be used to address these linked set of objectives, when they are correctly adjusted and calibrated. In order to do so, the models need to be compared with huge data sets covering the vast spatial extent of the near-to far-Earth plasma environment. The here presented technique for using several years worth of in situ Cluster data (Denton and Taylor, 2008) to produce noon-midnight meridian and equatorial data-planes and to compare them to the coupled simulations, demonstrates a novel general validation step for the models. While the presented data set already contains over 825 000 data points, the observations of the space plasma environment still remains sparse and the focus of future work will be to incorporation further satellite data from missions such as Double Star, THEMIS, and LANL data (Denton et al., 2005b) in order to extend the presented data-planes and to lower the grid resolution to match the model grids.
But in general, it can be said that the presented model runs do reflect the observations reasonably well on a global to medium scale, considering the general MHD code limitations. These limitations contain the lack of a description of reconnection, a self-consistent description of multicomponent and multienergetic plasma systems as well as the inclusion of localised resistivity, nevertheless the GM domain up to the inner magnetopause boundary is well reproduced by the BATS-R-US code and can provide a valuable larger context for the localised point-to-point observations. Discrepancies between the coupled model and the in situ observations were found in the IM domain especially in the lower L-shell regions. These discrepancies between the RCM model and the in situ observations are the result of the inherent limitations of the coupled model. The BATS-R-US code is a single-fluid code that solves the ideal MHD equations, thus when coupled to the RCM the general plasma composition is unknown. The RCM therefore has to make assumptions regarding the plasma composition, since the energy dependent particle drifts are dominating the IM region especially in the lower L-shell regions, already small derivations in the plasma composition can lead to variations between the model and the in situ data . Improving this major limitation and to include more precisely the high-energy processes present in the near-and far-tail regions are still the important remaining questions of ongoing research. The actual physical processes in this domain are only very limited reflected in the actual model describing the domain Danov and Koleva, 2007;Garner et al., 2004;Lemon et al., 2004Lemon et al., , 2005. To include the physics of this highly dynamical and chancing environment into global simulations will be the focus of further studies. The created Cluster data-planes and the presented comparison study can aid these simulations to better represent the highly variable inner magnetosphere especially in close vicinity to the Earth.
But also with the encountered variations between the model and the in situ data in the IM regions, the model still reasonable well describes the large-scale processes in this domain and therefore can also be used for event studies as outlined before for the GM domain, especially with further coupling to different models.
While the presented study has shown that the coupled BATS-R-US/RCM/TIE-GCM MHD code is a powerful tool to model the geospace environment on a large-scale, it has also shown that in order to get a more realistic representation of the highly dynamic plasma domains, small-scale global kinetic simulations are indispensable and will be subject of future studies in the modelling community. Although global kinetic models are still a few years from being widely used in the community, first multi-scale modelling approaches have been developed in order to fill the gap between the smallscale kinetic models and the global simulations. A first approach is described in Kuznetsova et al. (2007) and bridges the gap successfully between the two modelling domains, further studies of this kind will follow until the advances in technology will allow the global kinetic models to substitute the currently dominant global MHD simulations. Also general global validation studies of these combined model runs are still pending.