The data made accessible through this interface are 1-min-averaged, field/plasma data sets shifted to the
Earth's bow shock nose(BSN). This "High Resolution OMNI" (HRO) data set involves an interspersal of
BSN-shifted ACE, Wind, IMP 8 and Geotail data. The following material describes the content and building of this HRO
data set and related data sets.

OMNIWeb Data Explorer

One min and 5-min solar wind data sets
at the Earth's bow shock noseJoe King and Natalia Papitashvili, GSFC/SPDF and ADNET Systems, Inc.

This note describes the building and contents of several 1-min- and
5-min resolution, solar wind magnetic field and plasma data sets
time-shifted to the Earth's bow shock nose. Data from the ACE, Wind and
IMP 8 spacecraft were processed in 2005-6, while Geotail data were added
later, in 2007. Initially the data were for 1995 to near-current. In 2009,
the IMP 8 shifted data were extended back in time to 11/04/1973,
shortly after launch. Also in 2009, we added GOES fluxes of
protons above 10, 30 and 60 MeV to 5-min OMNI.
These products are primarily intended to support
studies of the effects of solar wind variations on the magnetosphere and
ionosphere. In addition, we address 1998-2000 1-min ACE data sets
shifted using various techniques to the Wind location.

Time shifting is based on the assumption that solar wind magnetic
field values observed by a spacecraft at a given time and place lie
on a planar surface (a "phase front") convecting with the solar wind,
and that the same values will be seen at a different place at the time
that the phase front sweeps over that location. A key element of the
time shifting is use of the phase front normal (PFN) directions, which
are to be determined individually for each input 15-16 sec magnetic
field observation by analysis of it and its near neighbors. We
identify and compare results of two distinct PFN determination
analysis techniques (minimum variance and cross products) and
two separate combinings of these, for a total of four shift techniques.

The family of products introduced herein consist of
(a) 1-min averaged 1998-2000 ACE magnetic field and plasma data
shifted to the Wind location by each of the four shift techniques,
along with 1-min unshifted Wind averages, with which interested
persons can make independent judgements on the relative
effectiveness of the various shift techniques,
(b) 1-min and 5-min averaged ACE (1998-present), Wind (1995-present), IMP 8
(1973-2000) and Geotail (1995-2006) magnetic field and plasma data sets
shifted to the Earth's bow shock nose,
(c) a 1-min spacecraft-interspersed data set at the bow shock nose
that we call the High Resolution OMNI (HRO) data set and
(d) a 5-min averaged version of HRO having GOES proton fluxes appended.
Time tags in records of all these products are target-arrival times
and not observation times.

This note addresses in sequence: (a) the input data sets and their
preparations, (b) the time shifting used, including discussion of the
multiple PFN determination techniques available and including
consideration and handling of "out-of-sequence" arrivals,
(c) the building of 1-min averages from the shifted 15-16 sec IMF
values and shifted 1-2 min plasma values, (d) discussion of the
various data sets created (spacecraft-specific and the spacecraft-
interspersed HRO), including their record formats and meanings
of each word in the records, (e) results of analysis of the 1998-2000
Wind data and ACE data shifted to Wind for predictability of IMF
and plasma variations at one point, given observations elsewhere,
as a function of the two-point separation vector, of the solar wind
state (variation level, fast or slow, etc.), and of the PFN determination
technique. In addition, a series of Appendices address
(f) interspacecraft comparisons of magnetic field and plasma
parameter values for finding systematic differences and parameter
cross-normalization used in interspersing data from three spacecraft,
(g) selection criteria for which data to use in High Resolution OMNI
when data from multiple spacecraft are available for a given interval.

Proton Fluxes from GOES (>10 MeV, >30 MeV, >60 MeV ) are taken from NGDC:
http://goes.ngdc.noaa.gov/data/avg/ originally, and more recently,
http://satdat.ngdc.noaa.gov/sem/goes/data/new_avg/.
See near end of Section 2 below for further detail.
Minute AE, AL, AU and SYM/D, SYS/H, ASYM/D, ASYS/H
indexes have been computed at WDC for Geomagnetism at U. Kyoto:http://swdcwww.kugi.kyoto-u.ac.jp/aeasy/
PC(N) is the Polar Cap Index determined from the North polar cap station at Thule,Greenland.
It has been computed at World Data Center for Geomagnetism, Copenhagen
at the National Space Institute (DTU Space), Technical Universtiy of Denmark: ftp://ftp.space.dtu.dk/WDC/indices/pcn/.

We have used publicly available ACE, Wind, IMP 8 and Geotail magnetic
field and plasma in building the 1-min and 5-min data products described
herein.

ACE (Advanced Composition Explorer) was launched August 25, 1997,
and continues to provide magnetic field, plasma and energetic particle
data from a ~180 day L1 orbit having X, Y, and Z (GSE) ranges of
220 to 250 Re, -40 to +40 Re, and -24 to +24 Re. The ACE home page
is at
http://www.srl.caltech.edu/ACE/.

Wind was launched November 1, 1994, as part of NASA's contribution
to the International Solar Terrestrial Program. It continues to obtain
magnetic field, plasma, energetic particle and plasma wave data.
Since mid-2004, it has been in an L1 orbit with excursions in Y(GSE)
between +/- 100 Re. It had multiple earlier phases, including an
interval spanning the last third of 2000 through mid 2002 with Y(GSE)
excursions in excess of 200 Re and an interval in late 2003 and early
2004 in orbit about the Lagrange point on the anti-sunward side of Earth.
The Wind home page is at
http://pwg.gsfc.nasa.gov/wind.shtml.

IMP 8 was launched October 26, 1973, into a low eccentricity Earth
orbit. Apogee and perigee distances have been in the ranges 38-45Re
and 28-34 Re. On average IMP 8 is out of the solar wind for about
5 days of every 12.5 day orbit. The IMP 8 magnetometer failed June 10,
2000. Data from the MIT plasma instrument and from three energetic
particle detectors were acquired until October, 2006.
The IMP 8 web page is at
http://spdf.gsfc.nasa.gov/imp8/project.html.

Geotail was launched July 24, 1992, into an eccentric orbit with apogee deep in
the geotail. In early 1995, the Geotail orbit was adjusted to
about 10 x 30 Re, and then to 9 x 30 Re in 1997 where it continues today (2008).
In this orbit, Geotail has annual solar wind "seasons" with
apogee local times on or near the Earth's dayside, and it has solar wind intervals
during each ~5 day orbit of the solar wind seasons.

"Level 2" 16-s magnetic field data and 64-s plasma data were pulled
from the ACE Science Center. (Credit goes to Andrew Davis and the
ASC team for a very effective data management and distribution
facility). The field and plasma data there start on September 2, 1997,
and February 5, 1998, respectively. Owing to the critical need for plasma
flow speed data in time shifting magnetic field data to the bow shock nose
or elsewhere, we limit the coverage of ACE data in our new data products
to February 5, 1998, and later.

Wind magnetic field data.

The Wind magnetic field data are standardly produced by the instrument team
at 3-s, 1-m and 1-h resolutions. Because we apply phase front normal
determination algorithms to 15.36-s IMP magnetic field data and to 16-s
ACE data, we form 15-s averages from the available 3-s data to have similarly
resolved Wind magnetic field data as input

The Wind magnetic field data are standardly available at 3-sec resolution
with no discrimination for orbit phase, in particular, for solar wind vs.
non-solar wind phases. We have filtered at hourly resolution the time-
continuous 3-sec data against the Wind bow shock crossing identifications
made by the Wind magnetometer team and available at
http://wind.nasa.gov/mfi/bow_shock.html to give a solar-wind-only
input data set. We have made our own identifications of the few crossings
that occurred after the October, 2003, end of the Wind team's list.

In October 2011, the Wind/MFI team finished the reprocessing of
all MFI data. Among other things, well-determined Bz offset values
were used. The new MFI data were inserted into High Resolution OMNI
when they became available, replacing the earlier MFI data. The
new data were used to re-determine solar wind phase front normals
used in shifting data.

There are rare spikes in the Wind magnetometer data. We have taken a
simple approach to eliminating most of these by rejecting any 3-sec
record with a magnetic field magnitude or component in excess of 70 nT.

Wind/SWE plasma data

Wind/SWE plasma parameter data are available at ~92-s resolution in
three versions corresponding to three approaches to their production from
underlying distribution functions. There are "key parameter" data, non-linear
fits-based data (fits assumed convecting bimaxwellian distributions), and
anisotropic moments-based data. These are discussed at the
MIT Wind/SWE web page cited above. The latter two are further discussed in
Justin Kasper's dissertation whose most salient parts are web-accessible at
ftp://spdf.gsfc.nasa.gov/pub/data/wind/swe/ascii/2-min/thesis.pdf.
Finally, "physics-based" tests of the goodness of the nonlinear fits (NLF)-based
velocities (~0.16% in speed, ~3 deg in direction), densities (~3%)and
temperatures (~8%) are discussed in Kasper et al. (2006).

The NLF data and the anisotropic moments-based data are available to within
several weeks of the current date, date of availability of Wind magnetic field version 4
data. The SWE KP data, on the other hand, are typically available to within
several weeks of the current date. Given this and given the urging of the MIT
plasma team to use the very good and more robust KP data, we have chosen
to use the KP data in our high resolution OMNI data set.

But given that it was the NLF data for which the relatively small uncertainties
cited above were determined, we shall normalize the KP density and temperature
values to equivalent NLF values in the spacecraft-interspersed HRO data set.
This point is further discussed and quantified in Appendices 1
and 2 addressing
comparisons and cross-normalizations of the available multi-spacecraft data.

As for the Wind magnetic field data, the SWE KP data are available with no
discrimination for orbit phase. We have extracted a solar wind-only set of
SWE KP data by again filtering at hourly resolution against the Wind bow
shock crossing identifications cited above.

The SWE KP data are initially computed and loaded to
CDAWeb. The SWE
team at MIT improves this product by passing it through a despiking routine
that compares a value with the median of three points (the point being tested
and its immediate predecessor and follower). Some spikes elude detection.
We have run a further despiking routine requiring (to be a non-spike) that
the difference between a parameter value and the mean of the two preceding
and two following values should be less than four times the standard deviation
in that mean or that that difference relative to the mean should be less than
some (parameter-dependent) value. This is further discussed in Appendix 3.

This data set has data from both the solar wind and magnetosheath
phases of the IMP 8 orbit. However, each record has an MIT-assigned
flag indicating whether the data definitely are, or are not, from the solar
wind, or whether they may be from solar wind or magnetosheath. We
have used this flag to eliminate from the products discussed in this
documentation any data not tagged as being definitely in the solar wind.

There are some spikes in the IMP 8 plasma data. To eliminate most of
these, we have applied the spike finder software discussed in Appendix 3
to the data. However, because the software assumes that the first two
and last two data points of every interval not having a data gap in excess
of one hour are good data, we have visually scanned plots of data after
the application of the spike finder software, and have identified and
eliminated a few extra points as being likely bad points.

The IMP 8 plasma flow elevation angle has long been recognized as
having a ~2 deg offset. This is further discussed in Appendix 1. We
have not taken this bias out of the data of the products discussed herein.

Geotail magnetic field and plasma data

First, we created 15-s averaged magnetic field averages from 3-sec values for input-
compatibility with ACE, Wind and IMP IMF data used. Second, we determined the principal
time intervals during which Geotail was beyond the Earth's bow shock, in the solar wind.
This process, which does not distinguish foreshock intervals from non-foreshock solar wind
intervals, is extensively discussed at
ftp://spdf.gsfc.nasa.gov/pub/data/geotail/merged/sw_min_merged/00readme
Our despiking of Geotail magnetic field and plasma data is also described in this readme file.
The despiked, 15-s, solar-wind-only magnetic field data set is accessible from
http://omniweb.gsfc.nasa.gov/ftpbrowser/geotail_mag15s.html

We used CPI plasma data rather than Geotail LEP plasma data as the former seemed to
have cleaner solar wind parameter values and were more immediately accessible to us.
CPI despiked plasma data also available at http://omniweb.sci.gsfc.nasa.gov/ftpbrowser/geotail_pla_cpi.html

As we were doing this work, the magnetometer PI team was working to
reprocess its data using more definitive Bz offset values. As of this
date (February 5, 2008) we had not received reprocessed data. So we have done our Bz
corrections using the expectation that, when averaged over a year, the
Bz component in geocentric solar ecliptic coordinates should be within
0.1 of zero. It is possible that one day our new Geotail data sets and
multi-spacecraft OMNI data sets will incorporate the not-yet-available
reprocessed data of the PI team.

GOES energetic proton fluxes

Fluxes of protons above 10, 30 and 60 MeV, as measured by NOAA's geosynchronous
GOES spacecraft are included in 5-minute OMNI. Data from the following
spacecraft were used for the indicated years: GOES 7, 1995; GOES 8, 1996-2002;
GOES 10, 2003; GOES 11, 2004-2010; GOES 13, 2011 and later. Data are as
taken from http://satdat.ngdc.noaa.gov/sem/goes/data/new_avg/
except that for GOES 13, where separate fluxes are given at NGDC for eastward-
and westward-looking sensors. For GOES 13, we have averaged these two fluxes for
inclusion in 5-min OMNI. To view separate eastward- and westward-looking fluxes, and
their ratios, see the FTPBrowser interface at
http://omniweb.sci.gsfc.nasa.gov/ftpbrowser/goes13_flux_5m.html
Principal Investigator for the GOES energetic particle instruments is currently
T. Onsager, and key responsible NGDC person is D. Wilkinson.

Extra notes

Data providers may occasionally create replacement versions of their data.
In such cases, we replace the superseded data in OMNI with the newer
data values, and typically make note that this has happened at
http://omniweb.gsfc.nasa.gov/html/ow_news.html.
Such changes are relatively rare are typically involve only small parameter value
changes.

We sometimes refer to "15-s" input magnetic field data throughout
these pages. Readers should appreciate this is a shorthand notation
for 16-s ACE data, 15-s Wind and Geotail data and 15.36-s IMP data.

To best support solar wind - magnetosphere coupling studies,
it is desired to time-shift solar wind magnetic field and plasma
data from their location of observation, which may be an hour
upstream of the magnetosphere and several tens of Re or more
removed from the Earth-sun line, to a point close to the
magnetosphere. We choose this point to be the bow shock nose.
In addition, to assess the goodness of such shifts, we separately
shift ACE data to Wind (by each of several shift techniques)
and compare the shifted ACE data and in situ Wind data.

Given the availability of data on a specific solar wind magnetic
field or plasma parameter P as a function of time t at the
location Ro of an observing spacecraft, i.e., P(t, Ro),
it is desired to infer values of this parameter at
some displaced location Rd, i.e., P(t', Rd). The key underlying
assumptions enabling estimation of the time shift,
delta-t = t'-t, between observation of the parameter at Ro
and t and arrival of this value/variation at Rd at t', is that solar
wind variations are organized in series of phase fronts (flat
planes) that convect with the solar wind velocity V. Curvature
of variation surfaces is ignored and propagation of these phase
fronts relative to the solar wind flow is ignored. The unphysical
interpenetration of these phase fronts is discussed later. Thus
the time shift equation is delta-t = n · (Rd – Ro) / n · V,
where n is the variation phase front normal (PFN) and where
“·” is the normal dot or scalar product of two vectors.

The target Rd to which we shall shift ACE, Wind, IMP 8 and Geotail
data is the bow shock nose. This will best support future
solar wind - magnetosphere coupling studies. We use the
field and plasma parameters determined at a given time,
and the bow shock model of Farris and Russell (1994) with
the magnetopause model of Shue et al (1997), to determine
where the bow shock will be when the phase front reaches it.
See Appendix 4 for a discussion of these models. We include
solar wind flow aberration associated with Earth's ~30 km/s
orbital motion about the sun in bow shock nose location
determination.

It is recognized that this is a very simplified approach, neglecting
finite response times of the magnetosphere to solar wind
variations, that may introduce some error. However,
except for extreme excursions in solar wind parameters, the
bow shock will not move enough to introduce significant
uncertainty in the timing of arrival of solar wind structures
observed upstream. (Uncertainties connected with other
factors such as planarity of features and the interpenetration
of variation phase planes are larger and affect the parameter
profiles and not merely the timing of arriving plasma.)

The bow shock location to which the data are shifted is included
in the output data records, among many other parameters.

In addition to shifting data to the bow shock nose, we shall also
shift ACE data, by each of four techniques, to the location of the
Wind spacecraft so that we can assess the predictability of
solar wind variations as a function of the shift technique, the
observer-target separation geometry, the variation level in the
solar wind, and the nature of the flow (e.g., fast vs. slow).

Minimum variance analysis (MVA) has long been used to
determine normals to discontinuity planes in the solar wind
magnetic field. See for example Sonnerup and Cahill, 1968.
In this approach, a 3x3 variance matrix

is formed, with averages taken over a set of N points
spanning the discontinuity and with i,j representing any
two spatial directions. The matrix is diagonalized, and
the eigenvector associated with the minimum eigenvalue
gives the minimum variance direction (MVD). The number
of points N to be used in the analysis, and the ratio of
intermediate to minimum eigenvalues to take as a lower
limit below which the MVD is considered not reliably
determined, are part of the “art” of MVA.

3a.1. Technique 1, "Modified" MVA

Weimer et al (2003) applied the basic concepts of MVA
to determine an MVD for each point of a continuous time
series of interplanetary magnetic field data. In effect, they
assumed each point lay on a planar phase front whose
normal could be used, along with the solar wind flow velocity,
in the determination of when that value (assumed constant
everywhere on the plane) would be seen elsewhere in space.

After determining surprisingly good correspondence of
time-varying time shifts thus determined with shifts
determined by multi-spacecraft analysis (e.g., Weimer et al,
2002), an error was discovered in the Weimer et al. (2003)
application of MVA. In particular, the 1/N**2 in the
expression above was inadvertently replaced by 1/N.
When the correct expression above was used, agreement
with the multi-spacecraft time shift determinations
deteriorated.

Shortly thereafter, Bargatze et al. (2005) demonstrated that the
MVA equations used in Weimer et al (2003) corresponded
approximately to an MVA constrained by the condition that
the mean magnetic field vector over the analysis interval
should lie in the plane of minimum variance, that is, that
<B>·n (n is the MVD) ~ 0. The Weimer et al (2003)
came to be known, at least on a limited basis, as Modified MVA.

Much of our early work in this two-year effort utilized the
Weimer-provided code used in his 2003 analysis. None of the
final products made available from our effort are based on this
technique, although some interim products, no longer available,
were.

3a.2 Technique 2, MVAB-0

A Comment by Haaland et al (2006) pointed out that MVA
exactly constrained by the <B>·n = 0 condition was first
used by Sonnerup and Cahill (1968) and has been discussed
by Sonnerup and Scheible (1998). Such an MVA, called
MVAB-0 by Haaland et al, diagonalizes not the matrix M
(see above), but the matrix P*M*P where the symmetric
matrix P (Pij = deltaij – eiej;
deltaij is kronecker delta and e
is the unit vector in the direction of the mean magnetic field)
projects each vector B onto the plane perpendicular to e.

Weimer has developed and provided to us new code that
correctly implements the MVAB-0 approach.

We have used the MVAB-0 code generously provided by
Weimer in mid 2006. It is the only MVA code used in our
final products.

Weimer spent significant effort determining parameters
for the MVAB-0 technique, by seeking parameter
sets whose results gave best agreement with multi-spacecraft
determinations of phase front normals. In particular, he found,
and we have used, for the MVAB-0 technique optimal results
with 77 15-s points in each analysis (~19 min spans for
each MVD determination), eigenvalue ratio greater than or
equal to 5.2 (for a reliable MVD determination), and angle
between MVD and solar wind flow vector less than 73 deg.
(Larger angles lead to excessively long predicted delays.)

To eliminate spurious PFN determinations associated with data
gaps, we added the requirement that the interval between the
first and last point involved in each PFN determination should be
no more than 1.25 times what it would be in the absence of data
gaps.

3a.3 Technique 3, Cross Product (CP)

A totally distinct approach to determining a phase front
normal, that should be perfect for an ideal tangential
discontinuity, is to take a cross product of magnetic
field vectors just prior to, and following, a discontinuity.
Weimer has developed code that determines phase front
normals continuously using the cross product concept and
has generously also provided this to us. In a private
communication to us, Weimer cites the work of
Knetter et al (2004)as the inspiration
for developing this cross product (CP) code.

Weimer also spent significant effort determining parameters
for the CP technique, by seeking parameter
sets whose results gave best agreement with multi-spacecraft
determinations of phase front normals. In particular, he found,
and we have used, for the CP technique optimal results with the
angle between the “before” and “after” vectors greater than
13 deg, that these vectors should be based on 17 points each,
centered on the points 14 points before and after the point for
which the PFN is sought (thus a span of 46 points, or ~12 mins,
for the PFN determinatioin for each point), and that the
component of the mean field vector normal to the phase front
should be less than 0.035 nT. He also used a 73 deg limiting angle
as for the MVAB-0 technique. As for the MVAB-0 technique,
we added the requirement that the interval between the first and
last point involved in each PFN determination should be no more
than 1.25 times what it would be in the absence of data gaps.

Now, having two fundamentally different techniques for PFN
determination, we are able to add combinations of these two.
We devised one, called Technique 4, which is the one we in fact
used for producing the bow shock nose-shifted products discussed
in these notes. The technique consists of first applying the CP
method for a given point and its relevant neighbors, if an acceptable
PFN is determined, this is used for this point. If CP does not produce
an acceptable PFN (e.g., if the included angle between the “before”
and “after” vectors is less than 13 deg), then the MVAB-0 technique
is applied and its resultant PFN, if acceptable, is used. If neither CP
nor MVAB-0 techniques produce an acceptable PFN, that point is
marked for later interpolation, and a PFN is attempted for the next
point in the time series.

3a.5 Technique 5, DW Combination of Techniques 2 and 3.

Weimer and King (2008) took an alternative approach and required that
both the CP and MVAB-0 techniques should produce the same PFN (to
within some accuracy, arbitrarily set at 5 deg) in order to be
acceptable, otherwise the point was marked for later interpolation.
Weimer has provided the code implementing this technique, which
we call Technique 5.

In all cases (Techniques 2-5), a PFN direction satisfying relevant
tests may or may not be determined. Typically, such points are
marked. Then, in a second pass, for each such point, a PFN is
determined by linear interpolation between the last good and next
good PFN. In our implementations, the span across which such
interpolations are made can be no longer than 3 hours. Data
belonging to such extended gaps are not shifted nor included in
our new data products.

We hope to modify this in the future, as an IMF that was not
varying over many hours would be highly predictable at the bow
shock nose yet would not lead to acceptable PFN's and hence
would not be "shifted" and included in our products. We have
searched the interval March-December, 1998, for such occurrences,
and find 45 days with multi-hour data gaps in shifted data despite
there being no gaps in the input ACE data. The average gap duration
is 4-6 hours, so the fraction of data lost in our shifted data set is
about 45*6 / 300*24 = 4%. Fortunately, this is when the IMF is
most quiet and accurate bow shock nose predictions least critical.

Technique 5, the DW combination of 2 and 3, involves
more interpolation of PFNs than the individual MVAB-0 or CP
technique, or than the JK/NP combination thereof, which is one of
the main reasons we did our production work with Technique 4.
In the same search of March-December, 1998, data mentioned in
the preceding paragraph, we found 60 days having intervals of
3 hours or more having Technique 4 data but not Technique 5
data (because no good PFN's were produced over such intervals
by Technique 5.) Again assuming an average 6-hour gap duration,
the fraction of time for which we do not have Technique 5 data
relative to the time for which we have Technique 4 data is
60 * 6 / 0.96 * 300 * 24 = 5%

We introduced above the time shift equation as
delta-t = n·(Rd – Ro) / n · V. n is the phase plane normal,
determined by analysis of magnetic field data only. V is the
solar wind velocity, including the ~30 km/s in the Ygse direction
associated with the Earth’s orbital motion about the sun. We
initially shift 15-sec magnetic field data using the vector
velocity determined by interpolating velocity values most
immediately preceding and following the time tag of the
observed magnetic field value, as long as the interval of
interpolation is less than one hour. Magnetic field data
points whose most immediately preceding and following
velocity data are separated by more than an hour are not
carried forward into our output data products.

Shifting means changing the time tags of data records.
There is no changing of observed parameter values in the
process.

After shifting magnetic field data, plasma data are shifted
by using the time shift duration associated with the magnetic
field observation whose pre-shift time tag lies closest to the
plasma record’s time tag, so long as two time tags lie within
2 minutes of each other.

Because the n and the V in the time shift equation vary at
various time scales, it sometimes happens that, if phase
front A is observed before phase front B, B may nevertheless
be predicted to arrive at a remote location (e.g., the BSN
location) before A arrives there. Such out-of-sequence
arrivals may be due to “overtaking” associated with speed
gradients or to “interpenetration” of variously oriented
phase planes (especially given a significant separation of
the locations of the BSN and of the observing spacecraft
in the direction normal to the solar wind flow).

This “interpenetration” is clearly unphysical and is one of
the primary shortcomings in our work. But there is no
physically justified alternative yet. Two different
alternatives have been considered. In Weimer’s earliest
work, he imagined that, for any pair of out-of-sequence
phase fronts, the latter-arriving phase front would be
precluded from arriving by the earlier arriving phase
front and so could be dropped from further consideration.
In more recent work, he imagined that the latter arriving
phase front would displace the earlier arriving phase front,
so that the earlier arriving phase front could be dropped
from further consideration.

Our sense is that, while we cannot specify the physical
processes that will occur and prevent interpenetration
and out-of-sequence arrivals, there is no good a prior
reason for favoring earlier-arriving or later-arriving
phase fronts in cases of out-of-sequence arrivals.
As such, our approach is to accept all shifted data as
belonging to the newly assigned time tags that each
record acquires via our simple time shift equation,
and to build 1-min data products with averages over
all points shifting into a given minute. We recognize
this involves an unphysical mixing of plasma elements
from differing domains. But in some sense it emulates
our ignorance of the dynamical processes that happen
in the real solar wind.

[Note added 01/22/2007. It should be recognized that occasionally
our approach to averaging over all data shifting into a given minute
leads to a series of minutes whose parameter values alternate between
those characteristic of different plasma domains. That is, each minute
average may not simply be an average of values from two domains,
especially for Wind plasma data which starts at 92-s resolution.
For a recent example of this, see 2007/10/25 Wind SWE plasma data
prior to shifting at http://omniweb.sci.gsfc.nasa.gov/ftpbrowser/wind_swe_kp.html,
and see corresponding shifted data at
http://omniweb.gsfc.nasa.gov/form/sc_merge_min1.html. There is a
clean interplanetary shock in the unshifted data at 10:44 UT, while there
is an interval of ~50 minutes duration, spanning 11:47 - 12:36, of shifting
between pre-shock and post-shock parameter values in the minute averages
built from shifted data. Users must exercise care in using spacecraft-specific,
bow-shock-nose shifted data, or High Resolution OMNI data created from
them, in the presence of significant variability in field and plasma parameters
and in derived phase front normal directions.]

This section describes the common format of (a) the
1-min ACE, Wind, IMP 8 and Geotail spacecraft-specific data
sets that have been created at the bow shock nose,
(b) the ACE data sets shifted by various techniques
to the location of Wind and (c) the unshifted Wind data.
It also describes the shared format of the 1-min and 5-min
spacecraft-interspersed OMNI data sets.

The 1-min field and plasma averages are built from 15-s
magnetic field and ~1-min plasma records
whose shifted time tags indicate that any portion of the
data underlying the parameter values (i.e., the higher
resolution field values from which the 15 sec field
averages were determined or the plasma spectra from
which the bulk plasma parameters were determined)
were observed during the relevant minute of interest.
See Appendix 5 for a more discussion of the averaging,
including the weighting used.

The 1-min time tags are at the start (not midpoint) of the
data used in the average.

The 5-min OMNI averages are built from the five relevant
1-min averages. The standard deviations in these averages
correspond to the process of building the 5-min averages and
do not retain knowledge of the standard deviations in the 1-min
averages.

Identification of spacecraft and of shift technique, for the
spacecraft-specific data sets, are captured in file names rather
than in data records. To review, we use the following identifiers:

Only shift technique 4 is used in the spacecraft-specific data sets
shifted to the bow shock nose and in the HRO data set created
from them, while each shift technique is used in the ACE data sets
shifted to Wind.

Note that standard deviations for the three vectors are given
as the square roots of the sum of squares of the standard deviations
in the component averages. The component averages are given in
the records but not their individual standard deviations.

Footnote C:
The DBOT (Duration Between Observing Times) words:
For a given record, we take the 1-min average time shift and
estimate, using the solar wind velocity and the location of the
observing spacecraft, the time at which the corresponding
observation would have been made at the spacecraft. Then we take
the difference between this time and the corresponding time of the
preceding 1-min record and define this as DBOT1. This difference
would be one minute in the absence of PFN (phase front normal)
and/or flow velocity variations.
When this difference becomes negative, we have apparent out-of-
sequence arrivals of phase planes. That is, if plane A is observed
before plane B at the spacecraft, plane B is predicted to arrive at the
target before plane A. Searching for negative DBOT enables finding
of such cases.

DBOT2 is like DBOT1 except that the observation time for the
current 1-min record is compared to the latest (most time-advanced)
previous observation time and not to the observation time of the
previous record. Use of DBOT2 helps to find extended intervals
of out-of-sequence arrivals.

We do not capture out-of-sequence-arrival information at 15-s
resolution but only at 1-min resolution. The standard deviation in
the 1-min averaged time shifts may be used to help find cases of
out-of-sequence 15-s data.

It is an important current research topic to determine under what
conditions single-spacecraft observations of solar wind field and
plasma variations upstream (and possibly off to the side of) the
Earth's magnetosphere can lead to reliable predictions of the solar
wind variations to occur at the Earth's bow shock. Goodness of
predictability may depend on many variables, including the spacecraft-
to-bow shock separation geometry, the level of variation in the solar
wind, the nature of the solar wind (e.g., fast vs. slow flows) and the
technique used to shift data from the observation point to the bow
shock.

It is possible to assess predictability goodness by multiple techniques.
One would be to compare single-spacecraft predictions with the results
of multi-spacecraft analyses, as was done by Weimer et al (2003), but
done over a statistically significant number of independent time
intervals. Another would be to search out a statistically significant
number of major solar wind field and/or plasma discontinuous or other
variations, and to note agreement level between spacecraft A's
observations and spacecraft B's observations as shifted to A
(A - shifted_B cross correlation functions - ccf's). A third would be to
simply compute A - shifted_B ccf's in a large number of fixed-duration
time intervals, each characterized by A-B separation geometry, mean
physical parameter values in the intervals, parameter variance levels
and the shift technique.

We are taking the last approach of the above paragraph. We have
built a database of ccf's for field and plasma parameters for ~6000
4-hour intervals in 1998-2000, for ACE data shifted to the Wind
spacecraft by each of the four shift techniques discussed in Section 3a
of these notes. While a final and comprehensive assessment of
goodness of predictability as a function of ACE-Wind separation,
solar wind flow state, solar wind variation level and shift technique
lies in the near future, we report herein some preliminary results.
It is intended that the final assessment will be published and will
be reproduced here when completed.

We focus here on predictability of Bz variations as the most geoeffective of the
solar wind parameters. Imagine that computed 4-hour ccf's are the dependent
variable in an independent variable space consisting of Wind-ACE separation
geometry (along and across the flow direction), the means and standard
deviations for each physical parameter in the 4-hour intervals, and the shift
technique. For any bin in independent variable space, we find a certain number
of intervals whose ccf's make up a distribution itself having a mean, median,
standard deviation, etc. We examine the medians of these distributions as
indicating dependence of predictability on the independent variables.

With no selection of parameters but exercising each of the 4 shift techniques,
we find four distributions with numbers of 4-hour intervals ranging between
5109 and 5288 and with medians ranging between 0.691 and 0.706. Standard
deviations in the (non-Gaussian) distributions of medians are ~0.31
Thus, at least in the case of looking over all the data, the various shift
techniques are giving statistically equivalent results. In fact this is also the
case for virtually all the binned analyses we've done.

Except where noted, additional results in this section are for shifts by
"technique4" that we have used in our production work.

To first assess dependence of predictability on the transverse separation of
Wind and ACE, we do a series of runs binned only by ACE-Wind
Impact Parameter (IP).
We find that the median of the Bz ccf distributions increases through the
values 0.35, 0.34, 0.54,0.63, 0.75, 0.85, 0.87 at the IP decreases through
the bins >150, 120-150,90-120, 60-90, 30-60, 15-30 and 0-15 Re.
The numbers of 4-hour intervals in these distributions range from
159 (120-150 Re) to 2001 (30-60 Re).
It is interesting that the ccf is the same for the 120-150 Re bin and the >150 Re
bin, and that the ccf is the same for the 0-15 and 15-30 Re bins. The latter may
be due to the occurrence of rotational discontinuities which, because of their
propagation relative to the ambient solar wind, are not well accommodated
by the shift assumptions. If we define a Bz scale length as the distance over
which the Bz ccf falls by 10% (cf. Richardson and Paularena, 2001), then the
scale length is approximately (135-15)/(0.85-0.35)*10 = 24 Re.

Interestingly, when we look at medians of Bz ccf distributions involving
MVAB-0-determined and CP-determined PFN's in the IP = 0-15 Re and 15-30 Re
bins, we find 0.87 (0-15 Re) and 0.84 (15-30 Re) for both methods. That both
methods give the same result may run counter to an expectation that the MVAB-0
method may be good for PFN determination for both tangential and rotational
discontinuities, while the CP method should be better for PFN determination for
non-propagating tangential discontinuities having no field component normal to
the discontinuity plane.

To examine the dependence of predictability on the solar wind variability level,
we did a series of runs for various values of the standard deviation in the
4-hour Bz average (sigma-Bz). Upon limiting the ACE-Wind Impact Parameter
to be less than
60 Re, we found median values of the Bz ccf distributions of 0.66, 0.76, 0.82,
0.85, and 0.91 in the sigma-Bz bins 0-1, 1-2, 2-3, 3-4, >4 nT. The numbers of
intervals per distribution ranges between 277 (sigma-Bz > 4 nT) and 1129
(1 < sigma-Bz < 2 nT). Removing the constraint on the Wind-ACE IP almost
doubled the numbers of 4-hour intervals per sigma-Bz run, but decreased the
median ccf's only by 7% (at largest sigma-Bz) to 13% (at smallest sigma-Bz).
The conclusion here is that the higher the variation level in Bz, the more
predictable are bow shock nose Bz variations, given upstream Bz observations.

To examine possible dependence of predictability on the X distance upstream,
we define bins by X(ACE) - X(Wind). For a series of runs all having Wind-ACE
IP < 60 Re, we find medians in Bz ccf distributions of 0.78, 0.77, 0.81, 0.74
for bins of <50 Re, 50-125 Re, 125-200 Re, >200 Re respectively. The numbers
of intervals in the distributions range between 345 (delta X > 200 re) and 1248
(125 Re < delta X < 200 Re). The conclusion here is that, while there's a hint of
a downturn in the Bz ccf at delta X > 200 Re, there's no major dependence of
predictability on delta X.

Finally, to assess predictability on flow speed, we do runs in flow speed bins
<350, 350-450, 450-550 and >550 km/s for IP < 60 Re and for sigma-Bz and
for sigma-Bz > 1 nT, we find medians in the Bz ccf distributions of 0.84, 0.83,
0.79, 0.72 as the speed increases through the four indicated bins. Numbers
of 4-hour intervals in the bins ranges from 346 (V > 550 km/s) to 1210
(350-450 km/s). Predictability in Bz variations decreases modestly as the
solar wind flow speed increases.

While the key issue for our new products is the extent to which
solar wind variations observed remote from the Earth's bow shock
may be used to infer variations at the bow shock nose, it is also
of interest to review whether there are systematic differences
in parameter values between pairs of input data sets. This is
largely because the spacecraft-interspersed data set (i.e., High
Resolution OMNI - HRO) should not have excessive parameter
changes due to transition between one source spacecraft and
another, and so that the parameter values included in the new
HRO are most likely "true" at least at the observation points.

These interfaces determine the slopes and intercepts in the linear
regressions P1 = a + b*P2, where P represents any of the relevant
physical parameters (or, as special cases, log N and log T).
The interfaces also determine the uncertainties in the slope and
intercept, cross correlation coefficients, and the rms deviations
between the data points on the scatter plots and the best fit lines.
The "1" and the "2" refer to the members of any spacecraft pair.

Our work uses linear regressions of logs of densities and temperatures
rather than the values of N and T themselves because these parameters
are more log-normally distributed than normally distributed.

Note that the documentation of our hourly resolution OMNI 2 data set
at
http://omniweb.gsfc.nasa.gov/html/ow_data.html
extensively discusses intercomparisons of hourly ACE, Wind and
IMP 8 magnetic field and plasma data. The rationale for the present
discussion is to address the significantly extended time span over which
data are now available for intercomparison.

For Wind/SWE, we would use the Key Parameter (KP) data,
but would normalize them, if any normalizations were appropriate, to
the nonlinear fit (NLF) data for which admirably small uncertainty
estimates had been derived by Kasper et al. (2006).
We have built a series of parameter-specific tables summarizing the
results of the annual and multi-year cross correlations. For plasma
comparisons, we used P(Wind/NLF) = a + b * P(2) where now P(2)
might be ACE or IMP 8 or Wind/KP.

Magnetic field comparisons

If the Wind magnetic field data are right, then IMP field magnitude and components
(absolute values) would need to be increased by 1.5 to 2 percent to match Wind.
Thus there are systematic Wind-IMP magnetic field component differences of
~0.3 to ~0.4 nT at ± 20 nT. Averaged over 1996-2000,
when Bz(Wind) = 0, Bz(IMP) = -0.06, indicating good IMP zero level determination.
There is no clear evidence of any time dependence in the Wind-IMP relations
in magnetic field data.

By contrast, Wind version 4 and ACE magnetic field data agree to within 1 percent
for virtually all components and years, and to within 0.03 nT in Bz
at Bz = 0.

The Geotail magnetic field data available as we were creating these new data sets
were known to have preliminary and incorrect Bz offsets. See the discussions in
Section 2 and in Appendix 2.

Flow speed comparisons

Flow speeds agree to within 1% or less. That is
|V(Wind/NLF) - V(Z)| / V(Wind/NLF) < 1%,
where Z = Wind/KP, ACE, or IMP. For the case of Z = ACE and IMP, V(Wind/NLF)
exceeds V(Z). V(Wind/KP) is virtually identical to V(Wind/NLF).

Flow direction angle comparisons

Flow azimuth angles between any source pair agree to within 1 degree
over the ± 10 deg range. Flow elevation angle agreement level depends
on the source pair. Wind/NLF and Wind/KP agree to within 1 degree
over the ± 10 degree range. The same is true for Wind/NLF vs. ACE
except that near +10 deg, Wind/NLF exceeds ACE by ~1.5 deg.
The IMP elevation angle exceeds the Wind/NLF elevation angle by
an amount ranging from ~1.2 deg at -10 deg to ~4 degrees at +10 degrees.
An apparent IMP flow elevation angle offset of ~2 deg has been recognized
for many years. The present analysis shows for the first time an elevation
angle dependence in this offset. There are no evident time dependences in
the relations between any source pair for flow speed or direction angles.

Density Comparisons and Temperature

Wind and ACE proton parameters. Previously we used Wind/SWE parameters
based on anisotropic nonlinear fits to Wind/SWE plasma distributions through
November 2004, and we used cross-normalized Wind/SWE Key Parameter data
thereafter. Now, owing to their greater "robustness," the only Wind/SWE
proton data we use for 1995-current are the cross-normalized SWE KP data.
(SWE KP cross-normalization is to the SWE nonlinear fit data.)
All recent results are given in Appendix 2.

The full old OMNI documentation package made before February 15, 2013
user may find at http://omniweb.sci.gsfc.nasa.gov/html/HROdocum_old.html.
(New upgrades for data cross-normalizations were made after February 15, 2013)

Using hourly averaged data, the previous section has revealed the
mainly small systematic differences for each magnetic
field parameter between Wind on the one hand and ACE and
IMP 8 on the other hand. It has also revealed systematic
differences for each plasma parameter between the nonlinear
fit-based Wind/SWE data on the one hand and the Wind/SWE
key parameter data, the ACE/ SWEPAM data and the
MIT/IMP 8 data on the other hand.

The question is now whether and for which parameters we should
cross normalize the data to be included in the spacecraft-interspersed
high resolution OMNI data set. (Note that we do no such
normalizations for our new spacecraft-specific data sets.) We choose
to minimize cross-normalizations for multiple reasons. First, since
we use 3-hour swaths of same-spacecraft data in 1-min OMNI,
there are at most only 0.55% of minute-to-minute transitions that
would involve a change of source spacecraft. In fact, the actual
fraction of transitions between sources is very much less than this.
Second, we do not expect this data set to be used for long
term solar wind variation studies; the hourly resolution OMNI data
set is more appropriate for this.

So, as for the present hourly OMNI data set, we shall cross-normalize
only plasma densities and temperatures.

For Wind/KP Density and Temperature data to Wind/NLF we use the same equations we used for hourly OMNI:

We have undertaken to eliminate spikes from the Wind and
IMP 8 magnetic field and plasma data sets. Owing to their
relatively clean state, we have judged it unnecessary to
despike the ACE data. Wind magnetic field data were
despiked with the simple approach of eliminating any record
with a field magnitude or component absolute value in excess
of 70 nT. Other data were despiked with the approach
described as follows.

We test a point using its two predecessors and two followers.
We require that the 1st and last of these 5 points be within
15 mins (for B data) or 60 mins (for plasma data). The first
two and last two points in a data segment separated from its
neighbors by intervals of >15 min (B) or >60 min (plasma)
go untested by the algorithms discussed here. (We visually
scanned output data looking for obvious spikes thereby missed,
and deleted these.)

Any record having a declared spike in any of its physical
parameters is rejected. For a parameter value to be declared
a spike, it must satisfy two criteria.

Let P represent the value of the physical parameter being
tested. Define <P> as the mean value of parameter P over the
1st, 2nd, 4th, and 5th points of the current set, and let sigma(P)
be the RMS deviation in this average. The first test for a spike
is to have |P-<P>| > 4 * sigma(P).

For completeness, we note that the Wind/SWE plasma
data came to us already having been run through MIT
despike software that required that the relative difference
between the point being tested and the median of that
point and its immediate predecessor and immediate
successor should be less than 0.1, 0.5 and 1.0 for flow
speed, density and thermal speed, respectively. Some
points accepted by the MIT software were rejected by ours.

We assume the geocentric direction to the bow shock nose is
parallel to the (aberrated) solar wind flow direction:
Rt = - |Rt| * V/|V|. (V and |V| are determined from the
aberration-corrected V values provided in the input plasma
data sets, but with 29.8 km/s, the mean orbital speed of the
Earth about the sun, added to their Vy values.)

We have input records with (typically shifted) time tags T and
parameter values P. The parameters are either ~15-sec magnetic
field or ~1-min plasma parameters. Magnetic field parameters
are typically averages of yet higher resolution magnetic field
parameters that have been obtained between some first time
Tf and some last time Tl. Plasma parameters are as derived from
some distribution function accumulated between some first time
Tf and some last time Tl. The relation between the input record
time tag T and the first and last times (Tf & Tl) of the data on
which the record's parameter values are based is dataset-specific.
The duration Tl-Tf varies between records for some data sets but
not for others.

We want to create output records tagged at the start of every
minute. The parameter values in the output records should be
based, as much as possible, on observations made during that
minute. This means that, for a given output minute, we want to
do weighted averages over any input values whose underlying
data were obtained, in whole or part, during the output record's
minute of interest. One weighting factor is the extent to which
the parameters of the input record cover the desired output
interval. The other factor is the extent to which the parameters
of the input record are determined by data taken outside the
minute of interest. These weights may be written as follows.

Let Tf* = Tf or Tf* = the first instant of the output record,
whichever is later. Let Tl* = Tl or Tl* = the last instant of the
output record, whichever is earlier. Then Tl* - Tf* = the part
of the duration of the input record which lies within the duration
of the output record. Let S = Tl* - Tf*. The fraction of the input
record which lies within the output record time span is
(Tl* - Tf*)/(Tl - Tf). Let this fraction be F. Note that F = S/(Tl - Tf).
For data sets having the same durations [i.e., (Tl - Tf) values] for
all records, we have F = constant * S. ACE and Wind field records
and plasma records each has a common Tl-Tf, while both IMP8
field and plasma records have varying Tl-Tf values.

To get parameter values <P> for the output records, find all input
records whose parameters are based on observations taken within
the output minute of interest. Define the weighted averages as
<P> = SUM (Si * Fi * Pi)/SUM (Si * Fi), where i indexes the
relevant input records and where the sums are over all the
relevant input records.
There is interest in defining variance measures of the P values.
These may be attributed to variances within the contributing Pi
values and to the spread of the Pi values about the mean <P>
value. We consider below only the variability in our Pi values
about <P>.

Since we build the mean using weighting, we do so also for the
variance, using the expression

V = [SUM ((Si*Fi) * (Pi-<P>)**2) / SUM (Si*Fi) = <P**2> - <P>**2

Five-minute averages are computed from the 1-min averages. The
5-min averages tagged with minute = 0 are built from 1-min
averages tagged as being for minutes 0, 1, 2, 3 and 4. Likewise for
5-min averages tagged with minutes 5, 10 ... 55.

(This Appendix was originally written as we were creating HRO from ACE, Wind and IMP data.
The variant used in adding Geotail data to HRO is described near the end of this Appendix.)

There will be many minutes when shifted data are available from
multiple spacecraft. In building High Resolution OMNI (HRO),
we shall follow the hourly OMNI practice of selecting data from
one source when multiple sources are available. However, instead
of following the hourly OMNI practice of selecting the source
for each unit time increment, for our HRO products we shall select
and intersperse 3-hour data segments [both field and plasma data together]
from among our multiple sources.

There are three criteria we shall use, namely, (a) the source-Earth
Impact Parameter (IP, separation transverse to the flow, with allowance
for Earth's orbital motion), (b) the completeness of magnetic field data
coverage in the 3-hour interval, (c) source continuity. This latter means
that if neither (a) nor (b) provides a strong discriminant between sources,
we shall favor using the source used in the previous 3-hour segment.

We make discrimination between spacecraft pairs algorithmically as
follows. Let ScX and ScY represent the two spacecraft being compared.

For 3-hour intervals with some data available from each of three
spacecraft (early 1998 through mid-2000), we have determined
the favored spacecraft for each of the three possible pairings of
spacecraft and then determined by inspection which one spacecraft
was preferable to both of the other two spacecraft.

When we added Geotail data to HRO, we treated the 3-spacecraft-based HRO data set as a single
data set and the Geotail data set as a second data set, and used the 2-spacecraft algorithm described
above for determining whether, for each 3-hour interval, Geotail data should replace the data
previously in HRO. We carefully used Impact Parameter appropriate to the spacecraft used
in HRO for the interval. Further, if the spacecraft used in HRO for a given interval is different
than the spacecraft used in HRO for the preceding interval, we ignore the "continuity factor" by
setting E = 0 in the above algorithm.

Note that upon making extensions to HRO, we frequently have data from
one source spacecraft reaching closer to current data than data from other
source(s). In such cases, most current data will be used in HRO with no "F tests"
relative to other spacecraft. But later, when data from other source(s) become
available, inter-spacecraft tests will be performed and the originally included
data may be replaced by data from the other source(s).