Abstract

The reconstruction and calibration algorithms used to calculate missing transverse momentum (\(E_{\text {T}}^{\text {miss}}\) ) with the ATLAS detector exploit energy deposits in the calorimeter and tracks reconstructed in the inner detector as well as the muon spectrometer. Various strategies are used to suppress effects arising from additional proton–proton interactions, called pileup, concurrent with the hard-scatter processes. Tracking information is used to distinguish contributions from the pileup interactions using their vertex separation along the beam axis. The performance of the \(E_{\text {T}}^{\text {miss}}\) reconstruction algorithms, especially with respect to the amount of pileup, is evaluated using data collected in proton–proton collisions at a centre-of-mass energy of 8 \(\text {TeV}\) during 2012, and results are shown for a data sample corresponding to an integrated luminosity of \(20.3\, \mathrm{fb}^{-1}\). The simulation and modelling of \(E_{\text {T}}^{\text {miss}}\) in events containing a Z boson decaying to two charged leptons (electrons or muons) or a W boson decaying to a charged lepton and a neutrino are compared to data. The acceptance for different event topologies, with and without high transverse momentum neutrinos, is shown for a range of threshold criteria for \(E_{\text {T}}^{\text {miss}}\) , and estimates of the systematic uncertainties in the \(E_{\text {T}}^{\text {miss}}\) measurements are presented.

1 Introduction

The Large Hadron Collider (LHC) provided proton–proton (pp) collisions at a centre-of-mass energy of 8 \(\text {TeV}\) during 2012. Momentum conservation transverse to the beam axis1 implies that the transverse momenta of all particles in the final state should sum to zero. Any imbalance may indicate the presence of undetectable particles such as neutrinos or new, stable particles escaping detection.

The missing transverse momentum (\(\vec {E}_{{\mathrm{T}}}^{\mathrm{miss}}\)) is reconstructed as the negative vector sum of the transverse momenta (\(\vec {p_{\text {T}}}\) ) of all detected particles, and its magnitude is represented by the symbol \(E_{\mathrm {T}}^{\mathrm {miss}}\). The measurement of \(E_{\text {T}}^{\text {miss}}\) strongly depends on the energy scale and resolution of the reconstructed “physics objects”. The physics objects considered in the \(E_{\text {T}}^{\text {miss}}\) calculation are electrons, photons, muons, \(\tau \)-leptons, and jets. Momentum contributions not attributed to any of the physics objects mentioned above are reconstructed as the \(E_{\text {T}}^{\text {miss}}\) “soft term”. Several algorithms for reconstructing the \(E_{\text {T}}^{\text {miss}}\) soft term utilizing a combination of calorimeter signals and tracks in the inner detector are considered.

The \(E_{\text {T}}^{\text {miss}}\) reconstruction algorithms and calibrations developed by ATLAS for 7 \(\text {TeV}\) data from 2010 are summarized in Ref. [1]. The 2011 and 2012 datasets are more affected by contributions from additional pp collisions, referred to as “pileup”, concurrent with the hard-scatter process. Various techniques have been developed to suppress such contributions. This paper describes the pileup dependence, calibration, and resolution of the \(E_{\text {T}}^{\text {miss}}\) reconstructed with different algorithms and pileup-mitigation techniques.

The performance of \(E_{\text {T}}^{\text {miss}}\) reconstruction algorithms, or “\(E_{\text {T}}^{\text {miss}}\) performance”, refers to the use of derived quantities like the mean, width, or tail of the \(E_{\text {T}}^{\text {miss}}\) distribution to study pileup dependence and calibration. The \(E_{\text {T}}^{\text {miss}}\) reconstructed with different algorithms is studied in both data and Monte Carlo (MC) simulation, and the level of agreement between the two is compared using datasets in which events with a leptonically decaying W or Z boson dominate. The W boson sample provides events with intrinsic \(E_{\text {T}}^{\text {miss}}\) from non-interacting particles (e.g. neutrinos). Contributions to the \(E_{\text {T}}^{\text {miss}}\) due to mismeasurement are referred to as fake \(E_{\text {T}}^{\text {miss}}\) . Sources of fake \(E_{\text {T}}^{\text {miss}}\) may include \({p}_{\text {T}}\) mismeasurement, miscalibration, and particles going through un-instrumented regions of the detector. In MC simulations, the \(E_{\text {T}}^{\text {miss}}\) from each algorithm is compared to the true \(E_{\text {T}}^{\text {miss}}\) (\(E_{\mathrm {T}}^{\mathrm {miss,True}}\)), which is defined as the magnitude of the vector sum of \(\vec {p_{\text {T}}}\) of stable2 weakly interacting particles from the hard-scatter collision. Then the selection efficiency after a \(E_{\text {T}}^{\text {miss}}\)-threshold requirement is studied in simulated events with high-\({p}_{\text {T}}\) neutrinos (such as top-quark pair production and vector-boson fusion \(H \rightarrow \tau \tau \)) or possible new weakly interacting particles that escape detection (such as the lightest supersymmetric particles).

This paper is organized as follows. Section 2 gives a brief introduction to the ATLAS detector. Section 3 describes the data and MC simulation used as well as the event selections applied. Section 4 outlines how the \(E_{\text {T}}^{\text {miss}}\) is reconstructed and calibrated while Sect. 5 presents the level of agreement between data and MC simulation in W and Z boson production events. Performance studies of the \(E_{\text {T}}^{\text {miss}}\) algorithms on data and MC simulation are shown for samples with different event topologies in Sect. 6. The choice of jet selection criteria used in the \(E_{\text {T}}^{\text {miss}}\) reconstruction is discussed in Sect. 7. Finally, the systematic uncertainty in the absolute scale and resolution of the \(E_{\text {T}}^{\text {miss}}\) is discussed in Sect. 8. To provide a reference, Table 1 summarizes the different \(E_{\text {T}}^{\text {miss}}\) terms discussed in this paper.

Table 1

Summary of definitions for \(E_{\text {T}}^{\text {miss}}\) terms used in this paper

Term

Brief description

Intrinsic \(E_{\text {T}}^{\text {miss}}\)

Missing transverse momentum arising from the presence of neutrinos or other non-interacting particles in an event. In case of simulated events the true \(E_{\text {T}}^{\text {miss}}\) (\(E_\mathrm{T}^\mathrm{miss,True}\)) corresponds to the \(E_{\text {T}}^{\text {miss}}\) in such events defined as the magnitude of the vector sum of \(\vec {p_{\text {T}}}\) of non-interacting particles computed from the generator information

Fake \(E_{\text {T}}^{\text {miss}}\)

Missing transverse momentum arising from the miscalibration or misidentification of physics objects in the event. It is typically studied in \(Z \rightarrow \mu \mu \) events where the intrinsic \(E_{\text {T}}^{\text {miss}}\) is normally expected to be zero

Hard terms

The component of the \(E_{\text {T}}^{\text {miss}}\) computed from high-\({p}_{\text {T}}\) physics objects, which includes reconstructed electrons, photons, muons, \(\tau \)-leptons, and jets

Soft terms

Typically low-\(p_{\text {T}}\) calorimeter energy deposits or tracks, depending on the soft-term definition, that are not associated to physics objects included in the hard terms

Pileup-suppressed \(E_{\text {T}}^{\text {miss}}\)

All \(E_{\text {T}}^{\text {miss}}\) reconstruction algorithms in Sect. 4.1.2 except the Calorimeter Soft Term, which does not apply pileup suppression

Object-based

This refers to all reconstruction algorithms in Sect. 4.1.2 except the Track \(E_{\text {T}}^{\text {miss}}\) , namely the Calorimeter Soft Term, Track Soft Term, Extrapolated Jet Area with Filter, and Soft-Term Vertex-Fraction algorithms. These consider the physics objects such as electrons, photons, muons, \(\tau \)-leptons, and jets during the \(E_{\text {T}}^{\text {miss}}\) reconstruction

2 ATLAS detector

The ATLAS detector [2] is a multi-purpose particle physics apparatus with a forward-backward symmetric cylindrical geometry and nearly 4\(\pi \) coverage in solid angle. For tracking, the inner detector (ID) covers the pseudorapidity range of \(|\eta |\) < 2.5, and consists of a silicon-based pixel detector, a semiconductor tracker (SCT) based on microstrip technology, and, for \(|\eta |\) < 2.0, a transition radiation tracker (TRT). The ID is surrounded by a thin superconducting solenoid providing a 2 T magnetic field, which allows the measurement of the momenta of charged particles. A high-granularity electromagnetic sampling calorimeter based on lead and liquid argon (LAr) technology covers the region of \(|\eta |<3.2\). A hadronic calorimeter based on steel absorbers and plastic-scintillator tiles provides coverage for hadrons, jets, and \(\tau \)-leptons in the range of \(|\eta |\) < 1.7. LAr technology using a copper absorber is also used for the hadronic calorimeters in the end-cap region of 1.5 < \(|\eta |\) < 3.2 and for electromagnetic and hadronic measurements with copper and tungsten absorbing materials in the forward region of 3.1 < \(|\eta |\) < 4.9. The muon spectrometer (MS) surrounds the calorimeters. It consists of three air-core superconducting toroid magnet systems, precision tracking chambers to provide accurate muon tracking out to \(|\eta |\)\(=\) 2.7, and additional detectors for triggering in the region of \(|\eta |\) < 2.4. A precision measurement of the track coordinates is provided by layers of drift tubes at three radial positions within \(|\eta |\) < 2.0. For 2.0 < \(|\eta |\) < 2.7, cathode-strip chambers with high granularity are instead used in the innermost plane. The muon trigger system consists of resistive-plate chambers in the barrel (\(|\eta |\) < 1.05) and thin-gap chambers in the end-cap regions (1.05 < \(|\eta |\) < 2.4).

3 Data samples and event selection

ATLAS recorded pp collisions at a centre-of-mass energy of 8 \(\text {TeV}\) with a bunch crossing interval (bunch spacing) of \(50\,\mathrm{ns}\) in 2012. The resulting integrated luminosity is 20.3 \(\mathrm{fb}^{-1}\) [3]. Multiple inelastic \(pp \) interactions occurred in each bunch crossing, and the mean number of inelastic collisions per bunch crossing (\(\langle \mu \rangle \)) over the full dataset is 21 [4], exceptionally reaching as high as about 70.

Data are analysed only if they satisfy the standard ATLAS data-quality assessment criteria [5]. Jet-cleaning cuts [5] are applied to minimize the impact of instrumental noise and out-of-time energy deposits in the calorimeter from cosmic rays or beam-induced backgrounds. This ensures that the residual sources of \(E_{\mathrm {T}}^{\mathrm {miss}}\) mismeasurement due to those instrumental effects are suppressed.

3.1 Track and vertex selection

The ATLAS detector measures the momenta of charged particles using the ID [6]. Hits from charged particles are recorded and are used to reconstruct tracks; these are used to reconstruct vertices [7, 8].

Each vertex must have at least two tracks with \({p}_{\text {T}} \) > 0.4 \(\text {GeV}\); for the primary hard-scatter vertex (PV), the requirement on the number of tracks is raised to three. The PV in each event is selected as the vertex with the largest value of \(\Sigma \,({p}_{\text {T}})^2\), where the scalar sum is taken over all the tracks matched to the vertex. The following track selection criteria3 [7] are used throughout this paper, including the vertex reconstruction:

These tracks are then matched to the PV by applying the following selections:

\(|d_0|\) < 1.5 mm,

\(|z_0\sin (\theta \))| < 1.5 mm.

The transverse (longitudinal) impact parameter \(d_0\)\((z_0)\) is the transverse (longitudinal) distance of the track from the PV and is computed at the point of closest approach to the PV in the plane transverse to the beam axis. The requirements on the number of hits ensures that the track has an accurate \({p}_{\text {T}}\) measurement. The \(|\eta |\) requirement keeps only the tracks within the ID acceptance, and the requirement of \({p}_{\text {T}}\) > 0.4 \(\text {GeV}\) ensures that the track reaches the outer layers of the ID. Tracks with low \({p}_{\text {T}}\) have large curvature and are more susceptible to multiple scattering.

The average spread along the beamline direction for pp collisions in ATLAS during 2012 data taking is around 50 mm, and the typical track \(z_0\) resolution for those with \(|\eta |~<~0.2\) and \(0.5~<~p_{\text {T}} ~<~0.6\)\(\text {GeV}\) is 0.34 mm. The typical track \(d_0\) resolution is around 0.19 mm for the same \(\eta \) and \(p_{\text {T}}\) ranges, and both the \(z_0\) and \(d_0\) resolutions improve with higher track \({p}_{\text {T}}\) .

Pileup effects come from two sources: in-time and out-of-time. In-time pileup is the result of multiple pp interactions in the same LHC bunch crossing. It is possible to distinguish the in-time pileup interactions by using their vertex positions, which are spread along the beam axis. At \(\langle \mu \rangle \)\(=\) 21, the efficiency to reconstruct and select the correct vertex for \(\mathrm{Z} \rightarrow \mu {}\mu \) simulated events is around 93.5% and rises to more than 98% when requiring two generated muons with \({p}_{\text {T}}\) > 10 \(\text {GeV}\) inside the ID acceptance [10]. When vertices are separated along the beam axis by a distance smaller than the position resolution, they can be reconstructed as a single vertex. Each track in the reconstructed vertex is assigned a weight based upon its compatibility with the fitted vertex, which depends on the \(\chi ^2\) of the fit. The fraction of \(\mathrm{Z} \rightarrow \mu {}\mu \) reconstructed vertices with more than 50% of the sum of track weights coming from pileup interactions is around 3% at \(\langle \mu \rangle \)\(=\) 21 [7, 10]. Out-of-time pileup comes from pp collisions in earlier and later bunch crossings, which leave signals in the calorimeters that can take up to 450 ns for the charge collection time. This is longer than the 50 ns between subsequent collisions and occurs because the integration time of the calorimeters is significantly larger than the time between the bunch crossings. By contrast the charge collection time of the silicon tracker is less than 25 ns.

3.2 Event selection for \(\mathrm{Z} \rightarrow \ell{}\ell\)

The “standard candle” for evaluation of the \(E_{\mathrm {T}}^{\mathrm {miss}}\) performance is \(\mathrm{Z} \rightarrow \ell{}\ell\) events (\(\ell =e\) or \(\mu \)). They are produced without neutrinos, apart from a very small number originating from heavy-flavour decays in jets produced in association with the Z boson. The intrinsic \(E_{\mathrm {T}}^{\mathrm {miss}}\) is therefore expected to be close to zero, and the \(E_{\mathrm {T}}^{\mathrm {miss}}\) distributions are used to evaluate the modelling of the effects that give rise to fake \(E_{\text {T}}^{\text {miss}}\) .

Candidate \(\mathrm{Z} \rightarrow \ell{}\ell\) events are required to pass an electron or muon trigger [11, 12]. The lowest \({p}_{\text {T}}\) threshold for the unprescaled single-electron (single-muon) trigger is \(p_{\text {T}}\) > 25 (24) \(\text {GeV}\), and both triggers apply a track-based isolation as well as quality selection criteria for the particle identification. Triggers with higher \({p}_{\text {T}}\) thresholds, without the isolation requirements, are used to improve acceptance at high \({p}_{\text {T}}\) . These triggers require \(p_{\text {T}}\) > 60 (36) \(\text {GeV}\) for electrons (muons). Events are accepted if they pass any of the above trigger criteria. Each event must contain at least one primary vertex with a z displacement from the nominal pp interaction point of less than \(200\,\mathrm{mm}\) and with at least three associated tracks.

The offline selection of \(\mathrm{Z} \rightarrow \mu {}\mu \) events requires the presence of exactly two identified muons [13]. An identified muon is reconstructed in the MS and is matched to a track in the ID. The combined ID\(+\)MS track must have \({p}_{\text {T}}\) > 25 \(\text {GeV}\) and \(|\eta |\) < 2.5. The z displacement of the muon track from the primary vertex is required to be less than 10 mm. An isolation criterion is applied to the muon track, where the scalar sum of the \({p}_{\text {T}}\) of additional tracks within a cone of size \(\Delta R\)\(=\)\(\sqrt{(\Delta \eta )^2+(\Delta \phi )^2}\)\(=\) 0.2 around the muon is required to be less than 10% of the muon \(p_{\text {T}}\) . In addition, the two leptons are required to have opposite charge, and the reconstructed dilepton invariant mass, \(m_{\ell \ell }\), is required to be consistent with the Z boson mass: 66 < \(m_{\ell \ell }\) < 116 \(\text {GeV}\).

The \(E_{\text {T}}^{\text {miss}}\) modelling and performance results obtained in \(\mathrm{Z} \rightarrow \mu {}\mu \) and \(Z\rightarrow e e\) events are very similar. For the sake of brevity, only the \(\mathrm{Z} \rightarrow \mu {}\mu \) distributions are shown in all sections except for Sect. 6.6.

3.3 Event selection for \(W\rightarrow \ell {}\nu\)

Leptonically decaying W bosons (\(W\rightarrow \ell {}\nu\)) provide an important event topology with intrinsic \(E_{\text {T}}^{\text {miss}}\); the \(E_{\text {T}}^{\text {miss}}\) distribution for such events is presented in Sect. 5.2. Similar to \(\mathrm{Z} \rightarrow \ell{}\ell\) events, a sample dominated by leptonically decaying W bosons is used to study the \(E_{\mathrm {T}}^{\mathrm {miss}}\) scale in Sect. 6.2.2, the resolution of the \(E_{\text {T}}^{\text {miss}}\) direction in Sect. 6.3, and the impact on a reconstructed kinematic observable in Sect. 6.4.

The \(E_{\text {T}}^{\text {miss}}\) distributions for W boson events in Sect. 5.2 use the electron final state. These electrons are selected with \(|\eta |\) < 2.47, are required to meet the “medium” identification criteria [14] and satisfy \({p}_{\text {T}}\) > 25 \(\text {GeV}\). Electron candidates in the region 1.37 < \(|\eta |\) < 1.52 suffer from degraded momentum resolution and particle identification due to the transition from the barrel to the end-cap detector and are therefore discarded in these studies. The electrons are required to be isolated, such that the sum of the energy in the calorimeter within a cone of size \(\Delta R\)\(=\) 0.3 around the electron is less than 14% of the electron \({p}_{\text {T}}\) . The summed \({p}_{\text {T}}\) of other tracks within the same cone is required to be less than 7% of the electron \({p}_{\text {T}}\) . The calorimeter isolation variable [14] is corrected by subtracting estimated contributions from the electron itself, the underlying event [15], and pileup. The electron tracks are then matched to the PV by applying the following selections:

\(|d_0|\) < 5.0 mm,

\(|z_0\sin (\theta \))| < 0.5 mm.

The W boson selection is based on the single-lepton triggers and the same lepton selection criteria as those used in the \(\mathrm{Z} \rightarrow \ell{}\ell\) selection. Events are rejected if they contain more than one reconstructed lepton. Selections on the \(E_{\mathrm {T}}^{\mathrm {miss}}\) and transverse mass (\(m_{\mathrm {T}}\)) are applied to reduce the multi-jet background with one jet misidentified as an isolated lepton. The transverse mass is calculated from the lepton and the \(\vec {E}_{{\mathrm{T}}}^{\mathrm{miss}}\),

where \(p_{\mathrm T}^{\ell }\) is the transverse momentum of the lepton and \(\Delta \phi \) is the azimuthal angle between the lepton and \(\vec {E}_{{\mathrm{T}}}^{\mathrm{miss}}\) directions. Both the \(m_{\mathrm {T}}\) and \(E_{\text {T}}^{\text {miss}}\) are required to be greater than 50 \(\text {GeV}\). These selections can bias the event topology and its phase space, so they are only used when comparing simulation to data in Sect. 5.2, as they substantially improve the purity of W bosons in data events.

The \(E_{\text {T}}^{\text {miss}}\) modelling and performance results obtained in \(W\rightarrow e{}v\) and \(W\rightarrow \mu {}v\) events are very similar. For the sake of brevity, only one of the two is considered in following two sections: \(E_{\text {T}}^{\text {miss}}\) distributions in \(W\rightarrow e{}v\) events are presented in Sect. 5.2 and the performance studies show \(W\rightarrow \mu {}v\) events in Sect. 6. When studying the \(E_{\text {T}}^{\text {miss}}\) tails, both final states are considered in Sect. 6.6, because the \(\eta \)-coverage and reconstruction performance between muons and electrons differ.

3.4 Monte Carlo simulation samples

Table 2 summarizes the MC simulation samples used in this paper. The \(\mathrm{Z} \rightarrow \ell{}\ell\) and \(W\rightarrow \ell {}\nu\) samples are generated with Alpgen [16] interfaced with Pythia [17] (denoted by Alpgen\(+\)Pythia) to model the parton shower and hadronization, and underlying event using the PERUGIA2011C set [18] of tunable parameters. One exception is the \(Z \rightarrow \tau \tau \) sample with leptonically decaying \(\tau \)-leptons, which is generated with Alpgen interfaced with Herwig [19] with the underlying event modelled using Jimmy [20] and the AUET2 tunes [21]. Alpgen is a multi-leg generator that provides tree-level calculations for diagrams with up to five additional partons. The matrix-element MC calculations are matched to a model of the parton shower, underlying event and hadronization. The main processes that are backgrounds to \(\mathrm{Z} \rightarrow \ell{}\ell\) and \(W\rightarrow \ell {}\nu\) are events with one or more top quarks (\(t\bar{t}\) and single-top-quark processes) and diboson production (WW, WZ, ZZ). The \(t\bar{t}\) and tW processes are generated with Powheg [22] interfaced with Pythia [17] for hadronization and parton showering, and PERUGIA2011C for the underlying event modelling. All the diboson processes are generated with Sherpa [23]. Powheg is a leading-order generator with corrections at next-to-leading order in \(\alpha _{\text {S}}\), whereas Sherpa is a multi-leg generator at tree level.

To study event topologies with high jet multiplicities and to investigate the tails of the \(E_{\mathrm {T}}^{\mathrm {miss}}\) distributions, \(t\bar{t}\) events with at least one leptonically decaying W boson are considered in Sect. 6.6. The single top quark (tW) production is considered with at least one leptonically decaying W boson. Both the \(t\bar{t}\) and tW processes contribute to the W and Z boson distributions shown in Sect. 5 as well as Z boson distributions in Sects. 4, 6, and 8 that compare data and simulation. A supersymmetric (SUSY) model comprising pair-produced 500 GeV gluinos each decaying to a \(t\bar{t}\) pair and a neutralino is simulated with Herwig\(++\) [24]. Finally, to study events with forward jets, the vector-boson fusion (VBF) production of \(H \rightarrow \tau \tau \) , generated with Powheg\(+\)Pythia8 [25], is considered. Both \(\tau \)-leptons are forced to decay leptonically in this sample.

Table 2

Generators, cross-section normalizations, PDF sets, and MC tunes used in this analysis

To estimate the systematic uncertainties in the data/MC ratio arising from the modelling of the soft hadronic recoil, \(E_{\text {T}}^{\text {miss}}\) distributions simulated with different MC generators, parton shower and underlying event models are compared. The estimation of systematic uncertainties is performed using a comparison of data and MC simulation, as shown in Sect. 8.2. The following combinations of generators and parton shower models are considered: Sherpa, Alpgen\(+\)Herwig , Alpgen\(+\)Pythia , and Powheg\(+\)Pythia8. The corresponding underlying event tunes are mentioned in Table 2. Parton distribution functions are taken from CT10 [30] for Powheg and Sherpa samples and CTEQ6L1 [38] for Alpgen samples.

Generated events are propagated through a Geant4 simulation [39, 40] of the ATLAS detector. Pileup collisions are generated with Pythia8 for all samples, and are overlaid on top of simulated hard-scatter events before event reconstruction. Each simulation sample is weighted by its corresponding cross-section and normalized to the integrated luminosity of the data.

4 Reconstruction and calibration of the \(E_{\text {T}}^{\text {miss}}\)

Several algorithms have been developed to reconstruct the \(E_{\text {T}}^{\text {miss}}\) in ATLAS. They differ in the information used to reconstruct the \(p_{\text {T}}\) of the particles, using either energy deposits in the calorimeters, tracks reconstructed in the ID, or both. This section describes these various reconstruction algorithms, and the remaining sections discuss the agreement between data and MC simulation as well as performance studies.

4.1 Reconstruction of the \(E_{\text {T}}^{\text {miss}}\)

The \(E_{\text {T}}^{\text {miss}}\) reconstruction uses calibrated physics objects to estimate the amount of missing transverse momentum in the detector. The \(E_{\text {T}}^{\text {miss}}\) is calculated using the components along the x and y axes:

where each term is calculated as the negative vectorial sum of transverse momenta of energy deposits and/or tracks. To avoid double counting, energy deposits in the calorimeters and tracks are matched to reconstructed physics objects in the following order: electrons (e), photons (\(\gamma \)), the visible parts of hadronically decaying \(\tau \)-leptons (\(\tau _{\mathrm{had}{\text {-}}\mathrm{vis}}\); labelled as \(\tau \)), jets and muons (\(\mu \)). Each type of physics object is represented by a separate term in Eq. (2). The signals not associated with physics objects form the “soft term”, whereas those associated with the physics objects are collectively referred to as the “hard term”.

The magnitude and azimuthal angle4 (\(\phi ^\mathrm{miss}\)) of \(\vec {E}_{{\mathrm{T}}}^{\mathrm{miss}}\) are calculated as:

The total transverse energy in the detector, labelled as \(\Sigma E_{\mathrm {T}}\), quantifies the total event activity and is an important observable for understanding the resolution of the \(E_{\text {T}}^{\text {miss}}\) , especially with increasing pileup contributions. It is defined as:

which is the scalar sum of the transverse momenta of reconstructed physics objects and soft-term signals that contribute to the \(E_{\text {T}}^{\text {miss}}\) reconstruction. The physics objects included in \(\sum p_{\mathrm {T}}^{\mathrm {soft}}\) depend on the \(E_{\text {T}}^{\text {miss}}\) definition, so both calorimeter objects and track-based objects may be included in the sum, despite differences in \({p}_{\text {T}}\) resolution.

The hard term of the \(E_{\text {T}}^{\text {miss}}\) , which is computed from the reconstructed electrons, photons, muons, \(\tau \)-leptons, and jets, is described in more detail in this section.

Electrons are reconstructed from clusters in the electromagnetic (EM) calorimeter which are associated with an ID track [14]. Electron identification is restricted to the range of \(|\eta |\) < 2.47, excluding the transition region between the barrel and end-cap EM calorimeters, 1.37 < \(|\eta |\) < 1.52. They are calibrated at the EM scale5 with the default electron calibration, and those satisfying the “medium” selection criteria [14] with \(p_{\text {T}} >10\)\(\text {GeV}\) are included in the \(E_{\text {T}}^{\text {miss}}\) reconstruction.

The photon reconstruction is also seeded from clusters of energy deposited in the EM calorimeter and is designed to separate electrons from photons. Photons are calibrated at the EM scale and are required to satisfy the “tight” photon selection criteria with \(p_{\text {T}}\) > 10 \(\text {GeV}\) [14].

Muon candidates are identified by matching an ID track with an MS track or segment [13]. MS tracks are used for 2.5 < \(|\eta |\) < 2.7 to extend the \(\eta \) coverage. Muons are required to satisfy \({p}_{\text {T}} \) > 5 \(\text {GeV}\) to be included in the \(E_{\text {T}}^{\text {miss}}\) reconstruction. The contribution of muon energy deposited in the calorimeter is taken into account using either parameterized estimates or direct measurements, to avoid double counting a small fraction of their momenta.

Jets are reconstructed from three-dimensional topological clusters (topoclusters) [41] of energy deposits in the calorimeter using the anti-\(k_t\) algorithm [42] with a distance parameter R\(=\) 0.4. The topological clustering algorithm suppresses noise by forming contiguous clusters of calorimeter cells with significant energy deposits. The local cluster weighting (LCW) [43, 44] calibration is used to account for different calorimeter responses to electrons, photons and hadrons. Each cluster is classified as coming from an EM or hadronic shower, using information from its shape and energy density, and calibrated accordingly. The jets are reconstructed from calibrated topoclusters and then corrected for in-time and out-of-time pileup as well as the position of the PV [4]. Finally, the jet energy scale (JES) corrects for jet-level effects by restoring, on average, the energy of reconstructed jets to that of the MC generator-level jets. The complete procedure is referred to as the LCW+JES scheme [43, 44]. Without changing the average calibration, additional corrections are made based upon the internal properties of the jet (global sequential calibration) to reduce the flavour dependence and energy leakage effects [44]. Only jets with calibrated \({p}_{\text {T}}\) greater than 20 \(\text {GeV}\) are used to calculate the jet term \(E_{{x(y)}}^{\mathrm {miss,jets}}\) in Eq. (2), and the optimization of the 20 \(\text {GeV}\) threshold is discussed in Sect. 7.

To suppress contributions from jets originating from pileup interactions, a requirement on the jet vertex-fraction (JVF) [4] may be applied to selected jet candidates. Tracks matched to jets are extrapolated back to the beamline to ascertain whether they originate from the hard scatter or from a pileup collision. The JVF is then computed as the ratio shown below:

This is the ratio of the scalar sum of transverse momentum of all tracks matched to the jet and the primary vertex to the \({p}_{\text {T}}\) sum of all tracks matched to the jet, where the sum is performed over all tracks with \(p_{\text {T}}\) > 0.5 \(\text {GeV}\) and \(|\eta |\) < 2.5 and the matching is performed using the “ghost-association” procedure [45, 46].

The JVF distribution is peaked toward 1 for hard-scatter jets and toward 0 for pileup jets. No JVF selection requirement is applied to jets that have no associated tracks. Requirements on the JVF are made in the STVF, EJAF, and TST \(E_{\text {T}}^{\text {miss}}\) algorithms as described in Table 3 and Sect. 4.1.3.

Hadronically decaying \(\tau \)-leptons are seeded by calorimeter jets with \(|\eta |\) < 2.5 and \(p_{\text {T}}\) > 10 \(\text {GeV}\). As described for jets, the LCW calibration is applied, corrections are made to subtract the energy due to pileup interactions, and the energy of the hadronically decaying \(\tau \) candidates is calibrated at the \(\tau \)-lepton energy scale (TES) [47]. The TES is independent of the JES and is determined using an MC-based procedure. Hadronically decaying \(\tau \)-leptons passing the “medium” requirements [47] and having \(p_{\text {T}}\) > 20 \(\text {GeV}\) after TES corrections are considered for the \(E_{\text {T}}^{\text {miss}}\) reconstruction.

4.1.2 Reconstruction and calibration of the \(E_{\mathrm {T}}^{\mathrm {miss}}\) soft term

The soft term is a necessary but challenging ingredient of the \(E_{\text {T}}^{\text {miss}}\) reconstruction. It comprises all the detector signals not matched to the physics objects defined above and can contain contributions from the hard scatter as well as the underlying event and pileup interactions. Several algorithms designed to reconstruct and calibrate the soft term have been developed, as well as methods to suppress the pileup contributions. A summary of the \(E_{\text {T}}^{\text {miss}}\) and soft-term reconstruction algorithms is given in Table 3.

Table 3

Summary of \(E_{\text {T}}^{\text {miss}}\) and soft-term reconstruction algorithms used in this paper

Term

Brief description

Section list

CST \(E_{\text {T}}^{\text {miss}}\)

The Calorimeter Soft Term (CST) \(E_{\text {T}}^{\text {miss}}\) takes its soft term from energy deposits in the calorimeter which are not matched to high-\({p}_{\text {T}}\) physics objects. Although noise suppression is applied to reduce fake signals, no additional pileup suppression techniques are used

The Track Soft Term (TST) \(E_{\text {T}}^{\text {miss}}\) algorithm uses a soft term that is calculated using tracks within the inner detector that are not associated with high-\({p}_{\text {T}}\) physics objects. The JVF selection requirement is applied to jets

The Extrapolated Jet Area with Filter \(E_{\text {T}}^{\text {miss}}\) algorithm applies pileup subtraction to the CST based on the idea of jet-area corrections. The JVF selection requirement is applied to jets

The Soft-Term Vertex-Fraction (STVF) \(E_{\text {T}}^{\text {miss}}\) algorithm suppresses pileup effects in the CST by scaling the soft term by a multiplicative factor calculated based on the fraction of scalar-summed track \(p_{\text {T}}\) not associated with high-\({p}_{\text {T}}\) physics objects that can be matched to the primary vertex. The JVF selection requirement is applied to jets

Four soft-term reconstruction algorithms are considered in this paper. Below the first two are defined, and then some motivation is given for the remaining two prior to their definition.

Calorimeter Soft Term (CST) This reconstruction algorithm [1] uses information mainly from the calorimeter and is widely used by ATLAS. The algorithm also includes corrections based on tracks but does not attempt to resolve the various pp interactions based on the track \(z_0\) measurement. The soft term is referred to as the CST, whereas the entire \(E_{\text {T}}^{\text {miss}}\) is written as CST \(E_{\text {T}}^{\text {miss}}\) . Corresponding naming schemes are used for the other reconstruction algorithms. The CST is reconstructed using energy deposits in the calorimeter which are not matched to the high-\({p}_{\text {T}}\) physics objects used in the \(E_{\text {T}}^{\text {miss}}\) . To avoid fake signals in the calorimeter, noise suppression is important. This is achieved by calculating the soft term using only cells belonging to topoclusters, which are calibrated at the LCW scale [43, 44]. The tracker and calorimeter provide redundant \({p}_{\text {T}}\) measurements for charged particles, so an energy-flow algorithm is used to determine which measurement to use. Tracks with \({p}_{\text {T}}\) > 0.4 \(\text {GeV}\) that are not matched to a high-\({p}_{\text {T}}\) physics objects are used instead of the calorimeter \({p}_{\text {T}}\) measurement, if their \({p}_{\text {T}}\) resolution is better than the expected calorimeter \(p_{\text {T}}\) resolution. The calorimeter resolution is estimated as \(0.4\cdot \sqrt{p_{\text {T}}}~\text {GeV}{}\), in which the \(p_{\text {T}}\) is the transverse momentum of the reconstructed track. Geometrical matching between tracks and topoclusters (or high-\({p}_{\text {T}}\) physics objects) is performed using the \(\Delta R\) significance defined as \(\Delta R / \sigma _{\Delta R}\), where \(\sigma _{\Delta R}\) is the \(\Delta R\) resolution, parameterized as a function of the track \({p}_{\text {T}}\) . A track is considered to be associated to a topocluster in the soft term when its minimum \(\Delta R / \sigma _{\Delta R}\) is less than 4. To veto tracks matched to high-\({p}_{\text {T}}\) physics objects, tracks are required to have \(\Delta R / \sigma _{\Delta R}\) > 8. The \(E_{\mathrm {T}}^{\mathrm {miss}}\) calculated using the CST algorithm is documented in previous publications such as Ref. [1] and is the standard algorithm in most ATLAS 8 \(\text {TeV}\) analyses.

Track Soft Term (TST) The TST is reconstructed purely from tracks that pass the selections outlined in Sect. 3.1 and are not associated with the high-\({p}_{\text {T}}\) physics objects defined in Sect. 4.1.1. The detector coverage of the TST is the ID tracking volume (\(|\eta |\) < 2.5), and no calorimeter topoclusters inside or beyond this region are included. This algorithm allows excellent vertex matching for the soft term, which almost completely removes the in-time pileup dependence, but misses contributions from soft neutral particles. The track-based reconstruction also entirely removes the out-of-time pileup contributions that affect the CST. To avoid double counting the \({p}_{\text {T}}\) of particles, the tracks matched to the high-\({p}_{\text {T}}\) physics objects need to be removed from the soft term. All of the following classes of tracks are excluded from the soft term:

tracks within a cone of size \(\Delta R\)\(=\) 0.05 around electrons and photons

tracks within a cone of size \(\Delta R\)\(=\) 0.2 around \(\tau _{\mathrm{had}{\text {-}}\mathrm{vis}}\)

ID tracks associated with identified muons

tracks matched to jets using the ghost-association technique described in Sect. 4.1.1

isolated tracks with \({p}_{\text {T}} ~\ge ~120\)\(\text {GeV}\) (\(\ge \)200 \(\text {GeV}\) for \(|\eta |\) < 1.5) having transverse momentum uncertainties larger than 40% or having no associated calorimeter energy deposit with \(p_{\text {T}}\) larger than 65% of the track \({p}_{\text {T}}\) . The \({p}_{\text {T}}\) thresholds are chosen to ensure that muons not in the coverage of the MS are still included in the soft term. This is a cleaning cut to remove mismeasured tracks.

A deterioration of the CST \(E_{\text {T}}^{\text {miss}}\) resolution is observed as the average number of pileup interactions increases [1]. All \(E_{\text {T}}^{\text {miss}}\) terms in Eq. (2) are affected by pileup, but the terms which are most affected are the jet term and CST, because their constituents are spread over larger regions in the calorimeters than those of the \(E_{\text {T}}^{\text {miss}}\) hard terms. Methods to suppress pileup are therefore needed, which can restore the \(E_{\mathrm {T}}^{\mathrm {miss}}\) resolution to values similar to those observed in the absence of pileup.

The TST algorithm is very stable with respect to pileup but does not include neutral particles. Two other pileup-suppressing algorithms were developed, which consider contributions from neutral particles. One uses an \(\eta \)-dependent event-by-event estimator for the transverse momentum density from pileup, using calorimeter information, while the other applies an event-by-event global correction based on the amount of charged-particle \({p}_{\text {T}}\) from the hard-scatter vertex, relative to all other pp collisions. The definitions of these two soft-term algorithms are described in the following:

Extrapolated Jet Area with Filter (EJAF) The jet-area method for the pileup subtraction uses a soft term based on the idea of jet-area corrections [45]. This technique uses direct event-by-event measurements of the energy flow throughout the entire ATLAS detector to estimate the \({p}_{\text {T}}\) density of pileup energy deposits and was developed from the strategy applied to jets as described in Ref. [4]. The topoclusters belonging to the soft term are used for jet finding with the \(k_{t}\) algorithm [48, 49] with distance parameter R\(=\) 0.6 and jet \(p_{\text {T}}\) > 0. The catchment areas [45, 46] for these reconstructed jets are labelled \(A_{\mathrm {jet}}\); this provides a measure of the jet’s susceptibility to contamination from pileup. Jets with \({p}_{\text {T}}\) < 20 \(\text {GeV}\) are referred to as soft-term jets, and the \({p}_{\text {T}}\)-density of each soft-term jet i is then measured by computing:

In a given event, the median \({p}_{\text {T}}\)-density \(\rho _{\mathrm {evt}}^{\mathrm {med}}\) for all soft-term \(k_{t}\) jets in the event (\(N_{\mathrm {jets}}\)) found within a given range \(-\eta _{\mathrm {max}}< \eta _{\mathrm {jet}}< \eta _{\mathrm {max}}\) can be calculated as

This median \({p}_{\text {T}}\)-density \(\rho _{\mathrm {evt}}^{\mathrm {med}}\) gives a good estimate of the in-time pileup activity in each detector region. If determined with \(\eta _{\mathrm {max}}\)\(=\) 2, it is found to also be an appropriate indicator of out-of-time pileup contributions [45]. A lower value for \(\rho _{\mathrm {evt}}^{\mathrm {med}}\) is computed by using jets with \(|\eta _{\mathrm {jet}}|\) larger than 2, which is mostly due to the particular geometry of the ATLAS calorimeters and their cluster reconstruction algorithms.6 In order to extrapolate \(\rho _{\mathrm {evt}}^{\mathrm {med}}\) into the forward regions of the detector, the average topocluster \(p_{\text {T}}\) in slices of \(\eta \), \({N}_{\mathrm {PV}}\), and \(\langle \mu \rangle \) is converted to an average \(p_{\text {T}}\) density \(\langle \rho \rangle (\eta ,{N}_{\mathrm {PV}}{}, \mu )\) for the soft term. As described for the \(\rho _{\mathrm {evt}}^{\mathrm {med}}\), \(\langle \rho \rangle (\eta ,{N}_{\mathrm {PV}}{}, \mu )\) is found to be uniform in the central region of the detector with \(|\eta |\) < \(\eta _\mathrm {plateau}\)\(=\) 1.8. The transverse momentum density profile is then computed as

where the central region \(|\eta |\) < \(\eta _\mathrm {plateau}\)\(=\) 1.8 is plateaued at 1, and then a pair of Gaussian functions \(G_\mathrm{{core}}(|\eta |-\eta _\mathrm {plateau})\) and \( G_\mathrm{{base}}(\eta )\) are added for the fit in the forward regions of the calorimeter. The value of \(G_\mathrm{{core}}(0)~=~1\) so that Eq. (9) is continuous at \(\eta ~=~\eta _\mathrm {plateau}\). Two example fits are shown in Fig. 1 for \({N}_{\mathrm {PV}}\)\(=\) 3 and 8 with \(\langle \mu \rangle \)\(=\) 7.5–9.5 interactions per bunch crossing. For both distributions the value is defined to be unity in the central region (\(|\eta |\) < \(\eta _\mathrm {plateau}\)), and the sum of two Gaussian functions provides a good description of the change in the amount of in-time pileup beyond \(\eta _\mathrm {plateau}\). The baseline Gaussian function \(G_\mathrm{{base}}(\eta )\) has a larger width and is used to describe the larger amount of in-time pileup in the forward region as seen in Fig. 1. Fitting with Eq. (9) provides a parameterized function for in-time and out-of-time pileup which is valid for the whole 2012 dataset. The soft term for the EJAF \(E_{\text {T}}^{\text {miss}}\) algorithm is calculated as

which sums the transverse momenta, labelled \(p_{x(y),i}^{\mathrm {jet,corr}}\), of the corrected soft-term jets matched to the primary vertex. The number of these
filtered jets, which are selected after the pileup correction based on their JVF and \({p}_{\text {T}}\) , is labelled \(N_{{\mathrm{filter}{\text {-}}\mathrm{jet}}}\). More details of the jet selection and the application of the pileup correction to the jets are given in Appendix A.

Soft-Term Vertex-Fraction (STVF)

The algorithm, called the soft-term vertex-fraction, utilizes an event-level parameter computed from the ID track information, which can be reliably matched to the hard-scatter collision, to suppress pileup effects in the CST. This correction is applied as a multiplicative factor (\(\alpha _{\text {STVF}}\) ) to the CST, event by event, and the resulting STVF-corrected CST is simply referred to as STVF. The \(\alpha _{\text {STVF}}\) is calculated as

which is the scalar sum of \({p}_{\text {T}}\) of tracks matched to the PV divided by the total scalar sum of track \({p}_{\text {T}}\) in the event, including pileup. The sums are taken over the tracks that do not match high-\(p_{\text {T}}\) physics objects belonging to the hard term. The mean \(\alpha _{\text {STVF}}\) value is shown versus the number of reconstructed vertices (\({N}_{\mathrm {PV}}\)) in Fig. 2. Data and simulation (including Z, diboson, \(t\bar{t}\) , and tW samples) are shown with only statistical uncertainties and agree within 4–7% across the full range of \({N}_{\mathrm {PV}}\) in the 8 \(\text {TeV}\) dataset. The differences mostly arise from the modelling of the amount of the underlying event and \(p_{\mathrm {T}}^{Z}\). The 0-jet and inclusive samples have similar values of \(\alpha _{\text {STVF}}\) , with that for the inclusive sample being around 2% larger.

The average transverse momentum density shape \(P^\rho (\eta ,N_{\text {PV}},\)\(\langle \mu \rangle \)) for jets in data is compared to the model in Eq. (9) with \(\langle \mu \rangle \)\(=\) 7.5–9.5 and with a three reconstructed vertices and b eight reconstructed vertices. The increase of jet activity in the forward regions coming from more in-time pileup with \({N}_{\mathrm {PV}}\)\(=\) 8 in b can be seen by the flatter shape of the Gaussian fit of the forward activity \(G_{\mathrm {base}}(\)\({N}_{\mathrm {PV}}\), \(\langle \mu \rangle \)) (blue dashed line)

The mean \(\alpha _{\text {STVF}}\) weight is shown versus the number of reconstructed vertices (\({N}_{\mathrm {PV}}\)) for 0-jet and inclusive events in \(\mathrm{Z} \rightarrow \mu {}\mu \) data. The inset at the bottom of the figure shows the ratio of the data to the MC predictions with only the statistical uncertainties on the data and MC simulation. The bin boundary always includes the lower edge and not the upper edge

4.1.3 Jet \({p}_{\text {T}}\) threshold and JVF selection

The TST, STVF, and EJAF \(E_{\text {T}}^{\text {miss}}\) algorithms complement the pileup reduction in the soft term with additional requirements on the jets entering the \(E_{\text {T}}^{\text {miss}}\) hard term, which are also aimed at reducing pileup dependence. These \(E_{\text {T}}^{\text {miss}}\) reconstruction algorithms apply a requirement of \(\text {JVF}\) > 0.25 to jets with \({p}_{\text {T}} \) < 50 \(\text {GeV}\) and \(|\eta |\) < 2.4 in order to suppress those originating from pileup interactions. The maximum \(|\eta |\) value is lowered to 2.4 to ensure that the core of each jet is within the tracking volume (\(|\eta |\) < 2.5) [4]. Charged particles from jets below the \(p_{\text {T}}\) threshold are considered in the soft terms for the STVF, TST, and EJAF (see Sect. 4.1.2 for details).

The same \(\text {JVF}\) requirements are not applied to the CST \(E_{\text {T}}^{\text {miss}}\) because its soft term includes the soft recoil from all interactions, so removing jets not associated with the hard-scatter interaction could create an imbalance. The procedure for choosing the jet \({p}_{\text {T}}\) and \(\text {JVF}\) criteria is summarized in Sect. 7.

Throughout most of this paper the number of jets is computed without a \(\text {JVF}\) requirement so that the \(E_{\text {T}}^{\text {miss}}\) algorithms are compared on the same subset of events. However, the \(\text {JVF}\) > 0.25 requirement is applied in jet counting when 1-jet and \(\ge \) 2-jet samples are studied using the TST \(E_{\text {T}}^{\text {miss}}\) reconstruction, which includes Figs. 8 and 22. The \(\text {JVF}\) removes pileup jets that obscure trends in samples with different jet multiplicities.

4.2 Track \(E_{\text {T}}^{\text {miss}}\)

Extending the philosophy of the TST definition to the full event, the \(E_{\text {T}}^{\text {miss}}\) is reconstructed from tracks alone, reducing the pileup contamination that afflicts the other object-based algorithms. While a purely track-based \(E_{\text {T}}^{\text {miss}}\) , designated Track \(E_{\text {T}}^{\text {miss}}\) , has almost no pileup dependence, it is insensitive to neutral particles, which do not form tracks in the ID. This can degrade the \(E_{\text {T}}^{\text {miss}}\) calibration, especially in event topologies with numerous or highly energetic jets. The \(\eta \) coverage of the Track \(E_{\text {T}}^{\text {miss}}\) is also limited to the ID acceptance of \(|\eta |\) < 2.5, which is substantially smaller than the calorimeter coverage, which extends to \(|\eta |\)\(=\) 4.9.

Track \(E_{\text {T}}^{\text {miss}}\) is calculated by taking the negative vectorial sum of \(\vec {p_{\text {T}}}\) of tracks satisfying the same quality criteria as the TST tracks. Similar to the TST, tracks with poor momentum resolution or without corresponding calorimeter deposits are removed. Because of Bremsstrahlung within the ID, the electron \({p}_{\text {T}}\) is determined more precisely by the calorimeter than by the ID. Therefore, the Track \(E_{\text {T}}^{\text {miss}}\) algorithm uses the electron \({p}_{\text {T}}\) measurement in the calorimeter and removes tracks overlapping its shower. Calorimeter deposits from photons are not added because they cannot be reliably associated to particular pp interactions. For muons, the ID track \({p}_{\text {T}}\) is used and not the fits combining the ID and MS \({p}_{\text {T}}\) . For events without any reconstructed jets, the Track and TST \(E_{\text {T}}^{\text {miss}}\) would have similar values, but differences could still originate from muon track measurements as well as reconstructed photons or calorimeter deposits from \(\tau _{\mathrm{had}{\text {-}}\mathrm{vis}}\), which are only included in the TST.

The soft term for the Track \(E_{\text {T}}^{\text {miss}}\) is defined to be identical to the TST by excluding tracks associated with the high-\({p}_{\text {T}}\) physics objects used in Eq. (2).

In this section, basic \(E_{\text {T}}^{\text {miss}}\) distributions before and after pileup suppression in \(\mathrm{Z} \rightarrow \ell{}\ell\) and \(W\rightarrow \ell {}\nu\) data events are compared to the distributions from the MC signal plus relevant background samples. All distributions in this section include the dominant systematic uncertainties on the high-\(p_{\text {T}}\) objects, the \(\vec {E}_\mathrm{T}^{\mathrm {\ miss,soft}}\) (described in Sect. 8) and pileup modelling [7]. The systematics listed above are the largest systematic uncertainties in the \(E_{\mathrm {T}}^{\mathrm {miss}}\) for Z and W samples.

5.1 Modelling of \(\mathrm{Z} \rightarrow \ell{}\ell\) events

The CST, EJAF, TST, STVF, and Track \(E_{\mathrm {T}}^{\mathrm {miss}}\) distributions for \(\mathrm{Z} \rightarrow \mu {}\mu \) data and simulation are shown in Fig. 3. The Z boson signal region, which is defined in Sect. 3.2, has better than 99% signal purity. The MC simulation agrees with data for all \(E_{\mathrm {T}}^{\mathrm {miss}}\) reconstruction algorithms within the assigned systematic uncertainties. The mean and the standard deviation of the \(E_{\text {T}}^{\text {miss}}\) distribution is shown for all of the \(E_{\text {T}}^{\text {miss}}\) algorithms in \(Z \rightarrow \mu {}\mu \) inclusive simulation in Table 4. The CST \(E_{\text {T}}^{\text {miss}}\) has the highest mean \(E_{\text {T}}^{\text {miss}}\) and thus the broadest \(E_{\mathrm {T}}^{\mathrm {miss}}\) distribution. All of the \(E_{\mathrm {T}}^{\mathrm {miss}}\) algorithms with pileup suppression have narrower \(E_{\mathrm {T}}^{\mathrm {miss}}\) distributions as shown by their smaller mean \(E_{\text {T}}^{\text {miss}}\) values. However, those algorithms also have non-Gaussian tails in the \(E_\mathrm{x}^\mathrm{miss}\) and \(E_\mathrm{y}^\mathrm{miss}\) distributions, which contribute to the region with \(E_{\mathrm {T}}^{\mathrm {miss}}\)\(\gtrsim \)50 \(\text {GeV}\). The Track \(E_{\mathrm {T}}^{\mathrm {miss}}\) has the largest tail because it does not include contributions from the neutral particles, and this results in it having the largest standard deviation.

The tails of the \(E_{\mathrm {T}}^{\mathrm {miss}}\) distributions in Fig. 3 for \(Z \rightarrow \mu {}\mu \) data are observed to be compatible with the sum of expected signal and background contributions, namely \(t\bar{t}\) and the summed diboson (VV) processes including WW, WZ, and ZZ, which all have high-\({p}_{\text {T}}\) neutrinos in their final states. Instrumental effects can show up in the tails of the \(E_{\mathrm {T}}^{\mathrm {miss}}\), but such effects are small.

The \(E_{\text {T}}^{\text {miss}}\)\(\phi \) distribution is not shown in this paper but is very uniform, having less than 4 parts in a thousand difference from positive and negative \(\phi \). Thus the \(\phi \)-asymmetry is greatly reduced from that observed in Ref. [1].

The increase in systematic uncertainties in the range 50–120 \(\text {GeV}\) in Fig. 3 comes from the tail of the \(E_{\text {T}}^{\text {miss}}\) distribution for the simulated \(\mathrm{Z} \rightarrow \mu {}\mu \) events. The increased width in the uncertainty band is asymmetric because many systematic uncertainties increase the \(E_{\text {T}}^{\text {miss}}\) tail in \(\mathrm{Z} \rightarrow \mu {}\mu \) events by creating an imbalance in the transverse momentum. The largest of these systematic uncertainties are those associated with the jet energy resolution, the jet energy scale, and pileup. The pileup systematic uncertainties affect mostly the CST and EJAF \(E_{\text {T}}^{\text {miss}}\), while the jet energy scale uncertainty causes the larger systematic uncertainty for the TST and STVF \(E_{\text {T}}^{\text {miss}}\) . The Track \(E_{\text {T}}^{\text {miss}}\) does not have the same increase in systematic uncertainties because it does not make use of reconstructed jets. Above 120 \(\text {GeV}\), most events have a large intrinsic \(E_{\text {T}}^{\text {miss}}\) , and the systematic uncertainties on the \(E_{\text {T}}^{\text {miss}}\) , especially the soft term, are smaller.

Distributions of the \(E_{\mathrm {T}}^{\mathrm {miss}}\) with the a CST, b EJAF, c TST, d STVF, and e Track \(E_{\text {T}}^{\text {miss}}\) are shown in data and MC simulation events satisfying the \(\mathrm{Z} \rightarrow \mu {}\mu \) selection. The lower panel of the figures shows the ratio of data to MC simulation, and the bands correspond to the combined systematic and MC statistical uncertainties. The far right bin includes the integral of all events with \(E_{\text {T}}^{\text {miss}}\) above 300 \(\text {GeV}\)

Figure 4 shows the soft-term distributions. The pileup-suppressed \(E_{\text {T}}^{\text {miss}}\) algorithms generally have a smaller mean soft term as well as a sharper peak near zero compared to the CST. Among the \(E_{\text {T}}^{\text {miss}}\) algorithms, the soft term from the EJAF algorithm shows the smallest change relative to the CST. The TST has a sharp peak near zero similar to the STVF but with a longer tail, which mostly comes from individual tracks. These tracks are possibly mismeasured and further studies are planned. The simulation under-predicts the TST relative to the observed data between 60–85 \(\text {GeV}\), and the differences exceed the assigned systematic uncertainties. This region corresponds to the transition from the narrow core to the tail coming from high-\(p_{\text {T}}\) tracks. The differences between data and simulation could be due to mismodelling of the rate of mismeasured tracks, for which no systematic uncertainty is applied. The mismeasured-track cleaning, as discussed in Sect. 4.1.2, reduces the TST tail starting at 120 \(\text {GeV}\), and this region is modelled within the assigned uncertainties. The mismeasured-track cleaning for tracks below 120 \(\text {GeV}\) and entering the TST is not optimal, and future studies aim to improve this.

Distributions of the soft term for the a CST, b EJAF, c TST, and d STVF are shown in data and MC simulation events satisfying the \(\mathrm{Z} \rightarrow \mu {}\mu \) selection. The lower panel of the figures show the ratio of data to MC simulation, and the bands correspond to the combined systematic and MC statistical uncertainties. The far right bin includes the integral of all events with \(E_\mathrm{T}^{\mathrm {miss,soft}}\) above 160 \(\text {GeV}\)

The \(E_{\text {T}}^{\text {miss}}\) resolution is expected to be proportional to \(\sqrt{\Sigma E_{\mathrm {T}}}\) when both quantities are measured with the calorimeter alone [1]. While this proportionality does not hold for tracks, it is nevertheless interesting to understand the modelling of \(\Sigma E_{\mathrm {T}}\) and the dependence of \(E_{\text {T}}^{\text {miss}}\) resolution on it. Figure 5 shows the \(\Sigma E_{\mathrm {T}}\) distribution for \(\mathrm{Z} \rightarrow \mu {}\mu \) data and MC simulation both for the TST and the CST algorithms. The \(\Sigma E_{\mathrm {T}}\) is typically larger for the CST algorithm than for the TST because the former includes energy deposits from pileup as well as neutral particles and forward contributions beyond the ID volume. The reduction of pileup contributions in the soft and jet terms leads to the \(\Sigma E_{\mathrm {T}}\) (TST) having a sharper peak at around 100 \(\text {GeV}\) followed by a large tail, due to high-\({p}_{\text {T}}\) muons and large \(\sum p_{\mathrm {T}}^{\mathrm {jets}}\). The data and simulation agree within the uncertainties for the \(\Sigma E_{\mathrm {T}}\) (CST) and \(\Sigma E_{\mathrm {T}}\) (TST) distributions.

Distributions of a\(\Sigma E_{\mathrm {T}}\) (CST) and b\(\Sigma E_{\mathrm {T}}\) (TST) are shown in data and MC simulation events satisfying the \(\mathrm{Z} \rightarrow \mu {}\mu \) selection. The lower panel of the figures show the ratio of data to MC simulation, and the bands correspond to the combined systematic and MC statistical uncertainties. The far right bin includes the integral of all events with \(\Sigma E_{\mathrm {T}}\) above 2000 \(\text {GeV}\)

5.2 Modelling of \(W\rightarrow \ell {}\nu\) events

In this section, the selection requirements for the \(m_{\mathrm {T}}\) and \(E_{\text {T}}^{\text {miss}}\) distributions are defined using the same \(E_{\text {T}}^{\text {miss}}\) algorithm as that labelling the distribution (e.g. selection criteria are applied to the CST \(E_{\text {T}}^{\text {miss}}\) for distributions showing the CST \(E_{\text {T}}^{\text {miss}}\) ). The intrinsic \(E_{\text {T}}^{\text {miss}}\) in \(W\rightarrow \ell {}\nu\) events allows a comparison of the \(E_{\mathrm {T}}^{\mathrm {miss}}\) scale between data and simulation. The level of agreement between data and MC simulation for the \(E_{\text {T}}^{\text {miss}}\) reconstruction algorithms is studied using \(W\rightarrow e{}v\) events with the selection defined in Sect. 3.3.

The CST and TST \(E_{\text {T}}^{\text {miss}}\) distributions in \(W\rightarrow e{}v\) events are shown in Fig. 6. The \(W\rightarrow \tau {}v\) contributions are combined with \(W\rightarrow e{}v\) events in the figure. The data and MC simulation agree within the assigned systematic uncertainties for both the CST and TST \(E_{\text {T}}^{\text {miss}}\) algorithms. The other \(E_{\text {T}}^{\text {miss}}\) algorithms show similar levels of agreement between data and MC simulation.

Distributions of the a CST and b TST \(E_{\text {T}}^{\text {miss}}\) as measured in a data sample of \(W\rightarrow e{}v\) events. The lower panel of the figures show the ratio of data to MC simulation, and the bands correspond to the combined systematic and MC statistical uncertainties. The far right bin includes the integral of all events with \(E_{\text {T}}^{\text {miss}}\) above 300 \(\text {GeV}\)

6 Performance of the \(E_{\text {T}}^{\text {miss}}\) in data and MC simulation

6.1 Resolution of \(E_{\text {T}}^{\text {miss}}\)

The \(E_\mathrm{x}^\mathrm{miss}\) and \(E_\mathrm{y}^\mathrm{miss}\) are expected to be approximately Gaussian distributed for \(\mathrm{Z} \rightarrow \ell{}\ell\) events as discussed in Ref. [1]. However, because of the non-Gaussian tails in these distributions, especially for the pileup-suppressing \(E_{\text {T}}^{\text {miss}}\) algorithms, the root-mean-square (RMS) is used to estimate the resolution. This includes important information about the tails, which would be lost if the result of a Gaussian fit over only the core of the distribution were used instead. The resolution of the \(E_{\text {T}}^{\text {miss}}\) distribution is extracted using the RMS from the combined distribution of \(E_\mathrm{x}^\mathrm{miss}\) and \(E_\mathrm{y}^\mathrm{miss}\), which are determined to be independent from correlation studies. The previous ATLAS \(E_{\text {T}}^{\text {miss}}\) performance paper [1] studied the resolution defined by the width of Gaussian fits in a narrow range of \(\pm 2\)RMS around the mean and used a separate study to investigate the tails. Therefore, the results of this paper are not directly comparable to those of the previous study. The resolutions presented in this paper are expected to be larger than the width of the Gaussian fitted in this manner because the RMS takes into account the tails.

In this section, the resolution for the \(E_{\text {T}}^{\text {miss}}\) is presented for \(\mathrm{Z} \rightarrow \mu {}\mu \) events using both data and MC simulation. Unless it is a simulation-only figure (labelled with “Simulation” under the ATLAS label), the MC distribution includes the signal sample (e.g. \(\mathrm{Z} \rightarrow \mu {}\mu \) ) as well as diboson, \(t\bar{t}\) , and tW samples.

6.1.1 Resolution of the \(E_{\text {T}}^{\text {miss}}\) as a function of the number of reconstructed vertices

The stability of the \(E_{\text {T}}^{\text {miss}}\) performance as a function of the amount of pileup is estimated by studying the \(E_{\mathrm {T}}^{\mathrm {miss}}\) resolution as a function of the number of reconstructed vertices (\({N}_{\mathrm {PV}}\)) for \(\mathrm{Z} \rightarrow \mu {}\mu \) events as shown in Fig. 7. The bin edge is always including the lower edge and not the upper. For example, the events with \({N}_{\mathrm {PV}}\) in the inclusive range 30–39 are combined because of small sample size. In addition, very few events were collected below \({N}_{\mathrm {PV}}\) of 2 during 2012 data taking. Events in which there are no reconstructed jets with \(p_{\text {T}}\) > 20 \(\text {GeV}\) are referred to collectively as the 0-jet sample. Distributions are shown here for both the 0-jet and inclusive samples. For both samples, the data and MC simulation agree within 2% up to around \({N}_{\mathrm {PV}}\)\(=\) 15 but the deviation grows to around 5–10% for \({N}_{\mathrm {PV}}\) > 25, which might be attributed to the decreasing sample size. All of the \(E_{\text {T}}^{\text {miss}}\) distributions show a similar level of agreement between data and simulation across the full range of \({N}_{\mathrm {PV}}\).

For the 0-jet sample in Fig. 7a, the STVF, TST, and Track \(E_{\text {T}}^{\text {miss}}\) resolutions all have a small slope with respect to \({N}_{\mathrm {PV}}\), which implies stability of the resolution against pileup. In addition, their resolutions agree within 1 \(\text {GeV}\) throughout the \({N}_{\mathrm {PV}}\) range. In the 0-jet sample, the TST and Track \(E_{\mathrm {T}}^{\mathrm {miss}}\) are both primarily reconstructed from tracks; however, small differences arise mostly from accounting for photons in the TST \(E_{\text {T}}^{\text {miss}}\) reconstruction algorithm. The CST \(E_{\text {T}}^{\text {miss}}\) is directly affected by the pileup as its reconstruction does not apply any pileup suppression techniques. Therefore, the CST \(E_{\text {T}}^{\text {miss}}\) has the largest dependence on \({N}_{\mathrm {PV}}\), with a resolution ranging from 7 \(\text {GeV}\) at \({N}_{\mathrm {PV}}\)\(=\) 2 to around 23 \(\text {GeV}\) at \({N}_{\mathrm {PV}}\)\(=\) 25. The \(E_{\text {T}}^{\text {miss}}\) resolution of the EJAF distribution, while better than that of the CST \(E_{\text {T}}^{\text {miss}}\) , is not as good as that of the other pileup-suppressing algorithms.

For the inclusive sample in Fig. 7b, the Track \(E_{\mathrm {T}}^{\mathrm {miss}}\) is the most stable with respect to pileup with almost no dependence on \({N}_{\mathrm {PV}}\). For \({N}_{\mathrm {PV}}\) > 20, the Track \(E_{\text {T}}^{\text {miss}}\) has the best resolution showing that pileup creates a larger degradation in the resolution of the other \(E_{\text {T}}^{\text {miss}}\) distributions than excluding neutral particles, as the Track \(E_{\text {T}}^{\text {miss}}\) algorithm does. The EJAF \(E_{\text {T}}^{\text {miss}}\) algorithm does not reduce the pileup dependence as much as the TST and STVF \(E_{\text {T}}^{\text {miss}}\) algorithms, and the CST \(E_{\text {T}}^{\text {miss}}\) again has the largest dependence on \({N}_{\mathrm {PV}}\).

The resolution obtained from the combined distribution of \(E_\mathrm{x}^\mathrm{miss}\) and \(E_\mathrm{y}^\mathrm{miss}\) for the CST, STVF, EJAF, TST, and Track \(E_{\mathrm {T}}^{\mathrm {miss}}\) algorithms as a function of \({N}_{\mathrm {PV}}\) in a 0-jet and b inclusive \(\mathrm{Z} \rightarrow \mu {}\mu \) events in data. The insets at the bottom of the figures show the ratios of the data to the MC predictions

Figure 7 also shows that the pileup dependence of the TST, CST, EJAF and STVF \(E_{\text {T}}^{\text {miss}}\) is smaller in the 0-jet sample than in the inclusive sample. Hence, the evolution of the \(E_{\text {T}}^{\text {miss}}\) resolution is shown for different numbers of jets in Fig. 8 with the TST \(E_{\text {T}}^{\text {miss}}\) algorithm as a representative example. The jet counting for this figure includes only the jets used by the TST \(E_{\text {T}}^{\text {miss}}\) algorithm, so the \(\text {JVF}\) criterion discussed in Sect. 4.1.3 is applied. Comparing the 0-jet, 1-jet and \(\ge \)2-jet distributions, the resolution is degraded by 4–5 \(\text {GeV}\) with each additional jet, which is much larger than any dependence on \({N}_{\mathrm {PV}}\). The inclusive distribution has a larger slope with respect to \({N}_{\mathrm {PV}}\) than the individual jet categories, which indicates that the behaviour seen in the inclusive sample is driven by an increased number of pileup jets included in the \(E_{\text {T}}^{\text {miss}}\) calculation at larger \({N}_{\mathrm {PV}}\).

The resolution of the combined distribution of \(E_\mathrm{x}^\mathrm{miss}\) and \(E_\mathrm{y}^\mathrm{miss}\) for the TST \(E_{\mathrm {T}}^{\mathrm {miss}}\) as a function of \({N}_{\mathrm {PV}}\) for the 0-jet, 1-jet, \(\ge \) 2-jet, and inclusive \(\mathrm{Z} \rightarrow \mu {}\mu \) samples. The data (closed markers) and MC simulation (open markers) are overlaid. The jet counting uses the same \(\text {JVF}\) criterion as the TST \(E_{\text {T}}^{\text {miss}}\) reconstruction algorithm

6.1.2 Resolution of the \(E_{\text {T}}^{\text {miss}}\) as a function of \(\Sigma E_{\mathrm {T}}\)

The resolutions of \(E_{\text {T}}^{\text {miss}}\) , resulting from the different reconstruction algorithms, are compared as a function of the scalar sum of transverse momentum in the event, as calculated using Eq. (4). The CST \(E_{\text {T}}^{\text {miss}}\) resolution is observed to depend linearly on the square root of the \(\Sigma E_{\mathrm {T}}\) computed with the CST \(E_{\text {T}}^{\text {miss}}\) components in Ref. [1]. However, the \(\Sigma E_{\mathrm {T}}\) used in this subsection is calculated with the TST \(E_{\text {T}}^{\text {miss}}\) algorithm. This allows studies of the resolution as a function of the momenta of particles from the selected PV without including the amount of pileup activity in the event. Figure 9 shows the resolution as a function of \(\Sigma E_{\mathrm {T}}\) (TST) for \(Z \rightarrow \mu \mu \) data and MC simulation in the 0-jet and inclusive samples.

In the 0-jet sample shown in Fig. 9a, the use of tracking information in the soft term, especially for the STVF, TST, and Track \(E_{\mathrm {T}}^{\mathrm {miss}}\), greatly improves the resolution relative to the CST \(E_{\text {T}}^{\text {miss}}\) . The EJAF \(E_{\text {T}}^{\text {miss}}\) has a better resolution than that of the CST \(E_{\text {T}}^{\text {miss}}\) but does not perform as well as the other reconstruction algorithms. All of the resolution curves have an approximately linear increase with \(\Sigma E_{\mathrm {T}}\) (TST); however, the Track \(E_{\text {T}}^{\text {miss}}\) resolution increases sharply starting at \(\Sigma E_{\mathrm {T}}\) (TST) \(=\) 200 \(\text {GeV}\) due to missed neutral contributions like photons. The resolution predicted by the simulation is about 5% larger than in data for all \(E_{\text {T}}^{\text {miss}}\) algorithms at \(\Sigma E_{\mathrm {T}}\) (TST) \(=\) 50 \(\text {GeV}\), but agreement improves as \(\Sigma E_{\mathrm {T}}\) (TST) increases until around \(\Sigma E_{\mathrm {T}}\) (TST) \(=\) 200 \(\text {GeV}\). Events with jets can end up in the 0-jet event selection, for example, if a jet is misidentified as a hadronically decaying \(\tau \)-lepton. The \(\sum p_{\mathrm {T}}^{\tau }\) increases with \(\Sigma E_{\mathrm {T}}\) (TST), and the rate of jets misreconstructed as hadronically decaying \(\tau \)-leptons is not well modelled by the simulation, which leads to larger \(E_{\text {T}}^{\text {miss}}\) resolution at high \(\Sigma E_{\mathrm {T}}\) (TST) than that observed in the data. The Track \(E_{\text {T}}^{\text {miss}}\) can be more strongly affected by misidentified jets because neutral particles from the high-\({p}_{\text {T}}\) jets are not included.

For the inclusive sample in Fig. 9b, the pileup-suppressed \(E_{\mathrm {T}}^{\mathrm {miss}}\) distributions have better resolution than the CST \(E_{\text {T}}^{\text {miss}}\) for \(\Sigma E_{\mathrm {T}}\) (TST) < 200 \(\text {GeV}\), but these events are mostly those with no associated jets. For higher \(\Sigma E_{\mathrm {T}}\) (TST), the impact from the \(\Sigma E_{\mathrm {T}}^\mathrm{jets}\) term starts to dominate the resolution as well as the \(\Sigma E_{\mathrm {T}}\) (TST). Since the vector sum of jet momenta is mostly common7 to all \(E_{\mathrm {T}}^{\mathrm {miss}}\) algorithms except for the Track \(E_{\mathrm {T}}^{\mathrm {miss}}\), those algorithms show similar performance in terms of the resolution. At larger \(\Sigma E_{\mathrm {T}}\) (TST), the Track \(E_{\mathrm {T}}^{\mathrm {miss}}\) resolution begins to degrade relative to the other algorithms because it does not include the high-\({p}_{\text {T}}\) neutral particles coming from jets. The ratio of data to MC simulation for the Track \(E_{\text {T}}^{\text {miss}}\) distribution is close to one, while for other algorithms the MC simulation is below the data by about 5% at large \(\Sigma E_{\mathrm {T}}\) (TST). While the Track \(E_{\text {T}}^{\text {miss}}\) appears well modelled for the Alpgen\(+\)Pythia simulation used in this figure, the modelling depends strongly on the parton shower model.

The resolution of the combined distribution of \(E_\mathrm{x}^\mathrm{miss}\) and \(E_\mathrm{y}^\mathrm{miss}\) for the CST, STVF, EJAF, TST, and Track \(E_{\mathrm {T}}^{\mathrm {miss}}\) as a function of \(\Sigma E_{\mathrm {T}}\) (TST) in \(\mathrm{Z} \rightarrow \mu {}\mu \) events in data for the a 0-jet and b inclusive samples. The insets at the bottom of the figures show the ratios of the data to the MC predictions

6.2 The \(E_{\text {T}}^{\text {miss}}\) response

The balance of \(\vec {E}_{{\mathrm{T}}}^{\mathrm{miss}}\) against the vector boson \(\vec {p_{\text {T}}}\) in \(W/Z+\)jets events is used to evaluate the \(E_{\text {T}}^{\text {miss}}\) response. A lack of balance is a global indicator of biases in \(E_{\text {T}}^{\text {miss}}\) reconstruction and implies a systematic misestimation of at least one of the \(E_{\text {T}}^{\text {miss}}\) terms, possibly coming from an imperfect selection or calibration of the reconstructed physics objects. The procedure to evaluate the response differs between \(Z\mathrm {+jets}\) events (Sect. 6.2.1) and \(W\mathrm {+jets}\) events (Sect. 6.2.2) because of the high-\(p_{\text {T}}\) neutrino in the leptonic decay of the W boson.

In events with \(\mathrm{Z} \rightarrow \mu {}\mu \) decays, the \(\vec {p_{\text {T}}}\) of the Z boson defines an axis in the transverse plane of the ATLAS detector, and for events with 0-jets, the \(\vec {E}_{{\mathrm{T}}}^{\mathrm{miss}}\) should balance the \(\vec {p_{\text {T}}}\) of the Z boson (\(\vec {p}_{\mathrm {T}}^{Z}\;\)) along this axis. Comparing the response in events with and without jets allows distinction between the jet and soft-term responses. The component of the \(\vec {E}_{{\mathrm{T}}}^{\mathrm{miss}}\) along the \(\vec {p}_{\mathrm {T}}^{Z}\;\) axis is sensitive to biases in detector responses [51]. The unit vector of \(\vec {p}_{\mathrm {T}}^{Z}\;\) is labelled as \(\hat{\mathcal {A}}_Z\) and is defined as:

Since the \(\vec {E}_{{\mathrm{T}}}^{\mathrm{miss}}\) includes a negative vector sum over the lepton momenta, the addition of \(\vec {p}_{\mathrm {T}}^{Z}\;\) removes its contribution. With an ideal detector and \(E_{\text {T}}^{\text {miss}}\) reconstruction algorithm, \(\mathrm{Z} \rightarrow \ell{}\ell\) events have no \(E_{\text {T}}^{\text {miss}}\) , and the \(\vec {\mathcal {R}}\) balances with \(\vec {p}_{\mathrm {T}}^{Z}\;\) exactly. For the real detector and \(E_{\text {T}}^{\text {miss}}\) reconstruction algorithm, the degree of balance is measured by projecting the recoil onto \(\hat{\mathcal {A}}_Z\), and the relative recoil is defined as the projection \(\vec {\mathcal {R}}{}\cdot \hat{\mathcal {A}}_Z{}\) divided by \(p_{\mathrm {T}}^{Z}\), which gives a dimensionless estimate that is unity if the \(E_{\text {T}}^{\text {miss}}\) is ideally reconstructed and calibrated. Figure 10 shows the mean relative recoil versus \(p_{\mathrm {T}}^{Z}\) for \(\mathrm{Z} \rightarrow \mu {}\mu \) events where the average value is indicated by angle brackets. The data and MC simulation agree within around 10% for all \(E_{\text {T}}^{\text {miss}}\) algorithms for all \(p_{\mathrm {T}}^{Z}\); however, the agreement is a few percent worse for \(p_{\mathrm {T}}^{Z}\) > 50 \(\text {GeV}\) in the 0-jet sample.

The \(\mathrm{Z} \rightarrow \mu {}\mu \) events in the 0-jet sample in Fig. 10a have a relative recoil significantly lower than unity (\(\langle \vec {\mathcal {R}}{}\cdot \hat{\mathcal {A}}_Z{}/p_{\mathrm {T}}^{Z}{}\rangle \) < 1) throughout the \(p_{\mathrm {T}}^{Z}\) range. In the 0-jet sample, the relative recoil estimates how well the soft term balances the \(\vec {p_{\text {T}}}\) of muons from the Z decay, which are better measured than the soft term. The relative recoil below one indicates that the soft term is underestimated. The CST \(E_{\text {T}}^{\text {miss}}\) has a relative recoil measurement of \(\langle \vec {\mathcal {R}}{}\cdot \hat{\mathcal {A}}_Z{}/p_{\mathrm {T}}^{Z}{}\rangle \)\(\sim \) 0.5 throughout the \(p_{\mathrm {T}}^{Z}\) range, giving it the best recoil performance among the \(E_{\text {T}}^{\text {miss}}\) algorithms. The TST and Track \(E_{\text {T}}^{\text {miss}}\) have slightly larger biases than the CST \(E_{\text {T}}^{\text {miss}}\) because neutral particles are not considered in the soft term. The TST \(E_{\text {T}}^{\text {miss}}\) recoil improves relative to that of the Track \(E_{\text {T}}^{\text {miss}}\) for \(p_{\mathrm {T}}^{Z}\) > 40 \(\text {GeV}\) because of the inclusion of photons in its reconstruction. The relative recoil distribution for the STVF \(E_{\text {T}}^{\text {miss}}\) shows the largest bias for \(p_{\mathrm {T}}^{Z}\) < 60 \(\text {GeV}\). The STVF algorithm scales the recoil down globally by the factor \(\alpha _{\text {STVF}}\) as defined in Eq. (11), and this correction decreases the already underestimated soft term. The \(\alpha _{\text {STVF}}\) does increase with \(p_{\mathrm {T}}^{Z}\) going from 0.06 at \(p_{\mathrm {T}}^{Z}\)\(=\) 0 \(\text {GeV}\) to around 0.15 at \(p_{\mathrm {T}}^{Z}\)\(=\) 50 \(\text {GeV}\), and this results in a rise in the recoil, which approaches the TST \(E_{\text {T}}^{\text {miss}}\) near \(p_{\mathrm {T}}^{Z}\)\(\sim \) 70 \(\text {GeV}\).

In Fig. 10b, the inclusive \(\mathrm{Z} \rightarrow \mu {}\mu \) events have a significantly underestimated relative recoil for \(p_{\mathrm {T}}^{Z}\) < 40 \(\text {GeV}\). The balance between the \(\vec {\mathcal {R}}\) and \(\vec {p}_{\mathrm {T}}^{Z}\;\) improves with \(p_{\mathrm {T}}^{Z}\) because of an increase in events having high-\({p}_{\text {T}}\) calibrated jets recoiling against the Z boson. The presence of jets included in the hard term also reduces the sensitivity to the soft term, which is difficult to measure accurately. The difficulty in isolating effects from soft-term contributions from high-\({p}_{\text {T}}\) physics objects is one reason why the soft term is not corrected. As with the 0-jet sample, the CST \(E_{\text {T}}^{\text {miss}}\) has a significantly under-calibrated relative recoil in the low-\(p_{\mathrm {T}}^{Z}\) region, and all of the other \(E_{\text {T}}^{\text {miss}}\) algorithms have a lower relative recoil than the CST \(E_{\text {T}}^{\text {miss}}\) . Of the pileup-suppressing \(E_{\text {T}}^{\text {miss}}\) algorithms, the TST \(E_{\text {T}}^{\text {miss}}\) is closest to the relative recoil of the CST \(E_{\text {T}}^{\text {miss}}\) . The relative recoil of the Track \(E_{\text {T}}^{\text {miss}}\) is significantly lower than unity because the neutral particles recoiling from the Z boson are not included in its reconstruction. Finally, the STVF \(E_{\text {T}}^{\text {miss}}\) shows the lowest relative recoil among the object-based \(E_{\text {T}}^{\text {miss}}\) algorithms as discussed above for Fig. 10a, even lower than the Track \(E_{\text {T}}^{\text {miss}}\) for \(p_{\mathrm {T}}^{Z}\) < 16 \(\text {GeV}\).

\(\langle \vec {\mathcal {R}}{}\cdot \hat{\mathcal {A}}_Z{}/p_{\mathrm {T}}^{Z}{}\rangle \) as a function \(p_{\mathrm {T}}^{Z}\) for the a 0-jet and b inclusive events in \(\mathrm{Z} \rightarrow \mu {}\mu \) data. The insets at the bottom of the figures show the ratios of the data to the MC predictions

For simulated events with intrinsic \(E_{\text {T}}^{\text {miss}}\) , the response is studied by looking at the relative mismeasurement of the reconstructed \(E_{\text {T}}^{\text {miss}}\) . This is referred to here as the “linearity”, and is a measure of how consistent the reconstructed \(E_{\text {T}}^{\text {miss}}\) is with the \(E_{\mathrm {T}}^{\mathrm {miss,True}}\). The linearity is defined as the mean value of the ratio, \((E_{\mathrm {T}}^\mathrm{miss}-E_{\mathrm {T}}^\mathrm{miss,True})/E_{\mathrm {T}}^\mathrm{miss,True}\) and is expected to be zero if the \(E_{\mathrm {T}}^{\mathrm {miss}}\) is reconstructed at the correct scale.

For the linearity studies, no selection on the \(E_{\mathrm {T}}^{\mathrm {miss}}\) or \(m_{\mathrm {T}}\) is applied, in order to avoid biases as these are purely simulation-based studies. In Fig. 11, the linearity for \(W\rightarrow \mu {}v\) simulated events is presented as a function of the \(E_{\mathrm {T}}^{\mathrm {miss,True}}\). Despite the relaxed selection, a positive linearity is evident for \(E_{\mathrm {T}}^{\mathrm {miss,True}}\)< 40 \(\text {GeV}\), due to the finite resolution of the \(E_{\mathrm {T}}^{\mathrm {miss}}\) reconstruction and the fact that the reconstructed \(E_{\mathrm {T}}^{\mathrm {miss}}\) is positive by definition. The CST \(E_{\text {T}}^{\text {miss}}\) has the largest deviation from zero at low \(E_{\mathrm {T}}^{\mathrm {miss,True}}\) because it has the largest \(E_{\mathrm {T}}^{\mathrm {miss}}\) resolution.

For the events in the 0-jet sample in Fig. 11a, all \(E_{\text {T}}^{\text {miss}}\) algorithms have a negative linearity for \(E_{\mathrm {T}}^{\mathrm {miss,True}}\) > 40 \(\text {GeV}\), which diminishes for \(E_{\mathrm {T}}^{\mathrm {miss,True}}\)\(\gtrsim 60\)\(\text {GeV}\). The region of \(E_{\mathrm {T}}^{\mathrm {miss,True}}\) between 40 and 60 \(\text {GeV}\) mostly includes events lying in the Jacobian peak of the W transverse mass, and these events include mostly on-shell W bosons. For \(E_{\text {T}}^{\text {miss}}\)\(\gtrsim \) 40 \(\text {GeV}\), the on-shell W boson must have non-zero \({p}_{\text {T}}\) , which typically comes from its recoil against jets. However, no reconstructed or generator-level jets are found in this 0-jet sample. Therefore, most of the events with 40 < \(E_{\mathrm {T}}^{\mathrm {miss,True}}\) < 60 \(\text {GeV}\) have jets below the 20 \(\text {GeV}\) threshold contributing to the soft term, and the soft term is not calibrated. The under-estimation of the soft term, described in Sect. 6.2.1, causes the linearity to deviate further from zero in this region. Events with \(E_\mathrm{T}^\mathrm{miss,True}\) >60 \(\text {GeV}\) are mostly off-shell W bosons that are produced with very low \({p}_{\text {T}}\) . For these events, the \(\vec {p_{\text {T}}}\) contributions to the \(E_{\text {T}}^{\text {miss}}\) reconstruction come mostly from the well-measured muon \(\vec {p_{\text {T}}}\) , and the soft term plays a much smaller role. Hence, the linearity improves as the impact of the soft term decreases with larger \(E_\mathrm{T}^\mathrm{miss,True}\).

For inclusive events in Fig. 11b with \(E_{\mathrm {T}}^{\mathrm {miss,True}}\)\(>40\)\(\text {GeV}\), the deviation of the linearity from zero is smaller than 5% for the CST \(E_{\text {T}}^{\text {miss}}\). The linearity of the TST \(E_{\text {T}}^{\text {miss}}\) is within 10% of unity in the range of 40–60 \(\text {GeV}\) and improves for higher \(E_{\mathrm {T}}^{\mathrm {miss,True}}\) values. The STVF \(E_{\text {T}}^{\text {miss}}\) has the most negative bias in the linearity among the object-based \(E_{\text {T}}^{\text {miss}}\) algorithms for \(E_{\mathrm {T}}^{\mathrm {miss,True}}\) > 40 \(\text {GeV}\). The TST, CST, STVF, and EJAF \(E_{\text {T}}^{\text {miss}}\) algorithms perform similarly for all \(E_{\mathrm {T}}^{\mathrm {miss,True}}\) values. As expected, the linearity of the Track \(E_{\text {T}}^{\text {miss}}\) settles below zero due to not accounting for neutral particles in jets.