Abstract

Artificial intelligence offers the potential to automate challenging data-processing tasks in collider physics. Here, to establish its prospects, we explore to what extent deep learning with convolutional neural networks can discriminate quark and gluon jets better than observables designed by physicists. Our approach builds upon the paradigm that a jet can be treated as an image, with intensity given by the local calorimeter deposits. We supplement this construction by adding color to the images, with red, green and blue intensities given by the transverse momentum in charged particles, transverse momentum in neutral particles, and pixel-level charged particle counts. Overall, the deep networks match or outperform traditional jet variables. We also find that, while various simulations produce different quark and gluon jets, the neural networks are surprisingly insensitive to these differences, similar to traditional observables. This suggests that the networks can extract robust physical information from imperfect simulations.

@article{osti_1360783,
title = {Deep learning in color: towards automated quark/gluon jet discrimination},
author = {Komiske, Patrick T. and Metodiev, Eric M. and Schwartz, Matthew D.},
abstractNote = {Artificial intelligence offers the potential to automate challenging data-processing tasks in collider physics. Here, to establish its prospects, we explore to what extent deep learning with convolutional neural networks can discriminate quark and gluon jets better than observables designed by physicists. Our approach builds upon the paradigm that a jet can be treated as an image, with intensity given by the local calorimeter deposits. We supplement this construction by adding color to the images, with red, green and blue intensities given by the transverse momentum in charged particles, transverse momentum in neutral particles, and pixel-level charged particle counts. Overall, the deep networks match or outperform traditional jet variables. We also find that, while various simulations produce different quark and gluon jets, the neural networks are surprisingly insensitive to these differences, similar to traditional observables. This suggests that the networks can extract robust physical information from imperfect simulations.},
doi = {10.1007/JHEP01(2017)110},
journal = {Journal of High Energy Physics (Online)},
number = 1,
volume = 2017,
place = {United States},
year = {Wed Jan 25 00:00:00 EST 2017},
month = {Wed Jan 25 00:00:00 EST 2017}
}

A likelihood-based discriminant for the identification of quark- and gluon-initiated jets is built and validated using 4.7 fb -1 of proton–proton collision data at √s = 7 TeV collected with the ATLAS detector at the LHC. Data samples with enriched quark or gluon content are used in the construction and validation of templates of jet properties that are the input to the likelihood-based discriminant. The discriminating power of the jet tagger is established in both data and Monte Carlo samples within a systematic uncertainty of ≈ 10–20 %. In data, light-quark jets can be tagged with an efficiency of ≈more » 50% while achieving a gluon-jet mis-tag rate of ≈ 25% in a p T range between 40 GeV and 360 GeV for jets in the acceptance of the tracker. The rejection of gluon-jets found in the data is significantly below what is attainable using a Pythia 6 Monte Carlo simulation, where gluon-jet mis-tag rates of 10 % can be reached for a 50 % selection efficiency of light-quark jets using the same jet properties.« less

Here, we study the impact of including quark- and gluon-initiated jet discrimination in the search for strongly interacting supersymmetric particles at the LHC. Taking the example of gluino pair production, considerable improvement is observed in the LHC search reach on including the jet substructure observables to the standard kinematic variables within a multivariate analysis. In particular, quark and gluon jet separation has higher impact in the region of intermediate mass-gap between the gluino and the lightest neutralino, as the difference between the signal and the standard model background kinematic distributions is reduced in this region. We also compare the predictionsmore » from different Monte Carlo event generators to estimate the uncertainty originating from the modelling of the parton shower and hadronization processes.« less

Charged track multiplicity is among the most powerful observables for discriminating quark- from gluon-initiated jets. Despite its utility, it is not infrared and collinear (IRC) safe, so perturbative calculations are limited to studying the energy evolution of multiplicity moments. While IRC-safe observables, like jet mass, are perturbatively calculable, their distributions often exhibit Casimir scaling, such that their quark/gluon discrimination power is limited by the ratio of quark to gluon color factors. In this paper, we introduce new IRC-safe counting observables whose discrimination performance exceeds that of jet mass and approaches that of track multiplicity. The key observation is that trackmore » multiplicity is approximately Poisson distributed, with more suppressed tails than the Sudakov peak structure from jet mass. By using an iterated version of the soft drop jet grooming algorithm, we can define a “soft drop multiplicity” which is Poisson distributed at leading-logarithmic accuracy. In addition, we calculate the next-to-leading-logarithmic corrections to this Poisson structure. If we allow the soft drop groomer to proceed to the end of the jet branching history, we can define a collinear-unsafe (but still infrared-safe) counting observable. Exploiting the universality of the collinear limit, we define generalized fragmentation functions to study the perturbative energy evolution of collinear-unsafe multiplicity.« less

Pathology reports are a primary source of information for cancer registries which process high volumes of free-text reports annually. Information extraction and coding is a manual, labor-intensive process. Here in this study we investigated deep learning and a convolutional neural network (CNN), for extracting ICDO- 3 topographic codes from a corpus of breast and lung cancer pathology reports. We performed two experiments, using a CNN and a more conventional term frequency vector approach, to assess the effects of class prevalence and inter-class transfer learning. The experiments were based on a set of 942 pathology reports with human expert annotations asmore » the gold standard. CNN performance was compared against a more conventional term frequency vector space approach. We observed that the deep learning models consistently outperformed the conventional approaches in the class prevalence experiment, resulting in micro and macro-F score increases of up to 0.132 and 0.226 respectively when class labels were well populated. Specifically, the best performing CNN achieved a micro-F score of 0.722 over 12 ICD-O-3 topography codes. Transfer learning provided a consistent but modest performance boost for the deep learning methods but trends were contingent on CNN method and cancer site. Finally, these encouraging results demonstrate the potential of deep learning for automated abstraction of pathology reports.« less

A unique correlative approach for automated segmentation of large 3D nanotomography datasets obtained using Transmission X-ray Microscopy (TXM) in an Al-Cu alloy has been introduced. Automated segmentation using a Convolutional Neural Network (CNN) architecture based on a deep learning approach was employed. This extremely versatile technique is capable of emulating the manual segmentation process effectively. Coupling this technique with post-scanning SEM imaging ensured precise estimation of 3D morphological parameters from nanotomography. The segmentation process as well as subsequent analysis was expedited by several orders of magnitude. Quantitative comparison between segmentation performed manually and using the CNN architecture established the accuracymore » of this automated technique. Its ability to robustly process ultra-large volumes of data in relatively small time frames can exponentially accelerate tomographic data analysis, possibly opening up novel avenues for performing 4D characterization experiments with finer time steps.« less