Big data for the universe

Astronomers at Lomonosov Moscow State University in cooperation with their French colleagues and with the help of citizen scientists have released &laquoThe Reference Catalog of galaxy SEDs» (RCSED), which contains value-added information about 800,000 galaxies. The catalog is accessible on the web and its description has been published in the Astrophysical Journal Supplement (impact factor -- 11.257). Two co-authors of the research paper are undergraduate students at the Faculty of Physics, Lomonosov Moscow State University. While still working on the catalog, the team has published a few research papers based on the data from it, including a study published by the prestigious interdisciplinary journal Science.

What can one learn using RCSED and why is it unique?

RCSED describes properties of 800,000 galaxies derived from the elaborated data analysis. For every galaxy, it presents its stellar composition, brightness at ultraviolet, optical, and near-infrared wavelengths. From RCSED, one can also access galaxy spectra obtained by the Sloan Digital Sky Survey, measurements of spectral lines, and properties determined from them, such as the chemical composition of stars and gas, contained in those galaxies. This makes RCSED the first catalog of its kind, which contains results of detailed homogeneous analysis for such large number of objects. Dr. Igor Chilingarian, an astronomer at Smithsonian Astrophysical Observatory, USA and a Lead Researcher at Sternberg Astronomical Institute, Lomonosov Moscow State University says: "For every galaxy we also provide a small cutout image from three sky surveys, which show how the galaxy looks at different wavelengths. This provides us with the data for further investigations." Dr. Ivan Katkov, a Senior Researcher at Sternberg Astronomical Institute adds: "The analysis of emission line profiles presented in RCSED is substantially more detailed and accurate then the data published in other catalogs".

RCSED is really flexible and very easy to use. By simply entering the object name or its coordinates in the search field, the web site will provide in a single page all the information referring to that object contained in the catalog. One can also use the catalog through Virtual Observatory applications such as TOPCAT. The RCSED web site also provides tutorials including the one, which describes a technique that Igor Chilingarian and Ivan Zolotukhin exploited to discover new compact elliptical galaxies, which were later published in the research paper &laquoIsolated compact elliptical galaxies: Stellar systems that ran away».

Another interesting detail about RCSED is that the team actively used the help of citizen scientists to develop the project web site. And among them there were high-level experts in software development and web design, who have daytime jobs in the largest Russian IT-companies. Dr. Ivan Zolotukhin, a Researcher at Sternberg Astronomical Institute, explains: "Programmers sometimes get burnt out by their routine work, and they would like to do something interesting and pleasant in their spare time, for instance, to help scientists. We are very grateful to them, they have become important members of our team and significantly strengthened our project. It's been always interesting for us to cooperate with IT specialists and we have a lot more projects where they can contribute. So if you use git, program in Python or know HTML/CSS, love stars, have a bit of spare time and are willing to help an international research team - please, contact us using the address published on the web page.

Dr. Ivan Katkov adds: "The RCSED catalog became possible thanks to the application of an interdisciplinary Big Data approach as we had to apply very complex scientific algorithms to a large dataset in a massively parallel way. Eventually, the expertise and resources available at large IT companies would undoubtedly allow researchers to significantly increase the quality and the quantity of research results and to make many important discoveries in astrophysics".

The fact that the RCSED catalog has attracted serious interest in the scientific community even during its assembly phase proves its great potential. During the last three years several external researchers were given the access to the catalog on request and, using RCSED data, published over a dozen of articles in professional peer-reviewed journals (Astrophysical Journal, Astronomy & Astrophysics, MNRAS). The catalog is the world largest homogeneous value-added dataset for nearby galaxies, containing information collected with ground-based and space telescopes. The unique research material for extragalactic astrophysics contained in RCSED will certainly help astrophysicists to achieve new interesting scientific results, some of which would probably qualify for publication in the interdisciplinary journals Science and Nature.

RCSED expansion prospects: one million galaxies will be there soon

The current release of the RCSED catalog could have comprised a larger number of galaxies or contained extra bits of information about the currently included objects, but at this moment the scientists have decided to focus on well-characterized datasets, which are described in detail and have known advantages and disadvantages. However, taking into account the project importance for extragalactic astronomy and observational cosmology, the RCSED team is going to move forward and expand the catalog in the near future.

There are two principal directions of further RCSED development: the galaxy sample expansion and incorporating new data for existing objects. The team considers a possibility to include near- and mid-infrared data from the WIS? satellite all-sky survey for the entire galaxy sample. However, this requires some additional methodical work in order to homogenize the data for galaxies at different redshifts.

Moreover, it is possible to expand the principal galaxy sample by including spectra from the latest data release of the SDSS-III survey. This will turn 800,000 to 1.5 million objects.

Incorporating the publicly available spectral data from the Hectospec archive (Igor Chilingarian has played a major role in the Hectospec archive project) will add 300-400 thousand objects at larger distances, whose spectra were collected with the 6.5-meter MMT telescope in Arizona. The current RCSED release comprises mostly nearby galaxies (by cosmological measures), whose redshifts are smaller than 0.4, because SDSS did not include faint objects. Therefore, the early Universe is not represented in the catalog at all. The Hectospec archive will allow the team to move a little bit further in the cosmological distance scale until the redshift of 0.7. If they add several thousand galaxies from the DEEP2 survey conducted with the 10-meter Keck telescope in early 2000s, they could get insights into objects at redshift up-to 1.0, when the Universe was less than half of its present age.

Igor Chilingarian concludes: "We shall be able to see the global picture in about ten years from now, when large surveys like DESI have collected 25-30 million galaxy spectra out to intermediate redshifts."

-end-

The RCSED project has been supported by the collaborative grant, provided by the Russian Foundation for Basic Research (RFBR) and The French National Center for Scientific Research (Centre National de la Recherche Scientifique, CNRS). On earlier stages the project was supported by the grants from the Russian Science Foundation (RScF), the President of the Russian Federation, along with French resources, available in the framework of the VO-Paris Data Center at the Paris Observatory.

Designing new materials from 'small' dataA Northwestern and Los Alamos team developed a novel workflow combining machine learning and density functional theory calculations to create design guidelines for new materials that exhibit useful electronic properties, such as ferroelectricity and piezoelectricity.

Big data for the universeAstronomers at Lomonosov Moscow State University in cooperation with their French colleagues and with the help of citizen scientists have released 'The Reference Catalog of galaxy SEDs,' which contains value-added information about 800,000 galaxies.

Big data for little creaturesA multi-disciplinary team of researchers at UC Riverside has received $3 million from the National Science Foundation Research Traineeship program to prepare the next generation of scientists and engineers who will learn how to exploit the power of big data to understand insects.

Best Science Podcasts 2018

The Right To SpeakShould all speech, even the most offensive, be allowed on college campuses? And is hearing from those we deeply disagree with ... worth it? This hour, TED speakers explore the debate over free speech. Guests include recent college graduate Zachary Wood, political scientist Jeffrey Howard, novelist Elif Shafak, and journalist and author James Kirchick.

#486 VolcanoesThis week we're talking volcanoes. Because there are few things that fascinate us more than the amazing, unstoppable power of an erupting volcano. First, Jessica Johnson takes us through the latest activity from the Kilauea volcano in Hawaii to help us understand what's happening with this headline-grabbing volcano. And Janine Krippner joins us to highlight some of the lesser-known volcanoes that can be found in the USA, the different kinds of eruptions we might one day see at them, and how damaging they have the potential to be. Related links: Kilauea status report at USGS A beginner's guide to Hawaii's otherworldly...