CMS collaboration releases its first open data from heavy-ion collisions

Jan 12, 2021

For a few weeks each year of operation, instead of colliding protons, the Large Hadron Collider (LHC) collides nuclei of heavy elements (“heavy ions”). These heavy-ion collisions allow researchers to recreate in the laboratory conditions that existed in the very early universe, such as the soup-like state of free quarks and gluons known as the quark–gluon plasma. Now, for the first time, the Compact Muon Solenoid (CMS) collaboration at CERN is making its heavy-ion data publicly available via the CERN Open Data portal.

Over 200 terabytes (TB) of data were released in December, from collisions that occurred in 2010 and 2011, when the LHC collided bunches of lead nuclei. Using these data, CMS had observed several signatures of the quark–gluon plasma, including the imbalance between the momenta of each jet of particles produced in a pair, the suppression (“quenching”) of particle jets in jet–photon pairs and the “melting” of certain composite particles. In addition to lead–lead collision data (two data sets from 2010 and four from 2011), CMS has also provided eight sets of reference data from proton–proton collisions recorded at the same energy.

The open data are available in the same high-quality format used by the CMS scientists to publish their research papers. The data are accompanied by the software that is needed to analyse them and by analysis examples. Previous releases of CMS open data have been used not only in education but also to perform novel research. CMS is hopeful that communities of professional researchers and amateur enthusiasts as well as educators and students at all levels will put the heavy-ion data to similar use.

“Our aim with releasing CMS data into the public domain via the Creative Commons CC0 waiver is to preserve our data and the knowledge needed to use them, in order to facilitate the widest possible use of our data,” says Kati Lassila-Perini, who has led the CMS open-data project since its inception in 2012. “We hope that those outside CMS will find these data as fascinating and valuable as we do.”

CMS has committed to releasing 100% of the data recorded each year after an embargo period of ten years, with up to 50% of the data being made available in the interim. The embargo affords the researchers who built and operate the CMS detector adequate time to analyse the data they collect. With this release, all of the research data recorded by CMS during LHC operation in 2010 and 2011 is now in the public domain, available for anyone to study.

