Hey Siri: How Much Does This Galaxy Cluster Weigh?

Aug 25, 2022

4 min

It's been nearly a century since astronomer Fritz Zwicky first calculated the mass of the Coma Cluster, a dense collection of almost 1,000 galaxies located in the nearby universe. But estimating the mass of something so huge and dense, not to mention 320 million light-years away, has its share of problems — then and now. Zwicky's initial measurements, and the many made since, are plagued by sources of error that bias the mass higher or lower.


Now, using tools from machine learning, a team led by Carnegie Mellon University physicists has developed a deep-learning method that accurately estimates the mass of the Coma Cluster and effectively mitigates the sources of error.


"People have made mass estimates of the Coma Cluster for many, many years. But by showing that our machine-learning methods are consistent with these previous mass estimates, we are building trust in these new, very powerful methods that are hot in the field of cosmology right now," said Matthew Ho, a fifth-year graduate student in the Department of Physics' McWilliams Center for Cosmology and a member of Carnegie Mellon's NSF AI Planning Institute for Physics of the Future.


Machine-learning methods are used successfully in a variety of fields to find patterns in complex data, but they have only gained a foothold in cosmology research in the last decade. For some researchers in the field, these methods come with a major concern: Since it is difficult to understand the inner workings of a complex machine-learning model, can they be trusted to do what they are designed to do? Ho and his colleagues set out to address these reservations with their latest research, published in Nature Astronomy.


To calculate the mass of the Coma Cluster, Zwicky and others used a dynamical mass measurement, in which they studied the motion or velocity of objects orbiting in and around the cluster and then used their understanding of gravity to infer the cluster's mass. But this measurement is susceptible to a variety of errors. Galaxy clusters exist as nodes in a huge web of matter distributed throughout the universe, and they are constantly colliding and merging with each other, which distorts the velocity profile of the constituent galaxies. And because astronomers are observing the cluster from a great distance, there are a lot of other things in between that can look and act like they are part of the galaxy cluster, which can bias the mass measurement. Recent research has made progress toward quantifying and accounting for the effect of these errors, but machine-learning-based methods offer an innovative data-driven approach, according to Ho.


"Our deep-learning method learns from real data what are useful measurements and what are not," Ho said, adding that their method eliminates errors from interloping galaxies (selection effects) and accounts for various galaxy shapes (physical effects). "The usage of these data-driven methods makes our predictions better and automated."


"One of the major shortcomings with standard machine learning approaches is that they usually yield results without any uncertainties," added Associate Professor of Physics Hy Trac, Ho's adviser. "Our method includes robust Bayesian statistics, which allow us to quantify the uncertainty in our results."


Ho and his colleagues developed their novel method by customizing a well-known machine-learning tool called a convolutional neural network, which is a type of deep-learning algorithm used in image recognition. The researchers trained their model by feeding it data from cosmological simulations of the universe. The model learned by looking at the observable characteristics of thousands of galaxy clusters, whose mass is already known. After in-depth analysis of the model's handling of the simulation data, Ho applied it to a real system — the Coma Cluster — whose true mass is not known. Ho's method calculated a mass estimate that is consistent with most of the mass estimates made since the 1980s. This marks the first time this specific machine-learning methodology has been applied to an observational system.


"To build reliability of machine-learning models, it's important to validate the model's predictions on well-studied systems, like Coma," Ho said. "We are currently undertaking a more rigorous, extensive check of our method. The promising results are a strong step toward applying our method on new, unstudied data."

Models such as these are going to be critical moving forward, especially when large-scale spectroscopic surveys, such as the Dark Energy Spectroscopic Instrument, the Vera C. Rubin Observatory and Euclid, start releasing the vast amounts of data they are collecting of the sky.


"Soon we're going to have a petabyte-scale data flow," Ho explained. "That's huge. It's impossible for humans to parse that by hand. As we work on building models that can be robust estimators of things like mass while mitigating sources of error, another important aspect is that they need to be computationally efficient if we're going to process this huge data flow from these new surveys. And that is exactly what we are trying to address — using machine learning to improve our analyses and make them faster."


This work is supported by NSF AI Institute: Physics of the Future, NSF PHY-2020295, and the McWilliams-PSC Seed Grant Program. The computing resources necessary to complete this analysis were provided by the Pittsburgh Supercomputing Center. The CosmoSim database used in this paper is a service by the Leibniz-Institute for Astrophysics Potsdam (AIP).


The study's authors include: Trac; Michelle Ntampaka, who graduated from CMU with a doctorate in physics in 2017 and is now deputy head of Data Science at the Space Telescope Science Institute; Markus Michael Rau, a McWilliams postdoctoral fellow who is now a postdoctoral fellow at Argonne National Lab; Minghan Chen, who graduated with a bachelor's degree in physics in 2018, and is a Ph.D. student at the University of California, Santa Barbara.; Alexa Lansberry, who graduated with a bachelor's degree in physics in 2020; and Faith Ruehle, who graduated with a bachelor's degree in physics in 2021.


You might also like...

Check out some other posts from Carnegie Mellon University

2 min

Is the economy on thin ice? Our expert can explain

Is the bubble about to burst again on the country's economy? A recent article by Bloomberg News paints a picture of what lies ahead - and the predictions look bleak at best. The Congressional Budget Office warned in its latest projections that US federal government debt is on a path from 97% of GDP last year to 116% by 2034 — higher even than in World War II. The actual outlook is likely worse. From tax revenue to defense spending and interest rates, the CBO forecasts released earlier this year are underpinned by rosy assumptions. Plug in the market’s current view on interest rates, and the debt-to-GDP ratio rises to 123% in 2034. Then assume — as most in Washington do — that ex-President Donald Trump’s tax cuts mainly stay in place, and the burden gets even higher.   With uncertainty about so many of the variables, Bloomberg Economics has run a million simulations to assess the fragility of the debt outlook. In 88% of the simulations, the results show the debt-to-GDP ratio is on an unsustainable path — defined as an increase over the next decade.  April 01 - Bloomberg The economy is in the news every day - and if you're a journalist looking to know what the future may hold - then let us help with your coverage and questions. Professor Lee Branstetter is a research associate of the National Bureau of Economic Research and nonresident senior fellow at the Peterson Institute for International Economics.   He is available to speak with media about the economy - simply click on his icon now to arrange an interview today.

2 min

#Experts in the Media: CMU's Rayid Ghani Weighs in on the Role of AI

The topic of Artificial Intelligence (AI) is beginning to dominate a lot of conversations and even conferences across the planet. And recently, Rayid Ghani, a Distinguished Career Professor in the Machine Learning Department and the Heinz College of Information Systems and Public Policy at Carnegie Mellon University was part of an in-depth article that explored if AI could be expected to play a positive role in the future of humankind.  In fact, it asked if we could soon expect AI to do the heavy lifting with regards to major problems like global warming that we're all trying to solve. The answers experts offered were varied and some a bit surprising but for the most part AI can help ... but we humans need to do a lot of the work. Today, AI is already proving itself in prediction and warning systems in the event of tornadoes or forest fires, for example. It is then necessary to evacuate populations, or for humans to agree to be vaccinated in the event of a pandemic, underlines Rayid Ghani, of Carnegie Mellon University. “We created this problem, not AI. Technology can help us… A little,” he clarified. “And only if humans decide to use it to tackle problems.” March 17 - The Archide  It is a fascinating article and well worth the read. And if you are a journalist looking to know more about Artificial Intelligence and what we can expect in the future - let us help with your coverage. Rayid Ghani is a reformed computer scientist who wants to increase the use of large-scale Machine Learning in solving large public policy. Rayid Ghani  is available to speak with media - simply click on his icon now to arrange an interview today.

3 min

CMU To Send Group of Satellites Into Orbit To Test Low-Cost Autonomous Swarming

Satellites designed by Carnegie Mellon University researchers to orbit in a swarm and maintain communication with each other while zipping around the Earth will test the latest in low-cost, CubeSat technology. The CMU team is monitoring four small satellites as they communicate with each other, determining where they are relative to one another and autonomously maneuvering to stay within communication range. The satellites have no propulsion systems but can change their orbital positions by adjusting their orientation in flight to increase or decrease drag. These maneuvers will allow the satellites to stay within a few hundred miles of each other and orbit in a loose formation relative to members of the swarm. “We’re trying to do really complicated, multiagent autonomy in a very small box,” said Zac Manchester, an assistant professor in the Robotics Institute at CMU’s School of Computer Science. “Our goal is to have these satellites act autonomously in a coordinated way. We want them to talk to each other, find out where their buddies are and stay close enough to maintain communication.” The goal of PY4 is to demonstrate low size, weight, power, and cost (SWaP-C) spacecraft-to-spacecraft ranging, in-orbit relative navigation, and coordinated simultaneous multi-point radiation measurements. Manchester’s Robotic Exploration Lab leads the project, known as PY4. The experiment is funded through a NASA Small Spacecraft Technology program grant, led by the agency’s Space Technology Mission Directorate, and managed by NASA’s Ames Research Center in Silicon Valley. The four CubeSats, each measuring 4 inches by 4 inches by 6.5 inches — about the size of a cracker box — launched Monday, March 4, aboard a SpaceX Falcon 9 rocket from Vandenberg Space Force Base in California. Max Holliday, middle, installs one of the four PyCubed-Based CubeSat (PY4) spacecraft into the Small Satellites Dispenser. The PY4 mission intends to demonstrate a low-cost approach to multiple-satellite missions. Advancing autonomy and proving that relatively small and inexpensive satellites can succeed could reduce mission and operational costs dramatically for subsequent missions. This technology demonstration aims to show that a group of low-cost small satellites provide the same functionality as larger, more costly spacecraft. Multiple satellites are needed for multipoint science data collection and for applications like positioning and tracking, stereo and 3D imaging of the Earth, and taking measurements to study the atmosphere and weather. Lowering the cost of these missions seeks to broaden the applications of multisatellite technology. “The big one that has been motivating us in the lab has been global wildlife tracking,” Manchester said. “We could have a number of low-cost satellites monitoring animals across the world to better understand the global impacts of poaching, illegal fishing, deforestation and climate change.” The satellites operate in low-Earth orbit at an altitude of about 342 miles (550 kilometers), slightly higher than the International Space Station. Solar panels snapped open once the satellites were in orbit to charge the batteries. The satellites orbit the Earth, passing over the poles at about 15,660 miles per hour (7 kilometers per second). The CMU team is monitoring the satellites but can only communicate directly with them using their own ground station during a five-minute window a few times per day. For the rest of the time, they rely on the Iridium constellation for communication with the satellites. A rendering of the satellites in orbit. In orbit, the satellites act autonomously. They periodically measure their relative distance to establish their positions in relation to each other. Combining this information with other sensor data available from the satellites allows them to control the configuration of the swarm. A maneuver such as changing the configuration of the swarm from a line to a box could take weeks. Manchester expects his team to gather all the data needed for their research in a few months. Zachary Manchester is a researcher and aerospace engineer with broad interests in dynamics, control, estimation and optimization.  If you're looking to know more about this project - simply click on his icon now to arrange an interview today.

View all posts