Christopher D. Carothers

Director, Center for Computational Innovations (CCI) & Professor, Computer Science Rensselaer Polytechnic Institute

Troy NY

Researches massively parallel computer systems with a focus on modeling and simulation systems

Contact

Rensselaer Polytechnic Institute
View more experts managed by Rensselaer Polytechnic Institute

View all Experts

Spotlight

Jan 16, 2020

2 min

Introducing AiMOS, The Most Powerful Supercomputer at a Private University

The most powerful supercomputer to debut on the November 2019 Top500 ranking of supercomputers, also the most powerful supercomputer in New York State, was recently unveiled at the Rensselaer Polytechnic Institute Center for Computational Innovations (CCI). Part of a collaboration between IBM, Empire State Development (ESD), and NY CREATES, the eight petaflop IBM POWER9-equipped AI supercomputer is configured to help enable users to explore new AI applications and accelerate economic development from New York’s smallest startups to its largest enterprises. AiMOS is: The most powerful supercomputer housed at a private university. The 24th most powerful supercomputer in the world. The third-most energy efficient supercomputer in the world. Named AiMOS (short for Artificial Intelligence Multiprocessing Optimized System) in honor of Rensselaer co-founder Amos Eaton, the machine will serve as a test bed for the New York State-IBM Research AI Hardware Center, which opened on the SUNY Polytechnic Institute (SUNY Poly) campus in Albany earlier this year. The AI Hardware Center aims to advance the development of computing chips and systems that are designed and optimized for AI workloads to push the boundaries of AI performance. AiMOS will provide the modeling, simulation, and computation necessary to support the development of this hardware. “The established expertise in computation and data analytics at Rensselaer, when combined with AiMOS, will enable many of our research projects to make significant strides that simply were not possible on our previous platform,” said Christopher Carothers, director of the CCI and professor of computer science at Rensselaer. “Our message to the campus and beyond is that, if you are doing work on large-scale data analytics, machine learning, AI, and scientific computing, then it should be running at the CCI.” Built using the same IBM Power Systems technology as the world’s smartest supercomputers, the US Dept. of Energy’s Summit and Sierra supercomputers, AiMOS uses a heterogeneous system architecture that includes IBM POWER9 CPUs and NVIDIA GPUs. This enables AiMOS with a capacity of eight quadrillion calculations per second. You can watch Rensselaer President Shirley Ann Jackson talk about AiMOS here: Chris Carothers is the director of the Center for Computational Innovations (CCI) at Rensselaer. He is available to speak with media about AiMOS and what it can enable – simply click on his icon to arrange an interview.

Areas of Expertise

Information and Computer Science

Parallel Discrete-Event Simulation

Computer Science

Massively Parallel Systems

Modeling and Simulation Systems

Massively Parallel Processing

Systems and Network Modeling

Biography

Chris Carothers is a Professor in the Computer Science Department at Rensselaer Polytechnic Institute. His research interest are in massively parallel systems focusing on modeling and simulation systems of all sorts. Prof. Carothers is an NSF CAREER award winner and is currently active in the DOE Exascale Co-Design Program associated with designs for next generation exascale storage systems as well as the NSF PetaApps Program, and the Army Research Center's Mobile Network Modeling Institute

Education

Georgia Institute of Technology

Ph.D.

Computer Science

1997

Georgia Institute of Technology

M.S.

Computer Science

1996

Georgia Institute of Technology

B.S.

Information and Computer Science

1991

Media Appearances

U.S. Military Sees Future in Neuromorphic Computing

The Next Platform online

2017-06-26

The novel architectures story is still shaping out for 2017 when it comes machine learning, hyperscale, supercomputing and other areas.

IBM’s supercomputer Watson gets a roommate at RPI

Albany Business Review online

2013-10-03

IBM chose Rensselaer Polytechnic Institute to house one of the most powerful supercomputers in the world, which will allow businesses to analyze massive amounts of data.

Articles

Massively Parallel Modeling and Simulation of Next Generation Hybrid Neuromorphic Supercomputer

Rensselaer Polytechnic Institute

Carothers, Christopher

2018

The major theme of research investigated here is how might neuromorphic computing impact future designs of supercomputer systems. This report provides both a summary and detailed experimental research results for the five core research thrusts (CRTs) covered in this research project.

Leveraging shared memory in the ross time warp simulator for complex network simulations

2018 Winter Simulation Conference (WSC)

Caitlin J Ross, Christopher D Carothers, Misbah Mubarak, Robert B Ross, Jianping Kelvin Li, Kwan-Liu Ma

2018

Scalability of parallel discrete-event simulation (PDES) systems is key to their use in modeling complex networks at high fidelity. In particular, intranode scalability is important due to the prevalence of many-core systems, but MPI communication between cores on the same node is known to have drawbacks (e.g., software overheads). We have extended the ROSS optimistic PDES framework to create memory pools shared by MPI processes on the same node in order to reduce on-node MPI overhead. We perform experiments to compare the performance of shared memory ROSS with pure MPI ROSS on two different systems. For the experiments, we use several models that exhibit a variety of characteristics to understand the conditions where shared memory can benefit the simulation. In general, higher remote event rates means that simulations are more likely to benefit from using shared memory, but this may also be due in part to improved rollback behavior.

Efficient Classification of Supercomputer Failures Using Neuromorphic Computing

2018 IEEE Symposium Series on Computational Intelligence (SSCI)

Prasanna Date, Christopher D Carothers, James A Hendler, Malik Magdon-Ismail

2018

Today's petascale supercomputers are comprised of ten's of thousands of compute nodes. Failures on these massive machines are a growing problem as the time for a single compute node to fail is shrinking. Ideally, the job scheduler would like the capability to predict node failures ahead of time in order to minimize the impact of node failures on overall job throughput. However, due to the tight power constraints of future systems, the online modeling of real-time error data must be accomplished using as little power as possible. To this end, the IBM TrueNorth Neurosynaptic System is used to create a Spiking Neural Network (SNN) model of supercomputer failure data and the classification accuracy of this model is compared to other Machine Learning (ML) and Deep Learning (DL) techniques. It is observed that the TrueNorth failure classification model yields a training accuracy of 99.41%, validation accuracy of 98.12% and testing accuracy of 99.80% and outperforms other machine learning and deep learning approaches. Moreover, the TrueNorth SNN consumes five orders of magnitude less power than the other ML/DL approaches during the testing phase. Additionally, it is observed that all ML/DL approaches investigated as part of this study are able to produce accurate models of the supercomputer system failure data.

Christopher D. Carothers

Rensselaer Polytechnic Institute

Spotlight

Introducing AiMOS, The Most Powerful Supercomputer at a Private University

Areas of Expertise

Biography

Education

Georgia Institute of Technology

Georgia Institute of Technology

Georgia Institute of Technology

Links

Media Appearances

U.S. Military Sees Future in Neuromorphic Computing

IBM’s supercomputer Watson gets a roommate at RPI

Articles

Massively Parallel Modeling and Simulation of Next Generation Hybrid Neuromorphic Supercomputer

Leveraging shared memory in the ross time warp simulator for complex network simulations

Efficient Classification of Supercomputer Failures Using Neuromorphic Computing