hero image
David Bader - New Jersey Institute of Technology. Newark, NJ, US

David Bader David Bader

Distinguished Professor, Computer Science | New Jersey Institute of Technology

Newark, NJ, UNITED STATES

Interests lie at the intersection of data science & high-performance computing, with applications in cybersecurity

Media

Publications:

David Bader Publication David Bader Publication David Bader Publication

Documents:

Photos:

Videos:

Predictive Analysis from Massive Knowledge Graphs on Neo4j – David Bader Interview: David Bader on Real World Challenges for Big Data Analytics, 5-Minute Interview: Dave Bader, Professor at Georgia Tech College of Computing

Audio:

Social

Biography

David A. Bader is a Distinguished Professor in the Department of Computer Science at New Jersey Institute of Technology. Prior to this, he served as founding Professor and Chair of the School of Computational Science and Engineering, College of Computing, at Georgia Institute of Technology.

He is a Fellow of the IEEE, AAAS, and SIAM, and advises the White House, most recently on the National Strategic Computing Initiative (NSCI). Dr. Bader is a leading expert in solving global grand challenges in science, engineering, computing, and data science.

His interests are at the intersection of high-performance computing and real-world applications, including cybersecurity, massive-scale analytics, and computational genomics, and he has co-authored over 230 articles in peer-reviewed journals and conferences. Dr. Bader has served as a lead scientist in several DARPA programs including High Productivity Computing Systems (HPCS) with IBM, Ubiquitous High Performance Computing (UHPC) with NVIDIA, Anomaly Detection at Multiple Scales (ADAMS), Power Efficiency Revolution For Embedded Computing Technologies (PERFECT), Hierarchical Identify Verify Exploit (HIVE), and Software-Defined Hardware (SDH).

He has also served as Director of the Sony-Toshiba-IBM Center of Competence for the Cell Broadband Engine Processor. Bader is a cofounder of the Graph500 List for benchmarking “Big Data” computing platforms. Bader is recognized as a “RockStar” of High Performance Computing by InsideHPC and as HPCwire’s People to Watch in 2012 and 2014. In April 2019, Bader was awarded an NVIDIA AI Lab (NVAIL) award, and in July 2019, Bader received a Facebook Research AI Hardware/Software Co-Design award.

Areas of Expertise (5)

Computational Genomics Applications in Cybersecurity Data Science High-Performance Computing Massive-Scale Analytics

Accomplishments (7)

NVIDIA AI Lab (NVAIL) Award

2019

Invited attendee to the White House’s National Strategic Computing Initiative (NSCI) Anniversary Workshop.

2019

Facebook AI System Hardware/Software Co-Design Research Award

2019

Named a member of "People to Watch" by HPC Wire

2014

The first recipient of the University of Maryland's Distinguished Alumni Award

2012
Department of Electrical and Computer Engineering

Named a member of "People to Watch" by HPC Wire

2012

Selected by Sony, Toshiba, and IBM to direct the first Center of Competence for the Cell Processor

2006

Education (3)

University of Maryland: Ph.D., Electrical and Computer Engineering 1996

Lehigh University: M.S., Electrical Engineering 1991

Lehigh University: B.S., Computer Engineering 1990

Affiliations (3)

  • AAAS Fellow
  • IEEE Fellow
  • SIAM Fellow

Media Appearances (2)

Big Data Career Notes: July 2019 Edition

Datanami  online

2019-07-16

The New Jersey Institute of Technology has announced that it will establish a new Institute for Data Science, directed by Distinguished Professor David Bader. Bader recently joined NJIT’s Ying Wu College of Computing from Georgia Tech, where he was chair of the School of Computational Science and Engineering within the College of Computing. Bader was recognized as one of HPCwire’s People to Watch in 2014.

view more

David Bader to Lead New Institute for Data Science at NJIT

Inside HPC  online

2019-07-10

Professor David Bader will lead the new Institute for Data Science at the New Jersey Institute of Technology. Focused on cutting-edge interdisciplinary research and development in all areas pertinent to digital data, the institute will bring existing research centers in big data, medical informatics and cybersecurity together to conduct both basic and applied research.

view more

Event Appearances (3)

Massive-scale Analytics

13th International Conference on Parallel Processing and Applied Mathematics (PPAM)  BIalystok, Poland

2019-09-09

Predictive Analytics from Massive Streaming Data

44th Annual GOMACTech Conference: Artificial Intelligence & Cyber Security: Challenges and Opportunities for the Government  Albuquerque, NM

2019-03-26

Massive-Scale Analytics Applied to Real-World Problems

2018 Platform for Advanced Scientific Computing (PASC) Conference  Basel, Switzerland

2018-07-04

Research Focus (2)

NVIDIA AI Lab (NVAIL) for Scalable Graph Algorithms

2019-08-05

Graph algorithms represent some of the most challenging known problems in computer science for modern processors. These algorithms contain far more memory access per unit of computation than traditional scientific computing. Access patterns are not known until execution time and are heavily dependent on the input data set. Graph algorithms vary widely in the volume of spatial and temporal locality that is usable my modern architectures. In today’s rapidly evolving world, graph algorithms are used to make sense of large volumes of data from news reports, distributed sensors, and lab test equipment, among other sources connected to worldwide networks. As data is created and collected, dynamic graph algorithms make it possible to compute highly specialized and complex relationship metrics over the entire web of data in near-real time, reducing the latency between data collection and the capability to take action.

With this partnership with NVIDIA, we collaborate on the design and implementation of scalable graph algorithms and graph primitives that will bring new capabilities to the broader community of data scientists. Leveraging existing open frameworks, this effort will improve the experience of graph data analysis using GPUs by improving tools for analyzing graph data, speeding up graph traversal using optimized data structures, and accelerating computations with better runtime support for dynamic work stealing and load balancing.

view more

Facebook AI Systems Hardware/Software Co-Design research award on Scalable Graph Learning Algorithms

2019-05-10

Deep learning has boosted the machine learning field at large and created significant increases in the performance of tasks including speech recognition, image classification, object detection, and recommendation. It has opened the door to complex tasks, such as self-driving and super-human image recognition. However, the important techniques used in deep learning, e.g. convolutional neural networks, are designed for Euclidean data type and do not directly apply on graphs. This problem is solved by embedding graphs into a lower dimensional Euclidean space, generating a regular structure. There is also prior work on applying convolutions directly on graphs and using sampling to choose neighbor elements. Systems that use this technique are called graph convolution networks (GCNs). GCNs have proven to be successful at graph learning tasks like link prediction and graph classification. Recent work has pushed the scale of GCNs to billions of edges but significant work remains to extend learned graph systems beyond recommendation systems with specific structure and to support big data models such as streaming graphs.

This project will focus on developing scalable graph learning algorithms and implementations that open the door for learned graph models on massive graphs. We plan to approach this problem in two ways. First, developing a scalable high performance graph learning system based on existing GCNs algorithms, like GraphSage, by improving the workflow on shared-memory NUMA machines, balancing computation between threads, optimizing data movement, and improving memory locality. Second, we will investigate graph learning algorithm-specific decompositions and develop new strategies for graph learning that can inherently scale well while maintaining high accuracy. This includes traditional partitioning, however in general we consider breaking the problem into smaller pieces, which, when solved will result in a solution to the bigger problem. We will explore decomposition results from graph theory, for example, forbidden graphs and the Embedding Lemma, and determine how to apply such results into the field of graph learning. We will investigate whether these decompositions could assist in a dynamic graph setting.

view more

Research Grants (6)

Echelon: Extreme-scale Compute Hierarchies with Efficient Locality-Optimized Nodes

DARPA/NVIDIA $25,000,000

2010-06-01

Goal: Develop highly parallel, security enabled, power efficient processing systems, supporting ease of programming, with resilient execution through all failure modes and intrusion attacks

view more

Center for Adaptive Supercomputing Software for Multithreaded Architectures (CASS-MT): Analyzing Massive Social Networks

Department of Defense $24,000,000

2008-08-01

Exascale Streaming Data Analytics for social networks: understanding communities, intentions, population dynamics, pandemic spread, transportation and evacuation.

view more

Proactive Detection of Insider Threats with Graph Analysis at Multiple Scales (PRODIGAL), under Anomoly Detection at Multiple Scales (ADAMS)

DARPA $9,000,000

2011-05-01

This paper reports on insider threat detection research, during which a prototype system (PRODIGAL)1 was
developed and operated as a testbed for exploring a range of detection and analysis methods. The data and test environment, system components, and the core method of unsupervised detection of insider threat leads are presented to document this work and benefit others working in the insider threat domain...

view more

Challenge Applications and Scalable Metrics (CHASM) for Ubiquitous High Performance Computing

DARPA 

2010-06-01

Develop highly parallel, security enabled, power efficient processing systems, supporting ease of programming, with resilient execution through all failure modes and intrusion attacks.

SHARP: Software Toolkit for Accelerating Graph Algorithms on Hive Processors

DARPA $6,760,425

2017-04-23

The aim of SHARP is to enable platform independent implementation of fast, scalable and approximate, static and streaming graph algorithms. SHARP will develop a software tool-kit for seamless acceleration of graph analytics (GA) applications, for a first of its kind collection of graph processors...

view more

GRATEFUL: GRaph Analysis Tackling power EFficiency, Uncertainty, and Locality

DARPA $2,929,819

2012-10-19

Think of the perfect embedded computer. Think of a computer so energy-efficient that it can last 75 times longer than today’s systems. Researchers at Georgia Tech are helping the Defense Advanced Projects Research Agency (DARPA) develop such a computer as part of an initiative called Power Efficiency Revolution for Embedded Computing Technologies, or PERFECT.

“The program is looking at how do we come to a new paradigm of computing where running time isn’t necessarily the constraint, but how much power and battery that we have available is really the new constraint,” says David Bader, executive director of high-performance computing at the School of Computational Science and Engineering.

If the project is successful, it could result in computers far smaller and orders of magnitude more efficient than today’s machines. It could also mean that the computer mounted tomorrow on an unmanned aircraft or ground vehicle, or even worn by a soldier would use less energy than a larger device, while still being as powerful.

Georgia Tech’s part in the DARPA-led PERFECT effort is called GRATEFUL, which stands for Graph Analysis Tackling power-Efficiency, Uncertainty and Locality. Headed by Bader and co-investigator Jason Riedy, GRATEFUL focuses on algorithms that would process vast stores of data and turn it into a graphical representation in the most energy-efficient way possible.

view more

Articles (6)

Tailoring parallel alternating criteria search for domain specific MIPs: Application to maritime inventory routing Computers & Operations Research

Lluís-Miquel Munguía, Shabbir Ahmed, David A Bader, George L Nemhauser, Yufen Shao, Dimitri J Papageorgiou

2019

Parallel Alternating Criteria Search (PACS) relies on the combination of computer parallelism and Large Neighborhood Searches to attempt to deliver high quality solutions to any generic Mixed-Integer Program (MIP) quickly. While general-purpose primal heuristics are widely used due to their universal application, they are usually outperformed by domain-specific heuristics when optimizing a particular problem class.

view more

High-Performance Phylogenetic Inference Bioinformatics and Phylogenetics

David A Bader, Kamesh Madduri

2019

Software tools based on the maximum likelihood method and Bayesian methods are widely used for phylogenetic tree inference. This article surveys recent research on parallelization and performance optimization of state-of-the-art tree inference tools. We outline advances in shared-memory multicore parallelization, optimizations for efficient Graphics Processing Unit (GPU) execution, as well as large-scale distributed-memory parallelization.

view more

Numerically approximating centrality for graph ranking guarantees Journal of Computational Science

Eisha Nathan, Geoffrey Sanders, David A Bader

2018

Many real-world datasets can be represented as graphs. Using iterative solvers to approximate graph centrality measures allows us to obtain a ranking vector on the nodes of the graph, consisting of a number for each vertex in the graph identifying its relative importance. In this work the centrality measures we use are Katz Centrality and PageRank. Given an approximate solution, we use the residual to accurately estimate how much of the ranking matches the ranking given by the exact solution.

view more

Ranking in dynamic graphs using exponential centrality International Conference on Complex Networks and their Applications

Eisha Nathan, James Fairbanks, David Bader

2017

Many large datasets from several fields of research such as biology or society can be represented as graphs. Additionally in many real applications, data is constantly being produced, leading to the notion of dynamic graphs. A heavily studied problem is identification of the most important vertices in a graph. This can be done using centrality measures, where a centrality metric computes a numerical value for each vertex in the graph.

view more

Scalable and High Performance Betweenness Centrality on the GPU [Best Student Paper Finalist] Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

A. McLaughlin, D. A. Bader

2014-11-01

raphs that model social networks, numerical simulations, and the structure of the Internet are enormous and cannot be manually inspected. A popular metric used to analyze these networks is between ness centrality, which has applications in community detection, power grid contingency analysis, and the study of the human brain. However, these analyses come with a high computational cost that prevents the examination of large graphs of interest. Prior GPU implementations suffer from large local data structures and inefficient graph traversals that limit scalability and performance. Here we present several hybrid GPU implementations, providing good performance on graphs of arbitrary structure rather than just scale-free graphs as was done previously. We achieve up to 13x speedup on high-diameter graphs and an average of 2.71x speedup overall over the best existing GPU algorithm. We observe near linear speedup and performance exceeding tens of GTEPS when running between ness centrality on 192 GPUs.

view more

STINGER: High performance data structure for streaming graphs [Best Paper Award] IEEE Conference on High Performance Extreme Computing

D. Ediger, R. McColl, J. Riedy, D. A. Bader

2012-09-01

The current research focus on “big data” problems highlights the scale and complexity of analytics required and the high rate at which data may be changing. In this paper, we present our high performance, scalable and portable software, Spatio-Temporal Interaction Networks and Graphs Extensible Representation (STINGER), that includes a graph data structure that enables these applications. Key attributes of STINGER are fast insertions, deletions, and updates on semantic graphs with skewed degree distributions. We demonstrate a process of algorithmic and architectural optimizations that enable high performance on the Cray XMT family and Intel multicore servers. Our implementation of STINGER on the Cray XMT processes over 3 million updates per second on a scale-free graph with 537 million edges.

view more

Contact