hero image
Cody Buntain - New Jersey Institute of Technology. Newark, NJ, US

Cody Buntain Cody Buntain

Assistant Professor | New Jersey Institute of Technology


Research interests intersect with data science and social science, have key applications in crisis informatics and political engagement




Cody Buntain Publication




Module 1. Analysis Environment - INST728E - Winter 2018 Social Media and Crisis Informatics Intro to Event Detection in Social Media ICWSM Science Slam 2017 - Reverted Wikipedia Edits, Johannes Kiesel @KieselJohannes




Cody Buntain's research focus is the intersection of data science with the social sciences, to develop and adapt advanced computational techniques to solve critical political and social issues, with a focus on crisis communication, social movements, and political participation.

Using data science with a focus on using social media to study how people engage socially and politically, especially during disasters and times of social unrest.

Areas of Expertise (8)

Machine Learning

Weak Supervision

Social Media

Online Political Engagement

Crisis Informatics

Information/Interaction Quality

Real-Time Summarization

Text Mining

Accomplishments (7)

Best Paper Award, IEEE SmartCloud


SIGIR Student Grant


Best Paper Honorable Mention, #Microposts2016


UMD HCIL Conference Award


Goldhaber Awards

2014 and 2015

International Conference Student Awards

2014 and 2015

Computer Science Department Gannon Award


Education (3)

University of Maryland College Park: Ph.D., Computer Science 2015

University of Alabama in Huntsville: M.S., Computer Science 2010

University of Alabama in Huntsville: B.S., Computer Science, Math 2007

Affiliations (2)

  • Intelligence Postdoctoral Fellowship
  • OSDC-PIRE Fellowship

Media Appearances (1)

Meet the Troll Hunters

NBC New York  tv


I-Team interviews NJIT's Cody Buntain about tracking political trolls online.

Media Appearance Image

view more

Event Appearances (4)

#HandsOffMyADA: A Twitter Response to the ADA Education and Reform Act

ACM Conference on Human Factors in Computing Systems  Glasgow, Scotland

Analyzing a Fake News Authorship Network

2019 iConference  College Park, MD

Learning Information Types in Social Media for Crises: TREC-IS

Twenty-Seventh Text Retrieval Conference  Gaithersburg, MD

#pray4victims: Consistencies In Response To Disaster on Twitter

21st ACM Conference on Computer-Supported Cooperative Work and Social Computing  Jersey City, NY

Research Grants (4)

Automated Program Analysis in Cybersecurity



The primary goal for Five Directions during the APAC [Automated Program Analysis for Cybersecurity] program was to measure the effectiveness and efficiency of the R&D [research and development] teams in detecting malware in Android applications. In order to achieve this goal, experiments were designed to test the tools being developed by the Research and Development (R&D) teams. The experiments pitted the research tools against malicious Android applications created by the Adversarial Challenge (AC) teams. The results of these experiments were then compared to the performance of a separate Control Team that used existing tools and techniques in order to analyze the same malicious applications. This analysis provided a method of evaluating the performances of each R&D team as well as the overall performance of the APAC program.

view more

Deception Studio: Attacker Characterization and Dynamic Relocation

Department of Defense - Airforce/Pikewerks $745,519

2011 - Phase II Deception Studio (DS) is a learning, behavior-based defense system for ensuring service availability and trust. DS's learning capabilities include attack detection, prediction, and attribution and can react to attacks in real time by shaping an adversary's perception and creating an illusion capable of manipulating his planning processes. Responses are deployed in a targeted fashion, allowing DS to respond with responses proportionate to the attack without inflicting hard penalties on valid users. Such responses can be both deceptive and active, extending the protection boundary of the system and forcing attackers to react to ever-changing conditions. DS can further provide availability of critical services by moving them out-of-band during ongoing attacks, dynamically migrating an attacker into a decoy environment, or degrading his access while maintaining availability for legitimate users. Before employing such responses, DS includes technology to heal critical services from infection and can also bring this healing technology to bare on compromised systems, returning them to the pool of usable systems. Deception Studio represents the state-of-the-art in active, behavior-based attack detection and prevention systems, imbuing systems with the ability to remain operational, available, and trustworthy through even the most targeted attacks.

view more

Imbuing Trust in Untrusted Hardware to Improve Protections

Department of Defense - Airforce/Pikewerks $97,559

2010 The Pikewerks InTrust system is a two-stage system designed to detect malicious implants or alterations in COTS hardware and firmware. It is meant to be used during both the integration/pre-deployment and the deployment stages to first establish trust and then maintain that trust during fielding. The pre-deployment test platform will make use of invasive testing and analysis techniques to ensure no unauthorized information leakage is occurring or embedded malware exists. Since many of these tests are heuristic-based and a number of malicious hardware modifications may have zero footprint until activation, however, it is possible that some alterations or implants will get past the pre-deployment analysis. As such, InTrust's second stage hardware sensors and firmware analysis mechanisms are designed to be embedded into fielded COTS platforms to detect tamper, attempts at modification, and the side effects of a triggered alteration. Further, once a tamper or modification attempt is detected, InTrust employs the Malicious Hardware Shield (MHS) to seal off regions of memory from direct access from unauthorized devices. InTrust can then integrate with existing Pikewerks environmental key generation to prevent unauthorized exposure of CPI/CT.

view more

Deception Studio: Attacker Characterization and Dynamic Relocation

Department of Defense - Air Force / Pikewerks $99,961

2009 - Phase I One of the most significant weaknesses that faces modern software protection solutions is the reliance on static policies and rule sets that are established based on “known” attack methods at the time of development. In reality, attacks are not static; they adapt over time, and evolve to defeat protections as they are made public. Pikewerks proposes to address both of these weaknesses by developing a system, referred to as Deception Studio that characterizes and appropriately reacts to attackers in real-time. As has been successfully implemented in traditional warfare, it will strive to shape the attacker’s perception, and create an illusion capable of manipulating their planning process. This concept is based on the combat operations process defined by John Boyd referred to as Observe, Orient, Decide, and Act (OODA). Deception Studio will characterize the attack, and tailor defenses based on what is observed.

view more

Articles (6)

Characterizing Gender Differences in Misogynistic and Antisocial Microblog Posts

Online Harassment

Cody Buntain

2018 This chapter presents an observational study into the genders of authors posting abusive misogynistic insults and hate speech on Twitter. We first characterize the different uses of potentially abusive and misogynistic expletives in Twitter using a novel diversity-based sampling strategy and use Amazon’s Mechanical Turk (MTurk) crowdsourcing platform to construct a labeled dataset of abusive, misogynistic insults.

view more

SMIDGen: An Approach for Scalable, Mixed-Initiative Dataset Generation from Online Social Networks

HCIL Tech Reports

Matthew Louis Mauriello, Cody Buntain, Brenna McNally, Sapna Bagalkotkar, Samuel Kushnir, Jon E Froehlich

2018 Recent qualitative studies have begun using large amounts of Online Social Network (OSN) data to study how users interact with technologies. However, current approaches to dataset generation are manual, time-consuming, and can be difficult to reproduce. To address these issues, we introduce SMIDGen: a hybrid manual+ computational approach for enhancing the replicability and scalability of data collection from OSNs to support qualitative research.

view more

Sampling Social Media: Supporting Information Retrieval from Microblog Data Resellers with Text, Network, and Spatial Analysis

In Proceedings of the 51st Hawaii International Conference on System Sciences 2018

Buntain, Cody, McGrath, Erin, and Behlendorf, Brandon

2018 This paper presents a computationally assisted method for scaling researcher expertise to large, online social media datasets in which access is constrained and costly. Developed collaboratively between social and computer science researchers, this method is designed to be flexible, scalable, cost-effective, and to reduce bias in data collection. Online response to six case studies covering elections and election-related violence in Sub-Saharan African countries are explored using Twitter, a popular online microblogging platform. Results show: 1) automated query expansion can researcher mitigate bias, 2) machine learning models combining textual, social, temporal, and geographic features in social media data perform well in filtering data unrelated to the target event, and 3) these results are achievable while minimizing fee-based queries by bootstrapping with readily-available Twitter samples.

view more

Automatically Identifying Fake News in Popular Twitter Threads


Cody Buntain ; Jennifer Golbeck


Information quality in social media is an increasingly important issue, but web-scale data hinders experts' ability to assess and correct much of the inaccurate content, or "fake news," present in these platforms. This paper develops a method for automating fake news detection on Twitter by learning to predict accuracy assessments in two credibility-focused Twitter datasets: CREDBANK, a crowdsourced dataset of accuracy assessments for events in Twitter, and PHEME, a dataset of potential rumors in Twitter and journalistic assessments of their accuracies. We apply this method to Twitter content sourced from BuzzFeed's fake news dataset and show models trained against crowdsourced workers outperform models based on journalists' assessment and models trained on a pooled dataset of both crowdsourced workers and journalists. All three datasets, aligned into a uniform format, are also publicly available. A feature analysis then identifies features that are most predictive for crowdsourced and journalistic accuracy assessments, results of which are consistent with prior work. We close with a discussion contrasting accuracy and credibility and why models of non-experts outperform models of journalists for fake news detection in Twitter.

view more

Powers and Problems of Integrating Social Media Data with Public Health and Safety

Data for Good Exchange

Cody Buntain, Jennifer Golbeck, & Gary LaFree

2015 Social media sites like Twitter provide readily accessible sources of large-volume, high-velocity data streams, now referred to as “Big Data.” While private companies have already made great strides in leveraging these social media sources, many public organizations and government agencies could reap significant benefits from these resources. Care must be exercised in this integration, however, as huge data sets come with their own intrinsic issues.

view more

Trust transfer between contexts

Journal of Trust Management

Cody Buntain, Jennifer Golbeck

2015 This paper explores whether trust, developed in one context, transfers into another, distinct context and, if so, attempts to quantify the influence this prior trust exerts. Specifically, we investigate the effects of artificially stimulated prior trust as it transfers across disparate contexts and whether this prior trust can compensate for negative objective information. To study such incidents, we leveraged Berg’s investment game to stimulate varying degrees of trust between a human and a set of automated agents.

view more