Areas of Expertise (7)
Human Nonverbal Behavior
Interactive Machine Learning
Social Skills Training
M. Ehsan Hoque directs the Rochester Human-Computer Interaction Lab. He is currently the interim director for the University's Goergen Institute for Data Science.
His research focuses on designing and implementing new algorithms to sense subtle human nonverbal behavior; enabling new behavior sensing and modelling for human-computer interaction; inventing new applications of emotion technology in high-impact social domains such as social skills training, public speaking; and assisting individuals who experience difficulties with social interactions.
Massachusetts Institute of Technology: Ph.D., Media Arts and Sciences (Media Lab) 2013
The University of Memphis: Electrical and Computer Engineering, M.Eng 2007
Penn State University: B.S., Computer Engineering 2004
- ACM Future of Computing Academy (ACM FCA)
- Association for the Advancement of Artificial Intelligence (AAAI)
- Association for Computing Machinery (ACM)
- Institute of Electrical and Electronics Engineers (IEEE)
- American Association for the Advancement of Science (AAAS)
- The Association for the Advancement ofAffective Commuting (AAAC)
Selected Media Appearances (5)
How to Spot a Liar: Experts Uncover the Signs of Deception-Can you see them?
“A lot of times people tend to look a certain way or show some kind of facial expression when they’re remembering things,” commented Tay Sen, a PhD student working in the lab of Ehsan Hoque, an assistant professor of computer science. “When they are given a computational question, they have another kind of facial expression.” Using a machine learning tool, the researchers found patterns...
Can YOU spot the liar? Researchers develop online game to help AI crack down on racial biases by analyzing over a million faces
Ethan Hoque, an assistant professor of computer science at the university, would like to dive deeper into the fact that interrogators unknowingly leak information when they are being lied to.
Interrogators demonstrate more polite smiles when they know they are hearing a falsehood. In addition, an examiner is more likely to return a smile by a lying witness than a truth-teller.
Looking at the interrogators' data could reveal useful information and could have implications for how TSA officers are trained.
'In the end, we still want humans to make the final decision,' Hoque says...
Facial software knows if you have something to hide
Surveillance cameras at airports could soon warn officers when a passenger is lying, after researchers analysed millions of frames of footage to map telltale facial expression...
M. Ehsan Hoque develops digital helpers that teach social skills
While virtual helpers that perform practical tasks, such as dealing with customer service issues, are becoming ubiquitous, computer scientist M. Ehsan Hoque is at the forefront of a more emotionally savvy movement...
Meet the Star of TED 2020: A Glass App That Coaches You As You Talk
New York Magazine
The nightmare of public speaking is set to become slightly less vomit-inducing, thanks to an app for smart glasses that provides real-time advice on how to modulate volume and cadence...
Selected Event Appearances (5)
Automated Dyadic Data Recorder (ADDR) Framework and Analysis of Facial Cues in Deceptive Communication
Proceedings of ACM on Interactive, Mobile, Warble, and Ubiquitous Computing (IMWUT) UbiComp 2018
CoCo: Collaboration Coach for Understanding Team Dynamics during Video Conferencing
Proceedings of ACM on Interactive, Mobile, Warble, and Ubiquitous Computing (IMWUT) UbiComp 2018
The What, When, and Why of Facial Expressions: An Objective Analysis of Conversational Skills in Speed-Dating Videos
IEEE International Conference on Automated Face and Gesture Recognition FG 2018
Say CHEESE: Common Human Emotional Expression Set EncoderAnalysis of Smiles in Honest and Deceptive Communication
IEEE InternationalConference on Automated Face and Gesture Recognition FG 2018
How Emotional Trajectories Affect Audience Perception in Public Speaking
CHI 2018 The Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI)
Selected Articles (5)
R. Shafipour, R. A. Baten, M. K. Hasan, G. Ghoshal, G. Mateos, and M. E. Hoque
Studies in learning communities have consistently found evidence that peer-interactions contribute to students’ performance outcomes. A particularly important competence in the modern context is the ability to communicate ideas effectively. One metric of this is speaking, which is an important skill in professional and casual settings. In this study, we explore peer-interaction effects in online networks on speaking skill development. In particular, we present an evidence for gradual buildup of skills in a small-group setting that has not been reported in the literature. Evaluating the development of such skills requires studying objective evidence, for which purpose, we introduce a novel dataset of six online communities consisting of 158 participants focusing on improving their speaking skills. They video-record speeches for 5 prompts in 10 days and exchange comments and performance-ratings with their peers. We ask (i) whether the participants’ ratings are affected by their interaction patterns with peers, and (ii) whether there is any gradual buildup of speaking skills in the communities towards homogeneity. To analyze the data, we employ tools from the emerging field of Graph Signal Processing (GSP). GSP enjoys a distinction from Social Network Analysis in that the latter is concerned primarily with the connection structures of graphs, while the former studies signals on top of graphs. We study the performance ratings of the participants as graph signals atop underlying interaction topologies. Total variation analysis of the graph signals show that the participants’ rating differences decrease with time (slope = −0.04, p
T. Sen, K. Hasan, Z. Teicher, M. E. Hoque
We developed an online framework that can automatically pair two crowd-sourced participants, prompt them to follow a research protocol, and record their audio and video on a remote server. The framework comprises two web applications: an Automatic Quality Gatekeeper for ensuring only high quality crowd-sourced participants are recruited for the study, and a Session Controller which directs participants to play a research protocol, such as an interrogation game. This framework was used to run a research study for analyzing facial expressions during honest and deceptive communication using a novel interrogation protocol. The protocol gathers two sets of nonverbal facial cues in participants: features expressed during questions relating to the interrogation topic and features expressed during control questions. The framework and protocol were used to gather 151 dyadic conversations (1.3 million video frames). Interrogators who were lied to expressed the smile-related lip corner puller cue more often than interrogators who were being told the truth, suggesting that facial cues from interrogators may be useful in evaluating the honesty of witnesses in some contexts. Overall, these results demonstrate that this framework is capable of gathering high quality data which can identify statistically significant results in a communication study.
S. Samrose, R. Zhao, J. White, V. Li, L. Nova, Y. Lu, M. R. Ali, M. E. Hoque
We present and discuss a fully-automated collaboration system, CoCo, that allows multiple participants to video chat and receive feedback through custom video conferencing software. After a conferencing session, a virtual feedback assistant provides insights on the conversation to participants. CoCo automatically pulls audial and visual data during conversations and analyzes the extracted streams for affective features, including smiles, engagement, attention, as well as speech overlap and turn-taking. We validated CoCo with 39 participants split into 10 groups. Participants played two back-to-back teambuilding games, Lost at Sea and Survival on the Moon, with the system providing feedback between the two. With feedback, we found a statistically significant change in balanced participation—that is, everyone spoke for an equal amount of time. There was also statistically significant improvement in participants’ self-evaluations of conversational skills awareness, including how often they let others speak, as well as of teammates’ conversational skills.
M. R. Ali, T. K. Sen, D. Crasta, V-D. Nguyen, R. Rogge, M. E. Hoque
In this paper, we demonstrate the importance of combinations of facial expressions and their timing, in explaining a person's conversational skills in a series of brief non-romantic conversations. Video recordings of 365 fourminute conversations before and after a randomized intervention were analyzed in which facial action units (AUs) were examined over different time segments. Male subjects (N=47) were evaluated in their conversation skills using the Conversational Skills Rating Scale (CSRS). A linear regression model was used to compare the importance of AU features from different time segments in predicting CSRS ratings. In the first minute of conversation, CSRS ratings were best predicted by activity levels in action units associated with speaking (Lips part, AU25). In the last minute of conversation, affective indicators associated with expressions of laughter (Jaw Drop, AU26) and warmth (Happy faces) emerged as the most important. These findings suggest that feedback on nonverbal skills must dynamically account for shifting goals of conversation.
T. K. Sen, K. Hasan, M. Tran, Y. Yang, M. E. Hoque
In this paper we introduce the Common Human Emotional Expression Set Encoder (CHEESE) framework for objectively determining which, if any, subsets of the facial action units associated with smiling are well represented by a small finite set of clusters according to an information theoretic metric. Smile-related AUs (6,7,10,12,14) in over 1.3M frames of facial expressions from 151 pairs of individuals playing a communication game involving deception were analyzed with CHEESE. The combination of AU6 (cheek raiser) and AU12 (lip corner puller) are shown to cluster well into five different types of expression. Liars showed high intensity AU6 and AU12 more often compared to honest speakers. Additionally, interrogators were found to express a higher frequency of low intensity AU6 with high intensity AU12 (i.e. polite smiles) when they were being lied to, suggesting that deception analysis should be done in consideration of both the message sender's and the receiver's facial expressions.