Areas of Expertise (8)
Social Signal Processing
Human-Scale, Occupant-Aware Environments
Richard J. Radke joined the Electrical, Computer, and Systems Engineering department at Rensselaer Polytechnic Institute in 2001, where he is now a Full Professor. He has B.A. and M.A. degrees in computational and applied mathematics from Rice University, and M.A. and a Ph.D. degree in electrical engineering from Princeton University. His current research interests involve computer vision problems related to human-scale, occupant-aware environments, such as person tracking and re-identification with cameras and range sensors. Dr. Radke is affiliated with the NSF Engineering Research Center for Lighting Enabled Services and Applications (LESA), the DHS Center of Excellence on Explosives Detection, Mitigation and Response (ALERT), and Rensselaer’s Experimental Media and Performing Arts Center (EMPAC) and Cognitive and Immersive Systems Laboratory (CISL). He received an NSF CAREER award in March 2003 and was a member of the 2007 DARPA Computer Science Study Group. Dr. Radke is a Senior Member of the IEEE and a Senior Area Editor of IEEE Transactions on Image Processing. His textbook Computer Vision for Visual Effects was published by Cambridge University Press in 2012. His Youtube Channel contains many annotatated lectures on signal processing, image processing, and computer vision.
Princeton University: Ph.D., Electrical Engineering
Rice University: M.A., Computational and Applied Mathematics
Rice University: B.A., Computational and Applied Mathematics
Media Appearances (3)
New Lab Helps Manufacturers Make the Most of Augmented Reality Technology
Assembly Magazine online
Augmented reality (AR) and virtual reality (VR) are cutting-edge tools that are becoming increasingly important to engineers for applications ranging from product design to assembly line layout.
RPI Kicks Off First Full Semester For Virtual Reality Lab
Rensselear Polytechnic Institute is celebrating its first semester with a new virtual and augmented reality lab, where students can use modern technology to explore new environments and learning methods.
Google’s Soli Advances Gesture Technology Toward a ‘Universal Remote Control’
In movies, the fastest way to convey a futuristic world is by giving characters the power to control their environment with a simple wave of the hand. Gesture control isn’t ubiquitous yet, but the technology is progressing. In January, the Federal Communications Commission approved Google’s Project Soli, a sensing technology that uses miniature radar to detect touchless gestures.
A Systematic Evaluation and Benchmark for Person Re-Identification: Features, Metrics, and DatasetsIEEE Transactions on Pattern Analysis and Machine Intelligence
Mengran Gou, Ziyan Wu, Angels Rates-Borras, Octavia Camps, Richard J Radke
2019 Person re-identification (re-id) is a critical problem in video analytics applications such as security and surveillance. The public release of several datasets and code for vision algorithms has facilitated rapid progress in this area over the last few years. However, directly comparing re-id algorithms reported in the literature has become difficult since a wide variety of features, experimental protocols, and evaluation metrics are employed. In order to address this need, we present an extensive review and performance evaluation of single- and multi-shot re-id algorithms. The experimental protocol incorporates the most recent advances in both feature extraction and metric learning. To ensure a fair comparison, all of the approaches were implemented using a unified code library that includes 11 feature extraction algorithms and 22 metric learning and ranking techniques. All approaches were evaluated using a new large-scale dataset that closely mimics a real-world problem setting, in addition to 16 other publicly available datasets: VIPeR, GRID, CAVIAR, DukeMTMC4ReID, 3DPeS, PRID, V47, WARD, SAIVT-SoftBio, CUHK01, CHUK02, CUHK03, RAiD, iLIDSVID, HDA+, and Market1501. The evaluation codebase and results will be made publicly available for community use.
Re-Identification with Consistent Attentive Siamese NetworksarXiv:1811.07487
Meng Zheng, Srikrishna Karanam, Ziyan Wu, Richard J. Radke
2019 We propose a new deep architecture for person re-identification (re-id). While re-id has seen much recent progress, spatial localization and view-invariant representation learning for robust cross-view matching remain key, unsolved problems. We address these questions by means of a new attention-driven Siamese learning architecture, called the Consistent Attentive Siamese Network. Our key innovations compared to existing, competing methods include (a) a flexible framework design that produces attention with only identity labels as supervision, (b) explicit mechanisms to enforce attention consistency among images of the same person, and (c) a new Siamese framework that integrates attention and attention consistency, producing principled supervisory signals as well as the first mechanism that can explain the reasoning behind the Siamese framework's predictions. We conduct extensive evaluations on the CUHK03-NP, DukeMTMC-ReID, and Market-1501 datasets and report competitive performance.
A Multimodal-Sensor-Enabled Room for Unobtrusive Group Meeting AnalysisProceedings of the 20th ACM International Conference on Multimodal Interaction
Indrani Bhattacharya, Michael Foley, Ni Zhang, Tongtao Zhang, Christine Ku, Cameron Mine, Heng Ji, Christoph Riedl, Brooke Foucault Welles, Richard J Radke
2018 Group meetings can suffer from serious problems that undermine performance, including bias, "groupthink", fear of speaking, and unfocused discussion. To better understand these issues, propose interventions, and thus improve team performance, we need to study human dynamics in group meetings. However, this process currently heavily depends on manual coding and video cameras. Manual coding is tedious, inaccurate, and subjective, while active video cameras can affect the natural behavior of meeting participants. Here, we present a smart meeting room that combines microphones and unobtrusive ceiling-mounted Time-of-Flight (ToF) sensors to understand group dynamics in team meetings. We automatically process the multimodal sensor outputs with signal, image, and natural language processing algorithms to estimate participant head pose, visual focus of attention (VFOA), non-verbal speech patterns, and discussion content. We derive metrics from these automatic estimates and correlate them with user-reported rankings of emergent group leaders and major contributors to produce accurate predictors. We validate our algorithms and report results on a new dataset of lunar survival tasks of 36 individuals across 10 groups collected in the multimodal-sensor-enabled smart room.