hero image
Deva Ramanan - Carnegie Mellon University. Pittsburgh, PA, US

Deva Ramanan

Professor | Carnegie Mellon University


Deva Ramanan's research interests span computer vision and machine learning, with a focus on visual recognition.


Deva Ramanan is a professor in the Robotics Institute at Carnegie Mellon University and the director of the CMU Argo AI Center for Autonomous Vehicle Research. The Center engages in fundamental research to produce advanced perception and next-generation decision-making algorithms that enable vehicles to perceive and navigate autonomously in diverse real-world urban conditions. His research interests span computer vision and machine learning, with a focus on visual recognition often motivated by the task of understanding people from visual data. He served at the program chair of the IEEE Computer Vision and Pattern Recognition (CVPR) 2018. He is on the editorial board of the International Journal of Computer Vision and is an associate editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence. He regularly serves as a senior program committee member for CVPR, the International Conference on Computer Vision, and the European Conference on Computer Vision. He also regularly serves on NSF panels for computer vision and machine learning.

Areas of Expertise (10)

Human-Centered Robotics

Human-Robot Collaboration

Machine Learning Embedded in Systems


3-D Vision and Recognition

Computer Vision

Visual Servoing and Visual Tracking

First-Person Vision

Sensing & Perception

Graphics & Creative Tools

Media Appearances (5)

Generative modeling tool renders 2D sketches in 3D

Tech Xplore  online


"As long as you can draw a sketch, you can make your own customized 3D model," said RI doctoral candidate Kangle Deng, who was part of the research team with Zhu, Professor Deva Ramanan and Ph.D. student Gengshan Yang.

view more

Self-driving cars would be nowhere without HD maps

Axios  online


"Even though a traffic light and the moon may resemble each other, a self-driving system should use a combination of contextual cues — including spatial, temporal and prior knowledge — to tell them apart," Deva Ramanan, principal scientist at self-driving tech competitor Argo AI explains in a blog post.

view more

New Perception Metric Balances Reaction Time, Accuracy

Carnegie Mellon University  online


The new metric, called streaming perception accuracy, was developed by Li, together with Deva Ramanan, associate professor in the Robotics Institute and principal scientist at Argo AI, and Yu-Xiong Wang, assistant professor at the University of Illinois at Urbana-Champaign. They presented it last month at the virtual European Conference on Computer Vision, where it received a best paper honorable mention award.

view more

Carnegie Mellon, Argo AI to Create Self-Driving Vehicle Research Center

Robotics Business Review  online


Deva Ramanan, an associate professor in the Robotics Institute who also serves as machine learning lead at Argo AI, will be the center’s principal investigator. The center’s research will involve faculty members and students from across CMU. The center will give students access to the fleet-scale data sets, vehicles and large-scale infrastructure that are crucial for advancing self-driving technologies and that otherwise would be difficult to obtain.

view more

Beyond deep fakes: Transforming video content into another video's style, automatically

EurekAlert!  online


Bansal will present the method today at ECCV 2018, the European Conference on Computer Vision, in Munich. His co-authors include Deva Ramanan, CMU associate professor of robotics.

view more





loading image


CVPR23 E2EAD | Deva Ramanan, Invited Talk [CVPR'21 WAD] Keynote - Deva Ramanan, Argo/CMU MCS 2020. Day 1. Deva Ramanan


Industry Expertise (2)

Computer Networking


Accomplishments (1)

IARPA Award for "Walk-Through Rendering From Images of Varying Altitudes (professional)


Education (2)

University of California at Berkeley: Ph.D., Electrical Engineering and Computer Science

University of Delaware: B.S., Computer Engineering

Affiliations (3)

  • IEEE Computer Vision and Pattern Recognition (CVPR)
  • International Journal of Computer Vision
  • IEEE Transactions on Pattern Analysis and Machine Intelligence

Articles (5)

Edge-based Privacy-Sensitive Live Learning for Discovery of Training Data

Proceedings of the 1st International Workshop on Networked AI Systems

2023 Finding true positives (TPs) to construct a training set for a new class of interest in machine learning (ML) is often a challenge. The novelty of the class suggests that cloud archives are unlikely to be helpful. We observe that most video data collected for surveillance and briefly stored at the edge before being overwritten is currently unused. To efficiently harness this untapped resource, we describe Delphi, a privacy-sensitive interactive labeling system that continuously improves labeling productivity through background learning. Our experimental results confirm the value of Delphi for training set construction from edge-sourced data.

view more

Towards long-tailed 3d detection

Conference on Robot Learning

2023 Contemporary autonomous vehicle (AV) benchmarks have advanced techniques for training 3D detectors, particularly on large-scale lidar data. Surprisingly, although semantic class labels naturally follow a long-tailed distribution, contemporary benchmarks focus on only a few common classes (eg, pedestrian and car) and neglect many rare classes in-the-tail (eg, debris and stroller). However, AVs must still detect rare classes to ensure safe operation. Moreover, semantic classes are often organized within a hierarchy, eg, tail classes such as child and construction-worker are arguably subclasses of pedestrian. However, such hierarchical relationships are often ignored, which may lead to misleading estimates of performance and missed opportunities for algorithmic innovation.

view more

WEDGE: A multi-weather autonomous driving dataset built from generative vision-language models

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

2023 The open road poses many challenges to autonomous perception, including poor visibility from extreme weather conditions. Models trained on good-weather datasets frequently fail at detection in these out-of-distribution settings. To aid adversarial robustness in perception, we introduce WEDGE (WEather images by DALL-E GEneration): a synthetic dataset generated with a vision-language generative model via prompting. WEDGE consists of 3360 images in 16 extreme weather conditions manually annotated with 16513 bounding boxes, supporting research in the tasks of weather classification and 2D object detection. We have analyzed WEDGE from research standpoints, verifying its effectiveness for extreme-weather autonomous perception. We establish baseline performance for classification and detection with 53.87% test accuracy and 45.41 mAP.

view more

Reconstructing animatable categories from videos

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

2023 Building animatable 3D models is challenging due to the need for 3D scans, laborious registration, and manual rigging. Recently, differentiable rendering provides a pathway to obtain high-quality 3D models from monocular videos, but these are limited to rigid categories or single instances. We present RAC, a method to build category-level 3D models from monocular videos, disentangling variations over instances and motion over time. Three key ideas are introduced to solve this problem:(1) specializing a category-level skeleton to instances,(2) a method for latent space regularization that encourages shared structure across a category while maintaining instance details, and (3) using 3D background models to disentangle objects from the background. We build 3D models for humans, cats, and dogs given monocular videos.

view more

Distilling Neural Fields for Real-Time Articulated Shape Reconstruction

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

2023 We present a method for reconstructing articulated 3D models from videos in real-time, without test-time optimization or manual 3D supervision at training time. Prior work often relies on pre-built deformable models (eg SMAL/SMPL), or slow per-scene optimization through differentiable rendering (eg dynamic NeRFs). Such methods fail to support arbitrary object categories, or are unsuitable for real-time applications. To address the challenge of collecting large-scale 3D training data for arbitrary deformable object categories, our key insight is to use off-the-shelf video-based dynamic NeRFs as 3D supervision to train a fast feed-forward network, turning 3D shape and motion prediction into a supervised distillation task. Our temporal-aware network uses articulated bones and blend skinning to represent arbitrary deformations, and is self-supervised on video datasets without requiring 3D shapes or viewpoints as input.

view more