Deva Ramanan

Professor

  • Pittsburgh PA UNITED STATES

Deva Ramanan's research interests span computer vision and machine learning, with a focus on visual recognition.

Contact

Biography

Deva Ramanan is a professor in the Robotics Institute at Carnegie Mellon University and the director of the CMU Argo AI Center for Autonomous Vehicle Research. The Center engages in fundamental research to produce advanced perception and next-generation decision-making algorithms that enable vehicles to perceive and navigate autonomously in diverse real-world urban conditions. His research interests span computer vision and machine learning, with a focus on visual recognition often motivated by the task of understanding people from visual data. He served at the program chair of the IEEE Computer Vision and Pattern Recognition (CVPR) 2018. He is on the editorial board of the International Journal of Computer Vision and is an associate editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence. He regularly serves as a senior program committee member for CVPR, the International Conference on Computer Vision, and the European Conference on Computer Vision. He also regularly serves on NSF panels for computer vision and machine learning.

Areas of Expertise

Human-Centered Robotics
Human-Robot Collaboration
Machine Learning Embedded in Systems
Neurorobotics
3-D Vision and Recognition
Computer Vision
Visual Servoing and Visual Tracking
First-Person Vision
Sensing & Perception
Graphics & Creative Tools

Media Appearances

SpreeAI Is Redefining Retail With Virtual AI-Powered Try-Ons Curated by the Top in Tech and Fashion

Associated Press  online

2025-05-06

Deva Ramanan (Robotics Institute) believes SpreeAI is "assembling a team that understands both the deep technical challenges and their product impact." SpreeAI is an app that allows users to virtually try on outfits.

View More

Generative modeling tool renders 2D sketches in 3D

Tech Xplore  online

2023-04-06

"As long as you can draw a sketch, you can make your own customized 3D model," said RI doctoral candidate Kangle Deng, who was part of the research team with Zhu, Professor Deva Ramanan and Ph.D. student Gengshan Yang.

View More

Self-driving cars would be nowhere without HD maps

Axios  online

2021-08-09

"Even though a traffic light and the moon may resemble each other, a self-driving system should use a combination of contextual cues — including spatial, temporal and prior knowledge — to tell them apart," Deva Ramanan, principal scientist at self-driving tech competitor Argo AI explains in a blog post.

View More

Show All +

Industry Expertise

Computer Networking
Automotive

Accomplishments

IARPA Award for "Walk-Through Rendering From Images of Varying Altitudes

2023-2027

Education

University of Delaware

B.S.

Computer Engineering

University of California at Berkeley

Ph.D.

Electrical Engineering and Computer Science

Affiliations

  • IEEE Computer Vision and Pattern Recognition (CVPR)
  • International Journal of Computer Vision
  • IEEE Transactions on Pattern Analysis and Machine Intelligence

Articles

Distilling Neural Fields for Real-Time Articulated Shape Reconstruction

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

2023

We present a method for reconstructing articulated 3D models from videos in real-time, without test-time optimization or manual 3D supervision at training time. Prior work often relies on pre-built deformable models (eg SMAL/SMPL), or slow per-scene optimization through differentiable rendering (eg dynamic NeRFs). Such methods fail to support arbitrary object categories, or are unsuitable for real-time applications. To address the challenge of collecting large-scale 3D training data for arbitrary deformable object categories, our key insight is to use off-the-shelf video-based dynamic NeRFs as 3D supervision to train a fast feed-forward network, turning 3D shape and motion prediction into a supervised distillation task. Our temporal-aware network uses articulated bones and blend skinning to represent arbitrary deformations, and is self-supervised on video datasets without requiring 3D shapes or viewpoints as input.

View more

Reconstructing animatable categories from videos

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

2023

Building animatable 3D models is challenging due to the need for 3D scans, laborious registration, and manual rigging. Recently, differentiable rendering provides a pathway to obtain high-quality 3D models from monocular videos, but these are limited to rigid categories or single instances. We present RAC, a method to build category-level 3D models from monocular videos, disentangling variations over instances and motion over time. Three key ideas are introduced to solve this problem:(1) specializing a category-level skeleton to instances,(2) a method for latent space regularization that encourages shared structure across a category while maintaining instance details, and (3) using 3D background models to disentangle objects from the background. We build 3D models for humans, cats, and dogs given monocular videos.

View more

WEDGE: A multi-weather autonomous driving dataset built from generative vision-language models

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

2023

The open road poses many challenges to autonomous perception, including poor visibility from extreme weather conditions. Models trained on good-weather datasets frequently fail at detection in these out-of-distribution settings. To aid adversarial robustness in perception, we introduce WEDGE (WEather images by DALL-E GEneration): a synthetic dataset generated with a vision-language generative model via prompting. WEDGE consists of 3360 images in 16 extreme weather conditions manually annotated with 16513 bounding boxes, supporting research in the tasks of weather classification and 2D object detection. We have analyzed WEDGE from research standpoints, verifying its effectiveness for extreme-weather autonomous perception. We establish baseline performance for classification and detection with 53.87% test accuracy and 45.41 mAP.

View more

Show All +