Nan Xi, M.D., Ph.D.

Assistant Professor VCU College of Engineering

Richmond VA

Research focus: Computer Vision, Medical AI.

Contact

VCU College of Engineering
View more experts managed by VCU College of Engineering

View all Experts

Biography

Dr. Nan Xi is a tenure-track Assistant Professor in the Department of Computer Science at Virginia Commonwealth University, leading the MAViC (Medical AI and Visual Computing) Lab. His research focuses on broad areas of Computer Vision and Medical AI. His works have been published in leading venues of computer vision and medical AI, including CVPR, ICCV, ECCV, ICML, ACM MM, MICCAI, WWW and so on. In 2024, his work on visual reasoning was awarded by SONY research award program.

Industry Expertise

Research

Computer Software

Areas of Expertise

Computer Vision

Aritificial Intelligence

Biomedical AI

Education

University at Buffalo

Ph.D.

Computer Science and Engineering

2025

Peking University

Doctor of Medicine

2019

Research Focus

Dr. Xi's research focuses on empowering AI systems with the ability of visual reasoning akin to human (experts), with the goal of enhancing both the reliability and usability of these AI models, especially in medicine and health applications.

Courses

CMSC 630 - Computer Vision and Image Processing

This course is an introduction to those areas of Artificial Intelligence that deal with fundamental issues and techniques of computer vision and image processing. The emphasis is on physical, mathematical, and information-processing aspects of computational vision and image processing.

Selected Articles

Latent Visual Diffusion Reasoning with Monte Carlo Tree Search

European Conference on Computer Vision (ECCV), 2026

Xirui Teng, Nan Xi *, Junsong Yuan (* Corresponding Author)

2026-06-18

Proposing Latent Diffusion Visual Reasoning (LDVR)to integrate keypoint-guided Monte Carlo Tree Search (MCTS) to model and visualize the latent visual reasoning process. LDVR not only produces more accurate skill assessments but also uncovers the critical visual reasoning sequences that contribute to the final evaluation

Pix2Key: Controllable Open-Vocabulary Retrieval with Semantic Decomposition and Self-Supervised Visual Dictionary Learning

Forty-third International Conference on Machine Learning (ICML), 2026

Guoyizhe Wei, Yang Jiao, Nan Xi, Zhishen Huang, Jingjing Meng, Rama Chellappa, Yan Gao

2026-07-06

Introducing Pix2Key, which represents both queries and candidates as open-vocabulary visual dictionaries, enabling intent-aware constraint matching and diversity-aware reranking in a unified embedding space.

dFLMoE: Decentralized Federated Learning via Mixture of Experts for Medical Data Analysis

Proceedings of Computer Vision and Pattern Recognition (CVPR), 2025

Luyuan Xie, Tianyu Luan, Wenyuan Cai, Guochen Yan, Zhaoyu Chen, Nan Xi, Yuejian Fang, Qingni Shen, Zhonghai Wu, Junsong Yuan

2025-06-09

Decentralized federated learning framework named dFLMoE that transmits each client’s knowledge to other clients and performs local decision-making on each client, effectively avoiding the knowledge damage caused by centralized server aggregation and eliminating the dependence on a central server.

Nan Xi, M.D., Ph.D.

VCU College of Engineering

Links

Biography

Industry Expertise

Areas of Expertise

Education

University at Buffalo

Peking University

Research Focus

Research Focus

Courses

CMSC 630 - Computer Vision and Image Processing

Selected Articles

Latent Visual Diffusion Reasoning with Monte Carlo Tree Search

Pix2Key: Controllable Open-Vocabulary Retrieval with Semantic Decomposition and Self-Supervised Visual Dictionary Learning

dFLMoE: Decentralized Federated Learning via Mixture of Experts for Medical Data Analysis

Interaction-centric Spatio-Temporal Context Reasoning for Multi-person Video HOI Recognition

Open Set Video HOI detection from Action- centric Chain-of-Look Prompting

Chain-of-Look Prompting for Verb-centric Surgical Triplet Recognition in Endoscopic Videos