Nan Xi, M.D., Ph.D.

Assistant Professor

Richmond VA UNITED STATES
Department of Computer Science

Research focus: Computer Vision, Medical AI.

Biography

Dr. Nan Xi is a tenure-track Assistant Professor in the Department of Computer Science at Virginia Commonwealth University. His research focuses on broad areas of Computer Vision and Medical AI. His work on visual reasoning was awarded by SONY research award program in 2024.

Education

University at Buffalo

Ph.D.

Computer Science and Engineering

2025

Research Focus

Dr. Xi's research focuses on empowering AI systems with the ability of visual reasoning akin to human (experts), with the goal of enhancing both the reliability and usability of these AI models, especially in medicine and health applications.

Courses

CMSC 630 - Computer Vision and Image Processing

This course is an introduction to those areas of Artificial Intelligence that deal with fundamental issues and techniques of computer vision and image processing. The emphasis is on physical, mathematical, and information-processing aspects of computational vision and image processing.

Selected Articles

Pix2Key: Controllable Open-Vocabulary Retrieval with Semantic Decomposition and Self-Supervised Visual Dictionary Learning

Forty-third International Conference on Machine Learning (ICML)

Guoyizhe Wei, Yang Jiao, Nan Xi, Zhishen Huang, Jingjing Meng, Rama Chellappa, Yan Gao

2026-07-06

Introducing Pix2Key, which represents both queries and candidates as open-vocabulary visual dictionaries, enabling intent-aware constraint matching and diversity-aware reranking in a unified embedding space.

dFLMoE: Decentralized Federated Learning via Mixture of Experts for Medical Data Analysis

Proceedings of Computer Vision and Pattern Recognition (CVPR)

Luyuan Xie, Tianyu Luan, Wenyuan Cai, Guochen Yan, Zhaoyu Chen, Nan Xi, Yuejian Fang, Qingni Shen, Zhonghai Wu, Junsong Yuan

2025-06-09

Decentralized federated learning framework named dFLMoE that transmits each client’s knowledge to other clients and performs local decision-making on each client, effectively avoiding the knowledge damage caused by centralized server aggregation and eliminating the dependence on a central server.

Interaction-centric Spatio-Temporal Context Reasoning for Multi-person Video HOI Recognition

European Conference on Computer Vision (ECCV)

Yisong Wang , Nan Xi *, Jingjing Meng, Junsong Yuan (* Corresponding Author)

2024-07-08

Interaction-centric Spatio-Temporal Context Reasoning for Multi-person Video HOI Recognition

Nan Xi, M.D., Ph.D.

Links

Biography

Education

University at Buffalo

Research Focus

Research Focus

Courses

CMSC 630 - Computer Vision and Image Processing

Selected Articles

Pix2Key: Controllable Open-Vocabulary Retrieval with Semantic Decomposition and Self-Supervised Visual Dictionary Learning

dFLMoE: Decentralized Federated Learning via Mixture of Experts for Medical Data Analysis

Interaction-centric Spatio-Temporal Context Reasoning for Multi-person Video HOI Recognition

Open Set Video HOI detection from Action- centric Chain-of-Look Prompting

Chain-of-Look Prompting for Verb-centric Surgical Triplet Recognition in Endoscopic Videos