Yiyang (Ian) Wang, Ph.D.

Assistant Professor Milwaukee School of Engineering

Milwaukee WI

Dr. Wang’s research interests lie at the intersection of machine learning, data science, and medical informatics.

Contact

Milwaukee School of Engineering
View more experts managed by Milwaukee School of Engineering

View all Experts

Education, Licensure and Certification

Ph.D.

Computer and Information Sciences

DePaul University

2023

M.S.

Computer Science

DePaul University

2018

Biography

Yiyang (Ian) Wang is an assistant professor in the Diercks School of Advanced Computing at MSOE, where he has served since 2023. He earned his Ph.D. in computer science from DePaul University in 2023. His research centers on data science and machine learning, with a special focus on advancing medical informatics and improving healthcare outcomes through AI.

Areas of Expertise

Machine Learning & Artificial Intelligence

Data Science & Predictive Modeling

Biomedical and Health Informatics

Fairness, Robustness, and Generalization in AI

Selected Publications

Harnessing Generative AI for Lung Nodule Spiculation Characterization.

Journal of Imaging Informatics in Medicine (2025): 1-17

Wang, Yiyang, Charmi Patel, Roselyne Tchoua, Jacob Furst, and Daniela Raicu.

2025-06-26

This work introduces a new approach to improving the diagnosis of lung cancer by teaching artificial intelligence (AI) systems to better recognize a key radiological feature known as spiculation—irregular, spike-like patterns on the edges of lung nodules that are closely linked to tumor aggressiveness. Traditional computer-aided diagnosis (CAD) tools struggle with spiculation because it is subtle, difficult to quantify, and underrepresented in existing datasets. To address this, we developed a framework using variational autoencoders (VAEs) to generate realistic variations of spiculated nodules from existing medical images. By augmenting the dataset with these synthetic but clinically meaningful examples, we significantly improved the model’s ability to detect spiculation—by up to 7.5%—without reducing accuracy on other cases. Beyond boosting performance, this work shows how advanced AI can uncover and reproduce clinically important features, offering both improved diagnostic support and deeper insights into tumor progression.

Outcome risk model development for heterogeneity of treatment effect analyses: a comparison of non-parametric machine learning methods and semi-parametric statistical methods

BMC Medical Research Methodology 24, no. 1 (2024): 158.

Xu, Edward, Joseph Vanghelof, Yiyang Wang, Anisha Patel, Jacob Furst, Daniela Stan Raicu, Johannes Tobias Neumann et al.

2024-07-23

This study explores how different modeling approaches influence the detection of heterogeneity of treatment effect (HTE) in clinical trials. Using data from the ASPREE trial (ASPirin in Reducing Events in the Elderly), we compared three methods for creating outcome risk subgroups: a traditional proportional hazards model, a decision tree, and a random forest. Each approach partitions participants into risk-based subgroups to evaluate whether aspirin’s effects on outcomes such as death, dementia, or disability differ across groups. While both machine learning models identified meaningful risk strata, only the proportional hazards model revealed statistically significant variation in absolute risk reduction across subgroups. Our findings show that the choice of modeling technique can shape HTE analysis results, highlighting the trade-offs between interpretability, robustness, and statistical power. This work underscores the importance of carefully selecting subgrouping methods to ensure reliable, clinically meaningful insights into how treatments affect diverse patient populations.

No nodule left behind: evaluating lung nodule malignancy classification with different stratification schemes

Medical Imaging 2023: Computer-Aided Diagnosis

2023

Machine learning models have been widely used in lung cancer computer-aided diagnosis (CAD) studies. However, the heterogeneity in the visual appearance of lung nodules as well as lack of consideration of hidden subgroups in the data are significant obstacles to generating accurate CAD outcomes across all nodule instances. Previous lung cancer CAD models aim to achieve Empirical Risk Minimization (ERM), which leads to a high overall accuracy but often fails at predicting certain subgroups caused by the lung cancer heterogeneity.

Show All +