hero image
Jiang Bian - University of Florida. Gainesville, FL, US

Jiang Bian

Professor/Chief | University of Florida

Gainesville, FL, UNITED STATES

Jiang Bian uses big data to advance medicine and creates methods to analyze diverse, massive data sources.


Jiang Bian focuses on biomedical Informatics, an interdisciplinary field whose central theme is to explore the effective uses of data, information, and knowledge for scientific inquiry, problem-solving, and decision-making, motivated by efforts to import human health. Jiang has a diverse yet strong multi-disciplinary background in data integration, semantic web, machine learning, natural language processing, social media analysis, network science, data privacy, and software engineering. Nevertheless, his expertise and background serve an overarching theme: data science with heterogeneous data, information, and knowledge resources.

Areas of Expertise (7)

Data Privacy in Healthcare

Cancer Data

Artificial Intelligence

Data-driven Medicine

Biomedical Informatics

Machine Learning

Natural Language Processing

Articles (3)

Early prediction of Alzheimer's disease and related dementias using real-world electronic health records

Alzheimers Dement

Qian Li, et. al


This study aims to explore machine learning (ML) methods for early prediction of Alzheimer's disease (AD) and related dementias (ADRD) using the real-world electronic health records (EHRs). A total of 23,835 ADRD and 1,038,643 control patients were identified from the OneFlorida+ Research Consortium. Two ML methods were used to develop the prediction models. Both knowledge-driven and data-driven approaches were explored. Four computable phenotyping algorithms were tested.

view more

A large language model for electronic health records

NPJ Digital Medicine

Xi Yang, et. al


There is an increasing interest in developing artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is comparatively small at 110 million parameters (compared with billions of parameters in the general domain).

view more

Data-driven identification of post-acute SARS-CoV-2 infection subphenotypes

Nature Medicine

Hao Zhang, et. al


The post-acute sequelae of SARS-CoV-2 infection (PASC) refers to a broad spectrum of symptoms and signs that are persistent, exacerbated or newly incident in the period after acute SARS-CoV-2 infection. Most studies have examined these conditions individually without providing evidence on co-occurring conditions. In this study, we leveraged the electronic health record data of two large cohorts, INSIGHT and OneFlorida+, from the national Patient-Centered Clinical Research Network.

view more





loading image