UF team develops AI tool to make genetic research more comprehensive

Dec 15, 2025

4 min

Kiley Graim


University of Florida researchers are addressing a critical gap in medical genetic research — ensuring it better represents and benefits people of all backgrounds.


Their work, led by Kiley Graim, Ph.D., an assistant professor in the Department of Computer & Information Science & Engineering, focuses on improving human health by addressing "ancestral bias" in genetic data, a problem that arises when most research is based on data from a single ancestral group. This bias limits advancements in precision medicine, Graim said, and leaves large portions of the global population underserved when it comes to disease treatment and prevention.


To solve this, the team developed PhyloFrame, a machine-learning tool that uses artificial intelligence to account for ancestral diversity in genetic data. With funding support from the National Institutes of Health, the goal is to improve how diseases are predicted, diagnosed, and treated for everyone, regardless of their ancestry. A paper describing the PhyloFrame method and how it showed marked improvements in precision medicine outcomes was published Monday in Nature Communications.



Graim, a member of the UF Health Cancer Center, said her inspiration to focus on ancestral bias in genomic data evolved from a conversation with a doctor who was frustrated by a study's limited relevance to his diverse patient population. This encounter led her to explore how AI could help bridge the gap in genetic research.


“If our training data doesn’t match our real-world data, we have ways to deal with that using machine learning. They’re not perfect, but they can do a lot to address the issue.” —Kiley Graim, Ph.D., an assistant professor in the Department of Computer & Information Science & Engineering and a member of the UF Health Cancer Center




“I thought to myself, ‘I can fix that problem,’” said Graim, whose research centers around machine learning and precision medicine and who is trained in population genomics. “If our training data doesn’t match our real-world data, we have ways to deal with that using machine learning. They’re not perfect, but they can do a lot to address the issue.”





By leveraging data from population genomics database gnomAD, PhyloFrame integrates massive databases of healthy human genomes with the smaller datasets specific to diseases used to train precision medicine models. The models it creates are better equipped to handle diverse genetic backgrounds. For example, it can predict the differences between subtypes of diseases like breast cancer and suggest the best treatment for each patient, regardless of patient ancestry.



Processing such massive amounts of data is no small feat. The team uses UF’s HiPerGator, one of the most powerful supercomputers in the country, to analyze genomic information from millions of people. For each person, that means processing 3 billion base pairs of DNA.


“I didn’t think it would work as well as it did,” said Graim, noting that her doctoral student, Leslie Smith, contributed significantly to the study. “What started as a small project using a simple model to demonstrate the impact of incorporating population genomics data has evolved into securing funds to develop more sophisticated models and to refine how populations are defined.”


What sets PhyloFrame apart is its ability to ensure predictions remain accurate across populations by considering genetic differences linked to ancestry. This is crucial because most current models are built using data that does not fully represent the world’s population. Much of the existing data comes from research hospitals and patients who trust the health care system. This means populations in small towns or those who distrust medical systems are often left out, making it harder to develop treatments that work well for everyone.


She also estimated 97% of the sequenced samples are from people of European ancestry, due, largely, to national and state level funding and priorities, but also due to socioeconomic factors that snowball at different levels – insurance impacts whether people get treated, for example, which impacts how likely they are to be sequenced.


“Some other countries, notably China and Japan, have recently been trying to close this gap, and so there is more data from these countries than there had been previously but still nothing like the European data," she said. “Poorer populations are generally excluded entirely.”


Thus, diversity in training data is essential, Graim said.


"We want these models to work for any patient, not just the ones in our studies," she said. “Having diverse training data makes models better for Europeans, too. Having the population genomics data helps prevent models from overfitting, which means that they'll work better for everyone, including Europeans.”


Graim believes tools like PhyloFrame will eventually be used in the clinical setting, replacing traditional models to develop treatment plans tailored to individuals based on their genetic makeup. The team’s next steps include refining PhyloFrame and expanding its applications to more diseases.


“My dream is to help advance precision medicine through this kind of machine learning method, so people can get diagnosed early and are treated with what works specifically for them and with the fewest side effects,” she said. “Getting the right treatment to the right person at the right time is what we’re striving for.”


Graim’s project received funding from the UF College of Medicine Office of Research’s AI2 Datathon grant award, which is designed to help researchers and clinicians harness AI tools to improve human health.

Connect with:
Kiley Graim

Kiley Graim

Assistant Professor

Kiley Graim develops and applies computational approaches to genomic data to understand disease and biological systems.

Artifical IntelligenceGenomicsBioinformaticsPrecision Medicine
Powered by

You might also like...

Check out some other posts from University of Florida

Study finds most cancer patients exposed to misinformation; UF researchers pilot 'information prescription' featured image

3 min

Study finds most cancer patients exposed to misinformation; UF researchers pilot 'information prescription'

Ninety-three percent of patients with a new cancer diagnosis were exposed to at least one type of misinformation about cancer treatments, a UF Health Cancer Center study has found. Most patients encountered the misinformation — defined as unproven or disproven cancer treatments and myths or misconceptions — even when they weren’t looking for it. The findings have major implications for cancer treatment decision-making. Specifically, doctors should assume the patient has seen or heard misinformation. “Clinicians should assume when their patients are coming to them for a treatment discussion that they have been exposed to different types of information about cancer treatment, whether or not they went online and looked it up themselves,” said senior author Carma Bylund, Ph.D., a professor and associate chair of education in the UF Department of Health Outcomes and Biomedical Informatics. “One way or another, people are being exposed to a lot of misinformation.” Working with oncologists, Bylund and study first author Naomi Parker, Ph.D., an assistant scientist in the UF Department of Health Outcomes and Biomedical Informatics, are piloting an “information prescription” to steer patients to sources of evidence-based information like the American Cancer Society. The study paves the way for other similar strategies. Most notably, the study found the most common way patients were exposed to misinformation was second hand. “Your algorithms pick up on your diagnosis, your friends and family pick up on it, and then you’re on Facebook and you become exposed to this media,” Parker said. “You’re not necessarily seeking out if vitamin C may be a cure for cancer, but you start being fed that content.” And no, vitamin C does not cure cancer. Health misinformation can prevent people from getting treatment that has evidence behind it, negatively affect relationships between patients and physicians, and increase the risk of death, research has shown. People with cancer are particularly vulnerable to misinformation because of the anxiety and fear that comes with a serious diagnosis, not to mention the overwhelming amount of new information they have to suddenly absorb. While past research has studied misinformation by going directly to the source — for instance, studying what percentage of content on a platform like TikTok is nonsense — little research has looked at its prevalence or how it affects people. The team first developed a way to identify the percentage of cancer patients exposed to misinformation. UF researchers collaborated with Skyler Johnson, M.D., at Huntsman Cancer Institute, an internationally known researcher in the field. The survey questions were based on five categories of unproven or disproven cancer treatments — vitamins and minerals, herbs and supplements, special diets, mind-body interventions and miscellaneous treatments — and treatment misconceptions. The myths and misconceptions were adapted from National Cancer Institute materials and included statements like “Will eating sugar make my cancer worse?” The team surveyed 110 UF Health patients diagnosed with prostate, breast, colorectal or lung cancer within the past six months, a time when patients typically make initial treatment decisions. Most had heard of a potential cancer treatment beyond the standard of care, and most reported they had heard of at least one myth or misconception. The most common sources were close friends or family and websites, distant friends/associates or relatives, social media and news media. The findings mark a shift in misinformation research, with major implications for the doctor-patient relationship, said Bylund, a member of the Cancer Control and Population Sciences research program at the UF Health Cancer Center. “I still think media and the internet are the source and why misinformation can spread so rapidly, but it might come to a cancer patient interpersonally, from family or friends,” she said. Most patients rarely discussed the potential cancer treatments they had heard about with an oncologist, the study also found. Next, the researchers plan to survey a wider pool of patients, then study the outcomes of interventions designed to decrease misinformation exposure, like the information prescription.

New AI tool matches students with high-impact internships featured image

2 min

New AI tool matches students with high-impact internships

Finding the right internship can be an important step for students, but it’s not always clear which opportunities will lead to the strongest growth. To help solve that problem, University of Florida researchers have developed an AI-powered tool that helps students identify internships most likely to accelerate their technical and professional development. Unlike traditional recommendation engines, Pro-CaRE not only predicts which opportunities will lead to stronger outcomes, it also explains why each suggestion is a good fit. In testing data collected from the students, Pro-CaRE’s predictions proved highly accurate, accounting for more than 72% of the differences in learning gains among participants. While the pilot is being tested in engineering, the tool could be adopted for other disciplines. “Internships are one of the most critical parts of an engineering education, but students often struggle to know which experiences will actually help them grow,” said Jinnie Shin, assistant professor of research and evaluation methodology in the UF College of Education. “What makes Pro-CaRE unique is that it doesn’t just offer a list of options. It provides personalized recommendations backed by data and it tells students clearly why an opportunity is a good match for them.” Pro-CaRE creates matches by analyzing each student’s coursework, major, background and self-reported interest, confidence and self-efficacy in engineering skills. It then compares that profile with a carefully chosen set of similar peers to refine suggestions. The result is more precise guidance that adapts to students at different stages of their degree programs. “Students shouldn’t have to guess or hope that an internship will be worthwhile,” Shin said. “With Pro-CaRE, they can approach opportunities knowing they’re backed by evidence, whether the role is onsite, hybrid or remote and whether it’s at a startup or a Fortune 500 company.” The system is designed to work across a wide range of companies and contexts, giving students flexibility while ensuring their choices align with their personal and professional goals. Each recommendation comes with a clear “why this?” explanation, so students can make confident decisions and discuss options more effectively with advisors. Pro-CaRE was developed by a cross-disciplinary UF team combining expertise in education and engineering. Alongside Shin, the project’s co-principal investigators include Kent Crippen in the College of Education and Bruce Carroll in the Herbert Wertheim College of Engineering. The team is exploring external funding opportunities to expand the usage and test the efficacy on a larger scale. “Ultimately, our goal is to empower students to invest their time in experiences that will have the greatest impact,” Shin said. “Pro-CaRE bridges the gap between what students hope to gain and what internships can truly deliver.”

Using AI tools empowers and burdens users in online Q&A communities featured image

2 min

Using AI tools empowers and burdens users in online Q&A communities

Whether you’ve searched for cooking tips on Reddit, troubleshooted tech problems on community forums or asked questions on platforms like Quora, you’ve benefited from online help communities. These digital spaces rely on people across the world to contribute their knowledge for free, and have become an essential tool for solving problems and learning new skills. New research reveals that generative artificial intelligence tools like ChatGPT are creating a double-edge effect on users in these communities, simultaneously making them more helpful while potentially overwhelming them to the point of decreasing their responses. “On the positive side, AI helps users learn to write more organized and readable answers, leading to a noticeable increase in the number of responses,” explained Liangfei Qiu, Ph.D., study coauthor and PricewaterhouseCoopers Professor at the University of Florida Warrington College of Business. “However, when users rely too heavily on AI, the mental effort required to process and refine AI outputs can actually reduce participation. In other words, AI both empowers and burdens contributors: it enables more engagement and better readability, but too much reliance can slow people down.” The study examined Stack Overflow, one of the world’s largest question-and-answer coding platforms for computer programmers, to investigate the impact of generative AI on both the quality and quantity of user contributions. Qiu and his coauthor Guohou Shan of Northeastern University’s D’Amore-McKim School of Business measured the impact of AI on users’ number of answers generated per day, answer length and readability. Specifically, they found that users who used AI tools to generate their responses contributed almost 17% more answers per day compared to those who didn’t use AI. The answers generated with AI were both shorter by about 23% and easier to read. However, when people relied too heavily on AI tools, their participation decreased. Qiu and Shan noted that the additional cognitive burden associated with heavier AI usage negatively affected the impact on a user’s answer quality. For online help communities grappling with AI policies, this research provides valuable insight into how these policies can be updated in the current AI environment. While some communities, like Stack Overflow, have banned AI tools, this research suggests that a more nuanced approach could be a better solution. Instead of banning AI entirely, the researchers suggest striking a balance between allowing AI usage while promoting responsible and moderated use. This approach, they argue, would enable users to benefit from efficiency and learning opportunities, while not compromising quality content and user cognition. “For platform leaders, the takeaway is clear: AI can boost participation if thoughtfully integrated, but its cognitive demands must be managed to sustain long-term user contributions,” Qiu said.

View all posts