Bridget McInnes, Ph.D.

Associate Professor and Graduate Program Director

  • Engineering East Hall, Room E4255, Richmond VA UNITED STATES
btmcinnes@vcu.edu

Dr. McInnes' research is in the area of Natural Language Processing (NLP) with a particular interest in semantics.

Contact

Biography

Dr. McInnes' research has primarily been in the area of Natural Language Processing (NLP) with a particular interest in semantics, the process of analyzing the meaning of text. Specific areas of interest include:

- Word sense disambiguation
- Biomedical text processing
- Semantic similarity and relatedness
- Information extraction
- Literature-based discovery

Industry Expertise

Education/Learning
Research

Areas of Expertise

Natural Language Processing
Biomedical Text Processing
Information Retrieval
Machine Learning

Education

University of Minnesota

Ph.D.

Computer Science

2009

University of Minnesota

M.S.

Computer Science

2004

University of Minnesota

B.S.

Computer Science

2002

Selected Articles

U-path: An undirected path-based measure of semantic similarity

AMIA Annual Symposium Proceedings Archive

2014

In this paper, we present the results of a method using undirected paths to determine the degree of semantic similarity between two concepts in a dense taxonomy with multiple inheritance. The overall objective of this work was to explore methods that take advantage of dense multi-hierarchical taxonomies that are more graph-like than tree-like by incorporating the proximity of concepts with respect to each other within the entire is-a hierarchy. Our hypothesis is that the proximity of the concepts regardless of how they are connected is an indicator to the degree of their similarity. We evaluate our method using the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and four reference standards that have been manually tagged by human annotators. The overall results of our experiments show, in SNOMED CT, the location of the concepts with respect to each other does indicate the degree to which they are similar.

View more

Determining the Difficulty of Word Sense Disambiguation

Journal of Biomedical Informatics

2014

Automatic processing of biomedical documents is made difficult by the fact that many of the terms they contain are ambiguous. Word Sense Disambiguation (WSD) systems attempt to resolve these ambiguities and identify the correct meaning. However, the published literature on WSD systems for biomedical documents report considerable differences in performance for different terms. The development of WSD systems is often expensive with respect to acquiring the necessary training data. It would therefore be useful to be able to predict in advance which terms WSD systems are likely to perform well or badly on.

This paper explores various methods for estimating the performance of WSD systems on a wide range of ambiguous biomedical terms (including ambiguous words/phrases and abbreviations). The methods include both supervised and unsupervised approaches. The supervised approaches make use of information from labeled training data while the unsupervised ones rely on the UMLS Metathesaurus. The approaches are evaluated by comparing their predictions about how difficult disambiguation will be for ambiguous terms against the output of two WSD systems. We find the supervised methods are the best predictors of WSD difficulty, but are limited by their dependence on labeled training data. The unsupervised methods all perform well in some situations and can be applied more widely.

View more

Evaluating Measures of Semantic Similarity and Relatedness to Disambiguate Terms in Biomedical Text

Journal of Biomedical Informatics

2013

In this article, we evaluate a knowledge-based word sense disambiguation method that determines the intended concept associated with an ambiguous word in biomedical text using semantic similarity and relatedness measures. These measures quantify the degree of similarity or relatedness between concepts in the Unified Medical Language System (UMLS). The objective of this work is to develop a method that can disambiguate terms in biomedical text by exploiting similarity and relatedness information extracted from biomedical resources and to evaluate the efficacy of these measure on WSD.

View more

Show All +