hero image
Eric Nyberg - Carnegie Mellon University. Pittsburgh, PA, US

Eric Nyberg

Professor | Carnegie Mellon University

Pittsburgh, PA, UNITED STATES

Eric Nyberg builds software applications that can understand and process human language.

Biography

Noted for his contributions to the fields of automatic text translation, information retrieval and automatic question answering, Eric Nyberg builds software applications that can understand and process human language. For the past decade, he has worked on question-answering technology, often in collaboration with colleagues at IBM. Since 2007, he and his CMU colleagues have participated in the Open Advancement of Question Answering, a collaboration with IBM that led to the development of Watson, a question answering computing system that defeated human opponents in nationally televised matches of Jeopardy. He currently directs the Master of Computational Data Science (MCDS) program. He is also co-founder and chief data scientist at Cognistx and serves on the Scientific Advisory Board for Fairhair.ai.

Areas of Expertise (6)

Automatic Text Translation

Processing Human Language

Automated Qustion Answering

IBM Watson

Information Retrieval

Articifical Intelligence

Media Appearances (3)

Meltwater acquires Algo, an AI-based news and data tracker

TechCrunch  online

2017-08-29

Michelsen is also not the only notable name in Algo’s pedigree: the company’s tech was partly developed by Eric Nyberg, a natural language pioneer and veteran who played a big role in the development of Watson at IBM, and is now the lead of the Language Technology Institute at Carnegie Mellon University. Nyberg is an advisor to Algo.

view more

Super-Smart Retail, Coming Soon To A Device Near You

Forbes  online

2015-08-04

One of the founders is Eric Nyberg, PhD, a professor in the Language Technologies Institute at Carnegie Mellon University. Eric directs the Master's Program in Computational Data Science. He was very involved in CMU's partnership with IBM in the development of Watson™ that ultimately triumphed over human competitors in the Jeopardy! Challenge.

view more

IBM readies Watson for post-Jeopardy life

CNN Money  online

2011-02-14

Watson didn't come cheap. IBM (IBM, Fortune 500) won't disclose how much it has invested in the project, but Eric Nyberg, a Carnegie Mellon computer science professor who has worked on Watson, estimates that the project cost IBM up to $100 million.

view more

Media

Publications:

Documents:

Photos:

loading image

Videos:

Watson Takes On Jeopardy! The Jeopardy! Challenge

Audio/Podcasts:

Social

Industry Expertise (3)

Computer Networking

Computer Hardware

Computer Software

Accomplishments (1)

Allen Newell Award for Research Excellence (professional)

n/a

Education (2)

Carnegie Mellon University: Ph.D., Computational Linguistics

Boston University: B.A.

Patents (3)

Integrated and authoring and translation system

US6658627

2003 The present invention is a system of integrated, computer-based processes for monolingual information development and multilingual translation. An interactive text editor enforces lexical and grammatical constraints on a natural language subset used by the authors to create their text, which they help disambiguate to ensure translatability. The resulting translatable source language text undergoes machine translation into any one of a set of target languages, without the translated text requiring any postediting.

view more

Integrated authoring and translation system

US5677835

1997 The present invention is a system of integrated, computer-based processes for monolingual information development and multilingual translation. An interactive text editor enforces lexical and grammatical constraints on a natural language subset used by the authors to create their text, which they help disambiguate to ensure translatability. The resulting translatable source language text undergoes machine translation into any one of a set of target languages, without the translated text requiring any postediting.

view more

Natural language processing system and method for parsing a plurality of input symbol sequences into syntactically or pragmatically correct word messages

US5299125

1994 A Natural Language Processing System utilizes a symbol parsing layer in combination with an intelligent word parsing layer to produce a syntactically or pragmatically correct output sentence or other word message. Initially, a plurality of polysemic symbol sequences are input through a keyboard segmented into a plurality of semantic, syntactic, or pragmatic segments including agent, action and patient segments, for example. One polysemic symbol sequence, including a plurality of polysemic symbols, is input from each of the three segments of the keyboard.

view more

Articles (5)

Difference-Masking: Choosing What to Mask in Continued Pretraining

arXiv preprint

2023 Self-supervised learning (SSL) and the objective of masking-and-predicting in particular have led to promising SSL performance on a variety of downstream tasks. However, while most approaches randomly mask tokens, there is strong intuition from the field of education that deciding what to mask can substantially improve learning outcomes. We introduce Difference-Masking, an approach that automatically chooses what to mask during continued pretraining by considering what makes an unlabelled target domain different from the pretraining domain. Empirically, we find that Difference-Masking outperforms baselines on continued pretraining settings across four diverse language and multimodal video tasks. The cross-task applicability of Difference-Masking supports the effectiveness of our framework for SSL pretraining in language, vision, and other domains.

view more

Chain-of-Skills: A Configurable Model for Open-domain Question Answering

arXiv preprint

2023 The retrieval model is an indispensable component for real-world knowledge-intensive tasks, e.g., open-domain question answering (ODQA). As separate retrieval skills are annotated for different datasets, recent work focuses on customized methods, limiting the model transferability and scalability. In this work, we propose a modular retriever where individual modules correspond to key skills that can be reused across datasets. Our approach supports flexible skill configurations based on the target domain to boost performance. To mitigate task interference, we design a novel modularization parameterization inspired by sparse Transformer. We demonstrate that our model can benefit from self-supervised pretraining on Wikipedia and fine-tuning using multiple ODQA datasets, both in a multi-task fashion. Our approach outperforms recent self-supervised retrievers in zero-shot evaluations and achieves state-of-the-art fine-tuned retrieval performance on NQ, HotpotQA and OTT-QA.

view more

Using Implicit Feedback to Improve Question Generation

arXiv preprint

2023 Question Generation (QG) is a task of Natural Language Processing (NLP) that aims at automatically generating questions from text. Many applications can benefit from automatically generated questions, but often it is necessary to curate those questions, either by selecting or editing them. This task is informative on its own, but it is typically done post-generation, and, thus, the effort is wasted. In addition, most existing systems cannot incorporate this feedback back into them easily. In this work, we present a system, GEN, that learns from such (implicit) feedback. Following a pattern-based approach, it takes as input a small set of sentence/question pairs and creates patterns which are then applied to new unseen sentences. Each generated question, after being corrected by the user, is used as a new seed in the next iteration, so more patterns are created each time. We also take advantage of the corrections made by the user to score the patterns and therefore rank the generated questions. Results show that GEN is able to improve by learning from both levels of implicit feedback when compared to the version with no learning, considering the top 5, 10, and 20 questions. Improvements go up from 10%, depending on the metric and strategy used.

view more

Knowledge-driven scene priors for semantic audio-visual embodied navigation

arXiv preprint

2022 Generalisation to unseen contexts remains a challenge for embodied navigation agents. In the context of semantic audio-visual navigation (SAVi) tasks, the notion of generalisation should include both generalising to unseen indoor visual scenes as well as generalising to unheard sounding objects. However, previous SAVi task definitions do not include evaluation conditions on truly novel sounding objects, resorting instead to evaluating agents on unheard sound clips of known objects; meanwhile, previous SAVi methods do not include explicit mechanisms for incorporating domain knowledge about object and region semantics. These weaknesses limit the development and assessment of models' abilities to generalise their learned experience. In this work, we introduce the use of knowledge-driven scene priors in the semantic audio-visual embodied navigation task: we combine semantic information from our novel knowledge graph that encodes object-region relations, spatial knowledge from dual Graph Encoder Networks, and background knowledge from a series of pre-training tasks -- all within a reinforcement learning framework for audio-visual navigation. We also define a new audio-visual navigation sub-task, where agents are evaluated on novel sounding objects, as opposed to unheard clips of known objects. We show improvements over strong baselines in generalisation to unseen regions and novel sounding objects, within the Habitat-Matterport3D simulation environment, under the SoundSpaces task.

view more

Distribution-aware Goal Prediction and Conformant Model-based Planning for Safe Autonomous Driving

arXiv preprint

2022 The feasibility of collecting a large amount of expert demonstrations has inspired growing research interests in learning-to-drive settings, where models learn by imitating the driving behaviour from experts. However, exclusively relying on imitation can limit agents' generalisability to novel scenarios that are outside the support of the training data. In this paper, we address this challenge by factorising the driving task, based on the intuition that modular architectures are more generalisable and more robust to changes in the environment compared to monolithic, end-to-end frameworks. Specifically, we draw inspiration from the trajectory forecasting community and reformulate the learning-to-drive task as obstacle-aware perception and grounding, distribution-aware goal prediction, and model-based planning. Firstly, we train the obstacle-aware perception module to extract salient representation of the visual context. Then, we learn a multi-modal goal distribution by performing conditional density-estimation using normalising flow. Finally, we ground candidate trajectory predictions road geometry, and plan the actions based on on vehicle dynamics. Under the CARLA simulator, we report state-of-the-art results on the CARNOVEL benchmark.

view more