Scott Niekum

Associate Professor, Manning College of Information and Computer Sciences University of Massachusetts Amherst

Amherst MA

Scott Niekum works to enable personal robots to be deployed in the home and workplace. He is an expert on the social implications of AI.

Contact

University of Massachusetts Amherst
View more experts managed by University of Massachusetts Amherst

View all Experts

Expertise

Human-Robot Interaction

Robotic Manipulation

AI Safety

Reinforcement Learning

Imitation Learning

Biography

Scott Niekum directs the Personal Autonomous Robotics Lab (PeARL) which works to enable personal robots to be deployed in the home and workplace with minimal intervention by robotics experts.

His work draws roughly equally from both machine learning and robotics, including topics such as imitation learning, reinforcement learning, safety, manipulation, and human-robot interaction. Specifically, he is interested in addressing the following questions: How can human demonstrations and interactions be used to bootstrap the learning process? How can robots autonomously improve their understanding of the world through embodied interaction? And how can robots learn from heterogenous, noisy interactions and still provide strong probabilistic guarantees of correctness and safety?

Niekum has also been among the nation's scientists warning about AI and encouraging the imposition of limits.

Social Media

Video

Education

University of Massachusetts Amherst

Ph.D.

Computer Science

University of Massachusetts Amherst

M.S.

Computer Science

Carnegie Mellon University

B.S.

Computer Science

Select Recent Media Coverage

Will artificial intelligence erode our rights?

BBC online

2024-01-26

Scott Niekum discusses the European Union’s AI Act, legislation aiming to maximize the benefits of using AI while protecting our individual rights. "With the emergence of impressive seeming technologies like ChatGPT, I’m really hoping that these tools eventually will blossom into something that can help accelerate scientific discoveries,” he says.

Six Months Ago Elon Musk Called for a Pause on AI. Instead Development Sped Up

Wired online

2023-09-29

Scott Niekum comments in an article on the advancement of AI, despite a letter signed by prominent technology figures to pause advanced AI development. “Many AI skeptics want to hear a concrete doom scenario. To me, the fact that it is difficult to imagine detailed, concrete scenarios is kind of the point—it shows how hard it is for even world-class AI experts to predict the future of AI and how it will impact a complex world. I think that should raise some alarms,” he says.

Sept. 11th: Art and science on the Farms

New England Public Media radio

2023-09-15

Scott Niekum a professor in the UMass Amherst Manning College of Information and Computer Sciences, discusses AI on “The Fabulous 413” radio program and podcast. “A lot about AI right now gives me pause,” Niekum says. “I’m not worried about rogue AIs with crazy, new personalities coming out of nowhere, but what I do worry about is that AI is moving much, much faster than virtually anybody predicted."

Show All +

Select Publications

Sope: Spectrum of off-policy estimators

Advances in Neural Information Processing Systems

2021

Many sequential decision making problems are high-stakes and require off-policy evaluation (OPE) of a new policy using historical data collected using some other policy. One of the most common OPE techniques that provides unbiased estimates is trajectory based importance sampling (IS). However, due to the high variance of trajectory IS estimates, importance sampling methods based on state-action visitation distributions (SIS) have recently been adopted. Unfortunately, while SIS often provides lower variance estimates for long horizons, estimating the state-action distribution ratios can be challenging and lead to biased estimates.

Adversarial intrinsic motivation for reinforcement learning

Advances in Neural Information Processing Systems

2021

Learning with an objective to minimize the mismatch with a reference distribution has been shown to be useful for generative modeling and imitation learning. In this paper, we investigate whether one such objective, the Wasserstein-1 distance between a policy's state visitation distribution and a target distribution, can be utilized effectively for reinforcement learning (RL) tasks. Specifically, this paper focuses on goal-conditioned reinforcement learning where the idealized (unachievable) target distribution has full measure at the goal.

Universal off-policy evaluation

Advances in Neural Information Processing Systems

2021

When faced with sequential decision-making problems, it is often useful to be able to predict what would happen if decisions were made using a new policy. Those predictions must often be based on data collected under some previously used decision-making rule. Many previous methods enable such off-policy (or counterfactual) estimation of the expected value of a performance measure called the return. In this paper, we take the first steps towards a'universal off-policy estimator'(UnO)---one that provides off-policy estimates and high-confidence bounds for any parameter of the return distribution.

Show All +

Scott Niekum

University of Massachusetts Amherst

Expertise

Biography

Social Media

Video

Education

University of Massachusetts Amherst

University of Massachusetts Amherst

Carnegie Mellon University

Links

Select Recent Media Coverage

Will artificial intelligence erode our rights?

Six Months Ago Elon Musk Called for a Pause on AI. Instead Development Sped Up

Sept. 11th: Art and science on the Farms

Experts issue a dire warning about AI and encourage limits be imposed

Google is poisoning its reputation with AI researchers

The Departure of 2 Google AI Researchers Spurs More Fallout

Select Publications

Sope: Spectrum of off-policy estimators

Adversarial intrinsic motivation for reinforcement learning

Universal off-policy evaluation

Understanding the relationship between interactions and outcomes in human-in-the-loop machine learning

Importance sampling in reinforcement learning with an estimated behavior policy