hero image
Mohit Iyyer - University of Massachusetts Amherst. Amherst, MA, US

Mohit Iyyer

Associate Professor of Computer Science | University of Massachusetts Amherst


Mohit Iyyer's main research interest is in designing deep neural networks for traditional natural language processing tasks.

Expertise (4)

Artificial Intelligence

Chat GPT

Machine Learning

Natural Language Processing


Mohit Iyyer's main research interest is in designing deep neural networks for traditional natural language processing tasks, such as translation, and new problems that involve understanding creative language. Recently, he has been called on to discuss the impact of artificial intelligence tools like ChatGPT on our lives.

Social Media






Mohit Iyyer — Candidate Friday 2022 2019 QASP Invited Talk - Mohit Iyyer


Education (2)

University of Maryland, College Park: Ph.D., Computer Science

Washington University in Saint Louis: B.S., Computer Science

Select Recent Media Coverage (5)

Colleges work on plans to adjust to new world of AI, ChatGPT

Western Mass News  tv


In a television news segment. Mohit Iyyer discusses how schools address the use of ChatGPT in education. He explains that he has been teaching a class about large language models like ChatGPT since 2018. “I give a take-home exam for students to use whatever they want: the Internet, books, this year ChatGPT. It was actually very difficult to write questions that ChatGPT cannot answer and get a decent amount of partial credit on so I had to write 10 or 15 questions, but just five I could put on the exam that were tricky,” says Iyyer.

mohit iyyer

view more

How does Bing's chatbot work and should we be concerned about it?

GBH  radio


This new technology is trained to predict the next word when given a large amount of text, sometimes leading to strange outputs, Mohit Iyyer, assistant professor of computer science at UMass Amherst explained on Greater Boston.

Media Appearance Image

view more

How will artificial intelligence the workplace?

Mass Appeal  online


Several months ago the computer program, ChatGPT made news because of it’s ability to create text and images solely generated by a computer after a brief request by a human. Today, fears of artificial intelligence is gripping workers worldwide with the thought that computers will be replacing humans. Dr. Mohit Iyyer from the UMass Manning College of Information & Computer Sciences, is here to talk about the growth of AI in the workplace.

view more

Months after ChatGPT’s noisy debut, colleges take differing approaches to dealing with AI

The Boston Globe  print


UMass Amherst is seeking the middle ground. The college recently amended its academic conduct policy to say that “unless the instructor expressly allows the usage of one of these AIs, it’s prohibited,” according to Mohit Iyyer, an assistant professor of computer science. “But it’s not possible to enforce this.” “From my colleagues’ perspective, it’s very ad hoc,” Iyyer said. “Some people are just going about their classes as normal. Others are expressly banning it, or allowing it for certain assignments and not for others, or allowing it for everything.”

view more

Getting Answers: ChatGPT’s possible impact on eduation and industries

Western Mass News  online


ChatGPT is an artificial intelligence language model created by OpenAI. Since it burst onto the scene in November, people have been using the chatbot to complete homework assignments, write code for websites, and even compose poetry. “Write a poem with 12 lines that rhymes every line. It can do that. The poem might not be the best. It also might not be 12 lines long,” said UMass Amherst Assistant Computer Science Professor Mohit Iyyer.

view more

Select Publications (3)

A critical evaluation of evaluations for long-form question answering



Long-form question answering (LFQA) enables answering a wide range of questions, but its flexibility poses enormous challenges for evaluation. We perform the first targeted study of the evaluation of long-form answers, covering both human and automatic evaluation practices. We hire domain experts in seven areas to provide preference judgments over pairs of answers, along with free-form justifications for their choices. We present a careful analysis of experts' evaluation, which focuses on new aspects such as the comprehensiveness of the answer. Next, we examine automatic text generation metrics, finding that no existing metrics are predictive of human preference judgments. However, some metrics correlate with fine-grained aspects of answers (e.g., coherence).

view more

On the Risks of Stealing the Decoding Algorithms of Language Models



A key component of generating text from modern language models (LM) is the selection and tuning of decoding algorithms. These algorithms determine how to generate text from the internal probability distribution generated by the LM. The process of choosing a decoding algorithm and tuning its hyperparameters takes significant time, manual effort, and computation, and it also requires extensive human evaluation. Therefore, the identity and hyperparameters of such decoding algorithms are considered to be extremely valuable to their owners. In this work, we show, for the first time, that an adversary with typical API access to an LM can steal the type and hyperparameters of its decoding algorithms at very low monetary costs.

view more

LongEval: Guidelines for human evaluation of faithfulness in long-form summarization



While human evaluation remains best practice for accurately judging the faithfulness of automatically-generated summaries, few solutions exist to address the increased difficulty and workload when evaluating long-form summaries. Through a survey of 162 papers on long-form summarization, we first shed light on current human evaluation practices surrounding long-form summaries. We find that 73% of these papers do not perform any human evaluation on model-generated summaries, while other works face new difficulties that manifest when dealing with long documents (e.g., low inter-annotator agreement). Motivated by our survey, we present LongEval, a set of guidelines for human evaluation of faithfulness in long-form summaries that addresses the following challenges: [...]

view more