Expert Insight: The Voice of Alexa: How Speech Characteristics Impact Consumer DecisionsMay 28, 20235 min read
In the 2020 film “Superintelligence,” an all-powerful artificial intelligence attempts to take over the world, and it studies an average person, played by Melissa McCarthy, to decide if humanity is worth saving. The AI is voiced by James Corden—a voice it chooses because it knows it’s one McCarthy’s character will engage with.
Rajiv Garg, associate professor of Information Systems & Operations Management at Emory’s Goizueta Business School, shows the “Superintelligence” trailer before his research presentations to set the tone. Garg conducts research that explores the impact of artificial intelligence voices on consumer behavior and purchase intent, along with Haris Krijestorac, a professor at HEC Paris, and Vijay Mahajan, a professor from The University of Texas at Austin.
Garg’s research began when Amazon launched celebrity voices for its Alexa device in 2019. From Samuel L. Jackson to Shaquille O’Neal, users can now get their news and entertainment, while interacting with their favorite superstars.
“I questioned if certain voices could get more engagement or more purchases from consumers,” Garg says.
If Alexa starts talking to you in Samuel L. Jackson’s voice, will you continue the conversation? What could Samuel L. Jackson’s voice sell you that you would buy?
Garg and his team began their research by collecting more than 300 celebrity voice samples, which they analyzed based on their sound characteristics, such as amplitude, frequency, and entropy. They looked at 20 sound characteristics and identified that all the voices could be segmented into six clusters: ostentatious, colloquial, friendly, authoritative, seductive, and suave.
The team then created advertisements for select products using computer generated voices for each of the six clusters, opting for artificial intelligence-created speech instead of celebrity deep fakes due to permission legalities.
They chose a shoe and an office chair as their products, and created two different advertisements for each product. One ad was simple, denoting the shoe as comfortable for all-day wear and the office chair as comfortable for sitting in for extended time periods. The other ad was hedonic, denoting the shoe as crafted with Italian leather and the office chair equipped with several massage features. They recorded the four advertisements using both a female and male voice for all six voice clusters.
Study participants listened to each of the four advertisements in one of the 12 voices, which was randomly selected. After the advertisement was played, participants were asked if they wanted more information, and later, if they wanted to buy the product (omitting the price as to not add another factor to their decision making).
Influencing Consumer Behavior
For simple, utilitarian products, they found no significant effect of voice on information seeking behavior. Garg says once participants hear this type of advertisement, they simply decide to purchase or move on.
Participants do, however, engage more in information seeking behavior for hedonic products when the voice is ostentatious, seductive, or authoritative. The team also found men were more likely than women to engage with ostentatious or seductive voices, and women were more likely to engage with friendly or colloquial voices.
Overall, they found participants did not seek information with male voices.
For information seeking, men and women only engage if the voices are female, which is somewhat intuitive. The industry is doing this—Alexa, Google, and Siri all have a female voice.
In terms of purchase intention, they found ostentatious voices have higher yields for utilitarian products. Men, especially, were more likely than women to purchase a utilitarian product advertised in an ostentatious voice.
Think about advertising a stapler. It’s a stapler—it staples paper—but you advertise it in a French accent to make it sound interesting.
Conversely, for hedonic products, an ostentatious voice has a negative effect on purchase intent because Garg says it can make the product sound gimmicky. Their research shows colloquial voices do the best here because people focus more on the advertisement’s content.
Across the board, they found seductive voices have a negative effect on purchase intent, but more so on utilitarian products compared to hedonic ones. Men were more likely than women to respond positively to seductive and suave voices.
Applying the results
Voices are another way smart device companies can personalize their customers’ experiences. Garg says these companies should be aware that there may be a certain voice that will garner the best engagement.
Their findings are not isolated to business, but may apply to other industries, such as the media. Garg says, for example, if publications intend to increase reader curiosity and engagement, they should use a female colloquial voice on “click to listen” features.
Although not yet tested, Garg says he wouldn’t be surprised if their results extend to real-world settings with real human voices as well.
During their research, Garg’s team asked participants if they had heard the advertisement voices before, and about 15 percent of respondents says they had.
"These were voices we’d created for the first time,” Garg says. “If they say they’ve heard the voice before, that means they were thinking of them as human voices. Although we didn’t study it that way, I do believe what we’re seeing will be relevant for actual human being’s voices and interactions.”
Having researched this for years, Garg says every time he listens to a voice, whether a customer service representative or podcast host, he questions whether or not it is impacting his behavior.
A lot of times when I’m making a decision, I know that I’m making that decision passively because of the voice.
“I’m acting 50 percent based on the rational information in the voice, but the other 50 percent I just want to listen more. There is an inherent desire for a certain voice.”
Garg says his favorite part of the research are those “aha moments,” whether they be the influence of voice in his own life or in the industry—such as large companies using female voices in their products to draw engagement.
He says he hopes to continue doing this kind of research to help startups and other companies perform better, as AI-powered voices continue to change the way people interact with technology and consume information.
“We’re finding these interesting phenomena that can help create new products that are more effective,” Garg says. “I am trying to increase the economic surplus, in some ways to improve society, and this technology presents numerous opportunities.”
Looking to know more? Rajiv Garg from Emory’s Goizueta Business School is available to speak with media – simply click on his icon now to arrange an interview today.
Rajiv Garg Associate Professor of Information Systems & Operations Management
Expert on Estimating Economic Value of Information and Algorithms