Expert Perspective: Mitigating Bias in AI: Sharing the Burden of Bias When it Counts Most

May 16, 2025

5 min

Gareth James




Whether getting directions from Google Maps, personalized job recommendations from LinkedIn, or nudges from a bank for new products based on our data-rich profiles, we have grown accustomed to having artificial intelligence (AI) systems in our lives.


But are AI systems fair? The answer to this question, in short—not completely. Further complicating the matter is the fact that today’s AI systems are far from transparent.


Think about it: The uncomfortable truth is that generative AI tools like ChatGPT—based on sophisticated architectures such as deep learning or large language models—are fed vast amounts of training data which then interact in unpredictable ways. And while the principles of how these methods operate are well-understood (at least by those who created them), ChatGPT’s decisions are likened to an airplane’s black box: They are not easy to penetrate.


So, how can we determine if “black box AI” is fair? Some dedicated data scientists are working around the clock to tackle this big issue.


One of those data scientists is Gareth James, who also serves as the Dean of Goizueta Business School as his day job. In a recent paper titled “A Burden Shared is a Burden Halved: A Fairness-Adjusted Approach to Classification” Dean James—along with coauthors Bradley Rava, Wenguang Sun, and Xin Tong—have proposed a new framework to help ensure AI decision-making is as fair as possible in high-stakes decisions where certain individuals—for example, racial minority groups and other protected groups—may be more prone to AI bias, even without our realizing it.


In other words, their new approach to fairness makes adjustments that work out better when some are getting the short shrift of AI.



Gareth James became the John H. Harland Dean of Goizueta Business School in July 2022. Renowned for his visionary leadership, statistical mastery, and commitment to the future of business education, James brings vast and versatile experience to the role. His collaborative nature and data-driven scholarship offer fresh energy and focus aimed at furthering Goizueta’s mission: to prepare principled leaders to have a positive influence on business and society.




Unpacking Bias in High-Stakes Scenarios


Dean James and his coauthors set their sights on high-stakes decisions in their work. What counts as high stakes? Examples include hospitals’ medical diagnoses, banks’ credit-worthiness assessments, and state justice systems’ bail and sentencing decisions. On the one hand, these areas are ripe for AI-interventions, with ample data available. On the other hand, biased decision-making here has the potential to negatively impact a person’s life in a significant way.


In the case of justice systems, in the United States, there’s a data-driven, decision-support tool known as COMPAS (which stands for Correctional Offender Management Profiling for Alternative Sanctions) in active use. The idea behind COMPAS is to crunch available data (including age, sex, and criminal history) to help determine a criminal-court defendant’s likelihood of committing a crime as they await trial. Supporters of COMPAS note that statistical predictions are helping courts make better decisions about bail than humans did on their own. At the same time, detractors have argued that COMPAS is better at predicting recidivism for some racial groups than for others. And since we can’t control which group we belong to, that bias needs to be corrected. It’s high time for guardrails.


A Step Toward Fairer AI Decisions

Enter Dean James and colleagues’ algorithm. Designed to make the outputs of AI decisions fairer, even without having to know the AI model’s inner workings, they call it “fairness-adjusted selective inference” (FASI). It works to flag specific decisions that would be better handled by a human being in order to avoid systemic bias. That is to say, if the AI cannot yield an acceptably clear (1/0 or binary) answer, a human review is recommended.


To test the results for their “fairness-adjusted selective inference,” the researchers turn to both simulated and real data. For the real data, the COMPAS dataset enabled a look at predicted and actual recidivism rates for two minority groups, as seen in the chart below.



In the figures above, the researchers set an “acceptable level of mistakes” – seen as the dotted line – at 0.25 (25%). They then compared “minority group 1” and “minority group 2” results before and after applying their FASI framework. Especially if you were born into “minority group 2,” which graph seems fairer to you?


Professional ethicists will note there is a slight dip to overall accuracy, as seen in the green “all groups” category. And yet the treatment between the two groups is fairer. That is why the researchers titled their paper “a burden shared is a burdened halved.”


Practical Applications for the Greater Social Good


“To be honest, I was surprised by how well our framework worked without sacrificing much overall accuracy,” Dean James notes. By selecting cases where human beings should review a criminal history – or credit history or medical charts – AI discrimination that would have significant quality-of-life consequences can be reduced.


Reducing protected groups’ burden of bias is also a matter of following the laws. For example, in the financial industry, the United States’ Equal Credit Opportunity Act (ECOA) makes it “illegal for a company to use a biased algorithm that results in credit discrimination on the basis of race, color, religion, national origin, sex, marital status, age, or because a person receives public assistance,” as the Federal Trade Commission explains on its website. If AI-powered programs fail to correct for AI bias, the company utilizing it can run into trouble with the law. In these cases, human reviews are well worth the extra effort for all stakeholders.


The paper grew from Dean James’ ongoing work as a data scientist when time allows. “Many of us data scientists are worried about bias in AI and we’re trying to improve the output,” he notes. And as new versions of ChatGPT continue to roll out, “new guardrails are being added – some better than others.”


“I’m optimistic about AI,” Dean James says. “And one thing that makes me optimistic is the fact that AI will learn and learn – there’s no going back. In education, we think a lot about formal training and lifelong learning. But then that learning journey has to end,” Dean James notes. “With AI, it never ends.”


Gareth James is the John H. Harland Dean of Goizueta Business School. If you're looking to connect with him - simply click on his icon now to arrange an interview today.


Connect with:
Gareth James

Gareth James

John H. Harland Dean and Professor of Information Systems & Operations Management

Data is the sword of the 21st century, those who wield it well, the Samurai. -Jonathan Rosenberg, adviser to Larry Page & former Google SVP

Statistical Problems in MarketingFunctional Data AnalysisStatistical MethodologyHigh Dimensional Regression

You might also like...

Check out some other posts from Emory University, Goizueta Business School

5 min

Why Simultaneous Voting Makes for Good Decisions

How can organizations make robust decisions when time is short, and the stakes are high? It’s a conundrum not unfamiliar to the U.S. Food and Drug Administration. Back in 2021, the FDA found itself under tremendous pressure to decide on the approval of the experimental drug aducanumab, designed to slow the progress of Alzheimer’s disease—a debilitating and incurable condition that ranks among the top 10 causes of death in the United States. Welcomed by the market as a game-changer on its release, aducanumab quickly ran into serious problems. A lack of data on clinical efficacy along with a slew of dangerous side effects meant physicians in their droves were unwilling to prescribe it. Within months of its approval, three FDA advisors resigned in protest, one calling aducanumab, “the worst approval decision that the FDA has made that I can remember.” By the start of 2024, the drug had been pulled by its manufacturers. Of course, with the benefit of hindsight and data from the public’s use of aducanumab, it is easy for us to tell that FDA made the wrong decision then. But is there a better process that would have given FDA the foresight to make the right decision, under limited information? The FDA routinely has to evaluate novel drugs and treatments; medical and pharmaceutical products that can impact the wellbeing of millions of Americans. With stakes this high, the FDA is known to tread carefully: assembling different advisory, review, and funding committees providing diverse knowledge and expertise to assess the evidence and decide whether to approve a new drug, or not. As a federal agency, the FDA is also required to maintain scrupulous records that cover its decisions, and how those decisions are made. The Impact of Voting Mechanisms on Decision Quality Some of this data has been analyzed by Goizueta’s Tian Heong Chan, associate professor of information systems and operation management. Together with Panos Markou of the University of Virginia’s Darden School of Business, Chan scrutinized 17 years’ worth of information, including detailed transcripts from more than 500 FDA advisory committee meetings, to understand the mechanisms and protocols used in FDA decision-making: whether committee members vote to approve products sequentially, with everyone in the room having a say one after another; or if voting happens simultaneously via the push of a button, say, or a show of hands. Chan and Markou also looked at the impact of sequential versus simultaneous voting to see if there were differences in the quality of the decisions each mechanism produced. Their findings are singular. It turns out that when stakeholders vote simultaneously, they make better decisions. Drugs or products approved this way are far less likely to be issued post-market boxed warnings (warnings issued by FDA that call attention to potentially serious health risks associated with the product, that must be displayed on the prescription box itself), and more than two times less likely to be recalled. The FDA changed its voting protocols in 2007, when they switched from sequentially voting around the room, one person after another, to simultaneous voting procedures. And the results are stunning. Tian Heong Chan, Associate Professor of Information Systems & Operation Management “Decisions made by simultaneous voting are more than twice as effective,” says Chan. “After 2007, you see that just 3.4% of all drugs and products approved this way end up being discontinued or recalled. This compares with an 8.6% failure rate for drugs approved by the FDA using more sequential processes—the round robin where individuals had been voting one by one around the room.” Imagine you are told before hand that you are going to vote on something important by simply raising your hand or pressing a button. In this scenario, you are probably going to want to expend more time and effort in debating all the issues and informing yourself before you decide. Tian Heong Chan “On the other hand, if you know the vote will go around the room, and you will have a chance to hear how others’ speak and explain their decisions, you’re going to be less motivated to exchange and defend your point of view beforehand,” says Chan. In other words, simultaneous decision-making is two times less likely to generate a wrong decision as the sequential approach. Why is this? Chan and Markou believe that these voting mechanisms impact the quality of discussion and debate that undergird decision-making; that the quality of decisions is significantly impacted by how those decisions are made. Quality Discussion Leads to Quality Decisions Parsing the FDA transcripts for content, language, and tonality in both settings, Chan and Markou find evidence to support this. Simultaneous voting or decision-making drives discussions that are characterized by language that is more positive, more authentic, and more even in terms of expressions of authority and hierarchy, says Chan. What’s more, these deliberations and exchanges are deeper and more far-ranging in quality. We find marked differences in the tone of speech and the topics discussed when stakeholders know they will be voting simultaneously. There is less hierarchy in these exchanges, and individuals exhibit greater confidence in sharing their points of view more freely. Tian Heong Chan “We also see more questions being asked, and a broader range of topics and ideas discussed,” says Chan. In this context, decision-makers are also less likely to reach unanimous agreement. Instead, debate is more vigorous and differences of opinion remain more robust. Conversely, sequential voting around the room is typically preceded by shorter discussion in which stakeholders share fewer opinions and ask fewer questions. And this demonstrably impacts the quality of the decisions made, says Chan. Sharing a different perspective to a group requires effort and courage. With sequential voting or decision-making, there seems to be less interest in surfacing diverse perspectives or hidden aspects to complex problems. Tian Heong Chan “So it’s not that individuals are being influenced by what other people say when it comes to voting on the issue—which would be tempting to infer—rather, it’s that sequential voting mechanisms seem to take a bit more effort out of the process.” When decision-makers are told that they will have a chance to vote and to explain their vote, one after another, their incentives to make a prior effort to interrogate each other vigorously, and to work that little bit harder to surface any shortcomings in their own understanding or point of view, or in the data, are relatively weaker, say Chan and Markou. The Takeaway for Organizations Making High-Stakes Decisions Decision-making in different contexts has long been the subject of scholarly scrutiny. Chan and Markou’s research sheds new light on the important role that different mechanisms have in shaping the outcomes of decision-making—and the quality of the decisions that are jointly taken. And this should be on the radar of organizations and institutions charged with making choices that impact swathes of the community, they say. “The FDA has a solid tradition of inviting diversity into its decision-making. But the data shows that harnessing the benefits of diversity is contingent on using the right mechanisms to surface the different expertise you need to be able to see all the dimensions of the issue, and make better informed decisions about it,” says Chan. A good place to start? By a concurrent show of hands. Tian Heong Chan is an associate professor of information systems and operation management. he is available to speak about this topic - click on his con now to arrange an interview today.

4 min

Expert Perspective: The Hidden Costs of Cultural Appropriation

In our interconnected world, cultural borrowing is everywhere. But why do some instances earn applause while others provoke outrage? This question is becoming increasingly crucial for business leaders who must carefully navigate cultural boundaries. Take the backlash the Kardashian-Jenner family faced for adopting styles from minority cultures or the controversy over non-Indigenous designers using Native American patterns in fashion. These examples highlight the issue of cultural appropriation, where borrowing elements from another culture without genuine understanding or respect can lead to accusations of exploitation. Abraham Oshotse, an assistant professor of organization and management at Goizueta Business School, along with Assistant Professor of Sociology and Anthropology at Hebrew University Yael Berda and Associate Professor of Organizational Behavior at the Stanford Graduate School of Business Amir Goldberg, explores this in their research on “cultural tariffing.” They shed light on why high-status individuals, such as celebrities or industry leaders, often come under fire when crossing cultural boundaries. The Concept of Cultural Tariffing Oshotse and coauthors define cultural tariffing as “the act of imposing a social cost on cultural boundary crossing. It is levied on high-status actors crossing into low-status culture, in order to mitigate the reproduction of the status inequality.” This notion suggests that the acceptance or rejection of cultural boundary-crossing is influenced by the perceived costs and benefits. Cultural appropriation involves taking elements from a culture that one does not belong to, without permission or authority. For example, when Elvis Presley brought African-American music into the mainstream, it was initially seen as elevating the genre. However, in today’s context, such acts might be criticized as appropriation rather than celebration. This research seeks to analyze people’s modern reactions to different examples of cultural boundary-crossing and which conditions induce cultural tariffing. The Hypotheses The researchers make four hypotheses about participants’ reactions to cultural appropriation: People will disapprove of cultural borrowing if there’s a clear power imbalance, with the borrowing group having more status or privilege than the group they are borrowing from. Cultural borrowing is more likely to be criticized if the person doing it has a higher socioeconomic status within their social group. Cultural borrowing is more likely to be criticized if the person doing it has only a shallow connection to the culture they’re borrowing from. Cultural borrowing is more likely to be criticized if the person doing it benefits more from it than the people from the culture they are borrowing from. Put to the Test Oshotse et al exposed respondents to four scenarios per hypothesis (16 total) with a permissible and a transgressive condition. In the permissible condition, subjects exhibit lower status or socioeconomic standing or a stronger connection to the target culture. Subjects in the transgressive condition exhibit a higher status or socioeconomic standing and less of an authentic connection to the target culture. Insights from the Study Oshotse’s study offers four key insights: Status Matters: Cultural boundary-crossing is more likely to generate disapproval if there’s a clear status difference favoring the adopter. Superficial Connections: The less authentic the adopter’s connection to the target culture, the more likely they are to face backlash. Socioeconomic Influence: Higher socioeconomic status within the adopter’s social group increases the likelihood of disapproval. Value Extraction: The more value the adopter gains relative to the culture they’re borrowing from, the higher the disapproval. These insights are crucial for leaders who want to navigate cultural boundaries successfully, ensuring their actions are seen as respectful and inclusive rather than exploitative. Real-World Implications for Business Leaders Why does this matter for business leaders? Understanding cultural tariffing is crucial when expanding into new markets, launching multicultural campaigns, or even managing diverse teams. The research suggests that crossing cultural boundaries without deep understanding or respect can backfire. That’s especially true when the adopter holds a higher socioeconomic status. Consider the example of a luxury brand adopting traditional African patterns without engaging with the communities behind them. In this case, it risks being seen as exploitative rather than innovative. The consequences aren’t just reputational; they can also impact the brand’s bottom line. This research isn’t just about isolated incidents; it mirrors sweeping societal shifts. Over the past 50 years, Western views have evolved to embrace ethnic diversity and multicultural exchange. But with this newfound appreciation comes a fresh set of challenges. Today’s leaders must navigate cultural interactions with greater care, fully aware of the historical and social contexts that shape perceptions of appropriation. In today’s global and interconnected business landscape, mastering the subtleties of cultural appropriation and tariffing is crucial. Leaders who tread thoughtfully can boost their reputation and success, while those who falter may face serious backlash. By understanding the hidden costs of crossing cultural boundaries, business leaders can cultivate authentic exchanges and steer clear of the pitfalls of appropriation. Abraham Oshotse is an assistant professor of organization & management. He is available speak to media regarding  this important topic - simply click on his icon now to arrange an interview today.

6 min

Hiring More Nurses Generates Revenue for Hospitals

Underfunding is driving an acute shortage of trained nurses in hospitals and care facilities in the United States. It is the worst such shortage in more than four decades. One estimate from the American Hospital Association puts the deficit north of one million. Meanwhile, a recent survey by recruitment specialist AMN Healthcare suggests that 900,000 more nurses will drop out of the workforce by 2027. American nurses are quitting in droves, thanks to low pay and burnout as understaffing increases individual workload. This is bad news for patient outcomes. Nurses are estimated to have eight times more routine contact with patients than physicians. They shoulder the bulk of all responsibility in terms of diagnostic data collection, treatment plans, and clinical reporting. As a result, understaffing is linked to a slew of serious problems, among them increased wait times for patients in care, post-operative infections, readmission rates, and patient mortality—all of which are on the rise across the U.S. Tackling this crisis is challenging because of how nursing services are reimbursed. Most hospitals operate a payment system where services are paid for separately. Physician services are billed as separate line items, making them a revenue generator for the hospitals that employ them. But under Medicare, nursing services are charged as part of a fixed room and board fee, meaning that hospitals charge the same fee regardless of how many nurses are employed in the patient’s care. In this model, nurses end up on the other side of hospitals’ balance sheets: a labor expense rather than a source of income. For beleaguered administrators looking to sustain quality of care while minimizing costs (and maximizing profits), hiring and retaining nursing staff has arguably become something of a zero-sum game in the U.S. The Hidden Costs of Nurse Understaffing But might the balance sheet in fact be skewed in some way? Could there be potential financial losses attached to nurse understaffing that administrators should factor into their hiring and remuneration decisions? Research by Goizueta Professors Diwas KC and Donald Lee, as well as recent Goizueta PhD graduates Hao Ding 24PhD (Auburn University) and Sokol Tushe 23PhD (Muma College of Business), would suggest there are. Their new peer-reviewed publication* finds that increasing a single nurse’s workload by just one patient creates a 17% service slowdown for all other patients under that nurse’s care. Looking at the data another way, having one additional nurse on duty during the busiest shift (typically between 7am and 7pm) speeds up emergency department work and frees up capacity to treat more patients such that hospitals could be looking at a major increase in revenue. The researchers calculate that this productivity gain could equate to a net increase of $470,000 per 10,000 patient visits—and savings to the tune of $160,000 in lost earnings for the same number of patients as wait times are reduced. “A lot of the debate around nursing in the U.S. has focused on the loss of quality in care, which is hugely important,” says Diwas KC. But looking at the crisis through a productivity lens means we’re also able to understand the very real economic value that nurses bring too: the revenue increases that come with capacity gains. Diwas KC, Goizueta Foundation Term Professor of Information Systems & Operations Management “Our findings challenge the predominant thinking around nursing as a cost,” adds Lee. “What we see is that investing in nursing staff more than pays for itself in downstream financial benefits for hospitals. It is effectively a win-win-win for patients, nurses, and healthcare providers.” Nurse Load: the Biggest Impact on Productivity To get to these findings, the researchers analyzed a high-resolution dataset on patient flow through a large U.S. teaching hospital. They looked at the real-time workloads of physicians and nurses working in the emergency department between April 2018 and March 2019, factoring in variables such as patient demographics and severity of complaint or illness. Tracking patients from admission to triage and on to treatment, the researchers were able to tease out the impact that the number of nurses and physicians on duty had on patient throughput. Using a novel machine learning technique developed at Goizueta by Lee, they were able to identify the effect of increasing or reducing the workforce. The contrast between physicians and nursing staff is stark, says Tushe. “When you have fewer nurses on duty, capacity and patient throughput drops by an order of magnitude—far, far more than when reducing the number of doctors. Our results show that for every additional patient the nurse is responsible for, service speed falls by 17%. That compares to just 1.4% if you add one patient to the workload of an attending physician. In other words, nurses’ impact on productivity in the emergency department is more than eight times greater.” Boosting Revenue Through Reduced Wait Times Adding an additional nurse to the workforce, on the other hand, increases capacity appreciably. And as more patients are treated faster, hospitals can expect a concomitant uptick in revenue, says KC. “It’s well documented that cutting down wait time equates to more patients treated and more income. Previous research shows that reducing service time by 15 minutes per 30,000 patient visits translates to $1.4 million in extra revenue for a hospital.” In our study, we calculate that staffing one additional nurse in the 7am to 7pm emergency department shift reduces wait time by 23 minutes, so hospitals could be looking at an increase of $2.33 million per year. Diwas KC This far eclipses the costs associated with hiring one additional nurse, says Lee. “According to 2022 U.S. Bureau of Labor Statistics, the average nursing salary in the U.S. is $83,000. Fringe benefits account for an additional 50% of the base salary. The total cost of adding one nurse during the 7am to 7pm shift is $310,000 (for 2.5 full-time employees). When you do the math, it is clear. The net hospital gain is $2 million for the hospital in our study. Or $470,000 per 10,000 patient visits.” Incontrovertible Benefits to Hiring More Nurses These findings should provide compelling food for thought both to healthcare administrators and U.S. policymakers. For too long, the latter have fixated on the upstream costs, without exploring the downstream benefits of nursing services, say the researchers. Their study, the first to quantify the economic value of nurses in the U.S., asks “better questions,” argues Tushe; exploiting newly available data and analytics to reveal incontrovertible financial benefits that attach to hiring—and compensating—more nurses in American hospitals. We know that a lot of nurses are leaving the profession not just because of cuts and burnout, but also because of lower pay. We would say to administrators struggling to hire talented nurses to review current wage offers, because our analysis suggests that the economic surplus from hiring more nurses could be readily applied to retention pay rises also. Sokol Tushe 23PhD, Muma College of Business The Case for Mandated Ratios For state-level decision makers, Lee has additional words of advice. “In 2004, California mandated minimum nurse-to-patient ratios in hospitals. Since then, six more states have added some form of minimum ratio requirement. The evidence is that this has been beneficial to patient outcomes and nurse job satisfaction. Our research now adds an economic dimension to the list of benefits as well. Ipso facto, policymakers ought to consider wider adoption of minimum nurse-to-patient ratios.” However, decision makers go about tackling the shortage of nurses in the U.S., they should go about it fast and soon, says KC. “This is a healthcare crisis that is only set to become more acute in the near future. As our demographics shift and our population starts again out, demand for quality will increase. So too must the supply of care capacity. But what we are seeing is the nursing staffing situation in the U.S. moving in the opposite direction. All of this is manifesting in the emergency department. That’s where wait times are getting longer, mistakes are being made, and overworked nurses are quitting. It is creating a vicious cycle that needs to be broken.” Diwas Diwas KC is a professor of information systems & operations management and Donald Lee is an associate professor of information systems & operations management. Both experts are available to speak about this important topic - simply click on either icon now to arrange an interview today.

View all posts