MACHINE learning and artificial intelligence are increasingly part of daily life. This emerging technology can improve processes and make them more accurate. That said, it can also exacerbate existing inequities by embedding unconscious biases of human designers and using data generated in inequitable systems.

Machine learning is the process by which information is introduced to a computer to identify patterns using algorithms. The patterns recognized are called models, which describe the information methodically, allowing predictions about the world beyond initial data. In circumstances where the product is artificial intelligence, the computer can continuously learn as it performs tasks based on initial models.

For example, if the objective is to learn about people’s purchasing preferences, providing data on purchase histories to the computer can allow it to identify patterns in similar purchases via grouping algorithms. It can create models of preferences and ultimately recommend similar items to individuals when they search for merchandise next time.

Even in this oversimplification, two sources of algorithmic bias emerge: the original data and the algorithm, both created by humans. So what impacts how fair or biased the technology is?


1. Limitations of the evidence base

The first phase of limiting bias in the technology is understanding the context in which you operate. Often algorithms include assumptions derived from research and literature. There is an inherent assumption that science is definitive, objective, and unbiased, however, science is a process. Building solid foundations of evidence and building its evidence base requires multiple studies on various groups representative of the population by different researchers, showing similar results over time. Unfortunately, diversity, equity and fairness have historically not been prioritized, and a limited evidence base is truly generalizable to everyone. That is why it is crucial to critically evaluate the existing literature for comprehensiveness, fairness and applicability before applying any assumptions.

For example, pharmaceutical clinical trials have mainly recruited adult white males; however, the results are generalized to the entire population. Algorithms built upon these results often do not accurately represent all underrepresented groups and can lead to unintentionally biased technologies.


2. Limitations of the data sets and algorithms

In the second phase, limitations of 1) the data from which the computer will learn and 2) the algorithms being used should be considered. While these seem detailed questions meant for data scientists, the decision’s impact on the result is immense; therefore, business segments need to at least be aware of the methods and able to communicate the results with limitations in mind.

For example, credit scores are determined through models that attempt to capture financially risky behaviours and poor habits. However, marginalized groups have historically been offered predatory products that lead to the snowballing of debt, which is then reflected in the data. This example is one of many that amplify systemic bias in the data via models and continues to disadvantage one population while advantaging another, especially through a scoring system that is used so frequently and ubiquitously.

Understanding the context and limitations and tempering the interpretation of the results is an important step that should involve the business unit and company leaders.


3. Impacts of new technology on marginalized communities

The negative impact of tech and algorithmic biases may not be intentional; however, the solution to health inequities should be. Therefore, in the third phase, development and business segments must be diverse, and multidisciplinary and include subject matter experts and community stakeholders to contextualize the results and think critically about possible outcomes. Additionally, it is crucial to assess the post-market, real-world evidence after the technologies have been used for some time and are transparent and accountable for those outcomes.

For example, there are race-based algorithms for medical decision-making. Ideally, the algorithms lessen disparities, although historically, this has not been the case. For instance, there are models which estimate the kidney’s filtration rate to determine if patients require specialized treatment or qualify for transplants. Previously, the model would predict higher filtration rates for Black patients (i.e. they would appear less ill).

Unfortunately, the resulting policies led to an increased rate of disease progression and delayed referrals for transplantation for Black individuals. One study estimates that if the models were still used, approximately 68,000 Black adults would not be referred to specialist care (population estimates were calculated based on the Diao et al publication and the American Community Survey ACSDT5Y2020 dataset). At the same time, an additional 16,000 would be ineligible for the transplant waiting list. This outcome is particularly undesirable considering the incidence of End-Stage Kidney Disease is three times higher in Black individuals compared to white.

Steps could have been taken earlier to prevent inequitable care from worsening after these clinical practices were implemented but that’s not the overarching theme. The point is that the algorithm’s application and potential impact were neither considered thoroughly nor contextualized within the epidemiology and inequity of kidney failure.


Beyond algorithmic biases

As data science becomes a prevalent tool, organizations should ensure that those technologies don’t simply automate existing algorithm biases. By working through these three key considerations, having multidisciplinary teams and staying accountable after the technology is in use, organizations can help mitigate algorithmic bias, prevent exacerbation of existing inequities and help create a more equitable environment.

Most importantly, the ubiquity of data science across all industries is here to stay. Therefore, to maximize its use for transformative outcomes and avoid beginner missteps, all companies should invest in better understanding of data science at every level.