Bias in Artificial Intelligence

Stéphanie

Nov 28, 20224 min read

Nowadays, interest in Artificial Intelligence (AI) is expanding thanks to the benefits brought to individuals and businesses via AI tools and applications. Nevertheless, many concerns are embracing AI such as relation to humans and humanity, threats to jobs, and biased judgement of AI. Indeed, bias in AI are anomalies related to the output of algorithms and/or trained data. These variations and aberrations are influenced by the hypothesis emitted during the algorithms’ development[1]. In the following, we will detail the origin of bias in AI, introduce the types of AI bias, and explain how to mitigate AI biases and design algorithms ethically.

Algorithm working principle and origin of bias

An algorithm is a set of rules and instructions designed in order to obtain a result. There are many types of algorithms, depending on their structure and their objective. Among them, Machine Learning has known a large development over the past years. This kind of algorithm allows a machine to learn from real examples, for classification and prediction purposes. Its working principle is based on the following task: based on input data X, the algorithm should recognize a category Y associated with an object X with minimal error risk[2]. Although already described in the seventies, we easily understand why these algorithms have recently experienced a boom. Indeed, they need a large amount of accessible input data, which is possible by the development of Web and Big Data. This boom opens the door to a new set of opportunities but also to many interrogations: if the reliability of the result obtained is linked to the quality of the input data, what happens when input data are biased?

Types of AI bias

Cognitive bias

Algorithms are designed by human beings, who have by essence their own vision of life and their own partiality. This can affect the inherent operation of the algorithm. One of the most controversial examples is the algorithm designed by Wang and Kosinski[3] who proposes to detect the sexual orientation of people based on facial recognition, posing an obvious ethical issue.

Another well-known example is the one from word embedding[4]: in search engines, the word “woman” can be automatically associated with nurse or secretary while the word “man” is associated with chief or captain.

Statistical bias

This bias relies on the fact that if the input data of the algorithm are incorrect, the results will automatically be biased (GIGO principle: Garbage In, Garbage Out). One striking example is the one from Amazon[5], who tried to develop an algorithm to facilitate hiring procedures by automatically grading the CVs from 1 to 5. In this case, the input data were the hundreds of millions of CVs of employees collected over the past 10 years. Unfortunately, among these CVs, men were largely over represented, especially for executive positions. As a result, the algorithm tended to attribute lower grades to women, whose CVs were unfairly and automatically eliminated.

How to mitigate AI bias

Although being a complex issue, there are some ways to mitigate AI bias.

Information complementation will consist in completing the input data in order to mitigate the initial bias, but this practice is rarely used since it requires modeling that can also induce other biases.
Data rectification consists in understanding why certain classes of persons are poorly represented in the database, modeling the probability of the presence of a person with given characteristics, and incorporating this probability in the algorithm.
Resampling consists in artificially recreating a population.
Auxiliary information consists in explaining why a person is present or not in the database.
Time drift consists in describing long-term trends or seasonal effects, which can sometimes be ignored when data collection occurs in a short period of time.

Design algorithms ethically

It is also proposed that instead of corrective actions, the algorithm should be designed in an ethical way in the first place[6]. There are many recommendations elaborated for this purpose such as the ones from McKinsey described below.

Examine the data to ensure that they are representative and in sufficient quantity.
Establish a debiasing strategy including technical, operational, and organizational strategies.
Improve human-driven processes by identifying biases in the creative process.
Decide which use-cases can be automated, and those for which human involvement is necessary.
Include ethics and social sciences experts in AI projects.
Encourage diversity in AI project teams: people who first notice the issues of unbiasedness are often the users belonging to that specific community.

Conclusion

AI has become an indispensable tool in many areas of daily and professional life. It should not be blindly trusted and it is necessary to have a critical eye on the results and to further design algorithms in an ethical way that takes into account all possible types of bias.

Subscribe to NETO Innovation webpage for more insights about deep tech and the related ethical aspects.