Fair or Flawed? How Algorithmic Bias is Redefining Recruitment and Inclusion

Author: Sanaa Gada
Mentor: Dr. Hong Pan
Lynbrook High School

Abstract

In a world where artificial intelligence is beginning to shape critical life decisions, can we trust that the algorithms guiding these choices are unbiased? This paper investigates the implications of algorithmic bias in hiring processes, emphasizing the dual role of artificial intelligence (AI) as both a transformative tool for recruitment and a potential perpetrator of discrimination. It begins with a review of current hiring practices and then identifies key factors contributing to algorithmic bias, including data quality issues, algorithmic opacity, and the influence of proxy variables. Notable cases where biases have emerged, including Amazon’s recruitment algorithm, which favored male candidates due to biased training data, are carefully examined. The paper outlines various strategies for mitigating algorithmic bias while acknowledging their limitations, such as data augmentation, vector space correction, and blind hiring. Furthermore, the research extends its analysis beyond hiring, exploring the manifestations of algorithmic bias in facial recognition technology, predictive policing, and healthcare, thus illustrating the broader societal implications. Conclusively, the paper advocates for creating strong frameworks and legislation to promote more openness and responsibility when using algorithms, underscoring society’s moral obligation to ensure technology serves all communities equitably in an increasingly automated world.

Key Terms

Algorithmic bias:‬‭ systematic and repeatable errors‬‭ in a computer system that create unfair outcomes

Applicant Tracking System (ATS):‬‭ a software system‬‭ that helps organizations manage the hiring and recruiting process

Artificial Intelligence (AI):‬‭ computer software systems that are capable of performing tasks traditionally associated with human intelligence

Proxy Variables:‬‭ a variable that serves as a substitute‬‭ for the variable of interest that cannot be measured directly

Target Variable:‬‭ a feature of a dataset needing to‬‭ be understood more clearly

1. Introduction

Today’s businesses focus more on finding the right employees to maintain their competitive edge. To achieve this, many companies are turning to artificial intelligence (AI) embedded within applicant tracking systems (see Key Terms) to streamline hiring processes, enhance efficiency, and reduce workloads. However, while AI offers significant advantages, it can also unintentionally perpetuate discrimination in hiring. This occurs when biased algorithms or data lead to unfair treatment of certain candidates (see Figure 1). Resources like the Implicit Association Test, available at‬‭ https://implicit.harvard.edu/implicit/takeatest.html,‬‭ can be valuable tools to help individuals recognize and explore their biases. Understanding the causes of discrimination in hiring is not just crucial; it’s a responsibility we all share for developing fairer and more inclusive employment practices.

Figure 1:‬‭ A pie chart displaying the various causes‬‭ of discrimination in the hiring process, highlighting how algorithmic bias, characterized by systematic and repeatable errors, significantly contributes to inequality techniques in hiring. Hiring discrimination prevents minority groups from accessing fair job opportunities and limits career growth (Albaroudi et al., A, 2024).

This paper will first explore the current job hiring process, some of the key issues that lead to algorithmic bias in the status quo, and real-world examples of hiring bias. Possible solutions to mitigating these biases will also be evaluated, acknowledging current limitations but highlighting ongoing advancements and future potential. Finally, general applications of algorithmic decision-making across various fields will be explored, demonstrating the breadth of these tools and the significance of addressing biases to create a fairer, more inclusive future for all.

2. Navigating AI-Driven Hiring

2.1 Applicant Tracking Systems

Applicant tracking systems, otherwise known as ATS, have become increasingly common in hiring practices. Websites embed this software system to help recruiters filter candidates throughout the hiring process, improving applicant sourcing. When candidates apply to a job posting by sharing their resumes, an ATS will scan candidates based on qualifying questions that satisfy the company’s standards (see Figure 2). AI plays a crucial role in this process, scanning resumes based on specific parameters regarding skills, qualifications, experiences, etc. It takes over the tedious human task of shortlisting candidate resumes, but algorithms may have encoded societal stereotypes found in the data they were trained with‬‭ (Frissen et al., 2023)‬‭ .

Figure 2:‬‭ Applicant tracking systems are used to parse‬‭ resumes after candidates apply, removing unqualified applicants by scanning for relevant keywords and matching qualifications. These algorithms are becoming an integral part of modern hiring processes, helping companies efficiently manage large volumes of applications and improve the overall recruitment workflow (Sen, 2023‬‭ ).

2.2 Three Factors Contributing to Algorithmic Bias

Algorithmic bias is shaped by several key factors, beginning with data quality issues. These issues arise when the training data used to develop algorithms is biased, incomplete, or reflective of historical inequalities. For instance, if data is collected from an organization that has historically disproportionately hired more white employees than Black employees, the algorithm might associate good performance with being white. This does not mean that hiring only Black employees would fix the bias; instead, it’s crucial to improve the diversity and balance of the data to reduce the bias. The urgency of this need cannot be overstated. Proxy variables (see Key Terms) can also embed systemic biases within algorithms even when direct indicators of discrimination, such as race and gender, are not explicitly included. For example, zip code can be a proxy variable for race because it strongly correlates with neighborhood segregation‬‭ (Fountain, 2022)‬‭ .

Another significant factor contributing to algorithmic bias is algorithmic opacity.‬‭ This term refers to the potential lack of understanding of an algorithm due to its complexity. Often, the lack of transparency in an algorithm makes it hard for human users to interpret its internal processes (Sadek et al., 2024)‬‭ . Many algorithms operate as “black boxes,” meaning their decision-making processes are not easily understandable or interpretable by those affected by their outcomes. If the logic behind decisions is unclear, it becomes challenging to identify or correct biases, ensure fairness, or hold the creators accountable for discriminatory outcomes.

Algorithms also rely on correlating variables with a target variable (see Key Terms) to predict outcomes.‬‭ For example, if a tech company was looking to hire a software developer proficient in a specific programming language such as Python, the target variable could be “proficiency in Python.” The recruitment algorithm would then categorize candidates into groups based on their coding skills, such as “expert in Python,” “basic Python knowledge,” “no Python experience,” etc. This allows the company to narrow the pool of candidates to those who match the technical expertise required for the job.

However, a key problem with target variables is how the output is defined will influence the result. For example, suppose non-technical skills (soft skills) are considered important for an organization and are part of the target variable. In that case, women may gain an advantage compared to algorithms that do not consider such skills. Additionally, the measure of employees’ performance is also based on the subjective assessment by their managers. Factors such as the employee-manager relationship, personal biases, or differing expectations within the team environment can contribute to the algorithmic evaluation. As a result, algorithms trained on this data may unintentionally reflect biases present in the workplace, resulting in unfair outcomes where high-performing individuals who may fit the requirements might be overlooked or undervalued by the algorithm.

2.3 Current Hiring Practices

AI has played a growing role in hiring and transforming how companies identify, evaluate, and recruit talents.‬‭ Early AI-driven hiring systems were‬‭ used to make the hiring process more efficient but lacked sophisticated methodologies. For example, Resumix was founded in 1988 and served as a resume parsing tool. ATS made its debut in the 1990s with job posting sites such as CareerBuild. By the early 2000s, talent assessment tools like eSkill and SkillSurvey used AI to automate pre-employment testing and background/skill checks. The 2010s saw the rise of AI-powered video interviewing software, with platforms like HireVue utilizing machine learning algorithms. Natural language processing (NLP), a subfield of AI that uses machine learning to understand spoken and written human language, is used to analyze speech patterns, word choice, and language structure during video interviews to assess candidates. Sentiment analysis provides insights into candidates’ emotions and engagement, adding another layer to talent evaluation.

However, AI’s integration into hiring has led to concerns about bias. A Microsoft research study in 2019 highlighted significant biases with AI algorithms embedded in the data on which they are trained. Researchers found that language models like Word2Vec, a machine learning technique that uses NLP to obtain vector representations of words, could produce biased associations between specific demographic groups and stereotypical terms. For example, their investigation led them to observe outputs such as a man is to a woman as a computer programmer is to a homemaker‬‭ (Chiu, n.d.)‬‭ . These biases pose risks in applications like resume screening, where hidden associations could unintentionally favor or disadvantage certain groups.

In 2018, Amazon’s AI-driven algorithm was also found to be biased. When AI systems are trained on historical data, they often reflect the existing biases within that data. If a company’s past hiring practices were skewed, those biases could be unintentionally embedded in the AI’s decision-making process. When Amazon attempted to automate its recruitment process in 2018, it used an algorithm trained on the previous 10 years of resumes. The dataset of resumes consisted primarily of male applicants over ten years, causing the algorithm to favor male language patterns and resulting in discrimination against female candidates‬‭ (Dastin, 2022)‬‭ . Words such as “executed” and “captured” were commonly found on male engineers’ resumes and were more favored by the technology. The system also downgraded resumes that featured the word “women’s,” such as in “women’s chess club captain.” This example highlights the potential risks of relying on AI in hiring, as biased training data can perpetuate discrimination and undermine diversity efforts.

Figure 3:‬‭ Different applications of AI in job hiring. The most common usages of AI are talent sourcing, such as career websites and candidate outreach, and candidate screening, which includes resume scanning and identifying skills that match the job description (The Ultimate Guide to AI in Recruiting, 2024).

While Amazon’s algorithm faced scrutiny for discriminating against women, it also prompted broader questions about how personal data is utilized in hiring. Companies increasingly rely on AI algorithms to sift through resumes, assess candidates, and predict job performance (see Figure 3). This reliance raises crucial considerations regarding the transparency and accountability of data usage in recruitment processes. Globally, regulatory approaches to AI in hiring are evolving. The European Union has taken the lead in establishing AI policy with its AI Act, which aims to ensure that AI hiring systems follow strict privacy rules. It mandates transparency in AI decision-making and imposes requirements for assessing the impact of AI on employment outcomes‬‭ (Sadek et al., 2024)‬‭ .

In the U.S., the rules for AI are less organized and vary from place to place. With general agreement that there need to be AI policies, the question becomes who will make the rules? A commonly shared point of view is that frameworks must ensure that company technologies do not cause harm and that they are held accountable for their impacts. Furthermore, policies need to advocate for greater transparency, including how AI systems work and the data they use. Actors in the AI space must adopt principles that promote responsible AI use, as articulated in the White House’s Blueprint for an AI Bill of Rights‬‭ (‬‭ The Three Challenges of AI Regulation‬‭ , n.d.)‬‭ .

As AI technology advances, the need for coherent frameworks will become increasingly important to ensure fairness and accountability in employment practices.

3. Solutions and Limitations of Mitigating Algorithmic Bias in Hiring

As algorithms play a more significant role in hiring, techniques like data augmentation, vector space correction, and blind hiring offer valuable ways to enhance fairness and inclusivity (see Figure 4). While each method brings its own limitations, they represent significant strides toward reducing bias in AI-driven recruitment.

Figure 4:‬‭ An overview of three possible solutions‬‭ to mitigating algorithmic bias. Data augmentation expands the training data by incorporating diverse examples to reduce bias and improve model fairness. See Figure 5 for an in-depth model of augmentation for text-based data. Vector space correction adjusts how data points are represented in a multi-dimensional space by positioning them more fairly in relation to biased concepts, helping to equalize their influence in the model. Blind hiring removes identifiable information like names or genders from applications to prevent unconscious biases during recruitment.

3.1 Data Augmentation

Definition:‬‭ Data augmentation involves using existing‬‭ training data and modifying it to create new instances that enhance machine learning model training. This technique helps address the issue of insufficient data by artificially increasing the volume, quality, and diversity of training data‬‭ (Mumuni & Mumuni, 2022)‬‭ .

Mechanism:‬‭ Common data augmentation methods for images include rotating, flipping, or cropping. For text data or data used in hiring practices, data augmentation techniques include synonym replacement, paraphrasing, and numerically variating data (see Figure 5).

Advantages:‬‭ Augmenting data increases the diversity‬‭ of the training dataset, allowing models to generalize unseen data better. Expanding a dataset also helps reduce overfitting, where a model memorizes the training data rather than learning its underlying patterns. Data augmentation allows a model to recognize patterns, increasing its ability to handle real-world variations.

Limitations:‬‭ However, augmented data may not always reflect real-world scenarios, potentially leading to overfitting if the modifications are not representative. Extreme data augmentation can introduce excess and unimportant information, deteriorating a model’s quality‬‭ (Walidamamou, 2023)‬‭ .

Figure 5:‬‭ A visual demonstration of an augmented dataset‬‭ where data is generated using information derived from the original training set. The available examples are diversified through slight variations, such as synonym replacement, slight paraphrasing, and numerical variations. They are combined with the original data to create a more diverse data set.

3.2 Vector Space Correction

Definition:‬‭ The vector space is a mathematical framework in which data points, such as words and images, are represented as vectors in a multidimensional space. Vector space correction helps mitigate biases by equalizing the distance between the protected attributes (such as race or gender) and the biased concept‬‭ (Albaroudi et al., 2024)‬‭ . See Figure 4 for a simplified visual demonstration.

Mechanism:‬‭ The process involves adjusting the positions‬‭ of vectors to reduce biases. For example, if a vector model leans towards associating white people with better skills and qualifications, the vector space correction technique will associate the same skills and qualifications with Black people.

Advantages:‬‭ This technique helps create a more balanced‬‭ representation of different groups, which can reduce the impact of biased data on model predictions.

Limitations:‬‭ Vector space correction can cause semantic drift, where adjustments in the vector space may unintentionally change the meanings and relationships of the data points. This can lead to inaccurate predictions and misinterpretations of ideas, making it harder for algorithms to accurately reflect real-world scenarios. Another limitation of this approach is that biases related to more than one attribute are hard to correct because many factors need to be considered before rearranging the vector space.

3.3 Blind Hiring

Definition:‬‭ Blind hiring is a recruitment strategy‬‭ that aims to eliminate bias by removing personal information from decision-making systems. Personal information such as names, zip codes, and health records can sometimes be indicators of social class, gender, age, or racial background.

Mechanism‬‭ : This technique focuses on evaluating candidates‬‭ based solely on their skills and qualifications, removing the chances of being unconsciously influenced in hiring decisions.

Advantages‬‭ : One advantage of blind hiring is that‬‭ it promotes diversity by allowing candidates from varied backgrounds to compete on an equal footing. Ultimately, this practice can lead to a more inclusive workplace culture overall.

Limitations‬‭ : While blind hiring practices aim to eliminate visible identifiers, such as names and genders, they do not fully address the underlying gender, racial, or social biases in the hiring process. This is because specific keywords can still influence perceptions and decisions, as they carry implicit biases that favor one group over another, regardless of the removal of direct identifiers (see Figure 6). For example, masculine traits typically include characteristics like confidence and competitiveness, whereas feminine traits often encompass emotional qualities like warmth, supportiveness, and collaboration‬‭ (Albaroudi et al., 2024)‬‭ .

Figure‬‭ 6:‬‭ A showcase of character traits perceived‬‭ as masculine vs feminine. Masculine words tend to be associated with dominance, while feminine words tend to be associated with emotional intelligence.

4. Addressing Algorithmic Bias in Broader Decision-Making Systems

Beyond job hiring, cases of algorithmic bias appear in various fields, including facial recognition technologies, predictive policing, and healthcare (see Figure 8). The following sections will explore how these biases manifest in these areas and examine their implications.

4.1 Facial Recognition Technology

Facial recognition technologies (FRT) are used to identify faces in static or moving images. The accuracy of an FRT depends upon the quality of the image it assesses and the makeup of the algorithm itself. FRTs are popular in authentication processes, police work, and medical diagnosis. However, many FRTs have been found to exhibit algorithmic bias, leading to disparities in accuracy based on race, gender, and other demographic factors. FRTs first capture the details of an image, identifying if it is human or not. A person’s face is broken down into key features, such as the distance between the eyes and the shape of the cheekbones. This information is translated into a faceprint, uniquely given to each individual. The faceprint is compared to images in a database to find a possible match (see Figure 7). False positives and false negatives are possible. A false positive is misreading an image as a match when it is not, whereas a false negative fails to match the face. Challenges such as the quality of available images, lighting, and facial expressions can affect the accuracy of FRTs.

Figure 7:‬‭ The 3 components of the facial recognition software system. The image is captured, converted into a digital representation (faceprint), and given a match score. A match score is a numerical value that helps determine if the individual’s face corresponds with an existing entry in the system, completing the identity verification process (‬‭ Lomibao, 2020).

Studies show that FRTs are often more accurate for lighter-skinned males but tend to misidentify women and individuals with darker skin tones at significantly higher rates. The Amazon Rekognition System resulted in higher accuracy for white and black men than white women (93%) and dark-skinned women (68.6%). The U.S. National Institute of Standards and Technology (NIST) also found higher rates of false positives for Asian and African American faces in comparison to Caucasian faces using the FBI’s database of 1.6 million domestic mugshots‬‭ (Fountain, 2022)‬‭ .

Further diversifying datasets to train from more racial and ethnic groups will help mitigate algorithmic bias and promote fairer outcomes. However, establishing and adhering to rigorous standards is also essential to improve the quality and accountability of this technology.

4.2 Predictive Policing

Predictive policing is a law enforcement technique that uses data and algorithms to predict where and when crimes will occur. The goal is to use this information to prevent crime, but it has emerged as a controversial approach to law enforcement. One of the early implementations, CompStat in New York City during the 1990s, employed visual tools like pin maps to display crime data by frequency and location‬‭ (Fountain, 2022)‬‭ . However, while CompStat aimed to promote efficiency and accountability, it also contributed to problematic practices such as “stop and frisk,” which is the practice of stopping individuals for questioning, sometimes without reasonable suspicion. This practice disproportionately targeted racial minorities, and research has shown that these practices can lead to lasting psychological harm.

The practice has raised significant concerns about algorithmic bias and its implications for marginalized communities. Some algorithms may produce biased results that lead to over-policing or police being repeatedly deployed to neighborhoods based on skewed data of specific neighborhoods. Some municipal governments have implemented executive orders banning the use of predictive policing software. However, while bans are effective in the short run, they are not a substitute for legislative action. At the root, a larger focus on addressing algorithmic bias is crucial to ensure that predictive policing does not aggravate existing inequalities in the criminal justice system.

4.3 Healthcare Algorithms

In the healthcare industry, algorithms have also been shown to lead to racial bias. A recent study found bias in an algorithm that generated individual-level medical risk scores affecting 200 million people. The algorithm identifies patients for “high-risk care management” but relies on healthcare costs as a proxy for illness, leading to biased outcomes. The algorithm assumed that those with higher medical costs were sicker. However, Black patients, despite having higher levels of illness than their White counterparts with the same risk score, tend to generate lower healthcare costs due to limited access to care and implicit biases in care. Additionally, Black patients tend to spend more on emergency visits and dialysis costs rather than inpatient surgeries and outpatient specialist costs, which cost more‬‭ (Fountain, 2022)‬‭ .

This research emphasized that biases often arise from flawed labels reflecting structural inequalities. Addressing these biases through improved algorithm design and iterative testing can lead to fairer outcomes, opening pathways for more equitable healthcare solutions.

Figure 8:‬‭ Besides job hiring, algorithmic biases exist in various societal places. Facial recognition technologies can misidentify individuals, particularly women and people of color, due to biased training data, leading to wrongful accusations or surveillance. Predictive policing occurs because algorithms used to predict crime can reinforce existing biases by over-policing specific communities based on historical data. Healthcare algorithms also possess biases, resulting in unequal treatment, with certain groups receiving less accurate diagnoses or care recommendations.

5. Conclusion

Algorithmic bias poses significant challenges across various sectors of the economy, often disproportionately affecting marginalized communities. As hiring practices evolve, companies increasingly rely on AI to make recruitment more efficient, yet this shift has also introduced unintended barriers to diversity, equity, and inclusion (DEI). Organizations aiming to improve the DEI culture must prioritize transparency, accountability, and ethical considerations in their hiring algorithm designs. While technical solutions are essential, they often are not enough on their own. This shows how crucial it is to have strong laws to tackle algorithmic bias. Passing effective legislation is a moral responsibility to ensure that technology benefits everyone. The future of mitigating algorithmic bias depends on global collaboration. By passing thoughtful laws, we can build a technology environment that helps everyone, creating a fairer society that supports communities in a world that relies more and more on automation.

6. References

Albaroudi, E., Mansouri, T., & Alameer, A. (2024). A Comprehensive Review of AI Techniques for Addressing Algorithmic Bias in Job Hiring.‬‭ AI‬‭ ,‬‭ 5‬‭ (1), Article 1. https://doi.org/10.3390/ai5010019

Bîgu, D., & Cernea, M.-V . (2019).‬‭ Algorithmic Bias‬‭ in Current Hiring Practices: An Ethical Examination‬‭ .

Chiu, R. (n.d.).‬‭ Can We Fix AI Hiring Bias? | Policy‬‭ Commons‬‭ . Retrieved October 24, 2024, from‬‭ https://policycommons.net/artifacts/1320212/can-we-fix-ai-hiring-bias/1923502/‬‭ .

Dastin, J. (2022). Amazon Scraps Secret AI Recruiting Tool that Showed Bias against Women*. In K. Martin,‬‭ Ethics of Data and Analytics‬‭ (1st ed.,‬‭ pp. 296–299). Auerbach Publications. https://doi.org/10.1007/s00146-022-01574-0‬‭ .

Fountain, J. E. (2022). The moon, the ghetto and artificial intelligence: Reducing systemic racism in computational algorithms.‬‭ Government Information Quarterly‬‭ ,‬‭ 39‬‭ (2), 101645. https://doi.org/10.1016/j.giq.2021.101645‬‭ .

Frissen, R., Adebayo, K. J., & Nanda, R. (2023). A machine learning approach to recognize bias and discrimination in job advertisements.‬‭ AI & SOCIETY‬‭ ,‬‭ 38‬‭ (2), 1025–1038. https://doi.org/10.1007/s00146-022-01574-0‬‭ .

Lomibao, L. (2020, December 13). Factsheet: Facial Recognition Technology (FRT).‬‭ Stop LAPD Spying Coalition‬‭ .‬‭ https://stoplapdspying.org/facial-recognition-factsheet/‬‭ .

Mumuni, A., & Mumuni, F. (2022). Data augmentation: A comprehensive survey of modern approaches.‬‭ Array‬‭ ,‬‭ 16‬‭ , 100258.‬‭ https://doi.org/10.1016/j.array.2022.100258‬‭ .

Sadek, T., Stanley, K. D., Smith, G., Marcinek, K., Cormarie, P., & Gunashekar, S. (2024). Artificial Intelligence Impacts on Privacy Law‬‭ . RAND‬‭ Corporation. https://www.rand.org/pubs/research_reports/RRA3243-2.html‬‭ .

Sen, S. (2023, September 22). Applicant Tracking System: The Ultimate Guide to Smart Hiring. Asanify. https://asanify.com/blog/human-resources/applicant-tracking-system-the-ultimate-guide-t o-smart-hiring/‬‭ .

The Ultimate Guide to AI in recruiting [2024]‬‭ . Joveo.‬‭ (2024, October 4). https://www.joveo.com/the-ultimate-guide-to-ai-in-recruiting/‬‭ .

The Three Challenges of AI Regulation‬‭ . (n.d.).‬‭ Brookings. Retrieved October 19, 2024, from https://www.brookings.edu/articles/the-three-challenges-of-ai-regulation/‬‭ .‬