Academic researchers in India and the U.S. are taking several different approaches to addressing the fairness of face biometrics for people from different demographics, and the datasets used to train the biometric algorithms.
IIT researchers develop FPR framework
Researchers from the Indian Institute of Technology (IIT) Jodhpur have developed a framework to evaluate datasets on a “fairness, privacy, and regulatory” (FPR) scale, aiming to tackle concerns of bias and ethical lapses in AI systems tailored for the Indian context.
According to a PTI report published by The Week Magazine, Mayank Vatsa, professor at IIT Jodhpur and corresponding author of the study implies that when building a facial recognition system specifically for India, it is best to prioritize datasets that reflect the diversity of facial features and skin tones that exist in the region.
The framework, developed with international collaborators and published in Nature Machine Intelligence in August, assigns an FPR score to datasets. It evaluates fairness by assessing demographic representation, privacy by identifying vulnerabilities that may lead to data breaches, and regulatory compliance by checking adherence to legal and ethical standards.
The researchers audited 60 datasets, including 52 biometric face-based datasets and eight chest X-ray datasets, finding widespread deficiencies. About 90 percent of face datasets scored poorly, with most failing to meet fairness and compliance standards.
SMU and WVU explore synthetic data’s potential
As AI applications in facial recognition expand globally, researchers from Southern Methodist University (SMU) and West Virginia University (WVU) are also spearheading efforts to address long standing issues of bias, fairness, and security in the technology.
At SMU’s Lyle School of Engineering, Corey Clark, assistant professor of computer science and deputy director of research at the SMU Guildhall, leads a team focused on generating vast synthetic datasets for AI training. Unlike traditional datasets comprising real human images – often sourced through ethically complex agreements or web scraping – synthetic datasets are algorithmically created. These datasets, Clark mentions in a YouTube explainer, can emulate highly realistic human likenesses without depending on identifiable individuals, preserving privacy while enabling large-scale model training.
The risks of racial and gender bias in AI algorithms have drawn criticism and highlighted the need for equitable technology.
Nima Karimian, an assistant professor in the Lane Department of Computer Science and Electrical Engineering at West Virginia University‘s Benjamin M. Statler College of Engineering and Mineral Resources, highlights that biometric systems face significant security vulnerabilities, particularly from attacks aimed at hardware such as phones and laptops.
At WVU, Karimian is tackling the issue from a different angle, leveraging a $632,000 NSF CAREER award to explore AI’s vulnerabilities to bias and fairness failures. His work underscores the risks inherent in using flawed datasets or algorithms in critical applications.
“To date, there is nonexistent research specifically addressing bias and fairness in anti-spoofing biometrics,” says Karimian, referring to liveness or presentation attack detection (PAD).
“Part of the challenge is the fact that the current state-of-the-art technique for training face recognition software in a way that mitigates bias involves using synthetic data, not images of real people’s faces. But synthetic data generation won’t work if we’re trying to mitigate bias in anti-spoofing systems, because the whole point of anti-spoofing systems is to distinguish between fake samples and genuine data.”
Clark sees synthetic data as potentially pivotal in overcoming barriers to equitable AI, while Karimian is seeking to explain the underlying reasons for demographic bias.
Article Topics
biometric-bias | biometrics | biometrics research | demographic fairness | face biometrics | facial recognition | synthetic data