论文标题
面部图像中的性别分类和偏见缓解
Gender Classification and Bias Mitigation in Facial Images
论文作者
论文摘要
性别分类算法在当今许多领域中都有重要的应用,例如人口研究,执法以及人类计算机的互动。最近的研究表明,接受偏见基准数据库培训的算法可能导致算法偏差。但是,迄今为止,几乎没有对性别分类算法对性别少数群体亚组的偏见进行的研究,例如LGBTQ和非二进制人群,它们在性别表达中具有独特的特征。在本文中,我们首先对现有基准数据库进行调查,以实现面部识别和性别分类任务。我们发现当前的基准数据库缺乏性别少数子组的表示。我们致力于将当前的二进制性别分类器扩展到包括非二进制性别类别。我们通过组装两个新的面部图像数据库来做到这一点:1)具有LGBTQ人群子集的种族平衡的包容性数据库2)由具有非二进制性别的人组成的包容性别数据库。我们努力提高分类精度,并减轻在增强基准数据库中训练的基线模型上的算法偏差。我们的合奏模型的总体准确度得分为90.39%,比接受Adience训练的基线二元性别分类器增长了38.72%。尽管这是减轻性别分类偏见的初步尝试,但要通过组装更多包容性数据库来将性别作为连续体建模需要更多的工作。
Gender classification algorithms have important applications in many domains today such as demographic research, law enforcement, as well as human-computer interaction. Recent research showed that algorithms trained on biased benchmark databases could result in algorithmic bias. However, to date, little research has been carried out on gender classification algorithms' bias towards gender minorities subgroups, such as the LGBTQ and the non-binary population, who have distinct characteristics in gender expression. In this paper, we began by conducting surveys on existing benchmark databases for facial recognition and gender classification tasks. We discovered that the current benchmark databases lack representation of gender minority subgroups. We worked on extending the current binary gender classifier to include a non-binary gender class. We did that by assembling two new facial image databases: 1) a racially balanced inclusive database with a subset of LGBTQ population 2) an inclusive-gender database that consists of people with non-binary gender. We worked to increase classification accuracy and mitigate algorithmic biases on our baseline model trained on the augmented benchmark database. Our ensemble model has achieved an overall accuracy score of 90.39%, which is a 38.72% increase from the baseline binary gender classifier trained on Adience. While this is an initial attempt towards mitigating bias in gender classification, more work is needed in modeling gender as a continuum by assembling more inclusive databases.
