Detecting Risk of Biased Output with Balance Measures

Journal of Data and Information Quality, April 2022
Mariachiara Mecati, Antonio Vetrò, Marco Torchiano
April 2022

Data has become a fundamental element of the management and productive infrastructures of our society, fuelling digitization of organizational and decision-making processes at an impressive speed. This transition shows lights and shadows, and the “bias in-bias out” problem is one of the most relevant issues, which encompasses technical, ethical, and social perspectives. We address this field of research by investigating how the balance of protected attributes in training data can be used to assess the risk of algorithmic unfairness. We identify four balance measures and test their ability to detect the risk of discriminatory classification by applying them to the training set. The results of this proof of concept show that the indexes can properly detect unfairness of software output. However we found the choice of the balance measure has a relevant impact on the threshold to consider as risky; further work is necessary to deepen knowledge on this aspect.

Official version of the paper available at: https://doi.org/10.1145/3530787