Identifying Risks in Datasets for Automated Decision–Making

EGOV2020 – IFIP EGOV-CeDEM-EPART 2020, Linköping University (Sweden), August 31 - September 2, 2020, pp. 332-344. egov-2020 (BEST PAPER AWARD)
Mecati, M., Cannavò F.E., Vetrò A., Torchiano, M.
PDF icon definitivo_paper_egov2020.pdf295.45 KB
September 2020


in the category "The most innovative research contribution or case study" at EGOV-2020.

Our daily life is profoundly affected by the adoption of au- tomated decision making (ADM) systems due to the ongoing tendency of humans to delegate machines to take decisions. The unleashed usage of ADM systems was facilitated by the availability of large-scale data, alongside with the deployment of devices and equipment. This trend re- sulted in an increasing influence of ADM systems’ output over several aspects of our life, with possible discriminatory consequences towards certain individuals or groups. In this context, we focus on input data by investigating measurable characteristics which can lead to discriminating automated decisions. In particular, we identified two indexes of hetero- geneity and diversity, and tested them on two datasets. A limitation we found is the index sensitivity to a large number of categories, but on the whole results show that the indexes reflect well imbalances in the input data. Future work is required to further assess the reliability of these indexes as indicators of discrimination risks in the context of ADM, in order to foster a more conscious and responsible use of ADM systems through an immediate investigation on input data.

Reproducibility package