Skip to main navigation Skip to search Skip to main content

Bounding the fairness and accuracy of classifiers from population statistics

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    8 Scopus citations

    Abstract

    We consider the study of a classification model whose properties are impossible to estimate using a validation set, either due to the absence of such a set or because access to the classifier, even as a black-box, is impossible. Instead, only aggregate statistics on the rate of positive predictions in each of several sub-populations are available, as well as the true rates of positive labels in each of these sub-populations. We show that these aggregate statistics can be used to lowerbound the discrepancy of a classifier, which is a measure that balances inaccuracy and unfairness. To this end, we define a new measure of unfairness, equal to the fraction of the population on which the classifier behaves differently, compared to its global, ideally fair behavior, as defined by the measure of equalized odds. We propose an efficient and practical procedure for finding the best possible lower bound on the discrepancy of the classifier, given the aggregate statistics, and demonstrate in experiments the empirical tightness of this lower bound, as well as its possible uses on various types of problems, ranging from estimating the quality of voting polls to measuring the effectiveness of patient identification from internet search queries. The code and data are available at https://github.com/ sivansabato/bfa.

    Original languageEnglish
    Title of host publication37th International Conference on Machine Learning, ICML 2020
    EditorsHal Daume, Aarti Singh
    PublisherInternational Machine Learning Society (IMLS)
    Pages8286-8295
    Number of pages10
    ISBN (Electronic)9781713821120
    StatePublished - 1 Jan 2020
    Event37th International Conference on Machine Learning, ICML 2020 - Virtual, Online
    Duration: 13 Jul 202018 Jul 2020

    Publication series

    Name37th International Conference on Machine Learning, ICML 2020
    VolumePartF168147-11

    Conference

    Conference37th International Conference on Machine Learning, ICML 2020
    CityVirtual, Online
    Period13/07/2018/07/20

    ASJC Scopus subject areas

    • Computational Theory and Mathematics
    • Human-Computer Interaction
    • Software

    Fingerprint

    Dive into the research topics of 'Bounding the fairness and accuracy of classifiers from population statistics'. Together they form a unique fingerprint.

    Cite this