skip to main content
10.1145/3468264.3468537acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article
Public Access

Bias in machine learning software: why? how? what to do?

Published: 18 August 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Increasingly, software is making autonomous decisions in case of criminal sentencing, approving credit cards, hiring employees, and so on. Some of these decisions show bias and adversely affect certain social groups (e.g. those defined by sex, race, age, marital status). Many prior works on bias mitigation take the following form: change the data or learners in multiple ways, then see if any of that improves fairness. Perhaps a better approach is to postulate root causes of bias and then applying some resolution strategy. This paper postulates that the root causes of bias are the prior decisions that affect- (a) what data was selected and (b) the labels assigned to those examples. Our Fair-SMOTE algorithm removes biased labels; and rebalances internal distributions such that based on sensitive attribute, examples are equal in both positive and negative classes. On testing, it was seen that this method was just as effective at reducing bias as prior approaches. Further, models generated via Fair-SMOTE achieve higher performance (measured in terms of recall and F1) than other state-of-the-art fairness improvement algorithms. To the best of our knowledge, measured in terms of number of analyzed learners and datasets, this study is one of the largest studies on bias mitigation yet presented in the literature.

    References

    [1]
    https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G
    [2]
    http://news.mit.edu/2018/study-finds-gender-skin-type-bias-artificial-intelligence-systems-0212
    [3]
    https://science.sciencemag.org/content/356/6334/183
    [4]
    arxiv:2012.09951
    [5]
    https://doi.org/10.1145/3368089.3409697
    [6]
    https://www.cairn.info/revue-horizons-strategiques-2007-3-page-17.htm
    [7]
    https://www.migpolgroup.com/_old/portfolio/proving-discrimination-cases-the-role-of-situation-testing/
    [8]
    http://dx.doi.org/10.1613/jair.953
    [9]
    http://papers.nips.cc/paper/6988-optimized-pre-processing-for-discrimination-prevention.pdf
    [10]
    http://www.aies-conference.com/wp-content/papers/main/AIES_2018_paper_162.pdf
    [11]
    https://doi.org/10.1016/j.ins.2017.09.064
    [12]
    http://fairware.cs.umass.edu/
    [13]
    https://2019.ase-conferences.org/home/explain-2019
    [14]
    https://github.com/IBM/AIF360
    [15]
    https://www.microsoft.com/en-us/research/group/fate/
    [16]
    https://qz.com/1268520/facebook-says-it-has-a-tool-to-detect-bias-in-its-artificial-intelligence/
    [17]
    http://jmlr.org/papers/v13/bergstra12a.html
    [18]
    http://mlr.cs.umass.edu/ml/datasets/Adult
    [19]
    https://github.com/propublica/compas-analysis
    [20]
    https://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit +Data)
    [21]
    https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients
    [22]
    https://archive.ics.uci.edu/ml/datasets/Heart+Disease
    [23]
    https://www.kaggle.com/c/bank-marketing-uci
    [24]
    https://www.kaggle.com/c/home-credit-default-risk
    [25]
    https://archive.ics.uci.edu/ml/datasets/Student+Performance
    [26]
    https://meps.ahrq.gov/mepsweb/
    [27]
    https://doi.org/10.1007/s10618-010-0190-x
    [28]
    https://doi.org/10.1007/s10115-011-0463-8
    [29]
    arxiv:1610.02413
    [30]
    arxiv:2011.03173
    [31]
    arxiv:1901.04966
    [32]
    https://www.ijcai.org/proceedings/2017/0549.pdf
    [33]
    arxiv:1805.05859
    [34]
    http://dx.doi.org/10.1145/3106237.3106277
    [35]
    http://dx.doi.org/10.1145/3238147.3238165
    [36]
    http://doi.acm.org/10.1145/3338906.3338937
    [37]
    https://doi.org/10.1145/3377811.3380331
    [38]
    http://dx.doi.org/10.1145/3368089.3409704
    [39]
    arxiv:1905.05786
    [40]
    https://doi.org/10.1145/3324884.3418932
    [41]
    https://doi.org/10.1145/3236024.3264590
    [42]
    https://pages.awscloud.com/rs/112-TZM-766/images/Fairness.Measures.for.Machine.Learning.in.Finance.pdf
    [43]
    https://blogs.thomsonreuters.com/answerson/ai-fairness-bias/
    [44]
    https://doi.org/10.1145/2020408.2020488
    [45]
    https://ieeexplore.ieee.org/document/7194626
    [46]
    https://ieeexplore.ieee.org/document/6235961

    Cited By

    View all
    • (2024)Plato’s Shadows in the Digital Cave: Controlling Cultural Bias in Generative AIElectronics10.3390/electronics1308145713:8(1457)Online publication date: 11-Apr-2024
    • (2024)The Use of Facial Recognition in Sociological Research: A Comparison of ClarifAI and Kairos Classifications to Hand-Coded ImagesSocius: Sociological Research for a Dynamic World10.1177/2378023124125965910Online publication date: 20-Jun-2024
    • (2024)Predicting Fairness of ML Software ConfigurationsProceedings of the 20th International Conference on Predictive Models and Data Analytics in Software Engineering10.1145/3663533.3664040(56-65)Online publication date: 10-Jul-2024
    • Show More Cited By

    Index Terms

    1. Bias in machine learning software: why? how? what to do?

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
        August 2021
        1690 pages
        ISBN:9781450385626
        DOI:10.1145/3468264
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 18 August 2021

        Permissions

        Request permissions for this article.

        Check for updates

        Badges

        • Distinguished Paper

        Author Tags

        1. Bias Mitigation
        2. Fairness Metrics
        3. Software Fairness

        Qualifiers

        • Research-article

        Funding Sources

        • Laboratory for Analytic Sciences
        • NSF

        Conference

        ESEC/FSE '21
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 112 of 543 submissions, 21%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)3,746
        • Downloads (Last 6 weeks)385
        Reflects downloads up to 14 Aug 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Plato’s Shadows in the Digital Cave: Controlling Cultural Bias in Generative AIElectronics10.3390/electronics1308145713:8(1457)Online publication date: 11-Apr-2024
        • (2024)The Use of Facial Recognition in Sociological Research: A Comparison of ClarifAI and Kairos Classifications to Hand-Coded ImagesSocius: Sociological Research for a Dynamic World10.1177/2378023124125965910Online publication date: 20-Jun-2024
        • (2024)Predicting Fairness of ML Software ConfigurationsProceedings of the 20th International Conference on Predictive Models and Data Analytics in Software Engineering10.1145/3663533.3664040(56-65)Online publication date: 10-Jul-2024
        • (2024)MirrorFair: Fixing Fairness Bugs in Machine Learning Software via Counterfactual PredictionsProceedings of the ACM on Software Engineering10.1145/36608011:FSE(2121-2143)Online publication date: 12-Jul-2024
        • (2024)Fairness Testing: A Comprehensive Survey and Analysis of TrendsACM Transactions on Software Engineering and Methodology10.1145/365215533:5(1-59)Online publication date: 4-Jun-2024
        • (2024)Bias Mitigation for Machine Learning Classifiers: A Comprehensive SurveyACM Journal on Responsible Computing10.1145/36313261:2(1-52)Online publication date: 20-Jun-2024
        • (2024)A Post-training Framework for Improving the Performance of Deep Learning Models via Model TransformationACM Transactions on Software Engineering and Methodology10.1145/363001133:3(1-41)Online publication date: 15-Mar-2024
        • (2024)An Empirical Study on Correlations Between Deep Neural Network Fairness and Neuron Coverage CriteriaIEEE Transactions on Software Engineering10.1109/TSE.2023.334900150:3(391-412)Online publication date: Mar-2024
        • (2024)Requirements Verification Through the Analysis of Source Code by Large Language ModelsSoutheastCon 202410.1109/SoutheastCon52093.2024.10500073(75-80)Online publication date: 15-Mar-2024
        • (2024)Ethics: Why Software Engineers Can’t Afford to Look AwayIEEE Software10.1109/MS.2023.331976841:1(142-144)Online publication date: Jan-2024
        • Show More Cited By

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media