This repo provides the code for our paper "On Adversarial Bias and the Robustness of Fair Machine Learning".
We design and implement data poisoning attack algorithms (algorithm1 and algorithm2) against machine learning models trained with equalized odds as the fairness constraint to reduce the overall test accuracy. We assume the attacker who can control the sampling process and (in the stronger case, also) the labeling process for some of the training data. Using these algorithms, we show that in presence of adversarial bias, fairness of machine learning models can be in conflict with robustness.
Require: python 3.6
Install virtualenv:
pip install virtualenv
Create a virtual environment
mkdir ~/env/ # this directory contains all virtual environment
virtualenv -p python3 ~/env/(name)
# replace (name) with your naming of the virtual environment
Install packages
source ~/env/(name)/bin/activate
# activate the environmentpip install -r requirements.txt
Note: For reproducibility, we include fairlearn repo (Microsoft, v0.3.0).
We evaluate the code on the COMPAS and Adult dataset. We generate 4 datasets in the dataset folder based on the preprocessing step which we mentioned in the paper. Each dataset includes:
clean training dataset. -
hard examples from nature. -
attacker dataset which is from the same distribution as clean training dataset but no overlap. -
test dataset.
- Implementation of Algorithm 1 and 2 are in Call the corresponding functions and provide required arguments to run the attacks.
- Follow Example notebook to see an example.
To cite the arxiv version, please use the following bibtex
title={On Adversarial Bias and the Robustness of Fair Machine Learning},
author={Chang, Hongyan and Nguyen, Ta Duy and Murakonda, Sasi Kumar and Kazemi, Ehsan and Shokri, Reza},
journal={arXiv preprint arXiv:2006.08669},