Towards Methodologies and Tools for Conducting Algorithm Audits

Lead PI

Abstract

This project will develop methodologies and tools for conducting algorithm audits. An algorithm audit uses controlled experiments to examine an algorithmic system, such as an online service or big data information archive, and ascertain (1) how it functions, and (2) whether it may cause harm.

Examples of documented harms by algorithms include discrimination, racism, and unfair trade practices. Although there is rising awareness of the potential for algorithmic systems to cause harm, actually detecting this harm in practice remains a key challenge. Given that most algorithms of concern are proprietary and non-transparent, there is a clear need for methods to conduct black-box analyses of these systems. Numerous regulators and governments have expressed concerns about algorithms, as well as a desire to increase transparency and accountability in this area.

This research will develop methodologies to audit algorithms in three domains that impact many people: online markets, hiring websites, and financial services. Auditing algorithms in these three domains will require solving fundamental methodological challenges, such as how to analyze systems with large, unknown feature sets, and how to estimate feature values without ground-truth data. To address these broad challenges, the research will draw on insights from prior experience auditing personalization algorithms. Additionally, each domain also brings unique challenges that will be addressed individually. For example, novel auditing tools will be constructed that leverage extensive online and offline histories. These new tools will allow examination of systems that were previously inaccessible to researchers, including financial services companies. Methodologies, open-source code, and datasets will be made available to other academic researchers and regulators.

This project includes two integrated educational objectives: (1) to create a new computer science course on big data ethics, teaching how to identify and mitigate harmful side-effects of big data technologies, and (2) production of web-based versions of the auditing tools that are designed to be accessible and informative to the general public, that will increase transparency around specific, prominent algorithmic systems, as well as promote general education about the proliferation and impact of algorithmic systems.

Funding

NSF