Harvard professor tackles algorithms, prediction, and fairness in lecture at Northeastern
Author: Juliana George
Date: 11.26.24
Machine learning algorithms are never fully fair. Facial recognition software has more trouble distinguishing between women and people with darker skin than between white men, and a 2019 study found that a predictive algorithm used by the health care professionals to identify patients in need of care showed a racial bias against Black patients.
There’s a reason for these disparities. Algorithms are only as good as their training data, and sometimes a sufficiently representative data set is unavailable. But Cynthia Dwork, Harvard University’s Gordon McKay Professor of Computer Science, doesn’t think that should stop computer scientists from pursuing fairness. In fact, she’s working to create algorithms that aren’t limited by existing data, and in doing so, she’s putting her faith in the possibility of a better world.
On October 21, Dwork shared some of her research on algorithmic fairness and prediction during her lecture “Prediction, Fairness and … Complexity Theory?” at the Fenway Center on Northeastern’s Boston campus.
For a time, Dwork focused on distributed systems, cryptography, and differential privacy, which enables data analysis while preserving the privacy of individual data subjects. Since 2010, she has studied algorithmic fairness in both theoretical and applied settings to advocate for more equity in machine learning. In 2023, she founded the Hire Aspirations Institute, bringing together top minds from Harvard, Northeastern, and other research institutions to investigate the fairness of online hiring processes.
Dwork’s research on algorithmic fairness, she told the Fenway Center crowd, began with defining group and individual notions of fairness. Group fairness can be evaluated by statistical parity — in which groups with different characteristics have an equal chance of reaching a certain outcome — or by within-group calibration. She discovered that group methods are not ideal for assessing fairness because statistical parity doesn’t necessarily lead to equity, and group boundaries are difficult to define.
By contrast, individual fairness would mean that similarly classified subjects would also receive similar outcomes, but using the right metric for the task is tricky. For one, the metric would have to be taken from a fair system, which doesn’t always exist. For another, no one knows the probability of a nonrepeatable event, so while there may be broad group statistics, it’s impossible to accurately predict whether, for example, a certain cancer patient’s tumor will metastasize or whether a particular convict will re-offend.
To get around this issue, Dwork established the concept of outcome indistinguishability, which determines the efficacy of an algorithm not by the number of errors, but by the indistinguishability of the real world from the model’s predicted world. This would be a stronger method than a fractional approach based on historical data because it would allow researchers to measure the viability of an algorithm based on their degree of access to the code.
With no access, an auditor would only see individuals and outcomes and could not gauge the model’s accuracy. With sample access, auditors could compare the model’s predictions against real-life outcomes, which would make for a stronger case for or against successful outcome indistinguishability. With Oracle access, or limited access to raw data that still protects the identities of individual subjects, auditors could even test an algorithm on individuals and potentially build a discrimination case if the model is found to be unfair.
These findings are germane at a time when, as Dwork noted, the public distrusts predictive algorithms that diverge from reality. For example, in 2020, Californians voted against a ballot proposition that would have replaced cash bail with an automated risk assessment system for certain charges. Despite widespread support for bail reform, critics worried that the risk assessment algorithm would rely on existing racial biases in the justice system.
Dwork believes that the key to ensuring the impartiality of similar systems is fairness auditing using outcome indistinguishability, which is more effective when the auditor has higher code access.
“People say, ‘Oh, this is the best I can do with the data I have,’” Dwork said. “I think this is really problematic, that you can look at something and say, ‘Gee, with this data, the only thing I can produce is this sexist scoring function.’ Then don’t produce the scoring function. Say, ‘No, I need different kinds of data.’
“If we imagine that our current world is a certain kind of transformation from a more ideal world,” Dwork added, “then we can build predictors that give us something closer to what would be the chances of success in a better world, even though we have no data from those worlds.”