Debunking AI’s Supposed Fairness-accuracy Tradeoff

Emily Black, John Logan Koepke, Pauline T. Kim, Solon Barocas & Mingwei Hsu, Less Discriminatory Algorithms, __ Geo. L.J. __ (forthcoming); Wash. U. Legal Studies Rsch. Paper (forthcoming), available at SSRN (Oct. 2, 2023).

Paul Ohm

We are likely to see as many new law review articles about artificial intelligence and law in the next year as the legal academy has produced since the dawn of AI. Those writing about AI for the very first time (*eyes a bunch of copyright scholars suspiciously*) would do well to engage deeply with the work of bias and discrimination scholars who have been writing some of the best, most insightful articles about AI and law for more than a decade. The frameworks and insights they have developed give us a way of thinking about AI law and policy beyond just considerations about Title VII and Equal Protection. A wonderful place to start is with the best contribution to this scholarship I have seen in years, Less Discriminatory Algorithms, by an interdisciplinary team of technical, policy, and legal experts.

This article is a form of my favorite genre of legal scholarship: “you all are making an important mistake about how the technology works.” In particular, it takes on the received wisdom that there is a vexing tradeoff between fairness and accuracy when training machine learning models. This supposed tradeoff—that in order to make a biased algorithm fairer, you need to sacrifice some of the model’s accuracy—may be true in theory for idealized and never-seen-in-the-wild maximally accurate models. But nobody ever has the time, compute, money, or energy to even approach this ideal, meaning real models out in the world are far from maximally accurate. People instead declare success as soon as their models are just accurate enough for their purposes.

The thing about these less-than-maximally-accurate models is that there are many other possible models the model builders could have trained that would have been nearly as accurate, while doling out different false positives and false negatives. Some of these nearly-as-accurate models would likely have better distributional outcomes for protected classes and other vulnerable populations. This is what two of the article’s authors, Emily Black and Solon Barocas, have dubbed “model multiplicity” in the computer science literature. Model multiplicity means that if those training a model would continue to experiment—tweaking a hyperparameter here, selecting or deselecting a feature there, or preprocessing the data a little bit more—they would probably find many other equally (or nearly as) accurate models that allocated the winners and losers differently.

Importantly, many of these other nearly-as-accurate models might also be more fair, less biased, and less discriminatory than the one actually deployed. Rather than saying that there is a tradeoff between fairness and accuracy, we should instead understand that the tradeoff is between accepting a given level of bias or discrimination versus spending a little more time, money, and (carbon emitting) compute to find an alternative that improves on fairness without sacrificing accuracy.

Happily, Black and Barocas have found a brilliant and interdisciplinary team of coauthors: noted antidiscrimination scholar Pauline Kim and two amazing tech-meets-law researchers from the incredible nonprofit Upturn, Logan Koepke and Mingwei Hsu. The group successfully connects these engineering insights to legal doctrine. Model multiplicity’s recasting of the fairness-accuracy tradeoff has a direct bearing on the so-called third step of Title VII’s disparate impact analysis: the requirement that plaintiffs demonstrate a less discriminatory alternative (LDA). (I’m omitting the article’s careful discussion of how other civil rights statutes, including ECOA and the FHA, handle disparate impact analysis a bit differently, but to the same general effect.) This means that plaintiffs might be able to win in the third step of the test, by citing this paper and bringing on an expert who can help find the fairer, just-as-accurate model-not-taken.

Finding the less discriminatory alternative would still be a daunting path. It would require a plaintiff to have access to the exact training environment the defendant used—including all of the training data, which defendants are sure to resist producing. If plaintiffs clear this formidable discovery challenge, they will still need to spend a lot of time and money to find the less discriminatory alternative. All of this might be insurmountably burdensome for plaintiffs, limiting the change that model multiplicity might bring to civil rights law.

The authors, however, have both technical and legal responses to these evidentiary challenges. If the burden were instead placed on the defendant at step two to show that they looked for less discriminatory alternatives during training, it would avoid the inefficiency, delay, and discovery hassles associated with burdening the plaintiff after the fact. Kim and her coauthors find judicial precedent for imposing this obligation on defendants at step two, while conceding that this interpretation has not yet been broadly adopted. Importantly, once this approach is understood to be required by civil rights law, it will impose on model builders the duty to start looking for these alternatives during training. According to model multiplicity, they will often find them! The result will be more model builders finding and deploying less discriminatory alternatives, meaning the win-win of fewer people suffering from discrimination and less litigation exposure for employers!

I like (lots!) so much about this article. It imports a simple but powerful and elegant technical insight—model multiplicity—into legal scholarship. It does so while also advancing a much-needed reform in the way we interpret civil rights law, putting the burden of looking for less discriminatory alternatives on defendants rather than plaintiffs. It is grounded and specific, explaining exactly why this should happen and how the three-step test and the correlative duty to search for less discriminatory alternatives ought to be interpreted; I can imagine courts across the country implementing this article’s prescriptions directly. It does not shy away from technical and legal detail, taking deep dives into the techniques used to find a less discriminatory alternative, how nearly accurate a less discriminatory alternative must be, or how much effort a model builder must spend in trying to find one, to list only three examples.

Finally, legal scholars must understand that model multiplicity matters beyond civil rights law. An obvious extension is to products liability law. Model multiplicity will provide a pathway for injured plaintiffs looking for a “reasonable alternative design” to support claims of defective products design.

We will soon see AI models that do not live up to our legal and societal ideals in many other ways, such as large language models that spout misinformation, deep fake models used to terrorize women, and facial recognition models that destroy privacy. As we fight over what we should do about the bad effects of models, we should understand that they are often the effects of bad models. Model multiplicity means refusing to let model builders rest on the expediency of having done the bare minimum. Keep trying; keep searching; keep looking, and you just may find models that both do well and do good.

Cite as: Paul Ohm, Debunking AI’s Supposed Fairness-accuracy Tradeoff, JOTWELL (March 7, 2024) (reviewing Emily Black, John Logan Koepke, Pauline T. Kim, Solon Barocas & Mingwei Hsu, Less Discriminatory Algorithms, __ Geo. L.J. __ (forthcoming); Wash. U. Legal Studies Rsch. Paper (forthcoming), available at SSRN (Oct. 2, 2023)), https://cyber.jotwell.com/debunking-ais-supposed-fairness-accuracy-tradeoff/.

Debunking AI’s Supposed Fairness-accuracy Tradeoff

Submit a Comment Cancel reply

INSIDE JOTWELL

Sponsored By

SECTIONS

Editor in Chief

Section Editors

CONTRIBUTING EDITORS

Student Editors

Feeds & Subscriptions

Get Email Updates

Search