Spanking the Money

Annemarie Bridy, Internet Payment Blockades, 67 Fla. L. Rev.__ (forthcoming 2015), available at SSRN.

A popular culture aphorism which is useful for teaching or comprehending intellectual property laws is “follow the money.” Often a law or a court decision only makes sense when its financial implications are contextualized. In this interesting, clear and engagingly well-written article, Professor Annemarie Bridy of the University of Idaho College of Law looks at how and why monetary transactions can be stopped cold in cyberspace by financial institutions that initially appear to be acting against their own business interests, but are actually submitting to unseen authority of questionable legitimacy. It is a story of commoditized sex, online sales of illegal drugs, and copyrighted rock and roll.

At the outset, Bridy positions her account of Internet payment blockades in the context of scholarship about powerful corporate actors doing the government’s bidding as the result of behind-the-scenes pressure. She credits Ronald Mann and Seth Belzley with important observations about “how concentration and high barriers to entry in the market for payment processing make payment intermediaries a ‘highly visible ‘choke point’ for regulatory intervention.'” (P. 4, citing to Ronald Mann and Seth Belzley, The Promise of Intermediary Liability.) She further notes in her introduction that: “Public-private regulatory cooperation of this sort goes by many names in the First Amendment literature, including proxy censorship soft censorship, and new school speech regulation,” citing to relevant works by Seth Kreimer (Seth F. Kreimer, Censorship by Proxy), Derek Bambauer (Derek E. Bambauer, Orwell’s Armchair), and Jack Balkin. (P. 5.)

As is so typical in any unsavory and overreaching account of copyright law in cyberspace, a major anti-hero in the sordid tale of Internet payment blockades is a pornography company, Perfect 10. Bridy recounts the story of Perfect 10’s aggressive but ultimately unsuccessful litigation efforts to hold payment intermediaries legally responsible for alleged copyright infringement by third party websites under the theory that providing payment processing services made them contributorily and vicariously liable. Judicial findings included determinations that: providing payment services did not rise to the level of a material contribution to the infringement; providing payment services did not induce infringement because there was no clear expression or affirmative act of specific intent to foster infringement; and the right and ability to indirectly affect the degree or ease of infringement by providing payment systems did not amount to the right or ability to control the infringing activity.

Bridy observes that after their judicial defeats Perfect 10 and other content owners pressed the U.S. government to devise other mechanisms of control over private payment systems. Bills to achieve this were introduced into Congress, intended to establish new statutory authority in the form of the Combatting Online Infringements and Counterfeits Act (COICA), the Stop Online Piracy Act (SOPA) and the Protect Intellectual Property Act (PIPA). However, these proposed laws proved very unpopular with Internet companies such as Google, Mozilla, Reddit and thousands of others, which whipped themselves into a massive and productive frenzy of protest. Many potentially affected companies leveraged their customer bases to publicly express deep and dire disapproval of this menu of proposed legislation, which was forthwith discarded hot-potato style.

But that was hardly the end of the matter. Bridy cogently explains that after abandoning this quest to directly regulate private payment processors, the U.S. government began deploying “soft” pressure and ardent persuasion to turn private payment entities such as Mastercard, Visa, American Express, Discover, and Paypal into controllable financial chokepoints in cyberspace. These chokepoints, which she labels Internet payment blockades, now facilitate the financial freeze out of political targets like Wikileaks, and of sites accused of vending potentially counterfeit or infringing goods and services.

Though they are ostensibly acting voluntarily, Bridy details the ways that payment intermediaries are essentially forced to police online activities virtually worldwide at the behest of Congresspeople, the Obama administration’s Office of the U.S. Intellectual Property Enforcement Coordinator, and myriad content owners through an opaque system of ironically denominated “best practices.” Bridy warns that there are a number of reasons that these allegedly best practices actually might not even qualify as halfway decent practices: extraterritorial enforcement of U.S. laws is improperly facilitated, extrajudicial remedies are unsuitably enabled, and extraordinary efficiency comes at the expense of due process and transparency. Important disputes are now privately adjudicated, displacing four letter f words like free (as in free speech) and fair (as in fair use) with (at least for this Jotwell reviewer) another four letter f word which communicates anger and despair at this reprehensible development.

Ultimately, Bridy concludes that new payment systems could develop which are more resistant to government interventions. Bitcoin, she asserts, possibly is one of them. She spends the last section of the article explaining how Bitcoin works and evaluating its potential as an effective Internet payment blockade runner. I learned a lot from this, as I did from this excellent article in its entirety.

Cite as: Ann Bartow, Spanking the Money, JOTWELL (February 13, 2015) (reviewing Annemarie Bridy, Internet Payment Blockades, 67 Fla. L. Rev.__ (forthcoming 2015), available at SSRN), http://cyber.jotwell.com/spanking-the-money/.
 
 

“They’re Coming to Get You, Barbara.”

Julie Cohen, The Zombie First Amendment, 56 Wm. & Mary L. Rev. __ (forthcoming 2015), available at SSRN.

Julie Cohen’s The Zombie First Amendment does not present itself as a piece of cyberlaw scholarship. It’s a treatment of information governance in the post-industrial, information age through the lens of constitutional law, with a broad range of potential applications—from information privacy to campaign finance reform to intellectual property law to network neutrality. In a sense, it’s a meta-cyberlaw paper. It’s not about information technology, but about information as technology.

Any piece by Julie Cohen both demands and rewards a more careful reading than a brief review such as this one can offer. Brevity is today’s currency, however. Begin, then, with the following overview of her argument: Contemporary First Amendment jurisprudence, she argues, is a species of the walking dead, legal doctrine whose form gives the appearance of being a plausibly sentient and responsive entity but whose spirit, soul, and intelligence has been displaced by powers that answer to a different, seemingly unstoppable and almost technological logic. Contemporary information practices have eaten the First Amendment’s brain.

The elements of the argument are these. The first part of the paper reviews and extracts a series of governing modern First Amendment principles from recent Supreme Court opinions. From Citizens United v. Federal Elec. Comm’n, 558 U.S. 310 (2010) (striking down certain campaign finance regulation), and Sorrell v. IMS Health, Inc., 131 S. Ct. 2653 (2011) (striking down state regulation of marketing use of information about physician prescribing behavior), come the proposition that “information flows that advance the purposes of private property accumulation and consumer surplus extraction may move freely with little fear of encountering regulatory obstacles” (P. 13.) From Holder v. Humanitarian Law Project, 561 U.S. 1 (2010) (upholding a federal law forbidding material support to terrorist organizations), Eldred v. Ashcroft, 537 U.S. 186 (2003) (upholding Congressional extension of the term of copyright), and Golan v. Holder, 132 U.S. 873 (2012) (upholding copyright amendments that effectively restored copyright protection to certain works that had entered the public domain) come evidence that “some types of content and speaker distinctions will be supported by the full force of law—will be treated, in other words, as principled and nonarbitrary.” (P. 13). Cohen combines these two points, as follows: “[T]hese opinions establish both a generally deregulatory stance toward proprietary, profit-motivated uses of information and the predicate for installing circuit breakers within the network to intercept other kinds of uses that threaten proprietary interests.” (P. 13.)

The cyberlaw argument comes next, and it arrives in blunt, forceful terms. That deregulatory stance, framed in terms of speech as a good “thing” to be protected or a bad “thing” to be guarded against, evinces an uncritical, almost technological determinism. This is the zombie First Amendment, which, Cohen argues, treats speech presumptively as property for constitutional purposes. In so doing it treats attempts to regulate the property-like “thing-ness” of speech as presumptively invalid—unless the regulation is itself directed to defining or advancing a property or property-like claim.

This claim extends a common cyberlaw theme, namely, the rhetorical equivalence of information and property, once recorded as “code is law,” now elevated beyond rhetoric to constitutional status. In a metaphorically technological sense, law is code, bereft of frameworks defined by humanism and justice. Understandings and definitions of what counts as “speech” for constitutional purposes have been overrun by a legal cousin of the market-oriented neoliberalism that characterizes much of the modern information economy. Speech doctrine under the First Amendment is turning from a bulwark against harmful incursions of legal power and privilege in a just society, into their avatar. More than 15 years ago, Cohen cautioned us about the implications of the technologies of the information society for distributive justice, in Lochner in Cyberspace.1 Her baleful predictions regarding the enduring roles of power and privilege have, in The Zombie First Amendment, come to fruition. What was emergent then in digital information technologies is being realized in decisions of the Supreme Court.

The second part of this paper turns from that conceptual framing to doctrinal and practical payoffs, which are presented partly in terms of developments in intellectual property law, particularly the regulation of expressive corporate speech via trademark and copyright doctrine, and partly in terms of information law, particularly the regulation of secret information (particularly state secrets) and the regulation of commercial data processors (both longstanding credit reporting companies and also search companies such as Google, social network platforms such as Facebook, and data brokers such as Axciom).

The technological practices and business models of all of these firms depend on extensive access to fine-grained forms of personal information. Legally protecting both the processes that do the work and the products they produce requires, as Cohen points out, a conceptual framework that identifies these “biopolitical” resources as freely “available to be commodified” (P. 20.) Those resources and practices together form an emerging norm of information access and production that consolidates and is consolidated by the “zombification” of the First Amendment that Cohen describes in the first half of the paper, with corresponding implications for power and privilege. The zombie First Amendment has lost the capability that speech doctrine once had to identify speech-related harms and to recognize true speech-related beneficiaries. Law is code, in the popular (and problematic) sense that consumers and citizens en masse are subject to and helpless before an unthinking technology.

Is Cohen right? The questions she is asking are surely questions that need to be asked; the patterns that she has identified are surely patterns to document and critique. The paper does not make a real effort to frame a path forward (and does not claim to have tried), aside from pointing to underlying resource access and allocation problems. That’s an important start. The zombie metaphor points further. Describing eventual solutions in terms of access and allocation means killing zombies by depriving them of food—that is, more brains. But everyone knows that the only truly effective way to kill a zombie is to destroy its brain. Cutting off the food supply is a half-step.

The brain of the zombie First Amendment appears to be the uncritical “thingification” of speech itself, an emerging pattern, exemplified by the work of Henry Smith,2 that has an uncredited cameo role here. Cohen’s paper calls to mind the work of another Cohen, Felix, who identified almost precisely the same problem, in almost precisely the same terms, 80 years ago.3 Felix Cohen, it turns out, was ahead of his time, or a George Romero of the legal system, if you will: the creative and intellectual source of a widespread and hugely important 21st century phenomenon. (Romero, of course, made the first true zombie movie, Night of the Living Dead, in 1968. This review’s title is a classic quotation from that film.) The physicality of the forms and practices of industrial production during the 20th century masked the true nature and implications of his claim. The rise of cyberspace, and the assumption that “information” and “code” are almost purely intangible and virtual, have pressed Felix Cohen’s dormant “thingification” idea into urgent and important service in law and in commercial and cultural practice. Can that idea be destroyed? I don’t know. But Julie Cohen’s paper is an important part of a much-needed revival of critical examination of the “thingification” pattern, and of how to go about resisting it.



  1. Julie E. Cohen, Lochner in Cyberspace: The New Economic Orthodoxy of “Rights Management,” 97 Mich. L. Rev 462 (1998). []
  2. E.g., Henry E. Smith, Property as the Law of Things, 125 Harv. L. Rev. 1691 (2012). []
  3. Felix Cohen, Transcendental Nonsense and the Functional Approach, 35 Colum. L. Rev. 809 (1935). []
Cite as: Michael Madison, “They’re Coming to Get You, Barbara.”, JOTWELL (January 12, 2015) (reviewing Julie Cohen, The Zombie First Amendment, 56 Wm. & Mary L. Rev. __ (forthcoming 2015), available at SSRN), http://cyber.jotwell.com/theyre-coming-to-get-you-barbara/.
 
 

Digital Behavioral Advertising – Why Worry?

Ryan Calo, Digital Market Manipulation, 82 Geo. Wash. L. Rev. 995 (2014).

Alongside the explosive growth of the internet, digital marketing is also growing aggressively. According to some projections, it might even surpass TV-based advertising in the coming decades. One of its most prominent and controversial features is commonly referred to as “behavioral advertising”; the tailoring of advertisements to specific users at a specific time, on the basis of previously collected personal information about those users’ online activities.

Behavioral advertising is creating a substantial buzz in the press. It is therefore no surprise that this issue is also generating a vibrant discussion in the legal and policy realm. Addressing it properly is a serious intellectual challenge. Behavioral advertising generates an uneasy feeling (some might find it “creepy“). Yet it is not necessarily simple to figure out why. Consumers have dealt with marketers—some of them quite aggressive—since the dawn of time. Existing mechanisms, which incorporate a delicate mix of market forces, reputation concerns and in extreme cases regulatory action, have produced an acceptable status quo. Recently, this status quo has apparently been breached. The challenge academics and policy makers face is explaining why and how. In his recent Article, Ryan Calo tackles this challenge directly, and sets forth important answers. His insights will enhance the policy debates about the regulation of behavioral marketing, and push them in the right direction.

One popular way to view the policy concerns raised by behavioral marketing is through the conceptual lens of “information privacy.” To some extent, this is the easy way out of the policy conundrum created by behavioral advertising, because the practice requires the collection of vast amounts of personal information regarding the traits and preferences of specific users. Prof. Calo notes this concern, but is wise not to rely upon it exclusively. While the privacy-based argument is simple and intuitive, it is also fragile. Marketers commonly argue that individuals’ privacy is not compromised because the personal information collected is not “identified” or “identifiable” and cannot be easily linked back to a specific individual. While this latter claim could certainly be rebutted, let us set it aside for now while observing that supplementing this basic privacy argument with an additional analytical framework surely cannot hurt.

Another key conceptual framework to understanding the problematic aspects of these new forms of marketing, Calo explains, is to recognize that they are manipulative. These marketing measures are influencing consumers to make bad decisions (at least from the consumers’ perspective), even though they do not, technically, include inaccurate information. Moving an argument based on the harms of manipulation from this colloquial level to a genuinely analytical one requires work—and this is where the Article’s central substantial contribution can be found.

To explain why manipulation is relevant here, Calo describes how advances in technology and social sciences empower digital marketing firms and allow them to exert influence over consumers. They do so by identifying the points when the consumers are most vulnerable and when their consumption-related decisions will (most likely) prove irrational.

Arguing that online digital marketing is manipulative to the excessive degree which calls for regulatory intervention is an uphill battle. On its face (and when applying Marshall McLuhan’s controversial paradigms for “understanding media“), this “cool” medium’s ability to substantially impact the human psyche is quite limited. With the possible exception of games, the forms of engagements brought about by new media are no competition to TV’s immersive force. TV engages our senses and draws us “in” entirely. It is an ideal platform for suggestive advertising. Yet even so, advertising flourishes on the screen, with the general understanding that it presents an acceptable price to pay for the so-called free content provided. How can digital media top this powerful set of effects, and create the unacceptable manipulative outcome noted?

The reason new media indeed generates these risks, according to Calo, is that it is both digitized and mediated. These two elements arguably provide marketers with an edge they did not have before. Technology provides the elasticity to shape and time every interaction with every specific user. Furthermore, it provides for the collection and analysis of personal data to optimize the firm’s interaction with consumers.

Reviewing Calo’s foundational claim calls for future empirical work, of course. Whether digital media indeed enable such extensive and successful manipulation is an open empirical question which social scientists must now explore. Providing a response should take time. While it is possible that at first users will be easily influenced by these marketing measures, they might prove resilient to them in the long run, or even develop countering measures as they adapt to new media.

Yet beyond establishing whether digital marketing can prove influential and effective, the policy argument to regulate behavioral marketing must also explain why the possible manipulative steps might prove problematic. Calo provides several analytical responses to this crucial question. I found the most interesting and promising argument to be one that focuses on “autonomy.” Here, Calo defines autonomy as the absence of vulnerability or ability to act in one’s self interest. With this definition in hand, it is relatively simple to explain what the firms engaged in behavioral advertising are doing wrong. These marketers identify the individual’s vulnerabilities and act upon them. By doing so, they limit consumers’ ability to exercise autonomy, thus leading to the unacceptable outcomes regulators must counter.

This argument, however, is trickier than it might seem. As with many other autonomy-based claims, it calls for properly explaining what it means to act in one’s “self interest.” In other words, how can we know for sure that the individual accepting the marketer’s proposal is exhaustedly succumbing to the marketers’ pitches, rather than acting within her own self interest? To complicate things, asking individuals after the fact whether they are happy with their purchases might not prove helpful. To bridge the cognitive dissonance they find themselves in after making the suggested purchase, consumers might successfully convince themselves that the purchase they made following the tailored digital pitch is the best thing they have done (or else, why did they carry it out?).

One way to explain what self-interest actually means in this context (and how it is linked to “autonomy”) is to break this notion into two, by introducing the concepts of first- and second-order preferences. In fact, the individual’s self interest is built up in several layers. The individual might have a set of long-term life goals, such as health, education and saving. These are her second-order preferences, which she strives to act upon and align her first-order preferences with—the current preferences according to which she operates daily. Digital advertising allows marketers to influence consumers to make decisions which coincide with their momentary first-order preference, but not necessarily their longer-term second-order one. Thus, the autonomy-based argument here can be further sharpened by noting that “self interest” actually means the ability to exercise second-order preferences. Or, the autonomy-based concern might promote the argument that the state must move to intervene when second-order preferences cannot be easily exercised.

Beyond unlocking the notion of autonomy, a discussion of manipulation calls for evaluating the ever-challenging “the market will fix it” argument. The marketing firms indeed act in their own interests, which are often misaligned with that of the consumer. But that is the nature of business. The reason the market nonetheless usually functions and does not necessarily move consumer surplus to the pockets of vendors is of course competition. Therefore, a convincing argument stating the problematic manipulative aspects of behavioral marketing must explain why market forces do not balance off the different manipulative ploys various firms strive to subject their users to. For instance, consider a firm trying to influence a consumer to abandon her second-order preference of health, and buy that extra cookie she really should not eat. Given the nature of technology, it is at least arguable that at the same exact time a firm promoting low-sugar cookies (or perhaps the consumers’ own health insurance plan or employer) will intervene with a message of its own, prompting the consumer to reconsider and revert to her second-order preferences.

The manipulation-based argument therefore must also convincingly argue that competition will not essentially change things and therefore regulatory intervention is needed. This might be due to several reasons. The relevant market might lack competition, or perhaps there are instances in which those who are supposed to provide the countering response (strengthening the consumers’ second-order preferences) are too weak and lack sophistication. In other instances, perhaps all the commercial firms share the problematic objective of influencing consumers in the same direction (for instance to engage in greater consumption) and therefore no countering voice will arise. The analysis will vary from one context to another and must therefore be further explored. For instance, Oren Bar-Gill explains convincingly why market forces will not mitigate firms’ incentives to influence consumers to act irrationally in credit and cell-phone markets. It is not clear that this argument could be made with similar success for other retail markets.

In conclusion, readers worried about behavioral marketing should read Ryan Calo’s article to better understand the analytical foundations of their hunch that these practices are problematic. Those who are not worried should probably read the article as well. It might change their mind.

Cite as: Tal Zarsky, Digital Behavioral Advertising – Why Worry?, JOTWELL (December 3, 2014) (reviewing Ryan Calo, Digital Market Manipulation, 82 Geo. Wash. L. Rev. 995 (2014)), http://cyber.jotwell.com/digital-behavioral-advertising-why-worry/.
 
 

Discrimination by Database

Solon Barocas & Andrew D. Selbst, Big Data's Disparate Impact, available at SSRN (2014).

I have previously written about an NYU School of Internet scholars, led by the philosopher Helen Nissenbaum, whose work is “philosophically careful, intellectually critical, rich in detail, and humanely empathetic.” There is also a Princeton School, which orbits around the computer scientist Ed Felten, and which is committed to technical rigor, clear exposition, social impact, and creative problem-solving. These traditions converge in Big Data’s Disparate Impact by Solon Barocas and Andrew Selbst. The article is an attempt to map Title VII employment discrimination doctrine on to data mining, and it is one of the most interesting discussions of algorithmic prediction I have read.

The pairing—anti-discrimination law and data mining—is ideal. They are both centrally concerned with how overall patterns emerge from individual acts; they shift back and forth between the micro and the macro, the stones and the mosaic. Moreover, they are both centrally concerned with making good decisions: each in its own way aspires to replace crude stereotypes with nuanced reason. It would seem then, that Big Data ought to be an ideal ally in Title VII’s anti-discrimination mission. But Barocas and Selbst give reasons to think that the opposite may be true: that data mining will introduce new forms of bias that Title VII is ill-equipped to remedy.

In any interesting decision problem, there is a gap between the evidence available to a decision-maker and her goals. A recruiter would like to avoid hiring candidates who will stab customers, but the candidates who end up stabbing customers never seem to list that fact on their resumes. Thus, the decision will be mediated through a rule: a prediction about how the observable evidence correlates with a goal. In this context, then, data mining is a discipline of using large datasets, sophisticated statistics, and raw computational power to formulate better, more predictive rules.

The resulting rules are at once intensely automated and intensely human. On the one hand, data mining algorithms can discover surprising rules that human rules would not have thought to look for or complicated rules that humans would not have been able to formulate. In this sense, the algorithmic turn allows the use of rules that really are supported by the data, rather than the biased rules we flawed humans would think to try.

At the same time, as Barocas and Selbst deftly show, data mining requires human craftwork at every step. Humans pick the datasets to use, and they massage that data to make it usable for the learning algorithms (e.g., by imputing ZIP codes for customers who haven’t listed them). Humans do the same thing on the other end, both approximating and constructing the characteristics they wish to select for. To learn who is a good employee, an algorithm needs to train on a dataset in which a human has flagged employees as “good” or “bad,” but that flagging process in a very real sense defines what it means to be a “good” employee. In the gap between evidence and goals, humans specify the set of possible rules the algorithm will choose among, and the algorithm that will choose among them.

Barocas and Selbst circle over this ground three times, each time at a higher level of abstraction: technical, doctrinal, prescriptive. On the first pass, they survey the ways that invidious biases can enter into the automated algorithmic judgments. On the second, they show that Title VII doctrine often fails to catch these biases, even when they would result in serious and unjustified mistreatment. And on the third, they show that it will not be easy to patch Title VII—that the challenges of Big Data go to the heart of the American project of equality.

Injecting algorithms into what was formerly a human decision-making process can undermine accountability by diffusing responsibility. For one thing, the data intensitivity of data mining makes it easier for bad actors to hide their fingerprints. Take the deeply uncool process of collecting, cleaning, and merging datasets to prepare them for mining. If a data broker redlines a tenant database that is then used as an input to an employment-applicant screening algorithm, the resulting hiring decisions will in a very real sense be racially motivated, but it will be almost impossible for anyone to reconstruct why. Proof problems abound in the land of Big Data, and Big Data’s Disparate Impact is replete with examples. Ring of Gyges, anyone?

It gets worse. Big Data optimists have argued that employers and other decision-makers rely on race as a crude proxy for the characteristics they really care about, so that with better data they will be less racist. Perhaps. But if Bert is a proxy for Ernie, then Ernie can also be a proxy for Bert. In a world where everything predicts everything else, as Paul Ohm has half-jokingly hypothesized, a data-mining algorithm does not need direct access to forbidden criteria like religion or race to make decisions on the basis of them. Indeed, it can find far subtler ones than humans are capable of: perhaps birth year plus car color plus favorite potato chip brand equals national origin. Put another way, data mining can be just as efficient at optimizing discrimination as at avoiding it.

Moreover, on closer inspection, almost every interesting dataset is tainted by the effects of past discrimination. In a classic example, St. George’s Hospital trained an algorithm to replicate its admission’s staff’s evaluations of medical-school applicants with 90% to 95% fidelity. Unfortunately, the staff’s past decisions had been racist and sexist, so “the program was not introducing new bias but merely reflecting that already in the system.” That last phrase should be alarming to anyone who has worried about the divide between disparate treatment and disparate impact. “In certain contexts, data miners will never be able to disentangle legitimate and proscribed criteria,” Barocas and Selbst write, because the “legitimate” criteria redundantly encode the “proscribed” ones. But if “the computer did it,” and these patterns seem to emerge from the data as if by magic, Title VII has a hard time explaining who if anyone has done something culpably wrong in relying on them.

In other words, as Barocas and Selbst observe, data mining brings civil rights law face to face with the unresolved tension between its nondiscrimination and antisubordination missions. On the one hand, individual acts of invidious discrimination dissolve into the dataset; on the other, the dataset itself is permeated by past discrimination. This would be a familiar enough observation about the limits of strictly race-neutral analysis in a world of self-perpetuating patterns of exclusion, but the algorithmic angle makes it new and urgent. Algorithms are not neutral; they make fraught decisions for complex reasons. In all of this, perhaps, Big Data is surprisingly human.

Cite as: James Grimmelmann, Discrimination by Database, JOTWELL (November 4, 2014) (reviewing Solon Barocas & Andrew D. Selbst, Big Data's Disparate Impact, available at SSRN (2014)), http://cyber.jotwell.com/discrimination-by-database/.
 
 

About Fallacies

Neil M. Richards & William Smart, How Should the Law Think About Robots? (2013), available at SSRN.

The article seems dated for a review here. There are newer ones on the subject, like e.g., Ryan Calo’s “Robotics and the Lessons of Cyberlaw” of 2014, for example. But the Richards & Smart article sticks in my mind. Maybe because, while both are premature (I will come to that immediately), this article makes a—or better—the fundamental point about law and politics in the face of changing technologies in a very simple and clear way.

“Premature” used to be the comment we would receive from the European Commission when we, at the heyday of European cyber regulation, as members of the Legal Advisory Board, an independent expert group abolished long since, would suggest a new initiative outside the Commission’s own agenda. Some of the readers may have encountered this word when presenting new ideas as legal counsel. I have never taken it as a derogatory term. “Premature” signifies a quality, if not an obligation, of legal proactive comment and advice. In that sense dealing with robotics and law is premature, and so are, by the way, the “We Robot” Conferences (established in 2012) which give context to this article, a conference series in which—disclosure is due—our Editor-in-Chief has been involved prominently.

The fundamental point is slow in coming: Richards & Smart start with a definition of a robot: a “non-biological autonomous agent,” i.e. “a constructed system that that display both physical and mental agency but is not alive in the biological sense.” We all are familiar, as the authors point out, with all sorts of robots. We know them from science fiction readings and the movies. There is already the small round disk that cleans our sitting rooms. There has been the automated assembly of cars by industrial robots. And lately these cars drive around themselves as robots guided by Google. And robots, the authors argue, will become increasingly multipurpose, gain more autonomy, and turn from lab exhibits into everyday devices communicating with each of us at any time. Law? There is a reference to the Nevada state regulation of 2011 for those car robots. But otherwise the article mentions legal implications only in a very general way; there is no discussion; there is not even a listing of possible legal problems.

And yet, it is exactly this lack that makes this article so special and brings us to that central point. The authors make a notable, an important pause. Before going into the legal details, they insist, we should be aware of how law and society deal with technology in general, and they take Cyberlaw as the example of what may happen to robotics and law: Essential for technology law is the way in which law perceives technology. It does so by analogy to a metaphor already in use, in order to relate the “new” to something law already knows. The example Richards & Smart are presenting from Cyberlaw is the evolving interpretation of the Fourth Amendment with regard to wiretapping: The metaphor chosen decides on the political and legal path the issues will take.

While the importance of a metaphor is not new to discussions about law and about Cyberlaw in particular – see for example Julie Cohen’s analysis of Cyberspace as space in a 2007 article (107 Colum. L. Rev. 210), the authors consciously register the moment of the critical turn before it is taken: Heed the warning, they say, “Beware of Metaphors.” They exemplify their premonitions about the way in which politics and law may perceive robots with what they call the “Android Fallacy”: The more robots may look and seem to behave like human beings, the more inclined we might be to assert them free will, and the more responsibility will be taken off the shoulders of their designers.

In essence, what this article is asking us—and this may be the real reason why this article sticks in my mind—is to what fallacies of Cyberlaw have we contributed with our writings, making way for what kind of legal policies, legislation and jurisprudence, with what kind of consequences even when we were acting with proactive intent? Shouldn’t we have allowed for more time to discuss the implications of our metaphors before surfing with the technological tide?

(Michael Froomkin took no part in the editing of this essay.)

Cite as: Herbert Burkert, About Fallacies, JOTWELL (October 3, 2014) (reviewing Neil M. Richards & William Smart, How Should the Law Think About Robots? (2013), available at SSRN), http://cyber.jotwell.com/about-fallacies/.
 
 

From Google to Tolstoy Bot: Should the First Amendment Protect Speech Generated by Algorithms?

Stuart Minor Benjamin, Algorithms and Speech, 161 U. Pa. L. Rev. 1445 (2013), available at SSRN.

Information, increasingly, is everywhere. Machines gather information, process it, and automatically communicate it, often in terms humans understand. Bots tweet on Twitter; Fitbits communicate a user’s activity record; Project Tango devices render 3D maps; and IBM’s Watson can now argue. With algorithms increasingly writing, drawing, and even debating, a central question for regulators, courts, and scholars is to what extent the First Amendment protects speech generated by algorithms. If algorithmic communication falls within First Amendment coverage, regulators will have a more difficult time governing it. But if it does not, courts will need to explain how the exclusion can sit comfortably with First Amendment theory and current doctrine.

Stuart Minor Benjamin positions the puzzle of algorithmic speech as part of a larger project in understanding First Amendment jurisprudence and its expansion and contraction. In previous work, Benjamin has asked how hard it would be to expand First Amendment coverage; in Algorithms and Speech, he asks how hard it would be to narrow the existing jurisprudence to exclude a practice that would otherwise be covered. Benjamin recognizes the potential regulatory consequences of First Amendment coverage of algorithmic speech. But he surveys Supreme Court caselaw and concludes that there is no principled way to exclude many algorithmic communications from speech protection without excluding much other communication that we deem squarely within the First Amendment’s coverage.

Algorithms and Speech does valuable work in laying out the current state of expansive First Amendment doctrine, and in identifying the Supreme Court’s reluctance to create new exceptions. Benjamin also clarifies that the coverage of algorithmic speech is not just a matter of making analogies to earlier media. Search engine results may be like editorial decisions, the claim of Eugene Volokh and Donald Falk’s 2012 white paper, but Benjamin is intent on finding an underlying reason why both are covered that goes beyond structural similarities.

The touchstone of First Amendment coverage, according to Benjamin, is the sending of a substantive message. Benjamin points out that when a sendable and receivable message has actually been sent, the Supreme Court has never found that message to be outside First Amendment coverage (a point that is historically untrue, but correct within current jurisprudence). Excluding algorithmically generated speech would upend existing First Amendment caselaw, requiring either the drawing of arbitrary lines or the exclusion of much of what is currently considered to be speech. Benjamin explains that the arguments against covering algorithmic speech, such as distinguishing it as corporate speech or commercial speech, would also leave core First Amendment institutions such as newspapers unprotected. What is most important to Benjamin is consistency, and the article is admirable in trying to craft rules that apply equally to all.

Benjamin carves out several important limitations. First, an algorithm that does not communicate a substantive message will not be protected. Second, because Benjamin hinges First Amendment protection on the communication of a message (but interestingly, not its receipt), some companies may have to indicate that they are editors. Third, as is the case with newspapers, laws of general applicability such as labor laws, tax laws, and most antitrust laws can apply to algorithmic speakers with no First Amendment ramifications. The government just can’t ban or compel substantive communication.

Benjamin makes a convincing case for the protection of search engine results under current First Amendment doctrine. The recent SDNY decision in Zhang v. Baidu, where a district court judge found First Amendment protection for Baidu’s search results, shows that judges are likely to agree. The article is also painstakingly honest in trying to maintain the cohesiveness of the Court’s First Amendment reasoning. But by positioning the question of algorithmic speech within current jurisprudence and around the model of search engine results, Benjamin limits the scope of the article in several ways. Algorithms and Speech does important work and will likely be a foundation that others will build on, but it leaves several more difficult questions for another day.

Benjamin steers away from reasoning from First Amendment theory. He instead navigates the “guideposts” of Supreme Court jurisprudence, accepting Cass Sunstein’s assertion that the First Amendment in practice is incompletely theorized. This leads to an unstated bias towards the jurisprudential status quo. It’s not clear precisely why preserving the guideposts of current jurisprudence is the right approach. The primary explanation offered is that disturbing the status quo of jurisprudence will threaten other media that more clearly rest at the heart of First Amendment interests.

We may soon be at a stage, however, where upending existing First Amendment caselaw is to some extent inevitable. If algorithmic speech does get full First Amendment protection, how will the intent requirements in many of the categories of unprotected speech get implemented? And if algorithmic speech does not get protection, how will we distinguish that content from human speech without threatening many of the values underlying the First Amendment? Algorithms and Speech takes a fascinating first step, while in its caselaw-driven approach leaving a number of important questions on the near horizon.

The second way in which the article is less daring that it could be stems from the model Benjamin chooses for algorithmic speech. The search engine model—of an algorithm running according to its programmers’ general intent—runs the article into limitations right where the questions get most interesting. Benjamin’s touchstone for First Amendment coverage, based on the Spence test, is the intentional communication of a substantive message by a human being. When the algorithm is no longer a tool for its user, but an artificial intelligence, Benjamin suggests the connection to a human speaker might become sufficiently attenuated that First Amendment coverage might no longer be appropriate. The problem is, as Bruce Boyden has pointed out, that the line between tool and independent message generator is exceedingly difficult to draw.

The flurry of scholarship around algorithmic speech shows the variety of ways a First Amendment problem can be framed: with a focus on the speaker, the message, the medium, or the listener. Benjamin’s focus is in large part on an intentional speaker. The recipient of a message matters to Benjamin to the extent that the recipient can identify the speaker as communicating a message, but the intentional speaker is for Benjamin the core of what turns information into First-Amendment-relevant speech.

Robert Post has taken a different approach to First Amendment coverage, explaining that First Amendment protection extends to a medium of communication because of its status as a social practice (both Tim Wu and Andrew Tutt have applied this to search engines). The question is whether certain kinds of algorithmic speech are sufficiently like protected media or have acquired enough of their own cultural meaning to deserve protection. This is not a question courts usually want to explicitly ponder.

A third way to frame First Amendment interests is to talk about the message itself, and whether its contents reflect high or low value speech. But as Benjamin points out, the Supreme Court has repeatedly rejected this message-oriented approach, at least with respect to speech made by an intentional speaker.

A fourth way to frame First Amendment interests is to look at the broader communications environment, including the listeners, readers, and receivers of communication. And this approach highlights a problem with both Benjamin’s and Boyden’s potential exclusion of algorithmic speech once a human is no longer involved in the selection of a message. Even an autonomous Tolstoy Bot would still be creating works that appear to readers to be identical to speech by humans. At that future phase of algorithmic creativity, censorship of Tolstoy Bot would affect readers the same way as censorship of Tolstoy: they would have access to less information, and would perceive the government’s actions as censorship. Under multiple theories of the First Amendment, this kind of censorship would raise problems—even if there is no human meaningfully involved in the crafting of the substantive message listeners could receive.

The article’s focus on a speaker’s agency as the touchstone of First Amendment protection thus may in practice prove to be both too inclusive—including actions as speech merely because a speaker claims to have a substantive message—and too underinclusive—rejecting speech that clearly looks like speech to a reader. The speaker’s agency approach may be most consistent with current jurisprudence, but the more difficult question for a future article is whether it is the right approach to future technologies, and why.

Cite as: Margot Kaminski, From Google to Tolstoy Bot: Should the First Amendment Protect Speech Generated by Algorithms?, JOTWELL (September 2, 2014) (reviewing Stuart Minor Benjamin, Algorithms and Speech, 161 U. Pa. L. Rev. 1445 (2013), available at SSRN), http://cyber.jotwell.com/from-google-to-tolstoy-bot-should-the-first-amendment-protect-speech-generated-by-algorithms/.
 
 

Don’t Restrict My E-book

Angela Daly, E-Book Monopolies and the Law, 18 Media & Arts L. Rev. 350 (2013), available at SSRN.

It’s still fashionable to point to the “cloud” as the solution to all sorts of problems in technology. But can such a shift disturb the carefully worked out compromises between different interests, which are embedded in legislation on topics such as competition and intellectual property?

Angela Daly, a research fellow at the Swinburne Institute of Technology (Australia) who is also about to complete a doctorate at the European University Institute (Florence, Italy), suggests that these clouds may bring little but rain. In her article, “E-book monopolies and the law”, published in the consistently topical Melbourne-based Media and Arts Law Review, she considers two particular features of e-book platforms and content: digital rights management, and competition.

The significance of these sectors is apparent. Daly sets out the successes of various players (notably Amazon and its Kindle and Kindle Store), even in markets that they have not officially entered like Australia. Quickly, however, the problem is apparent: depending on how you define the relevant market, the user can find themselves faced with high ‘switching costs’ and with limited opportunities to take advantage of all that the digital world appears to promise.

The problems of the legislative protection of DRM are well known. While restating the key criticisms (including the replacement of flexible rules with unbalanced stipulations or complex processes and presumptions), Daly also brings to wider attention the particular difficulties encountered in Australia, where the result of trade deals has been the worst of both worlds–greater US-style protection for rightsholders without even the safeguards of the US system. The key argument in this section, though, is about exhaustion (taking in the landmark European decision in the software case of UsedSoft v Oracle and the very different US decision in Capitol Records v ReDigi). It’s argued that the potential for exhaustion to protect competition and consumer interests in the e-book world is being stifled by some decisions, by whittling away exemptions for temporary copies, and above all the move from “goods” to “services” and from “sales” to “licenses”.

One consequence of poorly designed or implemented DRM legislation is the locking in of users to a particular platform. In the second main part of the paper, Daly develops a theme that runs across much of her work–the combination of potential harms to competition (especially for the ‘normal’ user who is not accustomed to complex workarounds) and wider harms to the public interest (e.g., censorship by powerful platform operators). Reviewing the various stages in the US and European Union investigations into the agency models adopted by Apple and publishers and a class action against Amazon, Daly makes some particularly telling points about the wider problems of Apple’s approach to revenue sharing, and the interaction between these models and DRM.

Daly’s proposed solution is an interesting one. She recognises that competition law brings something to the table, but although these cases and investigations may have the result of lowering prices, they do not address the underlying (detrimental) impact of the use of DRM as a tool to protect business models rather than creative works. As such, she calls for greater attention to fair use outside the US and a “fair deal for users” in trade negotiations. When so many questions are seen solely in competition terms, this more subtle approach is particularly welcome.

Since this wide-ranging article was published, the competition matters have rolled on. Apple and major publishers have settled with some of their opponents, and Amazon prevailed in a case brought by bookstores, although Amazon and the publisher Hachette are now caught up in a significant dispute about ebook prices. Disclosures regarding the activities of national security services may have shaken some user confidence in the shift to the cloud–but the reliance on DRM in the e-book sector remains clear. Daly’s deft handling of the range of issues in this sector should be of particular interest to the many committees and bodies investigating ‘copyright reform’ around the world.

Cite as: Daithí Mac Síthigh, Don’t Restrict My E-book, JOTWELL (July 21, 2014) (reviewing Angela Daly, E-Book Monopolies and the Law, 18 Media & Arts L. Rev. 350 (2013), available at SSRN), http://cyber.jotwell.com/dont-restrict-my-e-book/.
 
 

Free for the Taking (or Why Libertarians are Wrong about Markets for Privacy)

Have you heard any of these arguments lately? Consumers willingly pay for the wonderful free services they enjoy using the currency of their personal information. We can’t trust surveys that say that consumers despise commercial tracking practices, because the revealed preferences of consumers demonstrate that they are willing to tolerate tracking in return for free social networking services, email, and mobile apps. If privacy law X were implemented, it would kill the free Internet (or more immodestly, the Internet).

Two recent articles take on all of these arguments and more in the context of the privacy of information collected online by private corporations. The articles are similarly entitled (before their subtitle colons), Free and Free Fall. Both are written by excellent interdisciplinary scholars, Free by Chris Hoofnagle and Jan Whittington and Free Fall by Kathy Strandburg. These articles, individually but even more taken together, present a thorough, forceful, and compelling rebuttal to pervasive libertarian paeans to the supposed well-functioning online market for personal information.

Libertarian arguments hold great sway among policymakers, particularly in the United States. Libertarian think tank types exhort policymakers to respect the unprecedented efficiencies of the well-functioning market for personal information, which has created and supports today’s vibrant Internet. For many years, privacy law scholars (myself included)—the vast majority of whom do not subscribe to these libertarian beliefs—have treated these arguments as mere distractions but have not focused much attention on responding to them. This is no surprise, as the tools of economics seem not by themselves up to the task of describing the problems we have documented. Yet this inattention has taken a toll, as it has strengthened by want of opposition the libertarian critique of many proposed privacy regulations, which some policymakers have begun to parrot. While legal scholars have done little to respond, these arguments been rebutted, capably but only incompletely, by scholars from outside the field, like economist Alessandro Acquisti and engineers and computer scientists Jens Grossklags, Lorrie Cranor, Aleecia Macdonald, and Ed Felten. Before Free and Free Fall, however, we’ve lacked a thorough and thoroughly economic rebuttal to the libertarian critique.

The core libertarian argument that drives the rebuttal in Free and Free Fall is that people “pay” for free, online services with their data. No they don’t, at least not if “payment” is supposed to represent an accurate measure of consumer preference and definitely not if “payment” means that consumers rationally give up data about themselves in exchange for free services. Both Free and Free Fall carefully marshal forth arguments why ordinary economic conceptions of payment and cost and demand and preference do not hold in the “market” for data. The two articles use different economic methodologies—Free relies on the framework of “transaction costs economics” (TCE) and Free Fall speaks in the more traditional language of market failure. But both articles describe in detail the great risks of harm people expose themselves to by allowing companies to collect so much personal information, from identity theft to insecurity to self-censorship to humiliation to unwanted association. Consumers do not “pay” with information, because they do not understand the true costs of allowing their data to be snatched.

But, the libertarians might respond, consumers expose themselves to risk of harm in commercial transactions in other contexts, and in those cases we still consider payment in the face of risk to be an accurate measure of consumer preference. To respond, Free and Free Fall document the many reasons consumers find it impossible to account for the risk of harm from online data collection: the utter lack of transparency into corporate data practices creates insuperable information asymmetries; well-documented network effects give rise to lock-in and other barriers to competitive entry; and bounded rationality prevents consumers from accurately assessing risk. Worst of all, unlike in some consumer transactions, all of these barriers persist even after the commercial transaction takes place, leading Strandburg to compare them to “credence goods” like medical treatments and legal services, which tend to be “natural subjects of regulation.” She concludes “[c]onsumers are doing what amounts to closing their eyes and taking an unknown risk in exchange for a presently salient benefit.”

Of course, many other privacy articles and books have recounted the risks of privacy harm from commercial tracking, but these two articles work a subtle but powerful reframing of how we should account for this harm. Until now, the libertarians and the policymakers they have persuaded have found it easy to discount discussions of privacy harm as separate from and outweighed by the great and unmitigated benefits of economic efficiency and growth found on the other side of the scale. A little identity theft is a small price to pay for free Facebook and Gmail, they have argued. Free and Free Fall explain how these privacy harms themselves work on the “benefits” side of the scale, because they need to be accounted for as economic inefficiencies, which diminish the economic value, measured both individually and societally, of these online services. As Hoofnagle and Whittington’s Free puts it, “[t]he financial consequences of transactions that occur with the press of a button can be of such magnitude and lasting consequence that their implications for parties can easily dwarf those of typical purchases in our economy.”

In other words, the market for personal data is dysfunctional and distorted in ways that cause profound economic inefficiencies in the form of risk of privacy harm, inefficiencies that sensible privacy regulation can help correct. We need new privacy laws not despite what they might do to economic efficiency but because they will allow the market to produce even more economic efficiency.

Both articles also explain how these skewed market forces have been subtly re-architecting the Internet in societally harmful ways. Companies are being pushed to design data extractive services in pursuit of corporate riches, even if consumers would prefer precisely the opposite. Hoofnagle and Whittington’s Free recount Google’s history with the http Referer header, which has seen the company on more than one occasion intentionally rolling back or weakening pro-privacy, pro-security advances so as not to disrupt the expectations and profits of advertisers. Although the authors do not draw this particular connection, it is fair to say that some of the worst abuses of privacy of the NSA have resulted directly from corporate decisions like these to place the desires of advertisers ahead of the wishes of users.

But it disserves these two articles to lump them together without highlighting a little of what each does that the other doesn’t. Hoofnagle and Whittington’s Free builds on the TCE work of Oliver Williamson and others to propose a rigorous and grounded methodology for taking account of all of an online transaction’s efficiencies. Strandburg’s Free Fall focuses thoroughly on the development of the market for advertising, drawing on a rich and detailed history from economists and marketing experts outside the legal academy.

There is so much more to these long articles, but rather than describe more, I’ll simply urge those in the field to read both. It might be overselling things to say that these two articles have demolished the libertarian critique of privacy law. But they do administer a thorough and long overdue drubbing of some core libertarian arguments.

Cite as: Paul Ohm, Free for the Taking (or Why Libertarians are Wrong about Markets for Privacy), JOTWELL (May 26, 2014) (reviewing Katherine J. Strandburg, Free Fall: The Online Market's Consumer Preference Disconnect, NYU School of Law, Public Law Research Paper No. 13 – 62 (2013) and Chris Jay Hoofnagle & Jan Whittington, Free: Accounting for the Internet's Most Popular Price, 61 UCLA L. Rev. 606 (2014)), http://cyber.jotwell.com/free-for-the-taking-or-why-libertarians-are-wrong-about-markets-for-privacy/.
 
 

Good Fences Make Better Data Brokers

Woodrow Hartzog, Chain Link Confidentiality, 46 Georgia L. Rev. 657 (2012) available at SSRN.

Since at least the early 2000s, privacy scholars have illuminated a fatal flaw at the core of many “notice and consent” privacy protections: firms that obtain data for one use may share or sell it to data brokers, who then sell it on to others, ad infinitum. If one can’t easily prevent or monitor the sale of data, what sense does it make to carefully bargain for limits on its use by the original collector? The Federal Trade Commission and state authorities are now struggling with how to address the runaway data dilemma in the new digital landscape.  As they do so, they should carefully consider the insights of Professor Woody Hartzog. His article, Chain Link Confidentiality, offers a sine qua non for the modernization of fair data practices: certain obligations should follow personal information downstream.

After 2013, it is impossible to ignore the concerns of privacy activists. The Snowden revelations portrayed untrammeled data collection by government. Jay Rockefeller’s Senate Commerce Committee portrayed an out-of-control data gathering industry (whose handiwork can often be appropriated by government). America’s patchwork of weak privacy laws are no match for the threats posed by this runaway data, which is used secretly to rank, rate, and evaluate persons, often to their detriment and often unfairly. Without a society-wide commitment to fair data practices, a dark era of digital discrimination is a real and present danger.

As Hartzog notes, “current privacy laws are too limited, subjective, or vague to effectively police the “downstream” use of information by third parties.” This is a glaring weakness in privacy law, since a given bit of data might be redisclosed dozens or even hundreds of times in new digital data markets. Hartzog’s approach would “use contracts to link recipients of personal information ,” including in those contracts “(1) obligations and restrictions on the use of the disclosed information; (2) requirements to bind future recipients to the same obligations and restrictions; and (3) requirements to perpetuate the contractual chain.”  Like the viral licensing envisioned by Creative Commons, the “chain link confidentiality” approach is designed to effect a system of data transmission that balances the flexibility of private ordering with the stability of public law.

Hartzog’s article highlights the importance of health privacy law for modeling new relationships of responsibility between data collectors, sellers, and subjects.  As he observes, “The HIPAA Privacy Rules provide that, although only covered entities such as healthcare providers are bound to confidentiality, these entities may not disclose information to their business associates without executing a written contract that places the business associate under the same confidentiality requirements as the healthcare providers.” These protections have been strengthened even further by HITECH (and the HIPAA Omnibus Rule of 2013), which impose statutory and regulatory duties on business associates and even their downstream contractors.  The health privacy protections essentially “run with the data.”

What property-like restrictions accomplish in the health data sphere, Hartzog wants to accomplish via contracts that would bind the recipients of data to terms like those imposed on the original collector. Not only would this help individuals get a handle on “runaway data;” it would also help promote the validity of research in the big data field by indicating the provenance of data.  As Sharona Hoffman showed in the article “Big, Bad Data,” if we can’t tell the provenance of data, how can we adjust for or account for potential flaws or biases in it?

Both firms and data brokers increasingly try to integrate thousands of sources of information into profiles. The profiles are actionable, whether inside or outside the firm in which they are compiled. Runaway data can lead to cascading disadvantages. Once one piece of software has classified a person as a bad credit risk, a bad worker, or a poor consumer, that attribute may appear with decision-making clout in other systems all over the economy. And it can dilute or distort findings that are increasingly based on promiscuous correlations within unstructured data sets.  Chain link confidentiality would impose some baseline of order and attribution on the new data economy.

Runaway data poses a stark choice to data policymakers. Given the number of data breaches extant, it’s only a matter of time before breachers start developing dark markets of information more sensitive than credit card numbers online.  Are we going to allow datamongers to essentially act as “fences” for this stolen data? Or are we going to keep tabs on each “hop” of data from collector to broker to user and beyond—a bare predicate for keeping illicit or inaccurate data “fenced in?”  Hartzog’s chain links point us decisively toward the latter choice—a far better future for data practices.

Cite as: Frank Pasquale, Good Fences Make Better Data Brokers, JOTWELL (April 25, 2014) (reviewing Woodrow Hartzog, Chain Link Confidentiality, 46 Georgia L. Rev. 657 (2012) available at SSRN), http://cyber.jotwell.com/good-fences-make-better-data-brokers/.
 
 

Empirical Link Rot And The Alarming Spectre Of Disappearing Law

Something Rotten in the State of Legal Citation trumpets an important alarm for the entire legal profession, warning us that given current modes of citing websites in judicial cases create a very real risk that opinion-supporting citations by courts as important as the United States Supreme Court will disappear, making them inaccessible to future scholars. The authors of this important and disquieting article, Raizel Liebler and June Liebert, both have librarianship backgrounds, and they effectively leverage their expertise to explicate four core premises: Legal citations are important; web based legal citations can and do disappear without notice or reason; disappearing legal citations are particularly problematic in judicial opinions; and finally, to this reader’s vast relief, there are solutions to this problem, if only the appropriate entities would care enough to implement them.

Denoting the disappearing citation phenomenon with the vivid appellation “link rot,” Liebler and Liebert explain that the crucial ability to check and verify citations is badly compromised by link rot, and then demonstrate this with frankly shocking empirical evidence. According to their research:

[T]he Supreme Court appears to have a vast problem with link rot, the condition of internet links no longer working. We found that number of websites that are no longer working cited to by Supreme Court opinions is alarmingly high, almost one-third (29%). Our research in Supreme Court cases also found that the rate of disappearance is not affected by the type of online document (pdf, html, etc) or the sources of links (government or non-government) in terms of what links are now dead. We cannot predict what links will rot, even within Supreme Court cases. (P.278).

They warn that without significant changes to current practices, the information in citations within judicial opinions will be known solely from those citations. When citations lack lengthy parentheticals or detailed explanatory text, it might not even be clear to future readers, critics or researchers why a document was cited, no less the nature of the support or clarifications it offered.

Liebler and Libert acknowledge that the Internet has improved legal research in many ways, opening up information conduits that had not been easily available before, and that in many respects website citations were an exciting development for the Supreme Court. They note that Justice Souter was the first Justice to cite the Internet in 1996, in a concurrence, and “then in 1998, Justice Ginsburg used the Internet for sources to demonstrate different meanings of the word “carry” in her dissent.” (P. 279). By 2006, all of the Justices then serving had cited at least one website. Internet based citations continued to blossom, and Liebler and Liebert’s research establishes that between 1996 and 2010, 114 majority opinions of the Supreme Court included links, but that almost one third of them are no longer working. Link rot at the Supreme Court is extant, widespread, and perfidious. Among several arresting examples they offer is the following:

In Scott v. Harris, a video with a dead link was cited extensively by both the majority and minority opinions, serving as the focal point of a serious disagreement in the case. The majority opinion states, “We are happy to allow the videotape to speak for itself.” Additionally, the majority used the citation to the video to disagree with the dissent, stating that “Justice Stevens suggests that our reaction to the videotape is somehow idiosyncratic, and seems to believe we are misrepresenting its contents.” (P. 282).

Even when information cited by the Court remains available on the Supreme Court website, it is often relocated; the old links are not amended to point to the new location, so they are as good as dead if that is what researchers quite reasonably assume them to be. Liebler and Liebert’s findings affirm research which has charted extensive link rot in many other contexts such as law review articles. Even more disturbingly, this research is in accord with “a study of federal appellate opinions [which] found that in 2002, 84.6% of Internet citations in cases from 1997 were inaccessible; moreover, 34% of citations in cases from 2001 were already inaccessible by 2002.” (P. 290-91).

Liebler and Liebert’s stunning revelations are a simple matter to confirm in the context of any subject area. For example, one of the most important copyright cases the Supreme Court has ever decided was Sony Corp. v. Universal Studios in 1984. The long and not particularly well written majority opinion set the balance between content owners like Universal Studios, and companies like Sony that produced new and innovative technologies (in this case the Betamax videocassette recorder) with respect to secondary copyright infringement. Under Sony, a new technology that was capable of substantial non-infringing uses could not be enjoined from distribution on the grounds that it contributed to copyright infringement. Sony was controversial and its convoluted drafting gave lawyers and judges the opportunity to read it into a multitude of meanings. As a copyright law geek of long standing, this author has seen that majority opinion in Sony parsed, diced, sliced by lower courts, and ultimately repackaged as a shadow of its former self by a unanimous Supreme Court in 2005 in MGM Studios v. Grokster. But at least link rot was not a worry. The same cannot be said of Grokster, wherein Justice Breyer’s concurrence has links and some of the links have already rotted. In fairness he notes that “all Internet materials … are available in Clerk of Court’s case file,” but it is not at all clear how easy it might be for a researcher to access this now, or especially five years from now. According to Liebler and Liebert, the case files are only available to those with sufficient means to go to Washington, DC, and visit the office of the Clerk of the Supreme Court. (P.300).

Another thing one learns from Something Rotten in the State of Legal Citation is that the Supreme Court often does its own web-based fact-finding. Liebler and Liebert inform readers that Allison Orr Larsen conducted a study of fifteen years of Supreme Court opinions, and “found that of the over one hundred “most important Supreme Court cases” from 2000 to 2010, 56% include mentions of facts the Justices did not find in the record and instead found independently.” (P. 278). Liebler and Liebert quote her stunning finds as follows:

[I]t was quite common for Justices to demonstrate the prevalence of a practice through statistics they found themselves. And, at a fairly high rate these statistics were supported by citations to websites—I found seventy-two such citations in my non-exhaustive search. Importantly, statistics ere independently gathered from websites with widely ranging indicia of reliability.1

While it is sort of amusing to picture the Justices surreptitiously googling themselves when they get bored during oral arguments, it’s a little disconcerting to think of them relying even briefly on misinformation-ridden sites like Yahoo Answers. Yahoo has not cornered the market on dumb, because the Internet does not have corners, but Yahoo Answers is rather infamous for exchanges such as:2

Question: Is it wrong to hate a certain race?

Answer: No, because if you are only used to running a 5k, doing a 10k with your jogging group is going to take too long. I hate 10ks myself for this very reason.

Question: Why doesn’t the Earth fall down?

Answer: Because it can fly.

Question: I plan on starting a business selling dognuts, any advice?

Answer: If you want people to eat them, I would call them doughnuts.

Question: Does vodka kill bees and wasps?

Answer: Yes, over time it will destroy their tiny livers, but it is the disruption to the home life that really takes its toll.

One wonders about the quality of the information that the Justices are finding online, and this practice is even more dangerous if link rot means that citations to the Justices’ independent research cannot be assessed or verified. And if Supreme Court Justices are engaging in the dubious practice of doing their own online research about cases before them, one has to assume lower court judges are doing so as well.

Liebler and Lieber conclude their outstanding article by recommending possible solutions to the link rot problem. “Ideally,” they say, “every court should digitally archive all materials cited within an opinion, regardless of the format.” (P. 299) They observe that:

In 2009, the Judicial Conference of the United States created a report titled Internet Materials in Opinions: Citations and Hyperlinking that recommended two primary solutions to the broken Internet link problem: Clerks should download any cited Internet resources and include them with the opinions. The downloaded Internet resources should be included as attachments on a non-fee basis in each court’s Case Management/Electronic Case Files System, such as PACER. (P. 301).

PACER is not without its drawbacks, but there are other alternatives as well, including using the Internet Archive or other internet archiving organizations, or permanent URLs. The main takeaway from this valuable article is that something needs to be done about link rot, and the problem needs to be addressed quickly and expansively. Liebler and Liebert have done a great service to the entire legal profession by bringing link rot to our attention and mapping the gigantic contours of the problem so compellingly.



  1. See Allison Orr Larsen, Confronting Supreme Court Fact Finding, 98 Va. L. Rev. 1255, 1288 (2012) (including discussion of the Justices’ use of websites to conduct research during oral argument and for opinions). []
  2. These are representative, edited versions of Yahoo Answers, screen grabs of which are on file with the author. []
Cite as: Ann Bartow, Empirical Link Rot And The Alarming Spectre Of Disappearing Law, JOTWELL (April 11, 2014) (reviewing Raizel Liebler & June Liebert, Something Rotten In The State Of Legal Citation: The Life Span Of A United States Supreme Court Citation Containing An Internet Link (1996-2010), 15 Yale J. L. & Tech. 273 (2013)), http://cyber.jotwell.com/empirical-link-rot-and-the-alarming-spectre-of-disappearing-law/.