For lawyers and clients overwhelmed with the cost and aggravation of conducting electronic discovery, a new, more efficient method is taking hold. This methodology, known as predictive coding or technology-assisted review, was developed by e-discovery companies that claim it can provide a significant shortcut in large document reviews and therefore a substantial cost savings. In the last year, the new methodology has been addressed by courts, including those in New York. Some published research on the effectiveness of the technique has shown promise, and a recent article in The Wall Street Journal reported positive results in a case where a court approved the use of predictive coding over the objection of the party that sought the discovery.
In general terms, predictive coding is a way of using technology to extrapolate to a large set of data the results of human relevance decisions on a subset of that data. The process starts with lawyers who are most familiar with the issues in a case or with a set of document requests reviewing the subset of data. These reviewers generally generate a "seed set" of documents, each document of which is coded for relevance, privilege or other criteria. The seed set will include documents that are deemed both relevant and irrelevant. Those selections are then used by the computer to generate relevance rankings for the larger group of documents. The relevance rankings are then tested by the reviewing lawyers to refine the computer analysis. The process is analogous to a spam filter whereby the lawyers and the computer interact to achieve a level of certainty as to what is relevant. Some published studies maintain that the results of this approach are more accurate than an entirely human review of the results of keyword or Boolean searches. With a computer program doing the sorting work of junior lawyers, the savings in large cases can be substantial.
Advocates of technology-assisted review compare the anticipated costs of complying with discovery requests using search terms and human review against the projected costs of using predictive coding, citing studies that predict significant cost savings from using this approach. Those studies also contend that the accuracy of properly constructed technology-assisted reviews exceeds that of purely human reviews, because human reviews apparently miss a startlingly large percentage of relevant documents.
Opponents of the proposed approach question the soundness or applicability of the studies and are fundamentally uncomfortable with the notion that in a technology-assisted review, not every document is actually reviewed by a person. In addition, because the seed set that includes both relevant and irrelevant documents will generally be turned over to the receiving party, the producing party may be reluctant to supply documents that it would not have turned over in a manual review. Finally, concerns about the costs of resolving disputes and engaging in motion practice over the discovery process itselfgiven the lack of consensus on the appropriateness of technology-assisted reviewmay be a barrier to implementing the process.
Parties are free to agree to this or other shortcuts to extensive human review of the results of keyword searches. Courts have recently entered the fray on predictive coding in cases where an agreement is not reached. In the last year, at least five courts have addressed the use of technology-assisted review as a means of identifying responsive documents requested in discovery. In large matters, parties who are faced with significant costs for document review would do well to review these decisions.
'Da Silva Moore'
In Da Silva Moore v. Publicis Groupe & MSL Group, 2012 WL 607412 (S.D.N.Y. Feb. 24, 2012), U.S. Magistrate Judge Andrew Peck entered an order, which at the time was the first case in which a court explicitly endorsed predictive coding in a written decision, declaring that "[c]omputer assisted review is an acceptable way to search for relevant ESI in appropriate cases."
Although it was widely reported via a press release from the defendants' computer vendor that Peck "ordered" the parties to use predictive coding, Peck made clear that the parties had agreed to the defendants' use of predictive coding and he was merely resolving disputes over the scope and implementation of the process. Id. at n.1.1 Thus, in addressing the parties' disagreements, Peck considered the issues of which custodians should be searched, which sources of electronically stored information (ESI) must be searched and the protocol for use of the predictive coding tools. Id. at 7-12.
In examining the predictive coding protocol, Peck discussed to what confidence level the documents would be analyzed, the seed set that would be used and the fact that the plaintiffs would have access to those documents, the coding that would be done by the human reviewers and the number of iterative rounds that would be used to "train" the computer. Id. at 9-12. Importantly, Peck permitted the parties to return if they felt the defendants' protocol did not result in a properly trained computer. Id. at 12.
The plaintiffs had argued against the predictive coding protocol, alleging that the process violated the Federal Rules of Evidence regarding experts and that the results were not reliable Id. at 15. Peck overruled these objections, finding that the Federal Rules of Evidence were inapplicable at the discovery stage because there was no testimony being offered for admissibility at trial. Id. Similarly, Peck dismissed the plaintiffs' reliability concerns as premature at best, given the transparency of the process. Id.
Ultimately, Peck ruled that "predictive coding was appropriate considering (1) the parties' agreement; (2) the vast amount of ESI to be reviewed (over three million documents), (3) the superiority of the computer-assisted review to the alternatives (i.e., linear manual review or keyword searches), (4) the need for cost effectiveness and proportionality under Rule 26(b)(2)(C), and (5) the transparent process proposed by [the defendants]."
The plaintiffs filed objections to the Feb. 24, 2012 order with U.S. District Judge Andrew Carter. On April 26, 2012, Carter overruled the plaintiffs' objections, finding that Peck's rulings were "well reasoned and they consider the potential advantages and pitfalls of predictive coding software." Order at 3.