Predictive Coding in E-discovery: Cautionary Tales

October 16, 2012

ediscoveryOver the past several years, proponents of predictive coding—a  machine-learning technology that lets computers automatically predict how documents should be coded in litigation with limited human input—have hyped the technology as a cure-all for e-discovery’s ills.  In litigation practice, however, the technology has shown some side effects.  Two recent cases, which both involved protracted predictive coding disputes, demonstrate some of the technology’s potential problems and are instructive to situations where predictive coding may be most appropriate.

Da Silva Moore v. Publicis Groupe

Da Silva Moore v. Publicis Groupe, a Title VII gender discrimination class action venued in the Southern District of New York, is the most high-profile case involving predictive coding, and a cautionary tale to those considering using predictive coding in discovery.  Magistrate Judge Andrew Peck, a leading advocate in favor of predictive coding, has overseen discovery in the matter.  In addition to speaking at industry conferences, Magistrate Judge Peck authored an article—titled Search, Forward—promoting predictive coding technologies in litigation and he has said the article itself could serve as a “sign of judicial approval” of the technology.

As discovery in Da Silva Moore proceeded, the parties initially stipulated to use of predictive coding, but negotiations over a mutually agreeable protocol for programming the technology broke down.  After Magistrate Judge Peck accepted defendant’s recommended protocol, which plaintiffs contend is flawed, plaintiffs moved for Magistrate Judge Peck’s recusal.  In support of their argument, plaintiffs asserted that Magistrate Judge Peck’s public support for predictive coding and his relationships with vendors made him biased.  Magistrate Judge Peck denied plaintiff’s requests for recusal, and plaintiffs sought reconsideration.  As a result, both Magistrate Judge Peck’s recusal and plaintiffs’ objections to the parameters of the predicting coding protocol to be used are now being considered by Judge Andrew Carter, the presiding district court judge.

Kleen Products v. Packaging Corp. of America

Predictive coding became an issue in a much different way in Kleen Products v. Packaging Corp. of America, an antitrust matter set in the Northern District of Illinois.  There, plaintiffs, arguing that predictive coding would yield more thorough results than keyword searches, requested that defendants redo their document productions and all future productions using predictive coding technology.  This request, however, came after defendants had already reviewed and produced more than one million documents.

In comments on the record, Magistrate Judge Nan Nolan, who heard expert witness testimony on the sufficiency of the initial production, confirmed that the parties could not dictate what technology their opponent may use without indicating how document production results are insufficient or inaccurate.  Therefore, she asked the parties to reach a compromise on keyword searches, and, subsequently, plaintiffs withdrew their demand that the defendants must use predictive coding in a recently issued stipulation and order.


Predictive coding has not yet displaced commonly accepted e-discovery methods—like keyword search—nor will it anytime soon.  Though it’s fast becoming an acceptable e-discovery technology, the mere availability of predictive coding is not a basis for challenging appropriate uses of other e-discovery methods.  In addition, predictive coding can present ripe opportunities for discovery disputes in litigation, including fights over parameters used to program the technology.  Thus, selecting an e-discovery technology (or technologies) to be used in litigation remains an important strategic choice with several potential effects that must be considered before moving forward.