Technology

Predictive Tagging: It’s A Process, Not A Panacea

The staggering volume of electronically stored information (ESI) that companies retain today often renders it nearly impossible to manually review all data under tight discovery deadlines. The costs of reviewing each document in a collection of millions can also become prohibitive of proceeding in litigation. To control the costs of managing ESI and to mitigate the risks of potential discovery sanctions, parties are increasingly turning to technology.

Parties regularly rely upon keyword searches to reduce the universe of potentially responsive ESI. And though this tool has long been considered – along with manual review – to represent the gold standard of discovery, researchers have been steadily chipping away at its effectiveness. In 1985, a study revealed that although lawyers estimated retrieving 75 percent of relevant documents by using Boolean keyword searches in a case involving 40,000 documents consisting of 350,000 pages, they had in fact found only 20 percent of the relevant documents.[1]

Even more advanced review technology – predictive tagging (also called computer-assisted review, technology-assisted review, and predictive coding) has recently received a lot of attention – particularly since U.S. Magistrate Judge Andrew J. Peck’s decision in Da Silva Moore v. Publicis Groupe, No. 11 Civ. 1279 (ALC) (AJP), 2012 U.S. Dist. LEXIS 23350 (S.D.N.Y. Feb. 24, 2012), the first decision approving the use of computer-assisted review to search for relevant ESI.

Before Da Silva Moore, some lawyers were hesitant to use computer-assisted review since they were wary of testing the waters. Now, Judge Peck’s opinion offers counsel the following reassurance: “computer-assisted review is an available tool and should be seriously considered for use in large-data-volume cases where it may save the producing party (or both parties) significant amounts of legal fees in document review.”

Though it now has a judicial seal of approval, this technology is not a magic bullet. Even Judge Peck noted that it is not a “Staples-Easy-Button solution appropriate for all cases.” Rather, predictive tagging is best used as a complement to other forms of review – including human review – to speed the review process.

How The Predictive Tagging Process Works

In essence, predictive tagging applies statistical analysis across a large data set. The process begins with an experienced lawyer or a small team of lawyers coding a sample set from the pool of documents for responsiveness during the initial assessment process. The technology engine then analyzes those tagging decisions along with the documents’ content characteristics.

In the next phase, called interactive ranking, the algorithm generates a new, smaller sample set of documents, and an experienced lawyer reviews the documents for responsiveness. The algorithm studies this tagging and learns from the reviewer’s decisions; the algorithm then generates another small sample of documents. This refinement process repeats until the algorithm becomes stable for each issue. Stabilization typically requires a review of between 1,000 and 2,000 documents.

After several iterations of this process, the computer’s tagging will become sufficiently similar to the human tagging to yield reliable results across the remainder of the data set, and the software can rank documents according to their projected relevance for each issue. Therefore, this technology is designed to group the most relevant documents for prioritized, contextual review, which can improve review accuracy, shorten review time, and lower overall costs.

Advantages Of Technology-Assisted Review

Studies have found the “average efficiency and effectiveness” of predictive tagging surpass that of manual review.[2] On average, predictive tagging yields a higher number of relevant documents and greater accuracy than manual review.[3] Another survey showed that predictive tagging technology can save “45 percent of the costs of normal review” – with some reporting a savings of up to 70 percent.[4]

On a practical level, predictive tagging technology is best viewed as a process that supplements human review – not as its replacement. By identifying relevant documents and eliminating data redundancy, parties can significantly reduce the number of documents to review. This increases productivity, alleviates backlogs, and helps lawyers focus on the most relevant and most critical data, leading to a better understanding of the case and thus a better overall legal strategy.

Uses For Predictive Tagging Technology

Some parties use predictive tagging solely to validate a review team’s tagging, to eliminate non-relevant data from manual review, and to prioritize documents for review. Using this technology in these ways can increase review teams’ efficiency, reduce the risk of missing key data, and serve as quality control. Some other novel applications include pairing predictive tagging technology with other technologies to get the most out of both, such as by creating, validating, and challenging keywords and by batching documents not only according to their concepts but also in conversation thread order to improve the review’s context and continuity.

Improve the Reliability of Keyword Searches. Keyword searches alone can be problematic. On average, they find about 20 percent of relevant ESI in a large data set.[5] In a 2009 study involving batch tests of negotiated keyword terms, the Boolean searches had a mean precision ratio of 39 percent but recall of less than 4 percent – meaning the keywords missed 96 percent of the documents.[6] If keyword searches omit 80 percent of the relevant documents from a review, parties may miss key documents that could help build a case or a defense or inadvertently create the risk of discovery sanctions.

Moreover, parties that rely on keyword searching as the primary form of filtering and organizing data review only what the keywords return – leaving a universe of data untouched and contributing to the inaccuracy of the review. Even when keyword searching is combined with human review, studies have shown that lawyers are poor at making relevancy determinations in large data sets.[7]

Predictive tagging technology finds significantly more of the relevant documents up front – usually three to four times more than keyword matching – reducing the risk of missing key data. By running the predictive tagging technology against the entire universe of documents, parties can ensure that they are not missing key documents that may have lacked the key terms they searched for.

Create, Validate, and Challenge Keywords. As the predictive tagging algorithm learns from the coded documents, it identifies and tracks document attributes that distinguish them as relevant or non-relevant documents. The attributes include words, groups of words, the distance between words, and the frequency of words. Once stabilized, the software generates a list of keywords that appear predominantly in the responsive set of documents. This list can be cross-referenced with a list of terms generated by separate software that identifies terms appearing exclusively in responsive documents in addition to terms appearing exclusively in non-responsive documents. The result of this process creates a defensible Boolean string that should yield the most responsive documents in the set: for example, (peanut butter AND jelly ANDNOT milk). So, while humans are limited in their ability to identify keywords and patterns by the constraints of their own knowledge, experience, and imagination, the technology is not. Therefore, parties can use a combination of software programs to help decide upon valid search terms, rather than relying on the traditional “guess and check” method.

Parties can also use predictive tagging technology defensively. If a search term run through the entire data universe turns up non-relevant documents, parties can use the results to challenge the validity of the opposing party’s proposed search terms.

Triage Documents for Prioritized Review. Despite Judge Peck’s decision, some parties will be slow to adopt this technology and will continue to insist on negotiating keyword search terms. In those cases, parties can still leverage this technology to increase speed, efficiency, and accuracy while reducing the overall cost. Once the algorithm has ranked documents, the negotiated search terms can be run against the data set. Cross-referencing these respective results will allow you to triage the documents appropriately.

For example, if the algorithm tagged a series of documents that survived search term filtering as potentially non-responsive, they can be triaged offshore at a lower cost. Likewise, potentially responsive documents that survived filtering can be reviewed by onshore contract attorneys, while documents that scored high for responsiveness or that were tagged as “hot,” “privileged,” or “confidential” can be assigned to outside counsel. This intelligent batching means that the most relevant documents get the most attention while also ensuring that documents least likely to be relevant do not unnecessarily increase the cost of the review.

Best Practices For Advanced E-Discovery Technology

These algorithms present a sophisticated technology that should be leveraged in a variety of ways to meet the challenges presented by the exponential growth of data that companies are generating. Though advanced, this technology’s effectiveness is only as good as the process, and that process begins with sound training from experienced lawyers.

Although the case is still being litigated, Judge Peck’s opinion in Da Silva Moore is instructive for parties considering the use of this technology. Three guiding principles emerge: communication, cooperation, and transparency. Not only must communication between teams of reviewing attorneys and their vendors be strong and clear, but it must be equally effective with opposing counsel and vendors. When negotiating the use of this technology, parties must make the process as transparent as possible. Finally, parties should cooperate throughout discovery. Work with opposing counsel to share the seed set of documents used in training the computer – as the defendant did in Da Silva Moore.


[1] D.C. Blair & M.E. Maron, An Evaluation of Retrieval Effectiveness for a Full-Text Document-Retrieval System, 28 Communications of the ACM, Mar. 1985, 289-99.

[2] Maura R. Grossman & Gordon V. Cormack, Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, 17 Rich. J.L. & Tech. 11 (2011), available at http://jolt.richmond.edu/v17i3/article11.pdf.

[3] Id.

[4] Anne Kershaw & Joseph Howie, “Crash or Soar? Will the Legal Community Accept ‘Predictive Coding?’” Law Technology News, Oct. 1, 2010, available at http://www.akershaw.com/articles/LTN_CrashOrSoar_2010_Oct.pdf.

[5] Id.

[6] Jason R. Baron, Bruce Hedlin, Douglas W. Oard, & Stephen Tomlinson, 2009 TREC Legal Track Overview, available at http://trec-legal.umiacs.umd.edu/.

[7] Id.

Published .