Predictive Coding Isn’t The Only Tool In Your TAR Tool Kit

Editor: Please tell us about your professional background.

Shellhaas: I’ve been in the litigation support space for twenty years now. It started as a college job, and the industry boomed right as I was graduating. I was offered a management position in a growing company in the Washington, DC, area, and over the years I’ve held management positions in nearly every operational or sales role in this legal support space. And that’s what’s really interesting about this space – as opposed to the sales managers that traditionally supported account executives, it turns out that people who come from operations and really understand technology have proven to be the best people to serve the dual role of operational leader and sales support that e-discovery requires. As vice president of TrustPoint’s electronic discovery services, I manage the day-to-day e-discovery and hosted review operations and a team of solutions architechs that help account executives deliver the proper solution for their respective clients.

Editor: Predictive coding has become something of a buzzword; what might attorneys be missing in the rush to be part of – or avoid – the trend toward using more technology-assisted review?

Shellhaas: The buzz started at LegalTech® a couple years ago. Today, you can’t go a week, a subscription or a publication without seeing an article on predictive coding, but the underlying technology, basically grouping similar documents for review, was around as early as 2002. At the time, we presented the technology to firms as a more efficient review than a traditional linear review, but reviewers weren’t ready. So, this “new” trend is interesting because it seems to me that lawyers skipped a step in embracing predictive coding.We know, for instance, that simply grouping documents based on similarity before providing them to a reviewer will increase review speed by three times. That is a huge productivity leap, which should have been a great selling point for any lawyer within a law firm to make to a corporate counsel interested in lowering his costs. What predictive coding has done is to take this process to the next step by using technology to review a subset of documents for response or non-responsiveness. You can then take those results and compare them against the rest of the unreviewed documents and classify everything that remains. There are different iterations including quality control (QC) and validation, but ultimately, it’s still using that baseline function of grouping similar documents together. If the buzz about predictive coding is setting you on edge, my suggestion is to try using some of the underlying technologies for a review first. Once you are comfortable with that, you are more likely to be comfortable running a full predictive coding exercise.

Editor: How does technology-assisted review (TAR) differ from predictive coding?

Shellhaas: Predictive coding is just one of many tools in a TAR tool kit. To perform a predictive coding exercise you will first need to use various analytical tools, including grouping by similarity clusters and near duplicates. You don’t need to start with a predictive coding tool; you might even start by just running search terms. If ultimately, after using those other TAR tools, you decide that you want to do a full predictive coding exercise, you have that option. It’s made for flexible workflows.

Editor: Are keyword searches out of favor amongst reviewers who have embraced predictive coding?

Shellhaas: There are people who believe that you shouldn’t run keyword searches prior to a predictive coding or TAR exercise because it might limit the amount of data in your sampling. My goal is always to solve a client’s problem with the best solution, not pounding the square peg into the round hole. The reality is that keyword searches have been accepted by the court for many years now, and you can use keyword searches with the technology-assisted review or you can use the technology to prove that your keywords are accurate. You’re never going to get away from the use of keyword searches entirely, so you might as well incorporate it into your overall workflow.

Editor: How has predictive coding, technology and computer-assisted review changed the review landscape?

Shellhaas: Those firms that have embraced the technology and are utilizing it more than others have a distinct advantage. They find the relevant documents faster and are better able to develop their strategies faster. For as long as hosted review has been around, there are still many small firms that are not even working with a service provider to build an online database to review documents, but I think this will change as the upfront costs decrease.

Editor: What percentages of cases are utilizing some form of TAR?

Shellhaas: Most service providers have these tools in their tool kit, and everyone in the industry is at least aware of TAR, but if you polled corporations, law firm IT, and legal support managers, I think that you would find that TAR is being used in less than 20 percent of the cases.

Editor: Really? That seems a small number for all the buzz.

Shellhaas: I don’t think the trend has necessarily caught on yet. And keep in mind that that percentage takes into account really small matters where no one would even think to run technology through the data sets. That said, many lawyers still just want to put their eyes on every single document even though it’s been proven many times over that the machine is better. As newer law students come into the profession and firms use databases more often in general, exploring these technologies and their advantages will be less of a leap.

Editor: What are the results of TAR exercises showing?

Shellhaas: Exercises are showing that TAR is extremely efficient, and the feedback has been positive. Even the Justice Department stated that one of TrustPoint’s Second Request reviews was the most efficient predictive coding / TAR exercise they had done yet. When the government is accepting productions, then you know that it’s going to become more mainstream as we move forward. Not only do review speeds increase by three or more times, but you can more efficiently determine which documents are nonresponsive. Those nonresponsive documents fall away, and reviewers can spend valuable time on richer data sets, which are the most valuable in developing your case strategy.

Editor: How has the legal community responded to the use of TAR?

Shellhaas: The vast majority of the industry realizes that TAR is a powerful component. Those of us supporting the legal community can only state our case and arguments for TAR; ultimately, a decision maker within the firm or the corporation has to decide whether to use TAR, and if they don’t feel comfortable with it or feel the upfront costs are too high then they’ll pass. Generally, though, once they start using these technologies, it’s hard to go back. It’s like the stick shift car – if you go to the dealer shopping for one, the dealer will probably have to order it from the factory because there isn’t much demand for stick anymore. That’s because an automatic is so much more efficient. The same is true of TAR.

Editor: What are some of the risks associated with TAR?

Shellhaas: Like any new thing, people get anxious: What if we miss a bunch of documents because the computer was wrong? What if we disclose a bunch of privilege documents by mistake? But every study and comparative analysis that’s been conducted tells us that more mistakes are made in a general human review than in a review that utilizes these technologies. We’ve already established that TAR is a more efficient way to run a review – reviewers can use some of that time savings to catch mistakes that may have been made along the way.

Editor: What do you see as the next evolution in the use of these technologies for the litigation support industry?

Shellhaas: I think that large corporations, particularly highly litigious corporations, are going to start employing technologies that have analytical functions right within their own corporate domains. In this way, emails collected years ago for a specific litigation and classified as, for instance, intellectual property condition, remain classified within the system inside the corporation itself, so that whenever an intellectual property condition develops, the documents are already classified and behind the company’s firewall. These new electronic discovery review models (EDRM) are going to make reviewing, using and presenting data for litigation even more efficient. It really turns the entire review model on its head. Whereas we used to collect massive amounts of data from the corporation to bring into the litigation process and then try to use the technologies to whittle it down, we’re going to be whittling it down within the company’s domain. Now, how long will it take to get there? At a minimum, I believe it will take five years, but it’s already starting. As a service provider, we’re already offering managed service environments to corporations and law firms, where we provide the infrastructure, the systems, the computers, and the servers, so that we become a partner within the litigation support landscape. The fact is that the analytics technologies will continue to move farther upstream to where the original data is located within the corporate domains, and technology service providers like TrustPoint will be providing consulting and services that will make for a more efficient litigation/risk management philosophy.

Published August 25, 2014.