Technology

Melding Humans And Machines In E-discovery

Editor: Please describe your background.

Barnett: At the beginning of my legal career, I handled a wide variety of cases, including tax, securities, accounting fraud and corporate governance. My parallel interest in computers – I first started programming computers at a young age – led me to begin thinking about how electronic data was becoming essential in our cases, and I subsequently became my firm’s expert in e-discovery and electronic data management.

Several years later I joined Electronic Evidence Discovery (EED) when there were only two or three other companies in the industry. Six to nine months later, the Enron and Arthur Andersen scandals occurred, and the related litigation contributed to explosive growth in e-discovery. My next step was to form the e-discovery division of an Asia-based BPO company called SPi Technologies, where we pioneered the use of advanced data analytics. We used multiple search engines, the Web, and iterative feedback through reviewer tagging creating early versions of what is now called predictive coding, or supervised or machine learning.

In our very first year, SPi ranked in the top 10 in e-discovery revenue. Upon the sale of the company, I joined Sullivan & Cromwell as special counsel in their litigation department, where I created and ran the firm’s Electronic Discovery and Compliance department. I overhauled the way the entire firm handled electronic discovery and regulatory subpoena compliance.

I later moved to Stratify, which became Iron Mountain Digital. This division was acquired by Autonomy. I led Iron Mountain’s global records and information management consulting group, which included everything from consulting on records retention to active matter management in complex e-discovery matters.

The string of acquisitions and divestitures resulted in a business model primarily focused on enterprise software sales. I wanted to continue my focus on expert e-discovery consulting and service delivery, so I joined Stroz Friedberg because of its unparalleled expertise, knowledge and experience in complex digital risk management, including e-discovery, digital forensics, investigations, and cyber and data breach response. We are unmatched in the number of former federal prosecutors, senior federal law enforcement and top-tier law firm litigators on the team. Founder Ed Stroz was an FBI special agent who formed the Computer Crime Squad in New York, and co-founder Eric Friedberg was a federal prosecutor whose many accomplishments include serving as lead computer crimes prosecutor and computer and telecommunications coordinator.

In the early days, most providers focused on large-scale processing and hosting, but Stroz Friedberg took a novel and creative approach. It focused on digital forensics – identifying, preserving and analyzing all types of digital data, and if needed, testifying about the findings. It built its reputation on the highest standards of ethics, objectivity, credibility and technical expertise.

Our overall mission is to be the experts clients turn to and trust in the most difficult, complex or bet-the-company matters.

Digital forensics, data breach and cybercrime response are major parts of our work. E-discovery, however, now plays a very significant role in our core business. We leverage the most sophisticated analytical tools and approaches available to help clients get to the most important data quickly while addressing both the economics and the defensibility of the process.

Editor: Tell us about your vision for your firm’s e-discovery practice.

Barnett: Companies like Google, Facebook and Yahoo! have developed groundbreaking technology to process and analyze enormous amounts of data very rapidly (think of Google searching the entire Internet). This requires creating frameworks beyond the traditional relational database model – up until now, the foundation of e-discovery. My vision is to incorporate the best of the technological advancements and the most sophisticated analytical approaches and to apply them to improving the efficiency and accuracy of the e-discovery processes.

We have a history of developing analytical tools unique in the industry. For example, Privilege Analytics takes advantage of data extraction and entity identification. It ranks a set of documents according to their likelihood of being privileged. Privilege Analytics assesses information about a document that would not be identifiable solely through the use of keywords, concept organization or clustering. For example, what kind of document is it, a letter? A memo? Who is it from? Who received it? Were there extraneous recipients that might vitiate privilege? Does it contain legal advice? The technology involved has existed in other areas of business and academia but has not been previously used in e-discovery. We continually search for ideas outside the traditional e-discovery world.

Another example, Warm Touch, uses a process known as sentiment analysis, which analyzes communications to assess an individual’s risk of anti-social behavior. This is based on approaches used by intelligence professionals to probe the psychology of new leaders when they take power in situations where there is limited direct access to them. By analyzing the person’s communications, signs of instability and the likelihood of erratic behavior can be assessed. While often used after an event has occurred, this approach also offers the possibility to detect signs of instability and agitation that could lead to destructive behavior. We are currently developing a Warm Touch behind-the-firewall appliance.

All of our processes are fully transparent and show the user (or if necessary, the court) exactly how the results were obtained – stemming from our background in forensics and investigations. We don’t use black box technology; everything is transparent and documented.

Editor: There is a lot of talk about the role of machines versus humans in e-discovery. How important is human input?

Barnett: In our processes, human input and analysis are critical. Most technology that attempts to replicate some form of analysis or reasoning by human beings relies on human input for calibration and refinement. In the case of Privilege Analytics, for example, the law firm or company identifies individuals likely to be engaging in privileged communications. The technology then uses a combination of rules-based analysis, entity extraction and text searching to rank the data as to likelihood of privilege – together, these technologies provide inferential reasoning not available in other processes. In simple terms, a letter, from outside counsel to inside counsel, referring to legal advice, with no external recipients, is likely to be privileged. These are the kinds of inferences that Privilege Analytics makes.

Editor: What can humans do that machines can’t?

Barnett: Even the most powerful computers can’t approach how a human analyzes and understands. Computers are amazingly fast and efficient at doing exactly what they’re told to do – no more, no less.

Computers are very bad at making sense of context. Suppose I make a simple statement, “I saw Andrew driving up from Chicago.” If you try to analyze that sentence, it includes a lot of unanswered questions. Was I driving? Was Andrew driving? Were we both driving?

Suppose the next sentence is, “He was in his new Ferrari, and I was on my front porch.” Based on the context or “pragmatics,” as linguists call it, you understand who was driving and who was not. Computers simply can’t do this — at least for now. They compare characters and strings and can be instructed to make certain inferences based on that, but they don’t understand anything in the sense that humans do.

A computer's ability to make sense of sentiment or intent is even more limited. Suppose I ask you, "Do you know how to get to Central Park from here?” If you say “yes” and walk away, you really haven’t answered my question. All of us know, based on our understanding of the world, that question means, “How do you get to Central Park from here?” A computer can’t make that leap. No technology allows a computer to understand intent, sarcasm, irony, humor or any of the nuances that we take for granted in communication. That is why you need both computers and humans in conjunction doing what they each do best to make sense out of the vast amount of data confronting a company in a litigation or investigation.

Supervised learning or predictive coding technology, like our Auto Suggest tool, allows human beings to provide the input about the status of a document, and the technology learns from that. Suppose there is an exchange of emails. One email says “we need to cook the books for more revenue” (it's not likely we'll find an email that bluntly stated). The response states, “yes, let’s go for it.” A computer looking only at that response would not make anything of it, but a human being, able to consider context, would know it’s from a certain person to a certain person sent on a certain date about a certain event. A human reviewer knowledgeable about the case understands the background of the discussion (as philosopher of language John Searle calls it) and basic assumptions about how the world works that are absent from a computer’s ability to interpret or make inferences. Building on the human designations, Auto Suggest can rank documents according to their relatedness to human-designated documents. In this way, a process based on human understanding joined with technology can overcome the fact that computers can’t understand background, context and intent.

Editor: How do you determine the appropriate mix of humans and technology?

Barnett: That is dictated by the client’s specific needs in the matter and the appropriate technology for the problem. Some approaches are very interactive up front, like Privilege Analytics or Auto Suggest, our predictive coding offering. Others have more human involvement after the data has been processed and allow clients to use their knowledge of the case to investigate the data, such as First Glance, our ECA tool. Our technical expertise along with our real world experience litigating, investigating and prosecuting many different types of matters gives us the credibility to advise for or against different approaches. By contrast, providers who rely solely on pushing their specific technology are like someone who only has a hammer and sees every problem as a nail.

Editor: The question of privacy and avoiding the release of personal data is a big challenge. Is this something that Stroz Friedberg addresses?

Barnett: The question of privacy and intermingling of data is a huge and complex issue. With the explosion in the kinds of communication platforms and devices available, the quantity and variety of data has exploded. More and more companies have their own Facebook pages or Twitter feeds, and people obviously use these things for personal communication as well. A 2011 report by IDC stated that as much as 85 percent of all digital data is owned or controlled by the corporate enterprise at some time in its lifecycle — which results in intermingling of personal and business data on a vast scale.

We use all of our analytical tools to help clients focus on the data they need for the matter and to deal with private information by segregating it where possible or by allowing them to redact portions that cannot be produced.

We also provide security risk consulting, guiding companies and law firms on how to protect all of their data – including personal information, confidential material and trade secrets from a data breach, cyber attack or theft by departing employees. This is an important differentiator for Stroz Friedberg. Most providers of e-discovery services focus on processing data, putting it up for review and producing it. We understand the complete context of the data from its creation, to investigating its importance in an event, to explaining the details of its provenance and reliability in court.

Editor: How can companies reduce the cost of over-preservation?

Barnett: Companies that confront either litigation or regulatory or criminal enforcement actions are understandably very concerned about meeting their preservation obligations, and this concern can turn into excessive worry or fear and result in over-preservation to avoid any possible claim of spoliation. Our experience as testifying experts, neutrals and discovery special masters gives our clients confidence in our guidance on a defensible approach to reasonable rather than excessive and unnecessary preservation.

Published .