Data Analytics

Structured Data Illuminates Facts Like Never Before

Jonathan Hurwitz, managing consultant at iDiscovery Solutions, talks about how structured data analysis is revolutionizing the process of discovery – even in cases where the client doesn’t realize how valuable it will be.

CCBJ: How is structured data changing the litigation landscape?

Jonathan Hurwitz: First, it allows us to get to the facts a lot faster and with greater confidence. We used to have to go through this long process of discovery – gathering whatever documents you have, both paper and electronic, figuring out what was and wasn’t relevant, and getting them ready for production. Then the other side reviewed and read them, in a process that could often be very expensive, time consuming and prone to errors; and based on all of that information, we tried to figure out what the facts were and what really happened. But when working with structured data, we can take a step back and pull information out of digital systems that represents objective, unbiased facts. It’s not something that somebody wrote down. It’s something that was recorded by their phone or by a computer system: It’s data that has no vested interest in trying to make itself look one way or another. And in many cases, it’s done automatically. For example, your phone constantly records your location. You could write a text message to somebody and say, “Hey, I’m working late, I’ll be home around 10,” but that doesn’t actually mean you arrived home around 10. If we look at the actual structured data that’s associated with that person’s phone, it might tell a very different story. We might get a GPS entry from the phone that indicates that, no, you weren’t actually working late. You were down at the bar, for instance, or somewhere else, doing something completely different. The data that’s being recorded by these structured systems isn’t going to lie. In this case, context becomes more important than content. That’s a significant change in the way that lawyers can look at litigation.

The second thing that is important to note is that we are now in a situation where it’s not just one side that has all of the data. Now both sides likely have data that’s relevant and producible. Historically, it’s always been big corporations that have had the resources to maintain and the burden to produce the bulk of responsive data for a given dispute. Whereas the other side, if the plaintiffs were individuals, they usually didn’t have a ton of data. They were not really too worried about all of the things that go along with having to review and produce electronic discovery or electronic data. But now we’ve actually seen cases where an individual’s mobile phone has been considered discoverable. Even if it was a personal phone, we’ve actually seen the court say that, well, you used it while you were working – so if you’re claiming that you should be paid for hours that you weren’t paid for, then for all of the time that you claim you were on the clock, your phone is discoverable during those periods. In this sense, increased data availability and accessibility has leveled the playing field a bit. Both sides have data that can be discovered. Both sides have to worry about things like foliation and production and the cost of electronic discovery.

The third point that’s interesting is that this data tends to help shorten the life cycle of litigation. We’ve had some cases where we’ve actually gotten structured data and presented some findings within the first few months of litigation and been able to go back to the plaintiffs and say things like, “You may not have a case here, your client may be misleading you as to what happened. Here’s what the structured data tells us, and this is where the deficiencies in your arguments are.” And if you can get that information in front of a plaintiff early enough, they can make a quick decision before they’ve put a lot of time or effort or money into the case. If you can show them that they’re actually going to lose because there is no merit there, and you can do it early, you can prevent long, drawn-out litigation. On the flip side, you might find out early on that the data is actually going to be in the plaintiff’s favor, which would open up a different conversation. In either scenario, effective analysis of structured data can allow you to quickly understand the strengths and weaknesses of your case and make informed strategic decisions early on in the process.

How is all of this different than traditional discovery?

It’s actually similar to traditional discovery in that all of the same rules apply from the law. Litigation still counts. We still need to preserve, produce and review data – except now the data we’re looking at isn’t just Microsoft Word documents, Excel spreadsheets or email messages. It’s transactional items in a database, such as badge swipe records, computer access logs and GPS pings; and it gives a much more complete and defensible account of how things happened.

Say, for instance, when you walk into the office building in the morning, you have to swipe your badge, which creates a record that indicates that at 8:46 a.m. a person – let’s call him Jonathan Hurwitz – walked into the building and swiped his badge. Then he went to the third floor and swiped his badge on the reader there. Then he sat down at his computer and logged on to the corporate server. You can build what we’ve been calling a “day in the life” profile about an individual and really say that this is what their day looked like. They showed up at the building at 8:46 a.m. They entered their actual office at 8:53 p.m. Their computer was turned on and logged into the network at 8:57 a.m. So you can take that information and look at it in the light of whatever the merits of the case are. When we do wage and labor disputes, we often see people saying, “Hey, I was working off the clock and wasn’t getting paid.” And we can say, “Well, it looks like you didn’t get to the building until 8:46, and your time sheet has you clocking in and getting paid starting at 8:53. So it took seven minutes for you to go from the front door to the office, where you clocked in. That seems like a reasonable amount of time.”

On the other hand, maybe the data will tell us, “Yes, you entered the building at 8:46, but you didn’t get to the office and clock in until 9:30. Maybe there is a problem there. Maybe that’s something we need to investigate.” So the difference is that we’re not relying on subjective information or anecdotal evidence gleaned from interviews and written accounts when we’re trying to determine when a given event happened; instead, we’re looking at actual traces of activity that were recorded in real time, and it has a very hard factual basis to it. When you swipe your badge, it’s going to record when that badge got swiped on that specific badge reader. It’s both objective and consistent. You’re probably going to have a hard time getting into the building without swiping your badge – so if we don’t have a badge swipe for you, it’s a good indication that you weren’t actually in that building on that day.

Just by having a smartphone, you’re generating an ever-expanding digital footprint that is being leveraged in ways you donʼt even realize.

How do you see data leveling the playing field?

In the past, electronic discovery has been fairly one-sided. It was the corporation that had a lot of data, and the individual usually didn’t. Especially with large class actions, you’d see corporations having to do a lot of work in terms of document preservation, legal hold, review, production. They’d be worried about spoliation, whereas on the other side, people would be saying, “I don’t have any data. I’m just a person. I don’t have anything. However, I typically arrived at work at 8:30 a.m. and regularly was forced to work evenings and weekends.” But with the new structured data approach, both sides have a duty to preserve electronic data that’s likely relevant. Most individuals have smartphones that are going to track a lot of their activity – social media accounts, email accounts, etc. It’s very common now for individuals to have large amounts of data, even if they’re not aware of it. So we find ourselves in a situation where it’s not just one side that has this duty to preserve information and be worried about spoliation. Both sides are. So now maybe we come at things with a bit more of a sane approach as far as what’s reasonable and what demands can we make, because those demands could very easily be turned around and applied to both parties.

By allowing you to reach conclusions with greater confidence in less time, structured data analytics can actually make a case much less expensive than it would have been using more
traditional methods.

What are some important considerations you make when staffing investigations?

You want to make sure that you have people who have the domain knowledge to understand the factual context of a case, but also the technical expertise to understand what data will be the most useful, while leveraging it to its fullest potential. People who can really dig into the data and figure out what it means – what’s going on. Because one of the things we find with structured data is that it doesn’t always mean what you think it means. You might get an individual’s cell phone, look at the GPS location information on it and immediately say, “On the night of the murder, you were awake and walking around the house – your phone shows the GPS moving.” But actually, what can happen with a phone with GPS is that if you place it in one location and don’t move it, it might look like it’s moving, because the longer it sits there, the more accurate its location information will get. So it looks like the phone is “moving,” but actually it’s just the GPS homing in and getting a better reading. Or when looking at time stamp information across different sources, we may see instances where an employee appears to be working very long hours on certain days. However, upon closer inspection, it appears that each system has recorded time stamps in a different time zone, and when we normalize these sources, the employee’s working profile becomes much more realistic. Examples like these demonstrate the importance of having analysts who are both familiar with the technology and appropriately skeptical. Having people that are able to see through some of these biases or preconceived ideas about what the data means is crucial for eliminating the risk of false positives and improving the quality of your analysis. iDiscovery Solutions has a deep team of individuals with significant experience leveraging structured data in a variety of legal contexts, including investigations, discovery and expert testimony. We’d love to provide a free consultation on your project to discuss how structured data analytics might help.

Published .