Discovery

Why Big Data Is A Big Deal

Editor: Tell us about your legal practice and your involvement in matters relating to Big Data.

Berezin: I’m a member of Weil’s Complex Commercial and Patent Litigation groups and am based in Weil’s New York office. My practice focuses on litigation related to technology, as well as other complex commercial business disputes. In a number of ways, Big Data matters are increasingly part of my practice and the services that Weil provides to clients.

First, in nearly every litigation that Weil handles, the e-discovery process can present a significant Big Data challenge given the extremely large data sets originating from many different sources. As a result, we know how to identify in all of this data the truly critical documents and evidence, while managing the process efficiently.

Second, we also represent and counsel clients concerning data breaches, issues involving the use of cloud computing, PCI (Payment Card Industry) compliance issues, and data privacy issues. All of these issues can and often do arise in the context of Big Data.

Third, I am increasingly litigating cases that involve Big Data, including patent cases pertaining to technology used to analyze and derive value from Big Data, commercial disputes arising from the failure or inability to offer services that capitalize on Big Data innovations increasingly demanded by customers, and class action cases involving the alleged unlawful manipulation of Big Data to make supposedly improper business decisions.

Editor: Please explain Big Data.

Berezin: Definitions vary, but in my view Big Data describes a situation where information of various types is collected rapidly and stored in huge volumes. With Big Data, data sets are so large and diverse that the old ways of managing and using that data are simply insufficient. In response to the challenge of Big Data, organizations are using newer technology to mine all of this previously disparate and disorganized data to gain valuable insights.

Editor: Please explain some of the areas where such insights have been used.

Berezin: Cities have drawn upon and analyzed data to predict where violent crime is likely to occur and to concentrate law enforcement resources at those locations to make neighborhoods safer. This is an example of how Big Data can be used to improve people’s lives. We also understand that the federal government has used Big Data to prevent acts of terrorism and to help bring terrorists to justiceas we saw in the case of the Boston Marathon bombers.

In healthcare, large data sets have been used to uncover important correlations between adverse health events and various activities. There are numerous published examples of this occurring. Yet another example occurs in the energy sector, where energy companies analyze Big Data derived from the power grid to optimize energy delivery and improve the use of renewable energy sources. In general, the Industrial Internet – which involves leveraging the Internet to collect, analyze and benefit from industrial data (e.g., data collected by sensors on equipment thousands of miles from headquarters where it can be analyzed to diagnose and fix a problem before it becomes a major issue) – is bound to become an increasingly important aspect of the Big Data discussion.

Editor: Are there privacy issues in connection with Big Data?

Berezin: Privacy issues arise typically when a company collects and stores data that contains information that can be used to identify an individual. In these circumstances, companies need to be mindful of the various regulations and laws that govern the collection of that data and its use, both in the U.S. and internationally.

In the EU today, there are comprehensive regulations that govern data and privacy, and that also govern the transfer of data to countries like the U.S. There is a U.S.-EU Safe Harbor that will permit data from European countries to be transferred to the U.S. if the recipient company adheres to what are called “safe harbor privacy principles.” These principles could potentially inhibit the collection of data.

For example, certain requirements in the U.S.-EU Safe Harbor relate to the manner in which data may be collected. In fact, under the U.S.-EU Safe Harbor, there are notice and opt-out provisions that must be offered to individuals whose data is to be collected. As a result, the manner in which data is collected and maintained is important from a regulatory and risk perspective even in situations where a company does not intend to use any data that could reveal a person’s identity.

Nor is this solely a European regulatory issue. The U.S. has a complex patchwork of federal and state laws and regulations relating to personally identifiable information and specific types of data (e.g., health-related data, data relating to minors, financial data). For example, Section 5 of the Federal Trade Commission Act prohibits unfair or deceptive practices in or affecting commerce. The FTC has exercised its enforcement power related to the collection of data that contains personal information on a number of occasions. Other laws govern how certain data must be stored and protected, and provide for notification when a data security breach occurs.

Editor: Since collecting and datafying information may be deemed to be in the public interest, is it important that datafication of information by public bodies and private companies and its public disclosure be encouraged?

Berezin: At the end of the day, we are talking about whether it is better to have vastly richer, non-private information in the public domain. We know that when information is disclosed responsibly, it can be mined and analyzed to help our society improve and advance. Both the public and private sectors seem to understand this reality. For example, the federal government has a website called data.gov designed to increase the ability to find, download, and use all kinds of data sets that are generated and held by the U.S. government.

The federal government has also been encouraging the use of its healthcare data for the common good. For example, there are policies that encourage the disclosure of healthcare data from agencies such as the Centers for Medicare & Medicaid Services (CMS), the Food and Drug Administration (FDA), and the Centers for Disease Control and Prevention (CDC). Likewise, the Affordable Care Act authorizes Health and Human Services to release data to promote transparency in the market for healthcare and health insurance.

The Obama administration also initiated the Big Data Research and Development Initiative and committed $200 million to help the federal government improve its own use of Big Data. At the state level, we are seeing the launch of portals or websites that are called variously “data transparency portals,” “sunshine portals,” or “open data portals.” New York State, for example, has launched an open data portal that brings together data sets from local, state, and federal levels. It includes economic data, data about political donations, data about state expenditures, average daily traffic data, employment and crime statistics, and much more. We’re seeing similar efforts in several other states as well.

Editor: If a city is using surveillance cameras to fight crime, how can Big Data help?

Berezin: If data analyses make it possible for law enforcement in a number of cities to determine where violent crime is likely to occur, then law enforcement can benefit greatly from Big Data as a guide to where an increased police presence is desirable and where surveillance cameras can be placed to make those neighborhoods safer for the people who live there. As an example, surveillance cameras were part of the data analyzed to catch the Boston Marathon bombers.

Editor: I suppose there are some very good examples with respect to the use by companies of Big Data to look for ways in which products or sales can be improved.

Berezin: What we’ve seen are countless examples as diverse as Internet companies, brick and mortar retailers, security companies, airlines, etc., using all of the diverse data that had previously been in discrete silos to discover valuable insights about their customers and to improve the experience they offer to customers in ways that were really unthinkable before the rise of Big Data and the development of technology enabling its value to be effectively mined.

Editor: In legislative and regulatory proceedings, do you see Big Data being used by lawyers and lobbyists to support positions?

Berezin: Yes, just as Big Data can help companies improve the products and services they provide, Big Data can also be used to support a position before regulators or lawmakers to demonstrate, for example, the value of a particular regulation or law under consideration or that appropriate modifications should be made. It is obviously important to ensure that the process is handled scientifically and credibly, but there’s no doubt that Big Data can have a significant beneficial impact for clients in the areas of regulation and lawmaking.

Editor: Do you see a time when legislators and regulators will expect to hear testimony from lobbying groups citing results of Big Data analysis?

Berezin: I do, and I also expect to see legislators and regulators increasingly seeking the results of Big Data analyses from the government’s own stores of data. As we discussed earlier, governments are investing significantly in their ability to collect, manage, understand and analyze Big Data. So, yes, I believe that in the future, legislative and regulatory decisions will be strongly influenced by analyses of Big Data.

Editor: Are litigation-specific risks associated with Big Data?

Berezin: Regulatory enforcement actions and class action claims typically follow data breaches that expose personal information. Cases have also been brought when company personnel allegedly made improper decisions based on Big Data analyses. An example would be a class action based on the alleged misuse of Big Data to make decisions about consumer credit applications. As I mentioned, Weil already has handled significant cases that involved Big Data.

We have also been involved in situations when litigants sought in discovery underlying data from multiple databases, and in some instances they also sought proprietary applications used to analyze that data. It is certainly possible that a party may go further and seek to require another party to actually process its own data to find correlations that would be useful to support the requesting party’s case. We are very familiar with the arguments on both sides of this issue, and it is clearly one that lawyers need to be prepared to handle.

Published .