The How, Where, Why and What of Cloud-Based E-Discovery

What exactly is the cloud? And how important is it to E-discovery?

These days, many of the services we use, the products we buy, our sources of entertainment, and even our social circles are delivered to us via the cloud. We do our banking and pay our bills in the cloud; we buy everything from clothes to computers to groceries online via cloud-based platforms. Online shopping has become so prevalent that, even on the biggest in-person shopping day of the year — Black Friday — online shoppers outpaced in-store shoppers 88 million to 66.5 million in 2021 in the US. So many aspects of our personal lives today are tied to the cloud, even if it isn’t top of mind for most of us. In addition, cloud-based software is becoming ubiquitous in business as well. Many organizations have moved from an on premise Microsoft Office system to M365 in the cloud. M365 is used by over a million companies worldwide as their main productivity software with close to 600,000 companies in the U.S. alone opting for the software. And there are even more users of G-Suite in the US — 59.41% vs. 40.39% using M365.

Many organizations have been using cloud-based platforms for critical business functions, including SAP for Accounting, ADP for Payroll, Workday for HR and Salesforce for Customer Management for years already. The use of collaborative apps such as Slack and Teams have risen dramatically, especially since the pandemic, to support collaboration of increasingly remote teams. It’s no wonder that:

The global cloud computing market is expected to grow from $445.3 billion in 2021 to $947.3 billion by 2026.

That trend has also carried over to the market for e-discovery solutions. According to ComplexDiscovery, off-premise (cloud) e-discovery software spending is estimated at approximately 50% ($2.16B) of worldwide e-discovery software spending in 2021, with that percentage increasing to approximately 70% by 2026 and representing in dollar spending estimated at $5.03B in 2026 – an increase of nearly 233% in five years!

While the move to the cloud is unmistakable, many people still don’t understand it. This lack of understanding has perpetuated a lot of myths associated with the cloud. People often don’t fully understand the benefits associated with the cloud and why it makes sense for so many business processes today, including where it applies to e-discovery.

Finally, even if they are committed to the cloud, they don’t understand what to look for in a cloud-based e-discovery solution. This whitepaper will address the HOW, WHERE, WHY and WHAT associated with cloud-based e-discovery:

  • HOW the Cloud Works
  • WHERE E-discovery is in the Cloud Today
  • WHY the Cloud Makes Sense for Businesses Today
  • WHAT to Look for in a Cloud E-discovery Solution

Understanding these factors will hopefully help your organization make an informed decision when it comes to selecting the right e-discovery solution for you.

How the Cloud Works

To understand the cloud, you have to start with how the cloud works, how it differentiates from traditional on-premise computing and how it’s been misunderstood in the marketplace.

Defining the Cloud

Cloud computing is the on-demand availability of computer system resources, especially data storage (cloud storage) and computing power, without direct active management by the user1. In other words, the cloud consists of resources available to organizations to perform processes and store data that they don’t have to manage.

Conversely, with on-premises solutions (also referred to as on-premise, and alternatively abbreviated on-prem) organizations have to provide not just the software, but also the computers the software runs on within the premises of the person or organization using the software. Companies have to provide all supporting software (e.g., operating system) and the infrastructure, such as storage, and other resources.

One of the reasons that cloud computing has become so popular is that many organizations can’t fully replicate the environment needed to support the capacity and security requirements they have — at least not without considerable time and expense.

Three Categories of Cloud Services

The best way to look at cloud computing is in terms of the services it provides. The three common categories of cloud services are:

  • Infrastructure as a Service (IaaS): Cloud provider hosts servers, storage and other virtualized computer resources and makes them available to consumers over the internet.
  • Platform as a Service (PaaS): Cloud provider hosts application development platform and services on its own infrastructure and makes them available to consumers over the internet.
  • Software as a Service (SaaS): Cloud provider hosts applications and makes them available to consumers over the internet.

Each layer builds on the previous layer and many cloud providers provide all three levels of service — the infrastructure, platform and software — that their consumers need to run applications.

The Shared Responsibility Model, or, Who’s Responsible for What?

There’s a lot of components needed to support an enterprise-level solution today. They include the servers, the storage capacity, networking (for connectivity), virtualization, the operating system (required for any server to run), middleware (enabling the various components of the system to communicate and manage data), runtime (software to support the execution of computer programs), the customer data and the application(s) themselves.

With an on-premise solution and each of the three cloud service categories, the responsibility for each of these components varies — from the consumer being completely responsible (on-premise) to the cloud provider being completely responsible (SaaS). Here is the shared responsibility model for Amazon Web Services (AWS) as an example:

While AWS is used as the example here, the same concept applies for any cloud provider.

Busting the Myths of the Cloud

When it comes to the use of cloud solutions, there are some common reasons that organizations use for not considering a cloud solution. Unfortunately, many of those reasons are due to misconceptions about cloud solutions in terms of what they provide, how they work, and their impact on an organization. These “myths” keep organizations from making an accurate assessment of what a cloud solution can do for them.

MYTH: The Cloud is Less Secure Than On-Prem

Most organizations can’t even dream of providing a level of security that cloud providers offer. Just consider these levels of physical security offered by a cloud provider like AWS:

  • Data Center Access
    • Access to data centers is restricted to employees and third parties that are specified to which layer of the data center the individual needs access, and are time-bound, with an expiration date.
  • Data Center Access Logs
    • Physical access to AWS data centers is logged, monitored, and retained. AWS correlates information gained from logical and physical monitoring systems to enhance security on an as-needed basis.
  • Data Center Access Monitoring
    • AWS monitors their data centers using their global Security Operations Centers, which are responsible for monitoring, triaging, and executing security programs, providing 24/7 global support.
  • CCTV
    • Physical access points to server rooms are recorded by Closed Circuit Television Camera (CCTV). Images are retained according to legal and compliance requirements.
  • Data Center Entry Points
    • Physical access is controlled at building ingress points by professional security staff utilizing surveillance, detection systems, and other electronic means.
  • Intrusion Detection
    • Electronic intrusion detection systems are installed within the data layer to monitor, detect, and automatically alert appropriate personnel of security incidents 24/7. That’s just physical security. There are also operational and management security mechanisms ranging from firewalls and antivirus to incident response plans and hiring and termination policies. In short, you can rest assured cloud solutions are very secure.

MYTH: The Cloud is Newer Technology

While many people think that cloud computing is new technology, the concept goes back to the 1950s and “time-sharing” of mainframe computers to enable several users to access the mainframe from connected stations that carried no processing power of their own (i.e., “dumb” terminals).

And the term “cloud” was used to refer to platforms for distributed computing as early as 1993, when Apple spin-off General Magic and AT&T used it in describing their (paired) Telescript and PersonaLink technologies. The concept of cloud computing has been around for decades.

MYTH: There is Less Control with Cloud-Based Solutions

While many think that you give up control when putting your data in the cloud, you still control your data and can determine who can access your data. A typical customer agreement with a cloud provider includes provisions to prohibit the cloud provider from access, use, or sharing of customer data without your agreement (except as required to prevent fraud and abuse, or to comply with laws).

In fact, utilizing the technical and organizational security measures and controls offered by a cloud provider makes it easier for an organization to manage their own compliance requirements and comply with data privacy laws such as GDPR and CCPA.

MYTH: The Cloud Eliminates Jobs

With systems moving to the cloud, people think that eliminates IT jobs. IInstead, it changes the focus of IT from the physical aspects of setting up the network, hardware and infrastructure to the strategic aspects of managing and securing customer data and the access to that data. The focus of your IT staff is redirected to your organization’s broader mission critical tasks associated with access to and protection of data, not the “nuts and bolts” of making the environment work in the first place, making them more productive.

MYTH: Getting Started in the Cloud is Difficult

This is the easiest myth to debunk. Because of all the components addressed by a cloud provider (per the shared responsibility model above), cloud solutions require a much shorter ramp-up time than most on-premise solutions. While on-premise solutions can take as much as six months to a year to install, troubleshoot, integrate, and launch, cloud solutions can be up and running within just days.

MYTH: The Cloud is Less Reliable Than On-Prem

Cloud solutions are also very reliable — to the extent that when a notable cloud provider experiences downtime, it often makes the news. Reputable cloud providers implement their cloud computing environments with reliability in mind from the start — from selecting a site designed to mitigate environmental risks (such as flooding, extreme weather, and seismic activity) to redundancy (to quickly move mission critical processes to another resource when failures happen) to business continuity and pandemic response.

The Nines of Reliability

One way a cloud provider demonstrates reliability is through uptime (not including planned and announced maintenance periods) and the idea of “nines”. For example, a provider with 99% uptime (two “nines”) may seem reliable, but that equates to 3.65 days of unscheduled downtime per year! Three “nines” (99.9%) equates to 8.77 hours of unscheduled downtime per year and four “nines” equates to 52.6 minutes per year. Reliable cloud providers typically provide an uptime level of at least three to four nines.

WHERE Cloud E-discovery is Today

With all that in mind, it’s no wonder that e-discovery is moving toward the cloud as well and the most popular software offerings today are cloud solutions. E-discovery cloud software solutions have evolved over the years – from solutions that supported and automated specific tasks within the EDRM lifecycle to solutions that today streamline and automate end-to-end e-discovery.

Where Cloud-Based E-discovery Began

The earliest cloud-based solutions for e-discovery began to emerge in the late 2000s, and they were designed to support specific tasks within the EDRM life cycle, like the on-premise platforms that were primarily in use up to that point. Cloud-based e-discovery solutions back then primarily supported three specific functions for e-discovery:

  • Legal Hold Notification and Management
    • The earliest legal hold solutions automated the legal hold notification process, including creation of legal hold notices, automated distribution of hold notices to custodians, as well as tracking and reporting the acknowledgement status for each custodian.
  • Processing
    • These solutions were designed to perform traditional e-discovery processing tasks, including unpacking of container files, extracting metadata and text from files, extracting attachments from emails and generating TIFF or PDF images of the files for review.
  • Review and Production
    • These solutions were designed to support workflows associated with searching, review and generation of production sets. The earliest solutions were designed to replicate functionality within popular on-premise e-discovery solutions like Summation and Concordance.

The next iteration of e-discovery cloud solutions began to automate the flow of data through multiple EDRM phases within a single solution. These on-demand, Do-It-Yourself (DIY) solutions enabled users to point to a group of files (either a container file or folder) and upload the files to the cloud environment, where they were automatically processed and ingested into the review platform for search, review and eventual production. They linked portions of the Preservation/ Collection phases (preservation by collection) and generally spanned the entirety of the Processing, Analysis, Review and Production phases of the lifecycle.

Where Cloud-Based E-discovery is Today

Today’s users of cloud e-discovery solutions demand a greater level of automation to support the entire e-discovery life cycle.

Identification through Production

Users of cloud e-discovery solutions expect a solution that connects the “left side” of the EDRM (Identification, Preservation and Collection) with the “right side” (Processing, Analysis, Review and Production) to automate the flow of ESI from beginning to end. This includes the extension of legal hold management capabilities and the ability to preserve-in-place, enabling organizations to cull data earlier in the life cycle, saving time and cost downstream.

Integrations and Partnerships

That extension of automation requires integrations and partnerships with current Enterprise solutions that govern and manage information to support the increased emphasis on the left-side EDRM phases. This includes preservation-in-place of ESI in M365, Google Vault/Workspace and on individual custodian computers via solutions like Code42.

The cloud-based e-discovery solution of today must be comprehensive and connected to support the automation capabilities required by e-discovery professionals to keep up with increasing data volumes and discovery projects.

WHY the Cloud Makes Sense for Legal Departments Today

Having discussed the HOW the cloud works and the WHERE it’s used in e-discovery, let’s turn to the WHY so many organizations use the cloud today to support important business functions, including e-discovery.

Benefits of Cloud-Based Computing

There are several benefits of cloud-based computing that have led to its growing popularity that also fit the remote work environment that many of us work in today.

Ease of Use

The typical cloud-based e-discovery solution is designed to be consistent with current user interface (UI) design best practices, which means that the typical UI of a cloud solution is similar to other solutions you’ve probably used. Because deployment of cloud-based solutions is much easier than it is for on-premise solutions (because software updates only need to be installed in one place (the cloud provider), cloud solutions can adapt to changing trends and best practices in UI design much quicker than on-premise solutions can.

Ease to Get Started

Because most of the components in a cloud-based solution are already in place, it’s much easier and quicker to get started using a cloud solution than it is an on-premise solution where all of those components are the responsibility of your organization.

Lower Footprint

This also means a lower footprint within the organization, requiring less hardware and software to run than on-premise solutions.That’s true, not just for the network and server environment, it’s also true for individual workstations, which typically already have the software needed to run a cloud solution — your web browser.

Fewer Configuration Issues

Cloud solutions also eliminate most configuration issues, as most (if not all) of the functionality is contained within the browser environment. With on-premise solutions, installation on different workstations with different operating systems and different levels of updates applied can be challenging — what works on one workstation may not work on another because of compatibility issues. Diagnosing those compatibility issues can take time.

Greater Storage Capacity and Processing Power

For total storage capacity, system speed, and processing ability, on-premise systems are finite and predetermined, requiring an organization to accurately assess its needs up front. Overestimating can result in unnecessary expenses due to unused capacity, while underestimating can result in additional expenses, as well as downtime when upgrades are needed.

Conversely, cloud-based systems are “elastic,” or infinitely scalable, because they draw on the cloud provider’s entire online network. Additional storage and processing power are available when needed but only paid for (by your organization) when used.

Always Using the Latest Version

Another benefit of cloud-based solutions is that you’re always using the latest version of the software, with the latest features. That’s because software updates only need to be installed in one place — the cloud provider, while for on-premise solutions, the software may need to be installed in multiple places (individual desktops). Or if the software is being installed in one server within the organization, your IT department may have to test for — and address — potential compatibility issues before installing the latest release. When the cloud provider installs the latest update, you have access to the newest features and bug fixes instantaneously.

Today’s Remote Work Landscape

Another reason for the rising popularity in cloud solutions is the migration of many workers to remote work environments, which was accelerated during the COVID-19 pandemic, but was already occurring well before that. With fewer workers in a traditional office setting, the need to access the platform from anywhere, and do so securely, has become paramount.

Access From Anywhere

Of course, cloud-based solutions are designed to be accessible from anywhere. As long as you have an internet connection, you can access a cloud solution. On-premise solutions can be configured to be accessible through virtual private network (VPN) access, but that requires your organization to set up and maintain the VPN, as well as apply system updates to keep the VPN up and running optimally.

Centralizing Security

Another advantage is the fact that many of the security mechanisms associated with remote work are centralized within a cloud solution, which (as discussed above) are very secure. IBM reports that the rapid shift to remote operations during the pandemic appears to have led to more expensive data breaches. Breaches cost over $1 million more on average when remote work was indicated as a factor in the event, compared to those in this group without this factor ($4.96 vs. $3.89 million.) With 81% of U.S.-based IT professionals believing that having remote workers has increased their enterprise’s security challenges and 74% acknowledging that their company’s use of cloud solutions increased as a direct result of the COVID-19 pandemic, centralizing security helps not only limit costs associated with data breaches, but also entry points where breaches can happen in the first place.

WHAT to Look for in a Cloud E-discovery Solution

While there are plenty of reasons to use a cloud solution for e-discovery, each cloud e-discovery solution has different capabilities with regard to the software and the security of customer data. How each handles non-traditional e-discovery use cases may differ as well. Here is WHAT to look for in a cloud e-discovery solution.

Key Capabilities

While preferences for individual features may vary across organizations, these key capabilities are important to address discovery needs today.

Support the E-discovery Lifecycle

As noted in our whitepaper “Corporate E-discovery Data Realities for 2021” released last year, today’s e-discovery challenges can include big data, growing cyberattack threats, increased data privacy compliance challenges and a rise in investigations. Those complexities and challenges dictate a solution that supports the entire e-discovery lifecycle — from Legal Hold Notification to Production to automate the flow of ESI from beginning to end. Taking a “piecemeal” approach to discovery by moving data through a group of solutions manually adds time, cost and the potential for mistakes into your e-discovery process.

Automated In-Place Preservation and Collection

With today’s big data challenge, the day of collecting entire custodian data collections for discovery purposes has become unwieldy and costly. The ability to preserve-in-place within office productivity suite solutions (M365 or Google Vault/Workspace), collaboration apps (Slack), and risk management, detection and response solutions (Code42) supports the ability for targeted collections, saving time and cost downstream.

High-Speed, Comprehensive Processing

Even with a reduction of data downstream, your cloud e-discovery solution also needs a processing engine with a parallel processing engine able to run multiple processing “threads” at once that is built to address today’s big data challenges. Those challenges include an ever-increasing variety of data types — everything from Apple iOS files to social media data from Facebook to data from collaboration applications like Skype and Slack to forensic images from workstations and mobile devices. Your cloud e-discovery solution needs to be able to process hundreds of file types to support those evolving needs.

Streamlined Review

Your cloud e-discovery solution also needs to provide a variety of features and capabilities to streamline review workflows. That includes a “dashboard” of stats that enable you to understand your document collection more quickly and track progress, automatically updating a series of searches when new data is added to the matter, quickly identifying priority custodians with the most responsive documents, quickly locate and redact personally identifiable information (PII) within the collection, just to name a few capabilities.

Easy to Get Started and Use

Of course, your cloud e-discovery solution should not only take full advantage of the infrastructure that a cloud solution already provides, but it should also facilitate linking to enterprise systems within your organization to quickly support in-place preservation and be easy to learn and easy to use, with no extensive training required. The quicker you’re able to get started and become productive, the quicker the return on your investment.

Security Mechanisms

With 2021 having been a record year for data breaches, the security of your data has never been more important. Your cloud e-discovery solution should provide state-of-the-art security mechanisms to minimize risk and the potential for becoming a data breach statistic.

Application Security

The most secure cloud solutions offer a level of security that includes customizable password complexity, multi-factor authentication, session management, and API access. According to Microsoft, multi-factor authentication can block over 99.9 percent of account compromise attacks!

Encryption In-Transit and At Rest

To provide maximum protection for your data, it should be encrypted both in-transit and at rest, using industry standard encryption such as TLS 1.2 (for in-transit encryption) and AES 256 (for at rest encryption). Some solutions only encrypt data in-transit.

Secure Data Center

The AWS security example above illustrates just how important having a secure data center is, with data center access restrictions, logs and monitoring, CCTV and dedicated security staff and electronic intrusion detection systems. Not only that, it’s important to select a provider that also selects data center sites that mitigate environmental risks, such as flooding, extreme weather, and seismic activity as well.


A secure cloud solution is one that is regularly tested, so it’s also important to evaluate the level of security testing of the cloud solution and environment. Examples of important regular tests include:

  • Monthly vulnerability scanning on servers
  • Static Analysis Security Testing of the source code
  • Third-party gray box penetration test
  • Security, financial, and legal reviews of third-party software, applications, and contractors

Audits and Compliance

A secure cloud solution also is independently audited and verified, as well as data privacy compliant with regard to how data is stored and maintained in the cloud. A cloud provider should be able to demonstrate:

  • Service Organization Control (SOC) 2 Type II certification (SOC 2®)
  • Compliance with GDPR and CCPA requirements to protect personal data

Support for Additional Use Cases

E-discovery is not just about litigation anymore. A cloud e-discovery solution needs to be able to effectively support non-traditional e-discovery use cases as well as it supports the traditional litigation use case.

Internal Investigations

Internal investigations are conducted in response to allegations of wrongdoing or violations of regulatory compliance within an organization. These investigations are often conducted under accelerated time frames, often with the parties involved being unaware that they are being investigated and they may be a precursor to litigation itself. Select a system that supports “silent” holds to keep from notifying custodians being investigated that their data is under hold, and it has to support flexible workflows to support a variety of use cases.

Third-Party and Regulatory Requests

The ability to support flexible workflows is even more important when it comes to supporting information requests from third-parties and regulatory agencies. The time frame for many of these requests is even more accelerated than it is with internal investigations, so it’s imperative that your cloud e-discovery solution have robust processing capabilities, powerful and customizable search capabilities and flexible production options to support these high-intensity, short turnaround requests.


Over the last decade or so, the workplace has seen a massive transition away from on-prem systems as critical business functions like Finance and HR embrace cloud systems. Legal is no exception.

E-discovery is a perfect fit for the benefits offered by cloud software, offering the flexibility, security, and sheer ease of use that was unattainable with on-prem installed systems. Understanding the how, where, why and what associated with cloud-based e-discovery will help you select the software that is the best fit for your organization today and beyond.

Published .