Cloud e-discovery – examining the evidence

This week we publish a new long-form report, Cloud e-discovery: litigation comes down to earth – download an executive summary here.

In cloud e-discovery we see two major market shifts: corporations in-sourcing e-discovery to lower costs, while outsourcing IT infrastructure and services around it through hosting.  Still in early adoption, it is a leap of faith on some level, and carries both risks and benefits.  While most users in our 2010 e-discovery survey were bringing the e-discovery process in-house, only 16% were using cloud to do it, for a variety of reasons including security, data loss, regulatory concerns, and ease of retrieval.

But consider that hosted e-discovery has actually been around for over 20 years. What’s more, while some enterprises are resisting the cloud, their law firms, service providers, and other outsourcers entrusted with their data are not.

Witness this month’s 2010 Am Law tech survey – 80% of law firms are using hosted technology, 60% of those for e-discovery.  In fact, e-discovery tops all hosted software usage, far surpassing HR (21%), spam filter/email (21%), storage (6%) or document management (5%).  And while 79% report a positive experience, 30% said the savings were not what they expected.  Limited customization, diminished data control and security were even greater concerns.

And what of the bigger-picture risks?  Cloud topped the agenda last month at the Masters Conference as well: the growth of public and private cloud data from mobile use and social media, potential regulatory pitfalls, the benefits and risks of hosted e-discovery, and growing cross-border issues.  No blue-sky thinking here, just hard truths on the cloud from those on the front lines.

From e-discovery lawyers and consultants:

  • “[Public] cloud providers can’t meet the needs [of e-discovery] today.”
  • “Your data, your problem.”
  • “Data privacy in the EU is like free speech or freedom of religion in the US. . . they will give up the cloud before they give this up.”

From Microsoft General Counsel, speaking on cloud regulation:

  • “Things will move quickly, and if something bad happens, things will move faster still.”

From an enterprise buyer on procurement:

  • “It will take 19 months to work out e-discovery issues once you start talking about it.”
  • “Every dollar they save on cloud will be three dollars in legal.”
  • “I hate when people say ‘it’s not gonna stop – it’s already there.’ It makes customers think there is no choice but to comply.  But maybe ‘cloud’ will go away?”

And for the last word, a characteristically common-sense admonition from UK expert Chris Dale (speaking on ECA):

So, how to navigate it all?  For a succinct analysis of the cloud e-discovery market, our report is available to 451 CloudScape or Information Management subscribers, or get an executive summary here.  It offers a market overview, benefits and risks of cloud e-discovery, adoption trends and inhibitors, market drivers, current vendor and service-provider offerings, and the future direction of the market, particularly for enterprise customers.

Also note a complementary report, Cloud archiving: a new model for enterprise data retention, by Simon Robinson and Kathleen Reidy.  They estimate the market will generate around $193m in revenues in 2010, growing at a CAGR of 36% to reach $664m by 2014.  This report covers growth drivers, the competitive landscape and the outlook for consolidation, featuring detailed vendor profiles and end-user case studies.

E-discovery forensics at CEIC 2010 part 2

Continuing on our dive into forensics for e-discovery, today we cover more on the reasons for using it in practice, as well as highlights from CEIC 2010. . .

Now that we’ve examined the technology involved, one question remains: do you need forensically-defensible collection for e-discovery?  The answer is: not necessarily. Many lawsuits do not require this depth or scope of data collection (such as collecting from RAM), particularly civil cases.  And for general defensibility purposes, courts do not expect perfection.  The goal is a reasonable, good-faith effort to accurately preserve data and metadata with a repeatable, documented process  – one you can testify to in court if necessary.

Why use a forensic approach at all?  To the layman, forensics can sound hard and even scary, as well as potentially expensive and time-consuming – some vendors even refer to it as “the F word.”

Well, at this point you should consult a lawyer or expert – the goal is that you never have to use it.  But here are some good reasons to at least educate yourself about it:

1) Forensics has impressive capabilities, and the technology is cool – a.k.a. “the CSI defense.”  E-discovery is not just paper-based discovery on a computer.  The “paper trail” is now digital, and it’s important to know about this technology’s potential for the legal field, as well as the risks involved.  Like the fact that your deleted files are not really gone.

2) Forensic evidence is critical in trying some cases where the “smoking gun” isn’t just buried in a terabyte of text and document-level metadata – criminal matters, or trade secret or insider trading cases where you might have to dig through ‘track changes’ or reconstruct an IM history from RAM to see who knew what, and when.  E-discovery requires a tool box, and forensics can be an important one of those tools.

3) Targeted collection has its own benefits as an approach to e-discovery collection.  Forensics vendors argue that existing enterprise search tools are only as thorough and current as their latest index.  Likewise, preemptively storing data in a repository like an archive, ECM or Records Management system promises easier retrieval, but is not practical for all organizations and all types or volumes of data.

4) Last but not least: court defensibility (if done reputably by a qualified person with appropriate tools – this is not legal advice in any form).

I will leave it to the experts to flesh out the rest of the forensics story (or take issue with my cribbed-notes version in the comments), but a few show highlights from CEIC:

Exhibitors: As this was a tech show, I’ll lead with the tech.  While CEIC is unquestionably Guidance’s party, there was plenty of co-opetition on the exhibition floor from forensics rivals AccessData and Nuix, e-discovery appliance vendor Clearwell Systems, the now-integrated EMC SourceOne-Kazeon, and growing forensic consultancy D4, which showcased review tool partner kCura’s new Relativity 6 release.  451 subscribers can read about Guidance’s EnCase E-discovery V. 4 here, EMC’s new SourceOne for SharePoint here, a report on kCura here, and look forward to an imminent update on Clearwell 5.5, plus new coverage of AccessData and Nuix.

I recommend checking out the demos if you have the chance.  It’s interesting to see how technology evolves to make different active and dynamic data types accessible, both for collection (SharePoint is a big problem here – EMC, FTI and Nuix all debuted tools for it recently) and for attorney review.  For example, kCura’s latest release has a pivot table feature for attorneys to drill into large amounts of structured data like text messages intelligibly, as you would in Excel.

All-star cast:  CEIC ‘s 2010  e-discovery track featured some marquis panels on judicial opinions, international privacy regulations, advanced search and retrieval, and case law updates.  Many presenters are also on Guidance’s Advisory Board (which was meeting during the conference), so they actually stuck around after their sessions and gave attendees the chance to monopolize their attention at lunch and happy hour.  UK e-disclosure expert Chris Dale has a good run-down on the judges, which included Hon. Judge Peck, Judge Donald Shelton and Senior Master Steven Whitaker from the UK.  Also present: EDRM founders George Socha and Tom Gelbmann, the oft-cited Craig Ball, Browning Marean of DLA Piper, and of course Melissa Hathaway, former presidential Cyber-security Czar and worthy successor to last year’s keynoter Leonard Nimoy.

Browning gave a plug for Recommind‘s Axcelerate and Equivio Relevance‘s predictive coding capabilities for review during the search and retrieval panel, which thrilled me as a text analysis and search enthusiast.  451 subscribers can read more on these tools in our past coverage, or the recent long-form e-discovery report.

Users:  There really are no seat-fillers at CEIC; attendees are not just there for a Vegas getaway with continuing education credit.  Everyone I met was a practitioner and formidable techie, many from large companies and government organizations with high-volume litigation or internal investigations.

My conversations with them confirmed for me that e-discovery is still a case of “one size fits all nobody.”  When I asked about their go-to forensic brands, some users told me that each vendor’s tool has strengths, and ideally you should have access to and knowledge of several (if you can justify the purchase to accounting).  Some also use multiple “end-to-end” e-discovery platforms to suit their litigation requirements and cross-functional business processes.

One final thought to wrap this up.  The “e-discovery toolbox” analogy I keep beating to death is stolen extrapolated from George Socha’s advice on search methods: As in any project, you need to know your materials and understand what tools are best for the job.  Each has strengths in particular circumstances or scenarios, and with certain data types, locations and volumes.  It depends on your requirements and what results you’re looking for.

E-discovery forensics at CEIC 2010: sorta sexy, sorta scary, not at all niche

This year marked the 10th anniversary of the Computer and Electronic Investigations Conference (CEIC), a show hosted by Guidance Software focusing on digital investigations in forensics, e-discovery and cyber-security.  I’ll be reviewing this event in two posts, because there’s a lot of ground to cover here – check in tomorrow for some show highlights and more on forensics in practice for e-discovery.

As you might guess, both the crowd and the content at CEIC had a heavy technical and practitioner bent, along with a refreshingly low BS-quotient – good attendance and engagement at the in-depth sessions, not a lot of swag-grabbing seat-fillers milling around the exhibition floor.  Forensics has traditionally had strong traction in law enforcement and government, but the new EnCase E-discovery certification exam (EnCEP) and cyber-security track brought in good numbers of private sector attendees from both IT and General Counsel as well.  Overall attendance reportedly grew about 40% this year to 1300.

At this point, some of us without EnCE certfication may be wondering, “why is forensics important to e-discovery, and what is it anyway?”

The bottom-line in practice is that forensic collection and Guidance’s EnCase format in particular have very strong court defensibility.  From a broader market perspective, Guidance is the only US e-discovery software vendor to go public (in 2006), and has an enviable customer base among the Fortune 500.  All this is to say that while forensics is an expert-grade technology, it is not at all a niche.  In fact, Guidance was #1 in our recent user survey for current usage at 23%, while rival forensics vendor AccessData was cited by 11% of respondents’ planning to purchase e-discovery software or services in 2010.

And what exactly is forensics?  Here I will steal paraphrase liberally from forensic examiner, attorney and expert at-large Craig Ball :

Computer forensics is the expert acquisition, interpretation and presentation of active, encoded and forensic data, along with its juxtaposition against other available information (e.g., credit card transactions, keycard access data, phone records and voicemail, e-mail, documents and instant message communications and texting).

What kind of data are we talking about?  According to Craig: any systems data and metadata generated by a computer’s OS and software (for example: the date you create an MS Outlook contact), as well as log files, hidden system files, and deleted files.  Many tools also handle encrypted files and have additional functions like scanning images to detect pornography – CSI-grade stuff.

The most familiar forensic method of gathering evidence is imaging an entire hard drive, i.e. creating an exact duplicate of every bit, byte and sector, including “empty” space and slack space, with no alteration or additions to the original data.  However for e-discovery purposes, processing and reviewing that much data from a large number of enterprise machines would be prohibitively expensive and time-consuming.  Not to mention the risk of finding things you’re not looking for (even potentially criminal data like pornography which must be reported by law), and the danger of making incriminating data or deleted files accessible to opposing counsel.  For these reasons (among others), vendors like Guidance offer “targeted collection,” often through desktop agents installed on laptops and PCs which automate searching and collections by specific criteria across the network.

Tomorrow’s post will feature CEIC highlights from users, vendors and speakers, plus more on forensics and the e-discovery use case.  In the meantime, for some additional perspective check out #CEIC on Twitter [update: or #CEIC2010], or blog coverage from Craig Ball, Chris Dale and Josh “Bowtie Law” Gilliland of D4.  Many thanks to them and to the others who shared their experiences with me.  Stay tuned.

E-discovery user survey 2010 – a view from the front lines

Some of the best-kept secrets in e-discovery are not the kind revealed in a courtroom.  We all know about legal confidentiality, but the IT side has its own code of silence – call it “analyst-client privilege.”

It’s not that users and customers won’t talk about their vendors and methods – especially if they’re unhappy with those vendors, or have a horror story to share, which many do.  But users rarely go on-the-record with specifics in e-discovery.

So this year we introduce our first annual user survey.  It’s available as part of our just-released E-discovery and E-disclosure report for 2010, or you can access a copy through Applied Discovery here.  It will also be featured in our upcoming BrightTALK webinar on Thursday, May 27th at 12 noon ET, presented by Research Director Nick Patience.  Register here to attend.

And what did we learn?

Users report that corporate litigants still overwhelmingly use existing in-house resources and employees to fulfill discovery requests.  In spite of vendors’ claims that the market demands one throat to choke, customers still purchase tactically depending on their requirements.  About half perform e-discovery on an ad-hoc basis with no repeatable business process or dedicated staff.

What they are buying is even more revealing – our data gives the distribution of usage between 50+ vendors, with purchasing broken down by product or step in the EDRM (Electronic Discovery Reference Model), and whether customers choose software, services, law firms or in-house systems for each function.  Cross-tabbing by industry, company size, volume of litigation and legal budget shows even more granular trends and hot spots in what remains a highly fragmented market.

Beyond a snapshot of current holdings, half our respondents have shopping plans for 2010, showing shifts in vendor traction and product purchasing.  Users have strong predictions of their own for the market as well.  They are clear on pain points in the process and vendor selection criteria.  That said, future purchasing plans show little critical mass on vendor selection – it’s still anybody’s game in e-discovery.

And what about the cloud?  Or information governance?  Is cost still king for everyone?

Join us for a thorough run down of the state of the market in 2010 – a view from the front lines of e-discovery.  Register here to attend.

E-discovery post- “Zubulake Revisited” at IQPC

IQPC’s 3rd E-discovery conference for Financial Services felt like a spa day after LegalTech. You get your CLE credit in a room of less than 40 people while being fed gourmet cookies in a comfortable chair with an expensive view of Times Square – unlike LegalTech, where you spend half your time in an elevator of 40 people, and someone has pushed the button for every floor.

There were some noteworthy insights for anyone considering an investment in e-discovery software or services.  We’ve been crunching numbers for our E-discovery User Survey this week, with some interesting results:

  • the overwhelming majority of respondents were still performing every part of e-discovery primarily in–house
  • but about half were planning to make an e-discovery purchase in the next year
  • however a large number of them hadn’t finalized their choice of product or vendor.

So, how to choose?  Well, in the wake of “Zubulake Revisited,” there is now more judicial guidance on the e-discovery process and certainly more at stake.

To get an idea of what the courts are looking for and how companies are adapting, I attended IQPC’s Judicial panel on avoiding sanctions, as well as the panel on building a corporate e-discovery response team, featuring e-discovery senior management from Lehman Brothers Holdings, Barclay’s, MetLife and Bank of New York.

A few takeaways:

  • Judges on the sanctions panel were not sympathetic about high data volumes, saying “Lawyers just have to start dealing with it and make requests and responses appropriate.”  They rejected objections to “burdensome” ESI production requests and criticized litigants for lying about production costs to avoid producing data. One recommended native file production to cut costs rather than requesting images or paper (!)
  • Judges called for earlier preservation with a written legal hold, particularly in the wake of Scheindlin’s “Zubulake Revisited” opinion, which they called a “shot across the bow.” One claimed that some companies spent ten-digit numbers on preservation alone, especially if they’re caught at a late stage and can’t easily go back. That figure sounds like I must have misheard, but I don’t argue with judges.
  • Judges criticized the lack of cross-functional IT and legal expertise at Meet and Confer and in collection of data. They recommended consultant Craig Ball’s 50 questions to prepare for Meet and Confer [pdf], and advised that e-discovery collections be supervised by someone who should anticipate having to testify in court.
  • In the corporate E-discovery panel, Lehman Brothers Holdings (the entity responsible for administering all of Lehman’s litigation) reported standardizing on a single review platform for collaborating with all of its law firms, claiming that they initially had “big fights,” but eventually everyone accepted it – probably no mean feat considering the volume of litigation Lehman faces.
  • Another panelist noted that more corporations are taking control of review as well as collection and the earlier stages of e-discovery.  She advised that law firms should analyze legal issues but corporations handle the facts, doing as much review as possible in-house or outsourcing it at lower rates to cut costs.
  • No one on the corporate panel had any major objection to using SaaS e-discovery or storing legal data in the cloud.

Food for thought.  We are wrapping up our E-discovery User Survey this month and distributing results, which will also be included in our upcoming e-discovery long form report – contact us if you are interested in purchasing.  And many thanks to everyone who has participated in the survey.

LegalTech New York 2010 Wrap-Up

Nick and I spoke with about 30 software & service providers at LegalTech in New York this year; that’s in addition to the 30-some briefings (with some overlaps) last month for March’s upcoming annual e-Discovery report. We have a more comprehensive LegalTech wrap-up of vendor developments and shifts in the market landscape for our clients here , but here are some of the themes that came up at this year’s well-attended show. ALSO: we’re still soliciting end user participation in our e-discovery user survey, which you or your customers can access here, or contact me for more information  – all participants will receive a copy of the results.

Defensibility and the Scheindlin Opinion: Judge Scheindlin’s ruling in University of Montreal Pension Plan v. Banc of America was a hot topic, particularly the 85-page opinion “Zubulake Revisited: Six Years Later.” (pdf) It defines culpability for defensibility failures in ediscovery, i.e. what constitutes negligence, gross negligence and willfulness. Some of these include failure to issue a legal hold, incomplete collections, destruction of email or tapes, failure to preserve metadata and failure to determine accuracy and validity of search terms. It’s great to see some concrete guidelines, with obvious implications for e-discovery software and services. I found a concise wrap up here and from law.com here.  Software vendors are taking note, and already incorporating the ruling into their marketing.

Price sensitivity: Increased competition in the ediscovery market, lower budgets in legal departments, and more flexible law firm pricing due to pre-review data culling, LPO, AFA’s, etc. all contributed to greater industry price sensitivity this year, with more customer choices and influence in the market. Software and service vendors expanded pricing options and continued to target the outside legal spend from general counsel in their offerings. E-discovery service providers added per-gigabyte contract review to their software and service processes, and offered either “review management” or extensive collaboration with outside counsel. Software vendors offered more project management and monitoring capabilities for tracking time, cost and productivity. Collaboration workflows designed to lower outside counsel fees came up. Vendors increasingly talked about reuse of reviewed documents as another cost savings measure for serial litigants. Some vendor messaging was downright conspiratorial in insinuating that corporate legal could use software to throttle the law firm spend; although many e-discovery vendors have large law firm customer bases, the enterprise accounts remain lucrative and coveted.

New software releases: With deal sizes decreasing at some of the highest price points, there has been increased competition both from upstarts aiming to undercut the bigger names and the bigger names making a play for the mid-market. We heard a lot of “flat is the new up” last year, and many vendors seem to be hustling even harder to win customers by expanding to platforms or trying to tick all the RFI boxes with new features of varying validity. Even more than last year, vendors claim to be end to end and predict that customers want one throat to choke, and they’re attached to it. The customer market is more educated and savvy about its own requirements this year, which lowered the marketing artistry quotient quite a bit, but some of the features we saw added for legal hold, data mapping, retention and disposition policy, review, collaboration, and workflow are better than others. As much as customers may be tired of stringing together point tools with additional budget line items for auxiliary integration or conversion costs and extra fees, most of the products we’ve seen are not the silver bullets they promise – customers should continue to educate themselves about their organizational needs, what other companies are doing, and what the market has to offer. And vendors should consider that unhappy customers are usually the loudest ones.

Early case assessment: All the ECA product releases in the last year seem to have made it a de facto step in the ediscovery process, the only argument remaining is how early it should occur – as early as the initial data gathering at identification and collection, or just before review but after processing? The need for processing itself was a matter for debate – CaseCentral gave its direct Symantec Enterprise Vault connector a big push, but processing vendors advocate just as strongly for the benefits of thorough metadata extraction and preservation.

Information Management Reference Model: We missed the EDRM luncheon, but spoke with Reed Irvin of CA and Sandra Song of H5, the co-chairs of the group tasked with expanding the first box of the EDRM diagram into its own full Information Management Reference Model (IMRM). The working draft is a series of concentric circles outlining the information lifecycle from creation to retention, disposition, discovery and storage, including the architecture and business drivers behind these processes. We’ve written previously on the push for information governance in establishing a litigation preparedness and information management strategy, and are glad to see some structure and industry thought leadership put to these initiatives.

M&A: Lots of M&A buzz this year, unfortunately a lot of it speculation about vendors suspected to be looking for a necessarily swift exit.

A great show this year overall, and many thanks to everyone who took the time to speak with us.  We will be wrapping up the e-discovery report in the next month.  Watch this space.