The Data Day, A few days: February 15-21 2014

Informatica eyes eyes $1bn in sales. And more

And that’s the data day, today.

The Data Day, A few days: April 15-19 2013

‘Information governance’ in the era of big data. MariaDB Foundation takes next steps. And more.

And that’s the data day, today.

Previewing Information Management in 2012

Every New Year affords us the opportunity to dust down our collective crystal balls and predict what we think will be the key trends and technologies dominating our respective coverage areas over the coming 12 months.We at 451 Research just published our 2012 Preview report; at almost 100 pages it’s a monster, but offers some great insights across twelve technology subsectors, spanning from managed hosting and the future of cloud to the emergence of software-defined networking and solid state storage; and everything in between. The report is available to both 451Research clients and non-clients (in return for a few details); access the landing page here.  There’s a press release of highlights here. Also, mark your diaries for a webinar discussing report highlights on Thursday Feb 9 at noon ET, which will be open for clients and non-clients to attend. Registration details to follow soon…

Here are a selection of key takeaways from the first part of the Information Management preview, which focuses on information governance, ediscovery, search, collaboration and file sharing. (Matt Aslett will be posting highlights of part 2, which focuses more on data management and analytics, shortly.)

  • One of the most obvious common themes that will continue to influence technology spending decisions in the coming year is the impact of continued explosive data and information growth.  This  continues to shape new legal frameworks and technology stacks around information governance and e-discovery, as well as to drive a new breed of applications growing up around what we term the ‘Total Data’ landscape.
  • Data volumes and distributed data drive the need for more automation and auto-classification capabilities will continue to emerge more successfully in e-discovery, information governance and data protection veins — indeed, we expect to see more intersection between these, as we noted in a recent post.
  • The maturing of the cloud model – especially as it relates to file sharing and collaboration, but also from a more structured database perspective – will drive new opportunities and challenges for IT professionals in the coming year.  Looks like 2012 may be the year of ‘Dropbox for the enterprise.’
  • One of the big emerging issues that rose to the fore in 2011, and is bound to get more attention as the New Year proceeds, is around the dearth of IT and business skills in some of these areas, without which the industry at large will struggle to harness and truly exploit the attendant opportunities.
  • The changes in information management in recent years have encouraged (or forced) collaboration between IT departments, as well as between IT and other functions. Although this highlights that many of the issues here are as much about people and processes as they are about technology, the organizations able to leap ahead in 2012 will be those that can most effectively manage the interaction of all three.
  • We also see more movement of underlying information management infrastructures into the applications arena.  This is true with search-based applications, as well as in the Web-experience management vein, which moves beyond pure Web content management.  And while Microsoft SharePoint continues to gain adoption as a base layer of content-management infrastructure, there is also growth in the ISV community that can extend SharePoint into different areas at the application-level.

There is a lot more in the report about proposed changes in the e-discovery arena, advances of the cloud, enterprise search and impact of mobile devices and bring-your-device-to-work on information management.

DLP and e-discovery: two sides of the same governance coin?

We commented recently on Symantec’s acquisition of cloud archiving specialist LiveOffice. The announcement also afforded Big Yellow an opportunity to unveil what it calls “Intelligent Information Governance;” an over-arching theme that provides the context for some of the product-level integrations it has been working on. For example, it just announced improved integration between its Clearwell eDiscovery suite and its on-premise archive software, EnterpriseVault (stay tuned for more on this following LegalTech later this month).

There’s clearly an opportunity to go deeper than product-level ‘integration,’ however.  In a blog post, Symantec VP Brian Dye raised an issue that we have been seeing for a while, especially among some of our larger end-user clients. In the post, Brian discusses the fundamental contention that all of us – from individuals to corporations to governments — face around information governance — striking the right balance between control of information and freedom of information.

Software has emerged to help us manage this contention, most typically through data loss prevention (DLP) tools – to control what data does and doesn’t leave the organization — and eDiscovery and records management tools, to control what data is retained, and for how long. Brian noted that there is an opportunity to do much more here by linking the two sides of what is in many ways the same coin, for example by sharing the classification schemes used to define and manage critical and confidential information.

This is an idea that we have discussed at length internally, with some of our larger end-user clients, and with a good few security and IM vendors. Notably, many vendors responded by telling us that, though a good idea in principle, in reality organizations are too siloed to get value from such capabilities; DLP is owned and operated by the security team, while eDiscovery is managed by legal, records management and technology teams. While some of the end-users we have discussed this with are certainly siloed to a point, they are also working to address this issue by developing a more collaborative approach, establishing cross-functional teams, and so on.

A cynic would point out that some self interest might be at play here too from a vendor perspective; why sell one integrated product to a company when you can sell them essentially the same technology twice. But of course, we’re not the remotest bit cynical (!)  There is also the reality that at most large vendors, product portfolios have been put together at least in part by acquisitions.  Security and e-discovery products may be sold separately because they are, in fact, separate products with little to no integration in terms of products or sales organizations.  And vendors may not yet be motivated to do the hard integration work (technically, organizationally), if they are not seeing consistent enough demand from consolidated buying teams at large organizations.

Wendy Nather, Research Director of our security practice, notes that such integration is desirable;

– Users don’t WANT to have meta-thoughts about their data; they just want to get their work done, which is why it’s hard to implement a user-driven classification process for DLP or for governance.  The alternative is a top-down implementation, and that would work even better with only one ‘top’ — that is, the security and legal teams working from the same integrated page.

However, Wendy also notes that such an approach is itself not without complexity;

– Confidential data can be highly contextual in nature (for example, when data samples get small enough to identify individuals, triggering HIPAA or FERPA); you need advanced analytics on top of your DLP to trigger a re-classification when this happens.  Why, you might even call this Data Event Management (DEM).

It’s notable that Symantec is now starting to talk up the notion of a unified, or converged approach to data classification. Of course, it is one of the better-positioned vendors to take advantage here, given its acquisitions in both DLP (Vontu in 2007) and eDiscovery (Clearwell in 2011), while LiveOffice adds some intriguing options for doing some of this in the cloud (especially if merged with its hosted security offerings from MessageLabs).

Nonetheless, we look forward to hearing more from Symantec — and others — about progress here through 2012. Indeed, if you are attending LegalTech in New York in a couple of weeks, then our eDiscovery analyst David Horrigan would love to hear your thoughts. Additionally, senior security analyst Steve Coplan will be taking a longer look at the convergence of data management and security in his upcoming report on “The Identities of Data.”

In other words, this is a topic that we’re expending a fair amount of energy on ourselves; watch this space!

Information management preview of 2011

Our clients will have seen our preview of 2011 last week. For those that aren’t (yet!) clients and therefore can’t see the whole 3,500-word report, here’s the introduction, followed by the titles of the sections to give you an idea of what we think will shape the information management market in 2011 and beyond. Of course the IT industry, like most others doesn’t rigorously follow the wiles of the Gregorian calendar, so some of these things will happen next year while others may not occur till 2012 and beyond. But happen they will, we believe.

We think information governance will play a more prominent role in 2011 and in the years beyond that. Specifically, we think master data management and data governance applications will appear in 2011 to replace the gaggle of spreadsheets, dashboards and scorecards commonly used today. Beyond that, we think information governance will evolve in the coming years, kick-started by end users who are asking for a more coherent way to manage their data, driven in part by their experience with the reactive and often chaotic nature of e-discovery.

In e-discovery itself, we expect to see a twin-track adoption trend. While cloud-based products have proven popular, at the same time, more enterprises buy e-discovery appliances.

‘Big data’ has become a bit of a catchall term to describe the masses of information being generated, but in 2011 we expect to see a shift to what we term a ‘total data’ approach to data management, as well as the analytics applications and tools that enable users to generate the business intelligence from their big data sets. Deeper down, the tools used in this process will include new BI tools to exploit Hadoop, as well as a push in predictive analytics beyond the statisticians and into finance, marketing and sales departments.

SharePoint 2010 may have come out in the year for which it is named, but its use will become truly widespread in 2011 as the first service pack is release and the ISV community around it completes their updates from SharePoint 2007. However, we don’t think cloud-based SharePoint will grow quite as fast as some people may expect. Finally, in the Web content management (WCM) market – so affected by SharePoint, as well as the open source movement – we expect a stratification between the everyday WCM-type scenario and Web experience management (WEM) for those organization that need to tie WCM, Web analytics, online marketing and commerce features together.

  • Governance family reunion: Information governance, meet governance, risk and compliance; meet data governance….
  • Master data management, data quality, data integration: the road to data governance
  • E-discovery post price war: affordable enough, or still too strategic to risk?
  • Data management – big, bigger, biggest
  • Putting the BI into big data in Hadoop
  • The business of predictive analytics
  • SharePoint 2010 gets real in 2011
  • WCM, WEM and stratification

And with that we’d like to wish all readers of Too Much Information a happy holiday season and a healthy and successful 2011.

Wot no e-Discovery? (The Economist on information management)

There’s a special section in this week’s Economist on information management, entitled Data, Data Everywhere. It’s always good when your area of interest and coverage is on the cover of such an illustrious magazine. However, I read it and downloaded the PDF (which you can do as a subscriber) and searched that, and to my surprise there are two significant words close to my heart that don’t appear anywhere in the report. They are:

  • discovery (as a short hand for e-Discovery, or just on its own)
  • governance (as in information governance)

I know the author, Kenneth Cukier, he’s an excellent technology journalist and thinker with years of experience (we both spent perhaps way too long at the various meetings that hosted the various fights for control of the internet’s domain name system (DNS) in the 90s that led to the creation of ICANN).

Ken’s focus in the report was more on the data deluge created by the internet and how that affects individuals, mainly in the context of being a consumer, exploring issues such as personal privacy, and how companies such as Google and Wal-Mart manipulate ans profit from data. There was very little talk about the problems that creating, storing, searching, archiving and deleting information imposes on companies.

And although there is a section on new regulatory constraints, it was again focused mainly on privacy, personal information as a property right, and the integrity of information held about individuals by corporations, with a token nod on the need to preserve digital records, but again looking at it from a consumer’s perspective.

All important topics, for sure. But not the one that a lot of companies are spending a lot of money grappling with now and in the future.

Now I’m not naive, and didn’t expect a multi-page spread on litigation support or an exploration of what early case assessment means in a weekly magazine with such a broad readership as the Economist! But I thought that given that e-Discovery and more recently, information governance are shooting up the list of priorities of many CIOs (the ‘i’ does stand for information, after all) as realize that without appropriate litigation readiness and information governance in place they could find themselves in a financial and legal sinkhole, I thought it warranted at least a paragraph or two among the 14 pages of text.

Update: Clearwell’s CEO Aaref Hilaly posted something on the same subject at almost the same time as me.

IQPC New York E-discovery Conference 2009

I got the chance to attend several sessions at the New York IQPC e-discovery event this week for some interesting perspectives on bringing e-discovery to the enterprise.

Recommind’s Craig Carpenter hosted a panel on Information Governance featuring Scott McVeigh, Director of RM at Aramark and Dawson Horn, Senior Litigation Counsel of Tyco, focusing on the benefits of litigation preparedness and getting organizational support from management and stakeholders. This issue came up more than once during the conference – the challenge of obtaining executive approval and participation from IT, legal, HR, compliance, procurement, RM and other stakeholders in planning, designing and deploying comprehensive information systems. McVeigh encouraged users to be vocal about the need for change, (over the course of several years if necessary), and to invoke C-level names to achieve organizational buy-in.

Autonomy’s Deborah Baron interviewed Karla Wehbe, Senior Information Resources Manager at Bechtel, for a case study of how the company is promoting document re-use by collaborating with outside counsel on a new methodology for ediscovery review. After parting ways with its prior law firm and losing access to previously reviewed documents, Bechtel established an information-centric approach to the process, facilitating re-use of reviewed documents through additional coding from outside counsel. The company claims that 5-75% of reviewed documents are now reusable.

Benefits include better control of document categorization and retention policy, as well as the ability for the company to “tell a story” with its evidence that can be communicated across cases. Wehbe acknowledged an initial “identity crisis” from outside counsel as the corporation established more control, but claims that they are now advocates of the process, and it has built trust and cooperation between them. An interesting example of the changing nature of the attorney-client relationship in corporate law. I am curious as to what their billing arrangement is.

Ian Campbell of iConect was joined by Kurt Michel of Content Analyst, VP of litigation for Phillips North America Timm Miller and Morgan Lewis Associate Denise Backhouse for a discussion of collecting ESI internationally, including EU data privacy regulations, the Hague evidence convention, blocking statutes, and the precedent set by the 1987 Supreme Court case Aerospatiale v. United States for requiring discovery even in defiance of blocking statutes from the jurisdiction of the data.

The difference in global collection philosophy is staggering (at least to this provincial American). Backhouse was asked (facetiously we hope) if it wasn’t enough for both parties just to agree “not to tell” about breaking regulations during discovery, and responded that that would violate the fundamental human right to privacy – literally a foreign concept to those of us accustomed to living under the Patriot Act. Not only could a company not access or even put a litigation hold on employee email in many EU countries, according to Backhouse even board meeting notes would be forbidden since they would identify attendees, potentially revealing where they were employed at the time.

The panel concluded that international e-discovery is not a checklist, but a carefully-negotiated balance between compliance and avoiding sanctions. We continue to follow this with interest, particularly the pending updates from the UK Civil Procedure Rules Committee, as Nick reported from the Thomson Reuters E-disclosure Conference in London.

Unfortunately I missed the judges’ panel, but the sessions I did attend were informative and underscored some of the trends we’ve been seeing in the market. Namely: the rise of Information Governance, the shifting of roles between e-discovery vendors, service providers, general counsel and law firms as technology moves in-house, and the increasingly (complicated) global nature of e-discovery.

We’re now hard at work on our 2010 long-form report on E-discovery and E-disclosure, featuring 25+ vendor profiles and comprehensive coverage of this fast-paced market – publication is slated for late Q1 2010, after Legal Tech. Stay tuned.

Information governance Q&A

Our webinar last week on information governance went well and generated some interesting questions.  I didn’t get to answer all the questions on the call so I’ll take the opportunity to briefly answer some of them here, including some of the more interesting ones I did answer live.   Most of these topics were covered in much more detail in our recently published report on information governance, which also spawned the webinar. The full recorded webinar is also available online as well.

Q: Can you talk to any trends you see in terms of who in an organization is purchasing governance/e-discovery tools?

This is something covered in some detail in the report itself.  In general, there’s some difference in terms of purchasing between “governance” and “e-discovery.”  If the use case being addressed in a particular procurement process is specifically for reactive e-discovery – meaning, the ability to respond to a specific legal discovery request – then the process is likely to have heavy involvement from the legal department if not full ownership by that team with IT involvement.

Governance is generally broader and is likely to involve more underlying pieces of technology (e.g., archiving, records management, indexing tools for distributed data and e-discovery / early case assessment).  There’s certainly no single approach to governance and most organizations are in the earliest of stages in terms of putting in place some kind of broader governance strategy.  Procurement is still likely to be tied to more tactical requirements and the specifics of those requirements will dictate who’s involved (e.g., e-discovery is more likely to be run by legal, as noted above, while an email archiving decision is more likely to be led by IT with legal involvement).  Generally speaking, hashing out broader governance strategies may well involve IT (email management, storage, ECM and search folks), legal, compliance officers, records managers and security personnel, among others.

Q: What are your thoughts about how far right along EDRM the big ECM vendors will move?

So far, ECM vendors are focusing on the far left of the electronic discovery reference model (EDRM).  This has expanded in the last twelve months or so from a far more limited focus solely on the “information management” process step to greater capabilities for data identification, collection, preservation, and some review and analysis.  This is likely to continue, though I’d be surprised to see ECM vendors move beyond this.  Identification, collection and preservation will be key areas in the short term (EMC’s recent Kazeon buy is a good example of how ECM vendors will look to better handle distributed data).  Review and analysis capabilities are likely to remain in the area of early-case assessment, with the expectation that a winnowed-down set of data is still likely to be turned over to external counsel for further review and analysis. That’s likely to be where most ECM vendors stop, though not all; Autonomy, for example, plays specifically in the legal market as well with iManage and Discovery Mining.

Q: Can you explain a bit more what you mean by “litigation readiness”?   What processes does this cover?

I guess this is a phrase I use a lot when talking about information governance and perhaps I didn’t explain it well enough on the webinar.  Litigation readiness is really just one reason organizations are interested in information governance.  Poor information governance makes it difficult to respond efficiently and cost effectively to e-discovery.  There are a number of processes involved in better preparing for litigation, but ideally, organizations need to have some high-level understanding of what data exists, where it is and who has access to it.  That’s a whole lot easier said than done of course, particularly when you need to include data on desktops, laptops, shared file drives and so forth.  The processes generally need to encompass maintaining some kind of index of what resides on all those devices and how that data will be captured and secured if needed.  That needs to be combined of course with more formalized management of data in archives and records management systems, with some consistency in terms of retention and disposition policies (that are standardized and enforced) across sources.  Few organizations have a very good handle on this sort of thing across repositories and unmanaged devices today, but those that are more often involved in litigation are likely to be more litigation-ready.

Q: Is Information Governance of primary interest in the US or are companies in Europe also concerned? I.e. is there an opportunity for vendors beyond the US?

Information governance as it relates primarily to litigation readiness is of primary interest to those in the US and in parts of Europe that have similar discovery or disclosure requirements for electronic information.  In geographies that don’t yet have as strict requirements for electronic discovery, governance may still be an interest but may be for different reasons.  Compliance with specific regulations (e.g., privacy-related legislation) can be a concern, for example, as can IP protection or other types of security.   So there is certainly opportunity for vendors in specific markets, such as archiving, but the drivers might be different.

That’s probably enough for one blog post.  Again, those interested in the full webinar can find it here.

Let’s talk about info governance

This Thursday I’ll host a short webinar to discuss some of the findings from our recently-published report on the emerging Information Governance market.  This report looks at how archiving, records management and e-discovery technologies are coming together to help organizations get a better handle on internal data for litigation readiness and compliance purposes.

The webinar is free and open to anyone, so please feel free to join if you’re interested in this topic.

During the webinar, I’ll outline some of the trends we uncovered while doing our research for this report, look at the vendor landscape and M&A activity in this area, and briefly discuss some of the technologies that we think will be important in this sector moving forward.

Here’s the info and registration link:

The Rise of Information Governance webinar

Thursday, September 24, 2009

12:00 – 1:00 PM EDT

Register here

Recorded versions of our webcasts are available on our site a short while after the events are over.

The rise of information governance

Our lengthy report that shares a title with this blog post hit the wire yesterday (a high-level exec overview is available here for all).  I’ve blogged before about our efforts on this.  It has been quite a project, with several months of listening, reading and talking with lots IT managers, attorneys, integrators, consultants and vendors.  Oh and writing — the final doc weighs in at 57 pages…

I noted before that I wasn’t sure “information governance” was a specific or real enough sector to warrant this kind of market analysis.  Aren’t we really just talking about archiving?  Or e-discovery?  Or ECM?  In the end, I found we’re talking about all these things, but what is different is that we’re talking about them all together. How do we ensure consistent retention policy across different stores?  How do we safely pursue more aggressive disposition?  How do we include all that “in-the-wild” content in centrally managed policies?

Is “information governance” really the right tag for this?  I don’t know, but I never came across anything better (I did toy with “information retention management” for awhile).  We  might be calling it something else in a couple of years, but the underlying issues are very real.

From the report intro:

What is information governance? There’s no single answer to that question. At a high level, information governance encompasses the policies and technologies meant to dictate and manage what corporate information is retained, where and for how long, and also how it is retained (e.g., protected, replicated and secured). Information governance spans retention, security and lifecycle management issues. For the purposes of this report, we’re focusing specifically on unstructured (or semi-structured,
like email) information and governance as it relates primarily to litigation readiness.

In the report, we look at why organizations are investigating more holistic information governance practices:

  • to be better prepared for litigation
  • to ensure compliance
  • to reduce risks and costs of unmanaged or inconsistently managed information

Then we go into the market with analysis of:

  • the rise of email (and broader) archiving for litigation readiness
  • the relationship of the ECM and records management market
  • Autonomy and other vendors advocating “in-place” approaches to governance

There are also sections on adoption issues, market consolidation and areas for technology innovation.  And profiles of 15 vendors (each with a SWOT analysis) active in this market.

Expect lots more on this topic moving forward.