Since the start of this year I’ve been covering data warehousing as part of The 451 Group’s information management practice, adding to my ongoing coverage of databases, data caching, and CEP, and contributing to the CAOS research practice.
I’ve covered data warehousing before but taking a fresh look at this space in recent months it’s been fascinating to see the variety of technologies and strategies that vendors are applying to the data warehousing problem. It’s also been interesting to compare the role that open source has played in the data warehousing market, compared to the database market.
I’m preparing a major report on the data warehousing sector, for publication in the next couple of months. In preparartion for that I’ve published a rough outline of the role open source has played in the sector over on our CAOS Theory blog. Any comments or corrections much appreciated.
Our lengthy report that shares a title with this blog post hit the wire yesterday (a high-level exec overview is available here for all). I’ve blogged before about our efforts on this. It has been quite a project, with several months of listening, reading and talking with lots IT managers, attorneys, integrators, consultants and vendors. Oh and writing — the final doc weighs in at 57 pages…
I noted before that I wasn’t sure “information governance” was a specific or real enough sector to warrant this kind of market analysis. Aren’t we really just talking about archiving? Or e-discovery? Or ECM? In the end, I found we’re talking about all these things, but what is different is that we’re talking about them all together. How do we ensure consistent retention policy across different stores? How do we safely pursue more aggressive disposition? How do we include all that “in-the-wild” content in centrally managed policies?
Is “information governance” really the right tag for this? I don’t know, but I never came across anything better (I did toy with “information retention management” for awhile). We might be calling it something else in a couple of years, but the underlying issues are very real.
From the report intro:
What is information governance? There’s no single answer to that question. At a high level, information governance encompasses the policies and technologies meant to dictate and manage what corporate information is retained, where and for how long, and also how it is retained (e.g., protected, replicated and secured). Information governance spans retention, security and lifecycle management issues. For the purposes of this report, we’re focusing specifically on unstructured (or semi-structured,
like email) information and governance as it relates primarily to litigation readiness.
In the report, we look at why organizations are investigating more holistic information governance practices:
- to be better prepared for litigation
- to ensure compliance
- to reduce risks and costs of unmanaged or inconsistently managed information
Then we go into the market with analysis of:
- the rise of email (and broader) archiving for litigation readiness
- the relationship of the ECM and records management market
- Autonomy and other vendors advocating “in-place” approaches to governance
There are also sections on adoption issues, market consolidation and areas for technology innovation. And profiles of 15 vendors (each with a SWOT analysis) active in this market.
Expect lots more on this topic moving forward.