The Data Day, Today: November 21 2012

HP/Automomy fall-out. Behind 10gen’s strategic funding. And more

And that’s the Data Day, today.

The Data Day, Two days: November 19/20 2012

HP uncovers Autonomy irregularity. Pentaho ups big data commitment. And more.

And that’s the Data Day, today.

The Data Day, Two days: September 25/26 2012

Total Data analysis. Tokutek gets flash. And more.

And that’s the Data Day, today.

The Data Day, Two days: August 9/10 2012

HP’s Autonomy problem. Excel 2013. And more.

And that’s the Data Day, today.

The Data Day, Today: Apr 2 2012

Basho launches cloud storage play. Opera acquisitions. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Basho Unveils Riak CS, Multi-Tenant Cloud Storage Software for Public and Private Clouds

* InsightsOne Secures $4.3 Million in Series A Round of Funding Led by Norwest Venture Partners

* Opera buys Commendo to create predictive analytics powerhouse

* Opera Solutions Increases Procurement Capabilities with Acquisition of Lexington Analytics

* How federal money will spur a new breed of big data

* Another HP org change Vertica no longer under the purview of Autonomy boss Mike Lynch?

* New SAS Visual Analytics Helps Organizations Analyze, Visualize Big Data

* Citrusleaf Delivers Real-Time NoSQL Replication

* NuoDB Launches Open Source Initiative on Github

* Actian Teams up With FlyingBinary and Tableau to Unleash Big Data Potential

* DH2i Launches and Unveils DxConsole Next Generation Virtualization Solution to Enable the Agile, Always-On Enterprise

* Acunu Analytics Ready to Preview!

* SAND Technology Announces Second Quarter Results for Fiscal Year 2012

* Idera Announces VMware Database Performance Monitoring Solution

* Idera Announces SQL Compliance Manager 3.6

* WalmartLabs is building big data tools — and will then open source them

* The three waves of opportunities in big data

* 4 Big Data Myths – Part I

* For 451 Research clients

# Drawn to Scale raises funds for Hadoop-based real-time database Impact report

# ParElastic brings elastic parallelism to relational databases Impact report

# DH2i launches with PolyServe-inspired database-virtualization software Impact report

# Tape industry pins future on ‘big data,’ active archiving and LTFS Spotlight report

# Lucid Imagination dreams up new strategy for enterprise search Market development report

# Pentaho identifies ‘big data’ analytics as investment priority, hooks into DataStax Market development report

# GridGain positions in-memory data grid for real-time analytics Market development report

# Having earned its stripes in HPC, Panasas heads for ‘big data’ Market development report

* Google News Search outlier of the day: Top 10 Dog and Cat Medical Conditions of 2011

And that’s the Data Day, today.

The Data Day, Today: Mar 22 2012

Oracle reports Q3. EMC acquires Pivotal Labs. ClearStoty launches. And much, much more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Oracle Reports Q3 GAAP EPS Up 20% to 49 Cents; Q3 Non-GAAP EPS Up 15% to 62 Cents Database and middleware revenue up 10%.

* EMC Goes Social, Open and Agile With Big Data EMC acquires Pivotal Labs, plans to release Chorus as an open source project

* ClearStory Data Launches With Investment From Google Ventures, Andreessen Horowitz and Khosla Ventures

* HP Lead Big Data Exec Chris Lynch Resigns

* “Hortonworks Names Ari Zilka Chief Products Officer

* DataStax Enterprise 2.0 Adds Enterprise Search Capabilities to Smart Big Data Platform

* MapR Unveils Most Comprehensive Data Connection Options for Hadoop

* New Web-Based Alpine Illuminator Integrates with EMC Greenplum Chorus, The Social Data Science Platform

* RainStor and IBM InfoSphere BigInsights to Address Growing Big Data Challenges

* IBM Introduces New Predictive Analytics Services and Software to Reduce Fraud, Manage Financial Performance and Deliver Next Best Action

* Datameer Releases Major New Version of Analytics Platform

* Kognitio Announces Formation of “Kognitio Cloud” Business Unit

* HStreaming Announces Free Community Edition of Its Real-Time Analytics Platform for Hadoop

* Talend and MapR Announce Certification of Big Data Integration and Big Data Quality

* Schooner Information Technology Releases Membrain 4.0

* Gazzang Launches Big Data Encryption and Key Management Platform

* Logicworks Solves Big Data Hosting Challenges With New Infrastructure Services for Hadoop

* “Big Data” Among Most Confusing Tech Buzzwords

* For 451 Research clients

# Infochimps launches Chef-based platform for Hadoop deployment Impact Report

# Big-data security, or SIEM buzzword parity? Spotlight report

# DataStax adds enterprise search and elastic reprovisioning to database platform Market Development report

# With a new CEO and IBM as a reseller, Revolution Analytics charts next growth phase Market Development report

# Cray branches out, offering storage and a ‘big data’ appliance Market Development report

# CodeFutures sees a future beyond database sharding Market Development report

# Third time lucky for ScaleOut StateServer 5.0? Market Development report

# Attunity looks to 2012 for turnaround; up to the cloud and ‘big data’ movement Market Development report

# Panorama rides Microsoft’s coattails into in-memory social BI using SQL Server 2012 Market Development report

And that’s the Data Day, today.

The Data Day, Today: Jan 24 2012

Thoughts on Splunk’s IPO and DynamoDB. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Thoughts on the Splunk IPO and S-1 By Dave Kellogg.

* Thoughts on SimpleDB, DynamoDB and Cassandra By Adrian Cockcroft.

* Recommind’s Revenue Leaps 95% in Record-Setting 2011 Predictable.

* Hewlett-Packard Expands to Cambridge via Vertica’s “Big Data” Center Moving.

* Announcing SkySQL Enterprise HA for the MariaDB & MySQL databases

* Membase Server is Now Couchbase Server But not *the* Couchbase Server.

* Cloudera Teams With O’Reilly Media to Merge Hadoop World and Strata Conferences

* Survey results: How businesses are adopting and dealing with data 100 Strata Online Conference attendees.

* Big data market survey: Hadoop solutions

* LinkedIn released SenseiDB, an open source distributed, realtime, semi-structured database.

* For 451 Research clients

# VMware: not your father’s database company Impact Report

# Sparsity Technologies draws up plans for graph database adoption Impact Report

# Amazon launches DynamoDB, an auto-configuring database as a service Market Development report

# NuoDB targets Q2 release for elastic relational database Market Development report

# ADVIZOR illuminates growth strategy, roadmap in data discovery and analysis Market Development report

# Birst adds own analytic engine for BI, OEM agreement with ParAccel Market Development report

* Google News Search outlier of the day: RentAGrandma.com Recruiting Wonderful Grandmas

And that’s the Data Day, today.

Search by another name: enterprise search starts to mature into ‘application era’

Customers of The 451 Group would have seen my report on the enterprise search market published September 15. If you are a client, you can view it here. I thought it would be useful to provide a condensed version of the report to a wider audience as I think the market is at an important point it in its development and it merits a broader discussion.

The enterprise search market is morphing before our eyes into something new. Portions of it are disappearing, and others are moving into adjacent markets, but a core part of it will remain intact. A few key factors have caused this, we think. Some are historical, by which we mean they had their largest effect in the past, but the ongoing effect is still being felt, whereas the contemporary factors are the ones that we think are having their largest impact now, and will continue to do so in the short-term future (12-18 months).

Historical factors

  • Over-promising and under-delivery of intranet search between the last two US recessions, roughly between 2002 and 2007, resulting in a lot of failed projects.
  • A lack of market awareness and understanding of the value and risk inherent in unstructured data.
  • The entrance of Google into the market in 2002.
  • The lack of vision by certain closely related players in enterprise content management (ECM) and business intelligence (BI).

Contemporary factors

  • The lack of a clear value proposition for enterprise search.
  • The rise of open source, in particular Apache Lucene/Solr.
  • The emergence of big data, or total data.
  • The social media explosion.
  • The rapid spread of SharePoint.
  • The acquisitive growth of Autonomy Corp.
  • Acquisition of fast-growing players by major software vendors, notably Dassault Systemes, Hewlett-Packard and Microsoft.

The result of all this has been a split into roughly four markets, which we refer to as low-end, midmarket, OEM and high-end search-based applications.

Entry-level search

The low-end, or entry-level, enterprise search market has become, if not commodified, then pretty close to it. It is dominated by Google and open source. Other commercial vendors that once played in it have mostly left the market.

The result is that potential entry-level enterprise search customers are left with a dichotomy of choices: Google’s yellow search appliances that have two-year-term licenses and somewhat limited configurability (but are truly plug-and-play options) on the one hand, and open source on the other. It is a closed versus a very open box, and they have different and equally enthusiastic customer bases. Google is a very popular department-level choice, often purchased by line-of-business knowledge workers frustrated at obsolete and over-engineered search engines. Open source is, of course, popular with those that want to configure their search engine themselves or have a service provider do it and, thus, have a lot of control over how the engine works, as well as the results it delivers. Apache Lucene is also part of many commercial, high-end enterprise search products, including those of IBM.

Midmarket search

Mid-market search is a somewhat vague area, where vendors are succeeding in deals of roughly $75,000-250,000 selling intranet search. This area has thinned out as some vendors have tried to move upmarket into the world of search-based applications, but there are still many vendors making a decent living here. However, SharePoint has had a major effect on this part of the market, and if enterprises already have SharePoint – and Microsoft reckons more than 70% have at least bought a license at some point already – then it can be tough to offer a viable alternative. However, if SharePoint isn’t the main focus, then there is still a decent business to be had offering effective enterprise search, often in specific verticals, albeit without a huge amount of vertical customization.

OEM

The OEM search business has become a lot more interesting recently, in part due to which vendors have left it, leaving space for others. Microsoft’s acquisition of FAST in early 2008 meant one of the two major vendors at the time had essentially left the market entirely, since its focus moved almost entirely to SharePoint, as we recently documented. The other major OEM vendor at the time was Autonomy, and while it would still consider itself to be so, we think much of its OEM business, in fact, comes from document filters, rather than the OEMing of the IDOL search engine. Autonomy would strongly dispute that, but it might be moot soon anyway – it now looks as if it will end up as part of Hewlett-Packard following the announcement of its acquisition at a huge valuation, on August 18.

Those exits have left room for the rise of other vendors in the space. Key markets here include archiving, data-loss prevention and e-discovery. Many tools in these areas have old or quite basic search and text analysis functionality embedded in them, and vendors are looking for more powerful alternatives.

Search-based applications

The high end of the enterprise search market has become, in effect, the market for search-based applications (SBA) – that is, applications that are built on top of a search engine, rather than solely a relational database (although they often work alongside a database). These were touted back in the early 2000s by FAST, but it was too early, and FAST was too complex a set of tools to give the notion widespread acceptance. But in the latter part of the last decade and this one, SBAs have emerged as an answer to the problem of generic intranet search engines getting short shrift from users dissatisfied that the search engines don’t deliver what they want, when they want it.

Until recently, SBAs have mainly been a case of the vendors and their implementation partners building one-off custom applications for customers. But they are now moving to the stage where out-of-the-box user interfaces are being supplied for common tasks. In other words, it’s maturing in a similar way to the application software industry 20 years ago, which was built on top of the explosion in the use of relational databases.

We’ve seen examples in manufacturing, banking and customer service, and one of the key characteristics of SBAs is their ability to combine structured and unstructured data together in a single interface. That was also the goal of earlier efforts to combine search with business-intelligence tools, which often simply took the form of adding a search engine to a BI tool. That was too simplistic, and the idea didn’t really take off, in part because search vendors hadn’t paid enough attention to structure data.

But SBAs, which put much more focus on the indexing process than earlier efforts, appear to be gaining traction. If we were to get to the situation where search indexes are considered a better way of manipulating disparate data types than relational databases, that would be a major shift (see big data). Another key element of successful SBAs is that they don’t look like traditional search engines, with a large amount of white space and a search bar in the middle of the screen. Rather, they make use of facets and other navigation techniques to guide users through information, or often simply to present the relevant information to them.

As I mentioned, there’s more in the full report, including more about specific vendors, total (or big) data and the impact of social media. If you’d like to know more about it, please get in touch with me.

ILTA 2011 report: Autonomy taking HP to the e-Discovery cleaners?

Not surprisingly, the biggest topic of conversation at the International Legal Technology Association (ILTA) 2011 convention in Nashville is last week’s announcement by Hewlett-Packard (HP) that it was acquiring Autonomy for $11.8bn. The most common reaction–in addition to the rush out the door to buy HP’s now discontinued TouchPad for 99 bucks–was surprise at the healthy purchase price.  Although some ILTA attendees saw how the deal might make sense logistically, virtually no one thought the deal made any sense at all with such a high price tag for Autonomy.

Cloud computing–and law firms’ reluctant move toward it–is another big topic, but another trend that seems to be developing as the e-discovery industry matures is its move away from law firms. Many vendors are reporting that five years ago, their businesses were 70 percent or more in law firms, with the remaining 30 percent or less of the business with corporate clients. Vendors now report that those ratios have flipped, with corporate clients now making up the vast majority of business.

Although the e-discovery market may be shifting away from law firms, at least one vendor hasn’t forgotten them.  Exterro has announced at ILTA the launch of Fusion LawFirm. As the name implies, the new application is a version of Exterro’s Fusion platform designed especially for law firms.

Other vendors meeting with The 451 Group at ILTA to brief us on their product launches and other announcements are:

  • AccessData, which is launching its new early case assessment application, AD ECA
  • kCura and Nexidia, who announced their alliance where Nexidia’s audio and voice recognition application will be integrated into kCura’s Relativity platform
  • LexisNexis Applied Discovery, which made an ILTA announcement of its new partnership with Equivio to add predictive coding to its platform
  • LexisNexis LAW PreDiscovery with the launch of its new early case assessment (ECA) application, Early Data Analyzer
  • Nuix, which announced a new version of its platform last month
  • Orange Legal Technologies, which did an ILTA launch of PurpleBox, its new collection and ECA tool
  • Recommind, which discussed its predictive coding patent, and may have hosted ILTA’s best party at Nashville’s Country Music Hall of Fame
  • Wave Software, which announced a new version of its Trident e-mail processing application.

Quick HP-Autonomy thoughts

Just after the HP call about its Q3 numbers and the deal, here’s my initial (very) quick take as it’s late here in London:

  • This deal is about getting serious about software under Leo Apotheker. It gives HP a real information management story, greatly boosting its presence in the archiving, e-Discovery and enterprise search businesses.
  • However, company cultures are not complementary, the HP way is a long way from the hyper-aggressive sales and marketing culture at Autonomy. Maintaining Autonomy as a separate entity run by Mike Lynch proves this and calls into question how much real synergy can be had from such a structure. I cannot see that being sustained.
  • This instantly makes HP a bigger e-Discovery player than IBM or any of the major IT firms.
  • Product overlap exists in document and records management but gets HP into the web content management and website optimization markets.
  • Autonomy has resisted deals over the years as its market capitalization ballooned as it went on its own acquisition binge. Autonomy couldn’t have waited much longer as it would have grown too big to be swallowed by even the largest predator.
  • At least Autonomy customers will now have a services organization to call on after they’ve bought the software. Customer support and after sales service has not been a strength of Autonomy.
  • This leaves the FTSE 100 with just one software firm of note.