Entries from March 2012 ↓

Webinar: top 5 NoSQL gotchas

I’m taking part in a GigaOM Pro Webinar panel on April 4 entitled “Top 5 gotchas that prevent NoSQL from meeting business goals”.

Sponsored by DataStax, the panel also includes Billy Bosworth, DataStax CEO; Jo Maitland, Research Director, Cloud, GigaOM Pro; and is moderated by Paul Miller, Cloud Curator, GigaOM Pro.

Among the topics up for debate:

  • Will IT organizations get bogged down in NoSQL infrastructure battles instead of focusing on big data apps that satisfy the needs of the business?
  • What are the advantages and potential challenges to migrating towards a single data store?
  • Does simplifying the infrastructure mean that you now have to deal with poor performance due to conflicting workloads?
  • What lessons can we learn from history as we bring on NoSQL systems?
  • What unspoken business requirements can we anticipate to prevent being caught off guard?

The event takes place at Wednesday, April 4, 2012 at 10 AM, PST. Register here

Update on the relative popularity of NoSQL database skills

Back in December we ran a series of posts looking at the geographic distribution of NoSQL skills, according to the results of searching LinkedIn member profiles, culminating in a look at the relative overall popularity of the major NoSQL databases.

This week I took another look at LinkedIn to update the results for a forthcoming report, which gives us the opportunity to see how the results have changed over the past quarter:

While this provides us with an interesting opportunity to track LinkedIn profile mentions over time there isn’t a huge amount we can learn from this first update – other than that MongoDB seems to be increasing its dominance.

The only significant change that isn’t immediately obvious from looking at the chart is that Apache HBase has overtaken Apache CouchDB by a tiny margin to claim third place overall.

As we noted last time, however, Apache HBase is more reliant on the US than other NosQL databases for its LinkedIn mentions: it is the second most prevalent NoSQL database mentioned in the USA but fourth in the rest of the world.

Two other points to take into consideration:

– The results for Apache Cassandra are probably disproportionately low since we have to search for the full phrase in order to avoid including people called Cassandra.

– Previously we only searched for Membase. This time we added together the search results for both Membase and Couchbase. This may mean the result for Couch/Membase is disproportionately high since some members probably listed both.

This is not meant to be a comprehensive analysis, however, but rather a snapshot of one particular data source.

The Data Day, Today: Mar 22 2012

Oracle reports Q3. EMC acquires Pivotal Labs. ClearStoty launches. And much, much more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Oracle Reports Q3 GAAP EPS Up 20% to 49 Cents; Q3 Non-GAAP EPS Up 15% to 62 Cents Database and middleware revenue up 10%.

* EMC Goes Social, Open and Agile With Big Data EMC acquires Pivotal Labs, plans to release Chorus as an open source project

* ClearStory Data Launches With Investment From Google Ventures, Andreessen Horowitz and Khosla Ventures

* HP Lead Big Data Exec Chris Lynch Resigns

* “Hortonworks Names Ari Zilka Chief Products Officer

* DataStax Enterprise 2.0 Adds Enterprise Search Capabilities to Smart Big Data Platform

* MapR Unveils Most Comprehensive Data Connection Options for Hadoop

* New Web-Based Alpine Illuminator Integrates with EMC Greenplum Chorus, The Social Data Science Platform

* RainStor and IBM InfoSphere BigInsights to Address Growing Big Data Challenges

* IBM Introduces New Predictive Analytics Services and Software to Reduce Fraud, Manage Financial Performance and Deliver Next Best Action

* Datameer Releases Major New Version of Analytics Platform

* Kognitio Announces Formation of “Kognitio Cloud” Business Unit

* HStreaming Announces Free Community Edition of Its Real-Time Analytics Platform for Hadoop

* Talend and MapR Announce Certification of Big Data Integration and Big Data Quality

* Schooner Information Technology Releases Membrain 4.0

* Gazzang Launches Big Data Encryption and Key Management Platform

* Logicworks Solves Big Data Hosting Challenges With New Infrastructure Services for Hadoop

* “Big Data” Among Most Confusing Tech Buzzwords

* For 451 Research clients

# Infochimps launches Chef-based platform for Hadoop deployment Impact Report

# Big-data security, or SIEM buzzword parity? Spotlight report

# DataStax adds enterprise search and elastic reprovisioning to database platform Market Development report

# With a new CEO and IBM as a reseller, Revolution Analytics charts next growth phase Market Development report

# Cray branches out, offering storage and a ‘big data’ appliance Market Development report

# CodeFutures sees a future beyond database sharding Market Development report

# Third time lucky for ScaleOut StateServer 5.0? Market Development report

# Attunity looks to 2012 for turnaround; up to the cloud and ‘big data’ movement Market Development report

# Panorama rides Microsoft’s coattails into in-memory social BI using SQL Server 2012 Market Development report

And that’s the Data Day, today.

Upcoming data events and travel plans

I’m gearing up for a busy few weeks of international travel with presentations in the Europe and both the east and west coasts of the US.

It all starts on March 28 when I’ll be heading to London for Cassandra Europe 2012 where I’m looking forward to attending a packed schedule of Apache Cassandra case studies. Later in the day I’ll be essentially improvising a presentation combining our view of the state of the NoSQL market with an overview of highlights from the case studies stream for those who have attended the workshop stream.

The following week is HCTS EU, 451 Research’s own event in London, which takes place on April 2-3 and is Europe’s go-to convergence event for CIOs, cloud decision makers, vendors and investors. On April 3 I’ll be presenting our ‘Big Data’ Survival Guide – explaining the importance of ‘big data’ – what it is, what it isn’t and why you should care, we well as 451 Research’s associated concept of Total Data, designed to enable the realisation of valuable business intelligence from ‘big data’.

After a quick trip to California for an analyst event I’ll be heading for Zurich for a couple of events where I’ll be explaining our perspective on the development and adoption of NoSQL and NewSQL databases, including some insights from our forthcoming long format report on the competitive dynamic between MySQL, NoSQL and NewSQL. Specifically, I’ll be presenting at the ESE Conference on March 25th, followed by the NoSQL Road Show on March 26.

Then I’m off to Washington DC to attend MarkLogic World, where I’ll be appearing on a panel with other analysts on May 2 to discuss the impact and implications of ‘big data’.

At some point during all this traveling I’ll be completing the forthcoming long format report on the competitive dynamic between MySQL, NoSQL and NewSQL, hopefully before I’m back in California for OSBC, where I’m scheduled to present our findings on May 21.

Look out also for details of a couple of webinars currently being scheduled between now and the end of May as well.

And then I’m going on holiday.

The Data Day, Today: Mar 13 2012

Drawn to Scale raises funding. Cloudera launches HBaseCon. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Drawn to Scale Announces Funding for Real-Time Big Data

* Cloudera Announces HBaseCon 2012, the Industry’s First Apache HBase Community Conference

* Gazzang Launches Big Data Encryption and Key Management Platform

* Jaspersoft Closes Record Fiscal Year

* Schooner Information Technology Releases Membrain 4.0

* How Project Mercury is eBay’s Big Data play Up on the roof of EBay’s big data center.

* SAND Announces Universal Query

* Oracle has a cloud computing secret The potential impact of metered pricing.

* Why should I consider memcached plugin? …for MySQL.

* For 451 Research clients

# Microsoft launches SQL Server 2012, with an eye on ‘big data’ Impact report

# Global IDs hones governance, MDM focus; looks to the cloud and appliances for growth Impact report

# Clarabridge ups the ante in ‘voice of the customer’ with v5.0 as the CEM space heats up Impact report

# ScaleBase launches elastic load balancing for MySQL databases Market Development report

# Dassault’s Exalead searches for a ‘big data’ role Market Development report

And that’s the Data Day, today.

What’s in a name? Analyzing ‘Dropbox for the enterprise’

We’ve been spending a good deal of time lately talking to vendors looking to deliver ‘Dropbox-for-the-enterprise’ alternatives.  By this, providers generally mean that they enable users to sync and share their files across desktops and devices, but in a way that is palatable to corporate IT departments.   I’d say we really started to see this activity in earnest about a year ago, when Box started getting serious about the enterprise market and I began to get a lot of briefing requests from the likes of Accellion, Egnyte and others about their enterprise file sharing and sync offerings.  Things really started heating up later in 2011, as we saw VMWare announce its Dropbox-for-the-enterprise in August, Citrix acquire ShareFile in October; open source play ownCloud set sail in December and we recently initiated coverage on another startup, Germany-based TeamDrive.

These are only a few of the movements in this emerging market. Things will only become more active in 2012. Perhaps one of the more notable features is the broad background of players entering this space – we see vendors from virtualization, security, storage, content management and mobiltity sectors all vying for attention. This is likely to cause an awful lot of noise, and consfusion.

Compounding the matter is that everyone in this market seems to be struggling with what exactly to call it.  “Enterprise-grade Dropbox” neatly encapsulates it, but it’s not really a viable way to refer to a market segment.  We put out a report on ‘cloud file sharing’ late in 2011, but that really is a broader focus and doesn’t really capture what is important and different about this segment in particular.  Dropbox is a obviously a cloud service and many of the players that want to offer Dropbox-like services are as well.  But while the cloud certainly *can* be enabling an enabling technology, it doesn’t have to be.  Indeed, a number of players, such as Accellion, Egnyte, GroupLogic, ownCloud, Oxygen Cloud and, presumably, VMWare when it gets to it, are offering private-cloud or on-premises approaches for file sharing and sync.

So we’ve settled on Mobile File Sharing and Sync Platforms as the way that we are going to refer to this segment, at least for now.   The mobility part of this, as opposed to cloud, is what is really new and disruptive.  That is what drives the need for sync and native apps for specific device types.  We also think it is important to identify these emerging products, including Dropbox itself, as ‘platforms’ since we suspect there will be ample opportunity moving forward for customization and plug-ins to these tools.  We are already seeing some of these in the areas of security, content management and collaboration for Dropbox specifically.

Calling a set of Dropbox-like capabilities a platform is interesting, though we can also flip the conversation on its head and wonder whether sync is really a feature, as others are doing.  The answer may well be that it is both.  In the enterprise, it certainly makes sense as a feature of content management, collaboration and even storage offerings, since business content is generally part of broader business processes and often needs to be retained for compliance reasons.   IT also wants to get the most out of existing investments. We are already seeing sync as a feature from the likes of OpenText and Huddle, and this is arguably Box’s approach as well.  We also have partnerships between the likes of Oxygen Cloud and EMC, to layer a sync service on top of storage infrastructure.

We take a more extensive look at the market for Mobile File Sharing and Sync Platforms in a recent report (login required) for 451 clients.  This report looks at user and IT requirements and provides more detail on the enterprise players we’ve begun to track. How this market plays out exactly over time remains to be seen, but we think it has the potential to be extremely disruptive. For that reason it’s a space we’ll continue to watch closely, and from multiple vantage points.

The Data Day, Today: Mar 8 2012

Microsoft launches SQL Server 2012. MapR integrates with Informatica. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Microsoft Releases SQL Server 2012 to Help Customers Manage “Any Data, Any Size, Anywhere”

* SQL Server 2012 Released to Manufacturing

* SAS Access to Hadoop Links Leading Analytics, Big Data

* MapR And Informatica Announce Joint Support To Deliver High Performance Big Data Integration And Analysis

* Teradata Expands Integrated Analytics Portfolio

* New Teradata Platform Reshapes Business Intelligence Industry

* Microsoft’s Trinity: A graph database with web-scale potential

* KXEN Announces Availability of InfiniteInsight Version 6, a Predictive Analytics Solution with Unprecedented Agility, Productivity, and Ease of Use

* Software AG Announces its Strategy for the In-memory Management of Big Data

* Attunity and Hortonworks Announce Partnership to Simplify Big Data Integration with Apache Hadoop

* Schooner Information Technology and Ispirer Systems Partner to Deliver SQLWays for SchoonerSQL

* Big Data & Search-Based Applications

* Namenode HA Reaches a Major Milestone

* How Twitter is doing its part to democratize big data

* Dropping Prices Again– EC2, RDS, EMR and ElastiCache

* For 451 Research clients

# SAS outlines Hadoop strategy, previews Hadoop-based in-memory analytics Market Development report

# Pervasive rides the elephant into ‘big data’ predictive analytics Market Development report

# IBM makes desktop discovery and analysis play, shares business analytics priorities Market Development report

# Clustrix launches SDK to tap developer interest in new databases Market Development report

# Continuent and SkySQL team up for clustered MySQL support Analyst note

# MapR gets a boost from Cisco and Informatica Analyst note

And that’s the Data Day, today.

Cisco and Informatica deals provide a boost for MapR

We recently speculated that EMC Greenplum’s focus on the integration of its Greenplum HD Hadoop distribution with its Data Computing Appliance (DCA) and Isilon storage technology would mean an increasingly niche role for Greenplum MR- the Hadoop distribution based on MapR’s M5.

Two recent announcements indicate that niche might continue to be a lucrative one for MapR, however. First, Cisco released details of a reference architecture for deploying Greenplum MR on Cisco’s UCS servers. Then Informatica announced a partnership with MapR to jointly support its Data Integration Platform running on MapR’s distribution for Hadoop.

The Informatica relationship also covers bi-directional data integration with Informatica PowerCenter and Informatica PowerExchange, snapshot replication using Informatica FastClone, and data streaming into MapR’s distribution via NFS using Informatica Ultra Messaging. In addition, In addition, the free Informatica HParser Community Edition will be available for download as part of the MapR distribution.

While the partnership with Informatica is a direct one for MapR, the Cisco reference architecture announcement illustrates that the benefit MapR gains from its relationship with EMC Greenplum includes exploiting the company’s leverage with potential partners.

The Data Day, Today: Mar 2 2012

Hortonworks partners with Talend. Teradata and Greenplum updates. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* Talend Empowers Apache Hadoop Community with Talend Open Studio for Big Data

* Hortonworks Announces Strategic Partnership With Talend to Bring World’s Most Popular Open Source Data Integration Platform to Apache Community Talend Open Studio for Big Data, will be bundled as part of Hortonworks Data Platform.

* Teradata Transforms Global Database Technology

* New EMC Greenplum Database Enhancements Boost Big Data Analytics

* Cisco’s servers now tuned for Hadoop

* Amplidata Closes $8M Funding Round with Big Bang Ventures, Endeavour Vision, Intel Capital and Swisscom

* Got Big Data? Jaspersoft CEO Brian Gentile outlines three approaches to connecting to ‘big data’ for business intelligence reporting and analysis.

* Cray’s YarcData Division Launches New Big Data Graph Appliance

* Introducing Spring Hadoop Developing applications for Hadoop technologies based on Spring technologies.

* MarkLogic and Hortonworks Partner to Enhance Real-Time Big Data Applications with Apache Hadoop

* Continuent and SkySQL Join Forces to Better Serve the Global MySQL Community

* Data Entrepreneurship

* For 451 Research clients

# Anaplan bags $11.4m in VC, looks beyond budgeting and planning to business operations Impact Report

# XtremeData seeks to differentiate analytic database for extreme data workloads Impact Report

# Calpont adds parallel loading to columnar database for online analytics Market Development Report

# MarkLogic formalizes Hadoop support with Hortonworks partnership Analyst note

And that’s the Data Day, today.