7 Hadoop questions. Q4: alternative file systems

Which is your preferred Hadoop file system? The obvious answer is likely to be the Hadoop Distributed File System itself, although in recent years we’ve seen an increasing number of vendors pitching their own file system technologies as potential alternatives to HDFS. That’s why the use of alternative file systems is one of the primary questions being asked in the 451 Research 2013 Hadoop survey.


The limitations of HDFS are well-publicised, and it is no surprise that many vendors see an opportunity to pitch their existing files system technologies as alternatives to HDFS.

There is now a large number of HDFS alternatives to choose from, including: Cleversafe Dispersed Storage Network, DataStax CassandraFS, EMC Isilon OneFS, IBM GPFS, InkTank Ceph, MapR NFS, Quantcast QFS, Red Hat Storage (GlusterFS), and Symantec Veritas CFS.

Our research indicates that adoption of alternatives to HDFS is limited at this stage and early efforts, such as Appistry’s CloudIQ Storage Hadoop Edition, have come and gone.

However, as adoption of Hadoop grows into more mainstream enterprises, we increasingly see interest in some of these HDFS alternatives, particularly in relation to attempts to reduce duplication of effort with regards to file system management and maintenance.


The early responses to our Hadoop survey are therefore interesting: MapR NFS has scored highest in terms of adoption so far, but there is interest across the board (especially Red Hat Storage, CassandraFS, GPFS, OneFS and Ceph). By and large though, its true to say that most respondents have not considered, tested or adopted an alternative file system to date.

To give your view on this and other questions related to the adoption of Hadoop, please take our 451 Research 2013 Hadoop survey.

The Data Day, The week that was: October 22-26 2012

Cloudera launches Impala. Actuate snags Quiterian. Microsoft previews HDInsight.

And the rest:
– Microsoft previewed its Windows Azure HDInsight Service and Microsoft HDInsight Server for Windows.

– SAP launched a new “big data” bundle and go-to-market strategy.

– Informatica introduced Informatica PowerCenter Big Data Edition and reported its third quarter results.

– Also announcing financial results last week were QlikTech and Pervasive.

– Teradata updated its Unity suite with the addition of Unity Loader, and introduced its Unified Data Environment and the Unified Data Architecture.

– Splunk confirmed the release of Splunk Hadoop Connect and the Splunk App for HadoopOps.

– 10gen added five vice presidents to its management team.

– Rackspace partnered with Hortonworks to create OpenStack and Hadoop-based offerings for public and private cloud.

– Talend added support for Cassandra, HBase and MongoDB , and introduced big data profiling for Apache Hadoop to its integration platform

– MarkLogic announced support for HDFS and expanded its relationship with Hortonworks.

– Kognitio adopted a free licensing model.

– Calpont launched InfiniDB 3.5.

– MetaMarkets announced that it is open sourcing its Druid streaming, real-time data store.

– YarcData updated its uRiKA Big Data appliance for graph analytics.

– Alpine Data Labs announced a global OEM partnership with QlikTech.

– Actian and Attunity announced Attunity Replicate for Actian Vectorwise.

And that’s the Data Day, today.