Entries from April 2010 ↓

Gear6: deep-sixed or changing gears?

Considering Gear6 recently took up the cause of refuting the death of memcached it is somewhat ironic that it is now Gear6 that is rumoured to be close to death.

Not just rumoured, in fact. eWeek reports that the company is in the process of shutting down its operations, while rival Schooner Information Tech has declared that Gear6 recently filed for a business liquidation process called Assignment for the Benefit of Creditors.

The erstwhile caching appliance vendor, which changed its focus to the Memcached open source distributed memory object-caching system just over a year ago, has not issued a response – despite requests for clarification from us and others.

The lack of response is ominous, and we are sure the company built up some baggage thanks to its previous incarnation as cache appliance provider, but there are a few reasons why now would seem to me to be an inopportune one to cut and run:

  • The current Memcached strategy has only be going a year.
  • The company recently claimed to have added 100 new customers since the launch of a version of its Web Cache software for cloud platforms in December 2009 (not all of which are paying no doubt, but….
  • The profile of Memcached has never been higher.
  • Gear6 also recently inked a partnership with Hewlett-Packard that apparently sees Gear6 added to HP’s Extreme Scale-Out portfolio and HP’s field staff and channel partners trained to deploy and support Web Cache on HP’s BL, SL and DL servers.
  • The company raised $4m of an expected $12.7m financing round in August 2009, according to SEC filings.
  • .

    It is perfectly possible, of course, that the rest of that planned $12.7m was harder to come by than executives – or investors – had anticipated, but my first thought when reading about the Assignment for the Benefit of Creditors was that of a change of ownership for the company’s assets rather than total disappearance.

    We shall see.

    New e-Discovery/e-Disclosure report out now

    I’m very happy to say that our new report on the e-Discovery/e-Disclosure market  – E-Discovery and E-Disclosure: Bringing it all back home – is now available to clients and non-clients alike.

    The report contains:

    • User survey – a survey of 140+ end users about their current e-Discovery products, their purchasing plans over the next 12 months, the state of their budgets now and in the future, their pain points and how they execute their e-Discovery strategy – or even if they have one.
    • Detailed profiles of 32 software and service providers from the US and Europe.
    • Analysis of the current issues and drivers in the market and how we think they may evolve in the future, including issues such as litigation preparedness, early case assessment, in-sourcing of e-Discovery (hence the sub-title), cloud computing and regulatory and legal challenges in the US and Europe.
    • The market landscape including a detailed breakdown of how vendors map to the EDRM and a look at the markets that e-Discovery impacts upon, including archiving and information governance.
    • M&A analysis – forward-looking analysis as well an examination of past valuations.

    The report was written by Katey Wood (@KWood451) and myself (@nickpatience). Any questions regarding the report can be addressed to either of us and we can also let you know how you can buy the report whether you’re a 451 client or not.

    User perspectives on NoSQL

    The NoSQL EU event in London this week was a great event with interesting perspectives from both vendors – Basho, Neo Technology, 10gen, Riptano – and also users – The Guardian, the BBC, Amazon, Twitter. In particular I was interested in learning from the latter about how and why they ended up using alternatives to the traditional relational database model.

    Some of the reasons for using NoSQL have been well-documented: Amazon CTO Werner Vogels talked about how the traditional database offerings were unable to meet the scalability Amazon.com requires. Filling a functionality void also explains why Facebook created Cassandra, Google created BigTable, and Twitter created FlockDB (etc etc). As Werner said, “We couldn’t bet the company on other companies building the answer for us.”

    As Werner also explained, however, the motivation for creating Dynamo was also about enabling choice and ensuring that Amazon was not trying to force the relational database to do something it was not designed to do. “Choosing the right tool for the job” was a recurring theme at NoSQL EU.

    Given the NoSQL name it is easy to assume that this means that the relational database is by default “the wrong tool”. However, the most important element in that statement is arguably not “tool”, but “job” and The Guardian discussed how it was using non-relational data tools to create new applications that complement its ongoing investment in the Oracle database.

    For example, the Guardian’s application to manage the progress of crowdsourcing the investigation of MP’s expenses is based on Redis, while the Zeitgeist trending news application runs on Google’s AppEngine, as did its live poll during the recent leader’s election debate. Datablog, meanwhile, relies on Google Spreadsheets to serve up usable and downloadable data – we’ll ignore for a moment whether Google Spreadsheets is a NoSQL database 😉

    Long-term The Guardian is looking towards the adoption of a schema-free database to sit alongside its Oracle database and is investigating CouchDB. The overarching theme, as Matthew Wall and Simon Willison explained, is that the relational database is now just a component in the overall data management story, alongside data caching, data stores, search engines etc.

    On the subject of choosing the right tool for the job, Basho’s engineering manager Brian Fink pointed out that using NoSQL technology alongside relational SQL database technology may actually improve the performance of the SQL database since storing data in a relational database that does not need SQL features slows down access to data that does need SQL features.

    Another perspective on this came from Werner Vogels who noted that unlike database administrators/ systems architects, users don’t care about where data resides or what model it uses – as long as they get the service they require. Werner explained that the Amazon.com homepage is a combination of 200-300 different services, with multiple data systems. Users do not think about data sources in isolation, they care about the amalgamated service.

    This was also a theme that cropped up in the presentation by Enda Farrell, software architect at the BBC, who noted that the BBC’s homepage is a PHP application integrated with multiple data sources at multiple data centers, and also Twitter‘s analytics lead Kevin Weil, who described Twitter’s use of Hadoop, Pig, HBase, Cassandra and FlockDB.

    While the company is using HBase for low-latency analytic applications such as people search and moving to Cassandra from MySQL for its online applications, it uses its recently open-sourced FlockDB graph database to serve up data on followers and correlate the intersection of followers to (for example) ensure that Tweets between two people are only sent to the followers of both. (As something of an aside, Twitter is using Hadoop to store the 7TB of of data its generates a day from Tweets, and Pig for non-real time analytics).

    Kevin noted that the company is also working with Digg to build real-time analytics for Cassandra and will be releasing the results as open source, and also discussed how Twitter has made use of open source technologies created by others such as Facebook (both Cassandra and the Scribe log data aggregation server.

    One of the issues that has arisen from the fact that organizations such as Amazon and Facebook have had to create their own data management technologies is the proliferation of NoSQL databases and a certain amount of wheel re-invention.

    Werner explained that SmugMug creator Don Macaskill ended up being a MySQL expert not because he necessarily wanted to be, but because he needed to be because he had to be to keep his applications running.

    “He doesn’t want to have to become an expert in Cassandra,” noted Werner. “What he wants is to have someone run it for him and take care of that.” Presumably Riptano, the new Cassandra vendor formed by Jonathan Ellis – project chair for the Cassandra database – will take care of that, but in the meantime Werner raised another long-term alternative.

    “We shouldn’t all be doing this,” he said, adding that Dynamo is not as popular within Amazon Web Services as it once was as it is a product, that requires configuration and management, rather than a service, and Amazon employees “have better things to do.”

    Which raises the question – don’t Twitter, Facebook, the BBC, the Guardian et al have better things to do than developing and maintaining database architecture? In a perfect world, yes. But in a perfect world they’d all have strongly consistent, scalable distributed database systems/services that are suited to their various applications.

    Interestingly, describing S3 as “a better key/value store than Dynamo”, Werner noted that SimpleDB and S3 are “a good start to provide that service”.

    Looking forward to NoSQL EU

    I was asked a few weeks ago whether I thought NoSQL was largely a US, (and specifically) West Coast phenomenon. While it might seem that way for some of those in the bubble that is the Bay Area (and to be fair that’s where I was at the time), the answer is a definite “no”.

    As if to prove it, NoSQL EU is being held London next week with a great program of presentations from NoSQL vendors, projects and users.

    April 20 features presentations on The Guardian’s use of NoSQL, as well as an overview from Alex Popescu of MyNoSQL, followed by presentations from Basho, 10gen, Rackspace and Neo Technology.

    April 21 sees Amazon CTO Werner Vogels describing the birth of Dynamo, as well as presentations on the use of NoSQL databases from the BBC, Twitter, and Comcast. That is followed by presentations on Redis, Tokyo Cabinet (et al) and “the fate of the relational database”. Oh, and a panel debate moderated by some bloke called James Governor 😉

    Then on the 22nd there’s a day of workshops involving MongoDB, Redis, Riak and Neo4J.

    It’s shaping up to be a great event and I’m really looking forward to it. If you’re going to be there and want to say hi (between sessions!) let me know.