total data — Too much information

The Data Day, Two days: December 12/13 2012

December 13th, 2012 — Data management

Total Data Analytics. Couchbase Server 2.0. And more

New 451 Research report: Total Data Analytics bit.ly/W0GuUy

— Matt Aslett (@maslett) December 13, 2012

For 451 Research clients: Couchbase launches flagship distributed document-oriented NoSQL database bit.ly/UFiVd5

— Matt Aslett (@maslett) December 13, 2012

For 451 Research clients: FairCom straddles SQL and NoSQL with database engine update bit.ly/12kyYDC

— Matt Aslett (@maslett) December 12, 2012

Couchbase Announces Availability of Couchbase Server 2.0 mwne.ws/12kzaTt Couchbase Server becomes a distributed document database.

— Matt Aslett (@maslett) December 12, 2012

Cloudant announces availability of its NoSQL database as a service on Rackspace open cloud.s.tt/1wOKP

— Matt Aslett (@maslett) December 13, 2012

Garantia Data launches Redis Cloud and Memcached Cloud hosting services on the Windows Azure bit.ly/W0z2Zr

— Matt Aslett (@maslett) December 13, 2012

SQLstream launches SQLstream s-Server 3.0. bit.ly/UE1viY

— Matt Aslett (@maslett) December 13, 2012

And that’s the Data Day, today.

Comments Off on The Data Day, Two days: December 12/13 2012

New 451 Research report: Total Data Analytics

December 13th, 2012 — Data management

451 Research’s Information Management practice has published its latest long-format report: Total Data Analytics. Written by Krishna Roy, Analyst, BI and Analytics, along with myself, it examines the impact of ‘big data’ on business intelligence and analytics.

The growing emphasis on ‘big data’ has focused unprecedented attention on the potential of enterprises to gain competitive advantage from their data, helping to drive adoption of BI/analytics beyond the retail, financial services, insurance and telecom sectors.

In 2011 we introduced the concept of ‘Total Data‘ to reflect the path from the volume, velocity and variety of big data to the all-important endgame of deriving maximum value from that data. Analytics plays a key role in deriving meaningful insight – and therefore, real-world business benefits – from Total Data.

In short, big data and Total Data are changing the face of the analytics market. Advanced analytics technologies are no longer the preserve of MBAs and ‘stats geeks,’ as line-of-business managers and others increasingly require this type of analysis to do their jobs.

Total Data Analytics outlines the key drivers in the analytics sector today and in the coming years, highlighting the technologies and vendors poised to shape a future of increased reliance on offerings that deliver on the promise of analyzing structured, semi-structured and unstructured data.

The report also takes a look at M&A activity in the analytics sector in 2012, as well as the history of investment funding involving Hadoop, NoSQL and Hadoop-based analytics specialists. It also contains a list of 40 vendors we believe have the greatest potential to shape the market in the coming years.

The report is available now to 451 Research clients, here. Non-clients can get more information and download an executive summary from the same link.

Comments Off on New 451 Research report: Total Data Analytics

The Data Day, Two days: November 6/7 2012

November 7th, 2012 — Data management

Microsoft launches Hekaton, PolyBase. Appcelerator acquires Nodeable. And more

For 451 Research clients: Total data analytics: predicting the future bit.ly/SN3yCu The third extract from our Total Data Analytics

— Matt Aslett (@maslett) November 7, 2012

For 451 Research clients: SQLstream preps real-time SQL analysis of log and file data streams bit.ly/SN3L8w

— Matt Aslett (@maslett) November 7, 2012

Microsoft unveils in-memory transaction processing for SQL Server and ability to execute Hadoop queries from PDW. bit.ly/SN5kDz

— Matt Aslett (@maslett) November 7, 2012

Appcelerator has acquired Nodeable, will make StreamReduce open source. mwne.ws/RIN9NX

— Matt Aslett (@maslett) November 7, 2012

Pentaho adds instant big data discovery and mobile analysis to Pentaho Business Analytics Enterprise Edition. bit.ly/RR0DW8

— Matt Aslett (@maslett) November 6, 2012

Precog bring its Data Science Platform to MongoDB. mwne.ws/RWyv3T

— Matt Aslett (@maslett) November 7, 2012

Attunity launches Attunity Managed File Transfer (MFT) for Hadoop. prn.to/RR1KoO

— Matt Aslett (@maslett) November 6, 2012

And that’s the Data Day, today.

Comments Off on The Data Day, Two days: November 6/7 2012

The Role of NoSQL and Graphs in the Total Data Landscape

October 31st, 2012 — Data management

I’ll be flying over to San Francisco at the weekend to attend and present at GraphConnect, which takes place at the Hyatt Regency on November 5 and 6.

Specifically, I’ll be giving a presentation on the role of NoSQL and graphs in the total data landscape, subtitled Big Data, Total Data, NoSQL, Graph, at 11.40am on November 6.

Here’s the overview: The database market is changing rapidly with new approaches emerging that provide an alternative to the relational data model. This presentation examines the drivers behind the rise of NoSQL data stores and, in particular graph databases, focusing on their use-cases and adoption trends, and exploring where graph database fit in the world of NoSQL, NewSQL, and big data.

I’ll also be moderating a panel at 5.05pm on November 6 comprised of enterprise companies that use graph databases in production. This panel includes 3-4 technical leads from Accenture, Cisco and Telenor Norway that will discuss what it takes to put large scale graph databases into production.

GraphConnect looks like a great event for anyone with experience with, or just interest in, graph databases. Keynotes will be provided by Emil Eifrém, CEO, Neo Technology, and James Fowler, co-author of Connected: The Surprising Power of Our Social Networks and How They Shape Our Lives.

The full agenda can be found here, and it’s not too late to register, here.

Comments Off on The Role of NoSQL and Graphs in the Total Data Landscape

The Data Day, Two days: October 15/16 2012

October 16th, 2012 — Data management

NGDATA searches for consumer intelligence. Sparsity looks for social analytics partners.

For 451 Research clients: NGDATA positions Lily as a ‘big data’ platform for consumer intelligence bit.ly/Tt0hEZ

— Matt Aslett (@maslett) October 16, 2012

For 451 Research clients: Sparsity seeks partners, funding for graph-based social networking analytics bit.ly/R0oUfh

— Matt Aslett (@maslett) October 15, 2012

Nice write-up of my Total Data presentation at Talend Connect bit.ly/R0oXrt

— Matt Aslett (@maslett) October 15, 2012

SAP HANA One is now available for production use on AWS. bit.ly/V5WkwQ

— Matt Aslett (@maslett) October 16, 2012

Software AG’s Terracotta business generated €5m license revenue in Q3. bit.ly/Tt0AiX

— Matt Aslett (@maslett) October 16, 2012

Software AG’s webMethods 9.0 is now tightly integrated with Terracotta. bit.ly/Tt0vMo

— Matt Aslett (@maslett) October 16, 2012

Hadapt has announced version 2.0 of its Adaptive Analytical Platform, including integration with Tableau.s.tt/1qcem

— Matt Aslett (@maslett) October 16, 2012

Google has open source Supersonic, a query engine library for creating a column oriented database back-end. bit.ly/Tt0xnn

— Matt Aslett (@maslett) October 16, 2012

Delphix adds support for Oracle RAC and Microsoft SQL Server with 3.0. bit.ly/Wonk8S

— Matt Aslett (@maslett) October 15, 2012

Lavastorm Analytics adds direct support for the R programming language. bit.ly/SWjUsJ

— Matt Aslett (@maslett) October 15, 2012

MarkLogic and SGI team up for SGI DataRaptor with MarkLogic database. bit.ly/Tt1U5x

— Matt Aslett (@maslett) October 16, 2012

Clustrix and GoGrid partner for Database-as-a-Service. s.tt/1qaMu

— Matt Aslett (@maslett) October 16, 2012

And that’s the Data Day, today.

Comments Off on The Data Day, Two days: October 15/16 2012

The Data Day, Two days: September 25/26 2012

September 26th, 2012 — Data management

Total Data analysis. Tokutek gets flash. And more.

For 451 Research clients: Total Data analysis: toward the ability to question everything bit.ly/RgSX1R By Krishna Roy

— Matt Aslett (@maslett) September 26, 2012

For 451 Research clients: Tokutek gets flash with database update, eyes MongoDB integration bit.ly/RgT2m5

— Matt Aslett (@maslett) September 26, 2012

For 451 clients: KXEN looks up to the cloud, takes predictive analytics to salesforce.com users bit.ly/ORgkkh By Krishna Roy

— Matt Aslett (@maslett) September 25, 2012

TIBCO launches TIBCO Spotfire 5.0 with new in-memory engine bit.ly/QCwuJZ, expands Teradata partnership. bit.ly/QCwzxj

— Matt Aslett (@maslett) September 25, 2012

Datameer Raises $6 Million From Redpoint Ventures and Kleiner Perkins tcrn.ch/RgT3qe

— Matt Aslett (@maslett) September 26, 2012

Amazon introduces Provisioned IOPS high performance storage option for the Relational Database Service bit.ly/QTO38Q

— Matt Aslett (@maslett) September 26, 2012

Precog Launches in Public Beta to Unlock Big Data bit.ly/RgTgtu

— Matt Aslett (@maslett) September 26, 2012

SkySQL launches Enterprise Data Suite bit.ly/QCvTYI beta release of Cloud Data Suite bit.ly/QCvYM6

— Matt Aslett (@maslett) September 25, 2012

Eight Questions for Hewlett-Packard Software Head George Kadifa dthin.gs/RgVOI4

— Matt Aslett (@maslett) September 26, 2012

Tokutek Optimizes TokuDB for Flash Memory mwne.ws/QCwny9

— Matt Aslett (@maslett) September 25, 2012

HP announces new reseller agreement with Cloudera bit.ly/RgT79p

— Matt Aslett (@maslett) September 26, 2012

Treasure Data Announces Partnership with Jaspersoft prn.to/QTNXyc

— Matt Aslett (@maslett) September 26, 2012

And that’s the Data Day, today.

Comments Off on The Data Day, Two days: September 25/26 2012

The Data Day, Today: September 10 2012

September 10th, 2012 — Data management

Total data reporting. Business value from Hadoop. and more.

For 451 clients: Total data reporting: where ‘big data’ and traditional reporting rubber meet the road bit.ly/RvTaQi By Krishna Roy

— Matt Aslett (@maslett) September 10, 2012

Webinar |@451research, @blackberry & @yp join @mikeolson to talk realized business value from Hadoop: bit.ly/Nf8T4D 9/26 @ 10am PT

— Cloudera (@cloudera) September 7, 2012

For 451 Clients Only:” HP Autonomy has a new leader” – short analyst note….bit.ly/QxT4rE #HP #Autonomy #ECM #enterprisesearch

— Alan Pelz-Sharpe (@AlanPelzSharpe) September 10, 2012

HP appoints former Microsoft exec as general manager of Autonomy zd.net/RvYByT

— Matt Aslett (@maslett) September 10, 2012

Big Data in Your Blood nyti.ms/P17VVW

— Matt Aslett (@maslett) September 10, 2012

MySQL is what you get when app devs build an RDBMS. PostgreSQL is what you get when DB devs build an app dev platform. bit.ly/Rw0344

— Matt Aslett (@maslett) September 10, 2012

PostgreSQL 9.2 released bit.ly/TBXaj6

— Matt Aslett (@maslett) September 10, 2012

Gazzang Enables Data Encryption and Key Management on CDH4 bit.ly/TBXbDD

— Matt Aslett (@maslett) September 10, 2012

And that’s the Data Day, today.

Comments Off on The Data Day, Today: September 10 2012

What big data can learn from total football, and vice versa: part two

August 31st, 2012 — Data management

With transfer deadline day in full-swing it seems like as good a day as any to complete our look at the relationship between football (soccer) and big data (part one here).

Today is the last chance for a few months for football clubs to outsmart their rivals by buying the players that they hope will give them a competitive advantage for the rest of the season. How will data play a part?

Whereas the 2002 Oakland Athletics provided a clear example of how statistics can be used to gain competitive advantage in baseball player recruitment, evidence of similar success in football is harder to find. As indicated in part one, a prime example is Botlon Wanderers, which arguably punched above its weight for years and was one of the first Premier League teams to use statistics to influence strategy and player recruitment.

As Simon Kuper details, one of the key examples of the latter is the club’s 2004 signing of the late Gary Speed, who at 34 would have been judged by many as being too old to compete for much longer at the highest level.

Kuper reports how Bolton was able to compare data related to Speed’s physical data with younger – and more expensive – players in similar positions, and determine that his performance was unlikely to deteriorate as much as would be assumed. Speed played for Bolton for another four years.

While there are other examples of successful purchases being influenced by data, those more sceptical about the potential for data to influence the beautiful game can also point to some high-profile failures.

If a moneyball approach was going to be successful within English football it had the perfect chance to prove itself at Liverpool in recent years. Since October 2010 the club has been owned by Fenway Sports Group and John W Henry, who once tried to hire Billy Beane as the general manager of the Boston Red Sox and in November 2010 hired the closest thing European football has to Billy Beane – Damien Comolli – as Liverpool’s Director of Football Strategy.

Statistical relevance
Quite how much Liverpool’s subsequent transfer policy was influenced by statistics is known only to Liverpool insiders, but certainly Comolli was cited as being responsible for the signings of Luis Suárez, Andy Carroll, Jordan Henderson, Charlie Adam, Stewart Downing, and José Enrique – for an estimated total of £110m – with the £35m spent on Carroll making him the most expensive British footballer of all time.

Either way, statistics have been used to judge the wisdom of those purchases with the scoring record of striker Andy Carroll (6 goals in 44 Premier League games) and winger Stewart Downing’s record of goals and assists (0 and 0 in 37 Premier League games) coming in for particular scrutiny.

Carroll yesterday joined West Ham United on loan, while Downing looks likely to have to adopt a more defensive role to stay at the club. Comolli left Liverpool in April 2012 by mutual consent.

While Liverpool’s transfer dealings are hardly a ringing endorsement for the applicability of statistical analysis and football, it would be wrong to judge their compatibility solely on the basis of transfers alone.

Network theory
We have also seen growing evidence of interest in statistical analysis to football tactics, with a number of academic research reports having been published in recent months. These include Quantifying the Performance of Individual Players in a Team Activity, which originated at the Amaral Lab for Complex Systems and Systems Biology at Northwestern University and provides the basis for Chimu Solutions’ FootballrRating.com, and A network theory analysis of football strategies by researchers at University College London and Queen Mary University of London.

Source: A network theory analysis of football strategies

Both of these use network-based analysis to understand and represent the value of players within a team’s overall strategy. As the researchers behind ‘A network theory analysis of football strategies’ explain:

“The resulting network or graph provides a direct visual inspection of a team’s strategy, from which we can identify play pattern, determine hot-spots on the play and localize potential weaknesses. Using different centrality measures, we can also determine the relative importance of each player in the game, the `popularity’ of a player, and the effect of removing players from the game.”

Looking at this from the perspective of someone with an interest in analytics it is fascinating to see football analyzed and represented in this way. Looking at it from the perspective of a football fan, I can’t help wondering whether this is just a matter of science being used to explain something that footballers and football fans just instinctively understand.

Another research paper, Science of Winning Soccer: Emergent pattern-forming dynamics in association football, certainly falls into the category of over-explaining the obvious. Based on quantitative analysis of a frame by frame viewing of a soccer match the researchers concluded that “local player numerical dominance is the key to defensive stability and offensive opportunity.”

In other words, the attacking team is more likely to score if it has more players in the opposition’s penalty area than there are defenders (having “numbers in the box“), while the defending team is less likely to concede if it has more defenders than there are attackers (or has “parked the bus”).

What’s more: “The winning goal of the Manchester City match occurred when a Queen Park Ranger [sic] fell down”. “She fell over!”

Which isn’t to say that there is nothing football can learn from big data – just that there are clearly areas in which statistical analysis have more value to contribute than others.

But we’ll conclude by looking at what can data management learn from football – particularly total football – the soccer tactic that emerged in the early 1970s and inspired our concept of Total Data.

In our report of the same name we briefly explained the key aspects of Total Football…

Total Football was a different strategic approach to the game that emerged in the late 1960s, most famously at Ajax of Amsterdam, that focused not on the position of the player, but on his ability to make use of the space between those positions. Players were encouraged to move into space rather than sticking to pre-defined notions of their positional role, even exchanging positions with a teammate.

While this exchange of positions came to symbolize Total Football, the maintenance of formation was important in balancing the skills and talents of individual team members with the overall team system. This was not a total abandonment of positional responsibility – the main advantage lay in enabling a fluid approach that could respond to changing requirements as the game progressed.

This fluidity relied on having players with the skill and ability to play in multiple positions, but also high levels of fitness in order to cover more of the pitch than the players whose role was determined by their position. It is no coincidence that Total Football emerged at the same time as an increased understanding of the role that sports science and diet had to play in improving athletic performance.

… and outlined four key areas in which we believe data management as a discipline can learn from Total Football in terms of delivering value from big data:

Abandonment of restrictive (self-imposed) rules about individual roles and responsibility

Accepting specialist data management technologies where appropriate, rather than forcing existing technologies to adapt to new requirements. Examples include the adoption of non-relational databases to store and process non-relational data formats, and the adoption of MapReduce to complement existing SQL skills and tools.

Promotion of individuality within the overall context of the system

This greater willingness to adopt specialist technologies where appropriate to the individual application and workload does not require the abandonment of existing investments in SQL database and data-warehousing technologies, but rather an understanding of the benefits of individual data storage and processing technologies and how they can be used in a complementary manner – or in concert – to achieve the desired result.

Enabling, and relying on, fluidity and flexibility to respond to changing requirements

The adoption of alternative platforms for ad hoc, iterative data analysis enables users to have more options to respond to new analytic requirements and to experiment with analytic processing projects without impacting the performance of the data warehouse.

Exploitation of improved performance levels

The role of more efficient hardware, processor and storage technologies is often overlooked, but it is this improved efficiency that means users are now in a position to store and process more data, more efficiently than ever.

Comments Off on What big data can learn from total football, and vice versa: part two

Hadoop’s potential to revolutionise the IT industry

June 19th, 2012 — Data management

Platfora’s CEO Ben Werther recently wrote a great post explaining the benefits of Apache Hadoop and its potential to play a major role in a modern-day equivalent of the industrial revolution.

Ben highlights one of the important aspects of our Total Data concept, that generating value from data is about more than just the volume, variety, and velocity of ‘big data’, but also the way in which the user wants to interact with their data.

“What has changed – the heart of the ‘big data’ shift – is only peripherally about the volume of data. Companies are realizing that there is surprising value locked up in their data, but in unanticipated ways that will only emerge down the road.”

He also rightly points out that while Hadoop provides what is fast-becoming the platform of choice for storing all of this data, from an industrial revolution perspective we are still reliant on the equivalent of expert blacksmiths to make sense of all the data.

“Since every company of any scale is going to need to leverage big data, as an industry we either need to train up hundreds of thousands of expert blacksmiths (aka data scientists) or find a way into the industrialized world (aka better tools and technology that dramatically lower the bar to harnessing big data).”

This is a point that Cloudera CEO Mike Olson has been making in recent months. As he stated during his presentation at last month’s OSBC: “we need to see a new class of applications that exploit the benefits and architecture of Hadoop.”

There has been a tremendous amount of effort in the past 12-18 months to integrate Hadoop into the existing data management landscape, via the development of uni- and bi-directional connectors and translators that enable the co-existence of Hadoop with existing relational and non-relational databases and SQL analytics and reporting tools.

This is extremely valuable – especially for enterprises with a heavy investment in SQL tools and skills. As Larry Feinsmith, Managing Director, Office of the CIO, JPMorgan Chase pointed out at last year’s Hadoop World: “it is vitally important that new big data tools integrate with existing products and tools”.

This is why ‘dependency’ (on existing tools/skills) is an integral element of the Total Data concept alongside totality, exploration and frequency.

However, this integration of Hadoop into the established data management market really only gets the industry so far, and in doing-so maintains the SQL-centric view of the world that has dominated for decades.

As Ben suggests, the true start of the ‘industrial revolution’ will begin with the delivery of tools that are specifically designed to take advantage of Hadoop and other technologies and that bring the benefits of big data to the masses.

We are just beginning to see the delivery of these tools and to think beyond the SQL-centric perspective with analytics approaches specifically designed to take advantage of MapReduce and/or the Hadoop Distributed File System. This again though, signals only the end of the beginning of the revolution.

‘Big data’ describes the realization of greater business intelligence by storing, processing and analyzing data that was previously ignored due to the limitations of traditional data management technologies.

The true impact of ‘big data’ will only be realised once people and companies begin to change their behaviour, using this greater business intelligence gained from using tools specifically designed to exploit the benefits and architecture of Hadoop and other emerging data processing technologies, to alter business processes and practices.

Comments Off on Hadoop’s potential to revolutionise the IT industry

Forthcoming Webinar: Real World Success from Big Data

May 14th, 2012 — Data management

The initial focus of ‘big data’ has been about its increasing volume, velocity and variety — the “three Vs” — with little mention of real world application. Now is the time to get down to business.

On Wednesday, May 30, at 9am PT I’ll be taking part in a webinar with Splunk to discuss real world successes with ‘big data’.

451 Research believes that in order to deliver value from ‘big data’, businesses need to look beyond the nature of the data and re-assess the technologies, processes and policies they use to engage with that data.

I will outline 451 Research’s ‘total data’ concept for delivering business value from ‘big data’, providing examples of how companies are seeking agile new data management technologies, business strategies and analytical approaches to turn the “three Vs” of data into actionable operational intelligence.

I’ll be joined by Sanjay Mehta, Vice President of Product Marketing at Splunk, which was founded specifically to focus on the opportunity of effectively getting value from massive and ever changing amounts of machine-generated data, one of the fastest growing and most complex segments of ‘big data’.

Sanjay will share big data achievements from three Splunk customers, Groupon, Intuit and CenturyLink. Using Splunk, these companies are turning massive volumes of unstructured and semi-structured machine data into powerful insights.

Comments Off on Forthcoming Webinar: Real World Success from Big Data

The Data Day, Two days: December 12/13 2012

New 451 Research report: Total Data Analytics

The Data Day, Two days: November 6/7 2012

The Role of NoSQL and Graphs in the Total Data Landscape

The Data Day, Two days: October 15/16 2012

The Data Day, Two days: September 25/26 2012

The Data Day, Today: September 10 2012

What big data can learn from total football, and vice versa: part two

Hadoop’s potential to revolutionise the IT industry

Forthcoming Webinar: Real World Success from Big Data

Search

Twitter: maslett

Categories

451 Group blogroll

Recent Posts

Subscribe via Email

Archives

Search

Tags

Twitter: maslett

Categories

451 Group blogroll

Recent Posts

Subscribe via Email

Archives