Text Analytics Summit 2009

With the 5th annual Text Analytics Summit now in the bag, here are my thoughts on the event.

My talk on which vendor options to choose on Sunday night was, I think at least, well received. Probably only about 30 people in the room but all bar about 5 of them were end users, which is good. The slides are available to anyone who drops me a note, and for those that were there on Sunday, I will get them to you very soon.

That end-user theme carried on to the main conference, whereby there was a higher proportion of end users this year than last year without a doubt. The overall attendance was down slightly and when I saw the list on Monday morning I was concerned, but more than a third of them were users, which was much better than last year when there was often a feeling of vendors pitching to other vendors, which doesn’t help anybody.

A fair few of the end users present were at a very early stage of their assessment, too. Many were merely aware that text analytics can do something for them, but hadn’t engaged properly with any of the vendors. I will be following up with those and the other users I met during the conference as we look to help them evaluate their vendor options.

The end-user panel, moderated well by our own Katey Wood was interesting as ever. Jon Lehto of Monster.com had some rich insight and Bryan Jeppsen at JetBlue, now two years into its use of Attensity explained how it had changed its customer surveys from 1 open-ended question in 40 (and 39 structured questions) to mostly open-ended as it now has the power to analyze that text and get insight it would have never had received had it had to work out in advance what sort of answer it wants. Both AOL and JetBlue were able to bypass their IT departments and go with the SaaS versions of their vendors’ products.

The analyst panel, if I’m being honest, was probably a bit flat from the audience’s perspective as we were agreeing too much. I tried to disagree at one point but then didn’t quite clarify what I meant, so I did it in an earlier post. We had a question from the audience from someone at Whirlpool about ROI which we all struggled with a bit. That’s because ROI on text analytics apps is tricky because

  • quite often you’re doing something completely new that you’ve never been able to think of doing before, such as automatically parsing customer’s comments on blogs
  • many text analytics apps are quite small and thus don’t often require such an ROI measure
  • they’re often part of some sort of competitive or customer intelligence effort that’s much larger and thus the text analytics element itself isn’t subject to ROI.

But clearly for a company with the size of investment Whirlpool has made with text analytics, it’s a valid question and made us all ponder the ROI question a bit more deeply.

Things I thought I’d hear more about but didn’t: cloud and eDiscovery. There were SaaS-based representatives there in the shape of Clarabridge and Attensity for sure and Clarabrige in particular has some great reference customers willing to speak on its behalf, notably AOL and Intuit. But in terms of true cloud-based text analytics, it’s still too early, and may even been so next year.

I was more surprised not to hear much about eDiscovery. What little I did hear (apart from the listening to the sound of my own voice, of course) was from Ernst & Young and its proactive fraud detection work, plus some of which has been parlayed from previous successful eDiscovery work with clients, which is exactly what we thought would be happening (always good to hear end user validations of predictions made in research).

Things I though I’d hear about and did: sentiment analysis. Last year it was the undercurrent of the conference. This year it came very much to the surface. There wasn’t too much difference between a lot of the offerings and some of the presentations (but by no means all) were a bit too down in the weeds. But there’s tons of interesting implementations out there now, although a fair amount of work still to be done.

Anyway overall it was well worth it and I recommend the conference next year to anyone interested in how to leverage text for insight into customers, competitors, risk exposure or all sorts of other business and organizational issues.



#1 Dave Schubmehl on 06.04.09 at 10:34 am

I really wasn’t surprised to see a lack of discussion about e-discovery as the text analytics for e-discovery is always part of the search process. I do think that the embedding of text analytics into search in general is important as evidenced by Daniel Tunkelang’s and the EMC presentations.

I think it’s going be interesting to see how search and text analytics will combine over the next few years.

#2 Daniel Tunkelang on 06.04.09 at 10:56 am

I enjoyed the summit and was pleasantly surprised by the overall intellectual level. Perhaps my expectations of industry conferences are too low!

My only complaint was that there seemed to be little discussion (and perhaps little interest) to dive into details of how different text analytics approaches work. If anything, the message from some of the vendors is that no one wants to know (though the spin on this is that everyone wants the complexity masked).

But I agree with Dave that, for many applications (and certainly the ones I care most about), text analytics is tightly wound up with search, especially in areas like eDiscovery. It behooves both communities to play well with one another.

#3 Sid Banerjee on 06.04.09 at 5:30 pm

There may be a latent (ie untapped) market for text analytics and search, in my humble, biased opinion, I think the story again this year is that the real action, and growth in text analytics is at the intersection of text, business intelligence, statistics, and reporting. Text may start in documents, knowlege management systems, CRM and survey platforms – but real business value is being created (and was highlighted repeatedly in the show) when that text is ported into dimensional database, analytical, and business intelligence contexts and when it’s served up to business analysts, customer operations, and market researchers.

My 2 cents…

Sid Banerjee

#4 Eric Martin on 06.08.09 at 10:00 am

Well I think the holy grail of eDiscovery is all about finding out discernable and reproducible patterns which really enable proactive fraud detection as E&Y mentioned. So search is definitively one early step they have to go through but going to predictive analytics requires to combine structured and unstructured data alltogether for better fraud modelling. That’s what they use SPSS for…

#5 Catherine van Zuylen on 06.11.09 at 2:14 pm

We have definitely seen an uptick in customers using Attensity for fraud and legal analytics, but IMHO the reason that these topics don’t tend to get addressed at these conferences is that few (if any) end-customers are willing to talk to outsiders much about their initiatives in this area.

I think you hit the nail on the head when you mentioned the tussle between “explaining in detail how everything works” and “making your solution sound too hard to use”.

Many times, a vendor will “show” exceedingly well in a demo, but when you drill down, or when you try to use the technology in practice, it might fall down.

On the other hand, start throwing around sentence diagramming and the “how text analytics works” and business clients sometimes complain that your solution looks “too hard to use.”

I will say after nearly 10 years in the business, it is exciting to finally see the promise of text analytics finally being realized in applications that are in-use.