Entries from June 2008 ↓

Smart Desktop

I had the pleasure of meeting with Jon Herlocker and John Forbes this morning from Smart Desktop.  Smart Desktop is part of Pi Corporation, the mysterious ‘personal information management’ company EMC acquired in March of this year.  Pi is run by Paul Maritz, previously of Microsoft’s platform team and member of the Microsoft Executive Committee.

Pi, which is run as an ‘EMC Company’ (much like VMWare), has not yet released a product and has been in stealth mode for most of its existence.  Maritz now oversees Pi and is president and general manager of EMC’s new Cloud Infrastructure and Services Division.  Smart Desktop, which has a product currently in closed beta, is led by Herlocker (CTO) and Forbes (President).

Seems like a lot of executive management without much product yet, doesn’t it??  It certainly speaks to a grander vision, though what this vision includes from a larger EMC perspective is still under wraps.

But I did get a demo this morning from Smart Desktop and, grander visions aside, it’s pretty cool.

Smart Desktop is a desktop tool that aims to improve an individual’s access to information stored on the desktop or accessed from the desktop (i.e., Web pages) by grouping it into ‘projects’.  This takes into account whatever organizational structure an individual has on the desktop in Outlook or Windows Explorer and whatever ‘activity stream’ is created as a user works.  It makes recommendations as to what project content should be assigned to based on topics, email meta data, context and so forth.

The end result is that it is possible to view all content from multiple apps (Office, emails etc.) related to a project at the same time in one place. It’s tied nicely into Outlook so users can create new projects when an email comes in and all subsequent emails and documents deemed to be related to this project will be tagged by Smart Desktop.  Smart Desktop recommends content to the user based on current activity and can also be used to view content activity via a timeline – so you could look at all content (documents, emails, web pages) you accessed during a particular meeting, for example.

There’s clearly a larger potential opportunity for this technology, which ultimately tracks the activity and information consumption of users on the desktop.  It could be used to discover expertise or look at the usefulness of individual applications, though there are of course privacy concerns to be managed differently in various geographies.

I could hypothesize about how this all fits into EMC’s cloud computing vision, but won’t go too far down that road at this point.  It’s not hard to imagine though how this technology, which is currently slated to be deployed in a desktop environment (there is also meant to be a server-based product), could translate to different types of delivery models.

Text Analytics 2008 Redux

You’ve had Nick’s take, now here’s mine, with a little overlap – great minds think alike, right? 😉 We were not expecting the 40 attendees for the pre-conference workshops during prime Sunday TV viewing time. Seth Grimes laid out “Text Analytics for Dummies,” while Nick gave a market overview. But the attendance (and the long Q&A sessions) were good indicators of user enthusiasm and the desire for real, practicable advice about the field.

Some of the other memorable moments:

  • Best of the vendor panel: Seth Grimes’s challenge to say something nice about a fellow vendor’s offerings. And the vendors’ response to an audience question about incorporating UIMA, which was uniformly that it wasn’t necessary or in demand.
  • The Facebook presentation on trend-tracking through users’ “Wall” posts was brought back for an encore by popular demand. The crowd in my session was a little confrontational about the amount of analysis being done on the available information (never enough!), but as far as quick and dirty zeitgeist goes, it was unbeatable, and a lot of fun.
  • The Clarabridge 1-hour deployment was good sport, with at least one customer’s testimony that once the system is learned, it can actually be configured with speed approaching that of CTO Justin Langseth. You have to hand it to Clarabridge: they make it look easy.

Some thoughts on the users’ takes:

  • In presentations and in private chats, frequently recurring themes among vendors was eDiscovery and social media – some of the drivers for the market. The user questions I heard were mostly about sentiment analysis, deployment time and ROI. Specifically, information on how to judge all of the offerings – is sentiment analysis accurate enough? What is the expected deployment time, what is the ROI?
  • Precision and recall went back and forth again, but the hard truth is that the edge depends on the application. For patents or PubMed searches or eDiscovery, you need recall. For other applications, precision is paramount. Some users I spoke with mistook this as a lack of accuracy – it’s more of a sliding scale of usefulness.
  • Accuracy was a recurring issue, both because text analytics is an emerging technology, and, of course, text is messy and imprecise. Partly it’s a matter of maturation. But the “fast / cheap / or good – pick any two” truism about software development is equally true here. Even with built in taxonomies and dictionaries or domain-specific knowledge, any text analytics software needs configuration to increase accuracy for its application and user, which takes time.
  • “Win fast and win often” – great words from Tony Bodoh of Gaylord Hotels, on the user panel. Because of the financial investment, the fact that text analysis software can automate (obsolete) some employee work, the time it takes to configure, and general resistance to change, it is important to gain both executive and user buy-in early in the process. Chris Jones of Intuit echoed the sentiment, adding that it’s not advisable to go after your largest (and most time-consuming) problem first – come up with a number of smaller successes to prove the concept to users and higher-ups. Incidentally, both of these are Clarabridge users.
  • Jones also noted that one of his “lessons learned” was to avoid over-configuring or too much tinkering with the analytics. He advised after a prudent amount of configuration to treat it more or less like a black box, and not worry about what is going on under the hood, just let it do its job and leave it to the professionals.
  • Some more wisdom from the user panel: you can’t go into a text analytics deployment expecting quantifiable ROI. “You don’t know what you don’t know” – which is what the tool is there to solve. In many cases, the real potential isn’t obvious until you can see how it works with your business. At that point it’s possible to come up with applications that not even its creators could have thought up.
  • Lastly (and this is not a new sentiment, but it meant more coming from school Superintendent Chris Bowman, who looked like he had my parents on speed-dial): the text analytics field is emerging, and will become integrated with larger applications. This will eventually render a conference like this obsolete, but it also means a great chance to get a leg up as an early adopter.

Looking forward to next year!

Thoughts on the Text Analytics Summit

Overall, the conference was very interesting and well worth attending, from our perspective. Good attendance – 190 versus 140 last year – and a better mix of users and vendors, though not clear how many of those users came because of vendor sponsorships. Nevertheless it made for better discussions rather than vendors talking to themselves.

  • There wasn’t a lot to set one vendor apart from another. A lot of the vendor presentations were quite similar. This was summed up by a presentation from Ernest & Young at the end of Day 2, where ostensibly there as an Autonomy customer, the presentation actually showed tools from Megaputer and Seagate’s Metalincs, as well as Autonomy and he said ‘any of those guys next door [in the vendor exhibit space] could do this.” Quite.
  • SaaS will be one area where vendors can differentiate themselves from the pack, at least in the dshort term – Clarabridge and Attensity are leading the way there.
  • The thing most people wanted to know about was sentiment analysis. There’s a lot of vendors out there as we’ve noted before and a fair amount of confusion on the part of prospective users as to what it is, why it’s useful and what it might do for them. There will definitely be vendor consolidation over the next year in this space
  • The pre-conference workshop where I presented my overview of the vendor landscape was much more fun than I’d anticipated – I think my vendor overview went well – thought I wasn’t clear in what way it was a workshop, really just a couple of presentations from Seth Grimes and myself with questions afterwards, although we did some great follow-up in the bar, I guess that counts towards it! We had about 40 people there and over the next two days I met a few who said they didn’t show up because it was Father’s Day (word to the wise when organizing conferences ;)). For anyone waiting for my slides, I will send them to you, just drop me a line.
  • Everyone wanted to know what Facebok was doing and although it was interesting, I found it a little underwhelming, even if it was quite fun. Still they clearly have some very smart people there. What a corpus on which to experiment – the writing on everyone’s walls. I’m sure they’ll come up with more interesting applications of text analysis over time; indeed the presenter acknowldted that things like term disambiguation and sentiment analysis are on the roadmap, which is where things will ge interesting.
  • The EU government intelligence market may prove lucrative for some of these vendors, we intend to investigate that further ourselves.
  • As with any relatively obscure technology area, those implementing text analysis need to get some quick wins under their belt rather than go for the hardest problem first – Gaylord Hotels and Intuit (both Clarabridge customers, Clarabridge was the main sponsor) both emphasized that, as did others.
  • There was very little talk of semantic technologies, despite my best efforts to drum some up. I think that will change as text analysis and semantic tech are much more closely related than the players therein seem to want to admit.
  • There was perhaps too much content, a lot of presentations which fed the problem I mentioned earlier, of many vendors sounding very similar to one another.
  • There was not enough to be heard from the large vendors that have built or bought their way into this market – notably SAP-Business Objects (they was one person there from the Inxight team, but not presenting); SAS Insitiutite had a lot of people on the attendee list but most of them didn’t show for some reason, and although IBM had one presentation, I would have liked to have seen more. Microsoft’s presence was Matthew Hurst, who is clearly thinking pretty far ahead in terms of social media analysis and got a lot of people’s attention, including mine.

I’ll definitely be back next year.

IBM denies plan to open source DB2

ZDNet and its sister sites ran an interesting story yesterday indicating that IBM might be preparing to release its DB2 database under an open source license. If true, it would be a fascinating turn of events that would have a significant impact on the database industry. Unfortunately, it’s not. For more on the speculation and IBM’s denial, see this post over at our CAOS Theory blog.

Enterprise 2.0: the good and the bad

After four and a half days, twenty meetings, one heat wave and lots of hot tea (too much A/C), the second Enterprise 2.0 show is over. It’s a lot to cram into a summary-style blog post but here it goes:

What was interesting (mostly chronological and certainly not comprehensive):

  • Microsoft vs. IBM demo-duel on Monday and the buzz that carried through the week about it (people were still asking me today what I thought). General consensus? IBM knocked it out of the park but it probably doesn’t matter too much in the grand scheme of things.
  • IBM’s indication that it will include full RSS feed aggregation technology in the next version of Lotus Connections — not the 2.0 version that is just now shipping but the one that is likely to ship at this time next year. Discussions on the show floor last night with some IBM folks lead me to believe there is still some uncertainty as to what this actually means but Jeff Schick, IBM Vice President, Social Computing Software told me in a one-on-one meeting yesterday that IBM will go full-bore into feed aggregation in the next release.
  • Demo of NewsGator Social Sites. I’ve seen this before but it was interesting to see it on Monday afternoon, just hours after the Microsoft folks gave what can only be described as a weak SharePoint demo. Why didn’t they show Social Sites, since they included other partner technologies?
  • Discussion with Rob Curry of the Microsoft SharePoint team. He noted that for the next version of SharePoint (expected late in 2009 as part of Office 14), they doubled the development teams on ECM and social software. I told him I thought feed aggregation and wikis are the most obvious areas in need of major advancement in SharePoint and he would only say I would be ‘pleased’ with the next release.
  • Meeting with Tom Jenkins, Chief Strategy Officer at Open Text. Open Text had a big presence at the conference this year, an indication of the degree to which it has re-entered the collaboration market after several years of near exclusive focus on archiving, records management and compliance. What this means for the company’s SharePoint integration strategy remains to be seen.
  • Jabs traded by Sam Lawrence of Jive Software and Lawrence Liu, SharePoint Technical Product Manager at Microsoft on a panel yesterday about social computing platforms. The content itself wasn’t all that interesting but at least Sam added some humor and Lawrence is an eminently good sport.
  • Catch-up meeting with Atlassian and a discussion of how Confluence, JIRA and Atlassian’s other developer tools tie to a single sales strategy to technical teams. This was followed in the general ballroom by a session given by Ned Lerner from Sony Computer Entertainment, which showed, among other things, how core Confluence and JIRA are in their game development processes.
  • Socialtext SocialCalc — this is interesting though I haven’t yet had a chance to view the demo.
  • Open source panel this morning.

What wasn’t:

  • Too much discussion of cultural change, barriers to adoption and best practices. These are all useful and much-needed topics, don’t get me wrong. But most of the sessions I joined on Tuesday and Wednesday were variations on these themes. I didn’t go to all of them to be sure, but I went to more than a few and seemed to be hearing much of the same content over and over. As Vishy put it: “If anybody says viral one more time I’m gonna sneeze.”
  • I was hoping for more discussion on integration strategies, platforms vs. point tools, profiles / identity management, standards, deployment in customer-facing environments and so forth. A layer or two deeper I guess than most of the sessions went. Maybe next year we’ll all be more able to have those conversations.
  • And speaking of next year, there were too many demos and vendor pitches this year that were extremely similar. How many will return next year? Or the year after? For that matter, for how many years will there be an “enterprise 2.0” conference before this stuff just becomes everyday?
  • Most of the more technical sessions were held today, Thursday the final half day of the conference after many folks were gone.
  • Like last year, most of the sessions were way too crowded with every seat filled. That’s a good thing for the vendors and the conference organizers, but not too comfortable or enjoyable for those in attendance.

That makes a longer list of things that were worthwhile than those that weren’t, making it, I would say, a well spent week. And there were lots of great hallway chats and opportunities to catch up. To anyone I was supposed to meet at some point and did not, please leave a comment or contact me directly.

Open source at Enterprise 2.0

I attended a star-studded open source panel this morning, with Bob Bickel of Ringside Networks, Jeff Whatcott of Acquia and John Newton of Alfresco. The panel and audience members discussed adoption of open source specifically for social applications.

There was a bit of discussion on market readiness for open source in this sector. A comment came from the audience that Alfresco, the most established of the three vendors, started with an “easy target” – that is, replacing document management systems that were largely understood and seen as commodities. The same audience member noted that applying commercial open source to emerging social applications may be more difficult, as these are viewed as more strategically important for IT and management.

Ringside is really only just now getting started so it isn’t too far down the road in selling to enterprises, but Bickel came from JBoss and so recounted some of his experiences there with overcoming adoption hurdles at the application platform layer. Acquia is also a new company but it is attached to the popular Drupal project. Acquia hopes to help legitimize Drupal for the enterprise.

Other questions from the audience focused mostly on the complexity of deploying some open source tools (lack of documentation etc.) and licensing issues.

The issue of how little open source was represented at this conference, something I had also noticed, also came up. John Newton said he went from booth to booth on the show floor asking “are you open source?” He got few “yes” answers. Alfresco / Acquia were on the show floor along with a big Sun / MySQL booth but of the 52 vendors on in the demo pavilion, that was about it for vendors with primarily open source business models (a few like Socialtext and Jive Software dabble some in open source but it’s not their primary model).

It’s interesting that at a conference that was all about communities and user-generated content, the vendors represented didn’t have more of a focus on community-generated software. The emphasis in conference sessions and certainly among the vendors on the show floor was much more around software that is easy-to-procure and easy-to-deploy for business users…in other words, lots of SaaS.

Why? I met with John Newton after the panel and he said he thought it was just the vendors present, not a real reflection of the amount of social software currently deployed as open source. I think that’s true as most organizations definitely have WordPress, MediaWiki and Roller deployments but none of these tools were represented at the conference. (Aaron Fulkerson from MindTouch was there (commercial open source wiki vendor) but MindTouch didn’t have a booth.)

Jeff Whatcott also noted off-panel that he thinks the SaaS and open source models will advance in parallel in this market but there will eventually be a “come to Jesus” moment when organizations realize the benefits of community development and the need to have the flexibility to develop, integrate and customize this stuff. I agree that these two models will continue in parallel for awhile or perhaps more than awhile as there are likely to roles for both SaaS and open source in the social software (or collaboration) market for the foreseeable future.

Update: I neglected to mention in this post originally that John Eckman from Optaros did a wonderful job moderating this panel.  My oversight for not mentioning that.

Microsoft vs. IBM

The first tutorial this morning at The Enterprise 2.0 show here in Boston was Social Computing Platforms: IBM and Microsoft. It was a duel of demos, not as open or back-and-forth a discussion as I’d hoped. But the general concession during the event and in the hallways afterwards was that Microsoft was showed up by IBM…thoroughly.

The Lotus demo was first. Lotus Connections is just coming out in version 2.0 and has a fairly complete set of capabilities for social networking, bookmarking, tagging, communities and blogging. The UI is clean and modern and the presenter, Suzanne Minnassian, did a great job sticking with her user scenario and showing how Connections can be used.

Then there was SharePoint. Microsoft SharePoint is of course lots of things – it’s a basic ECM product, it’s a portal and it has some nascent social computing features. But this demo was only to focus on those features, and they’re really not competition for Lotus Connections at this point. And just how nascent these features are was clearly evident this morning, in a demo that also included partner technologies and open source code. It was too technical and showed how difficult SharePoint can be to configure.

To be fair, comparing SharePoint and Connections is really not comparing apples to apples. SharePoint hasn’t reached the level of market penetration it has because of its social software features. Microsoft positions SharePoint as a platform and that partner technologies work better to customize it for specific verticals. There’s some truth to this, but the story will no doubt change as SharePoint gets more social in future releases.

I met with a Rob Curry, a product manager for SharePoint, this afternoon. He wouldn’t comment on specifics in the SharePoint road map but we can be pretty sure that the next version, expected as part of Office 14 late in 2009, will go much further down the social softwar path. In the meantime, SharePoint is still a juggernaut. Can IBM make some hay with its social software lead to stop that?

Google’s enterprise search: in the cloud & in a box

Google has changed the name the scope of its Website search it offers to Website owners that want a little more than simply to know that their site is being indexed by Google, but don’t want to go as far as buying one of its blue or yellow search appliances. 451 clients can read what we thought of it here.

Google has three levels of Website search to offer organizations – completely free but with no control as to which parts of your website are indexed and when, known as Custom Search Edition/AdSense for Search (CSE/AFS); the newly rebranded Google Site Search; and  the Google search appliances, which it sells in Mini and Search Appliance form factors, which can be used both for external-facing Website search as well as intranet search.

Google stopped issuing customer numbers for its appliances in October 2007. The number of organizations it had sold to at that point was about 10,000 customers. I suspect that number is around 11,500 now, though I don’t have any great methodology to back that up, I’m just extrapolating from previously-issued growth figures. That’s an extraordinary amount of organizations with a Google box.

To give some perspective, Autonomy has ~17,000 customers now. But the vast majority came from Verity. When Autonomy bought Verity in November 2005, Verity had about 15,000 customers (and Autonomy had about 1,000). But Verity got about 8,000 of those customers via its acquisition of Cardiff Software in February 2004. So in about 2.5 years Autonomy has added about 1,000 customer, but of course has done of lot of up-selling to its base and doesn’t play in the low-cost search business anymore (mainly because of Google).

The actual number of Google appliances sold is higher of course as many organizations have multiple appliances. I’ll never forget 18 months or so ago standing in  a room of a top 3 Wall Street investment bank with its top ~25 technologists gathered in a room and seeing about 6 of them put up their hands when asked who has a Google appliance – most of those weren’t known about to their boss or to each other.

But Google appliance proliferation is commonplace in large organizations. The things are so cheap and so relatively easy to install they are bought often under the radar of IT . The problem comes when times get tough (as they are in investment banking IT, that’s for sure) the organization wants to ring more out of the assets it has – even if it didn’t know it had those assets until relatively recently.

That’s why we strongly expect Google to come out with some sort of management layer this year to handle this sort of unintended (by the customer that is) proliferation. Watch this space.

Enterprise 2.0 conference

Several of us will be attending the Enterprise 2.0 show here in Boston next week. I’ll be there all week along with my colleagues Anne Nielsen, who works with me on the social software market and Vishy Venugopalan, who covers development tools, mash-ups and rich internet apps for The 451 Group.

We have quite a few meetings set up and the list of who we’re meeting with shows how varied the vendors in this market currently are. We’re meeting with (listed in order of meeting not significance): Microsoft, NewsGator, Open Text, Igloo, Spigit, Atlassian, Socialtext, Day Software, Adenin, IBM, Alcatel-Lucent, Jive Software, and Alfresco.

As you can see, that’s quite a list. It includes start-ups I’ve not spoken with before, social software players, WCM and ECM vendors I know fairly well, and the big guys. I’m not quite sure what Alcatel-Lucent is doing there but am intrigued to find out. Oracle is running a couple of sessions at the show, but it is interestingly the Oracle application folks not those from the WebCenter team (Oracle AR notes this is because of a schedule conflict with an Oracle sales training).

I’ve also ensured I preserved time this year to attend some sessions, something I neglected to properly account for at this show last year. I’m particularly looking forward to the 3-hour tutorial on Monday morning featuring IBM and Microsoft. It’s hosted by Mike Gotta, a favorite of mine, so I’m sure it will be good.

This was a good show last year, though it had a definite ‘irrational exuberance’ sort of feel to it and it will be interesting to see if that has died down a bit this year. Last year the show floor was way too small and many of the sessions were overflowing, showing the organizers underanticipated interest. It’s at the Westin Waterfront again, a nice new hotel here in Boston but not the biggest of venues (particularly as it sits next to the cavernous Boston Convention Center).

In any event, I’m sure it will be week filled with interesting people and discussions, and that we’ll come back with lots of fodder for future research. Our schedules are tight but anyone wanting to meet up, please feel free to leave a comment or contact me directly. Hope to see you there.

On the maturity of complex event processing

How mature is complex event processing technology? That is the question that has been doing the rounds among the great and the good of the CEP industry in recent weeks following an article in Wall Street & Technology in which Ivy Schmerken noted that attendees at the Accelerating Wall Street conference indicated it was a myth that CEP technology is mature.

StreamBase’s Mark Palmer responded with the argument that an increasing number of publicly announced deployments indicated the opposite. Then Tim Bass agreed with Ivy, before mentioning the Gartner Hype Cycle and it all went off. So who is right? Without wishing to be accused of fence-sitting, I would offer the suggestion that they all are, thanks to a couple of quirks of CEP.

For example, as the product of academic research at Cal Tech, Cambridge, Brown, Stanford and UC Berkeley (amongst others) it could be argued that the concepts that underpin CEP products are mature. Additionally, given the adoption of CEP by relatively conservative businesses in financial services, it could also be argued that the technology itself is mature.

However, CEP to date is a niche technology that has been targeted specifically at those customers, so holding them up as an example of its maturity is somewhat self-fulfilling. There are a number of potential markets for the wider adoption of CEP, but they are so far untapped. As Opher Etzion put it: “In the CEP area we certainly have mature applications, we also have some maturity in the products of the first generation, but we are somewhat far from the maturity of the entire area.”

I think Mark’s comment at the end of his post “we shouldn’t feel compelled to thwart that growth with a claim that the products are not ‘mature’ when they actually are in a lot of ways” is quite revealing. The fact that such a level of debate about CEP’s maturity is taking place, and the fact that Mark is concerned that the debate might stifle growth, is itself indicative of an immature market segment in my opinion.