Laurent Lachal
German Rapid-I pushes open source data mining forward
Data mining (increasingly referred to as 'predictive analytics') has yet to get out of its ghetto. It enables enterprise customers to anticipate and respond to changes in market conditions, risks, customer behaviour etc. Yet it remains restricted to a relatively elite niche of users in a small number of organisations because of its complex implementation cycles, skill requirements and general lack of user-friendliness. Two developments may change the situation. The first is the closer ties between business intelligence (BI) and data mining platforms through partnerships like the ones SPSS signed with Business Objects in December 2007 and Cognos in March 2008. The second is via open source data mining companies such as Rapid-I, a pragmatic young German start-up. Rapid-I markets RapidMiner, an open source data mining solution known as YALE (Yet Another Learning Environment) until mid 2007, when it was rebranded for legal reasons. Its proprietary counterparts (e.g. SAS, SPSS) have a stronger focus on (classical) statistics and well-established standard data mining processes (including industry sector-specific solutions and tools), while RapidMiner is more generic, with a stronger focus on flexibility, extendability, artificial intelligence methods and meta-learning, as well as on the data mining process as a whole, including arbitrary process nesting and automated optimisation. Rapid-I does not position itself directly against proprietary tools, asserting instead that RapidMiner not only competes with but also complements them in that it has a wider area of applicability (based on the fact that it is both more flexible by design and that its open source status enables users to customise it to fit their specific requirements).Apart from pricing, Rapid-I claims to make a difference via usability. That may be the case, but data mining techniques are still beyond the understanding of the average business analyst or knowledge worker. Hence the full value of data mining in the organisation is not so easily realised, even via open source software. The company also argues that open source data mining solutions are much faster at adopting leading-edge technologies (to which many proprietary vendors will retort that they wait for the market to be ready before making use of such technologies).Rapid-I plans to expand its offering in two directions. First, it intends to deliver the product with a broader set of features. Second, it plans to build analytical applications on top of RapidMiner. We believe this second objective to be the most promising objective in that it addresses the two major obstacles to data mining adoption: limited usability and lack of focus on business needs. It will be hard, though, for Rapid-I to attract partners that not only to sell Rapid-I's analytical applications but also embed RapidMiner in their own applications. The company already has a handful of OEM partners though. Rapid-I claims to be profitable and has so far adopted a conservative approach to financing: the co-founders bankroll the company themselves instead of relying of venture capital (VC) funding. Ovum would prefer to see the company tap into VC money (and connections). A more aggressive VC-fuelled expansion strategy would help the company cement its position and build up its much-needed partner network.For more information please refer to the Ovum report entitled Rapid-I: German open source data mining.

