Tag Archives: Star Analytics

Apr. 21, 2009 SDF Apache Mahout

SDForum copy.jpgeastman-jeff-copy.jpghoffman-suzanne-copy.jpg

On April 21, 2009 at SAP in Palo Alto, SDForum’s Business Intelligence SIG hosted “BI Over Petabytes: Meet Apache Mahout” by Jeff Eastman. Suzanne Hoffman of Star Analytics talked about what she learned at the Gartner conference. Performance management is making a comeback as people try to make better use of the information they may already have. The leaders in BI are IBM Cognos, Microsoft and Oracle. One visionary is TIBCO.

Eastman thinks machine learning is a subfield of artificial intelligence concerned with algorithms that optimize computer performance. It is used in search clustering, knowledge management, mapping social networks, transforming taxonomies, analyzing markets, filtering unwanted e-mail and detecting fraud.

The Apache Mahout project is dedicated to the production of open source Machine Learning tools on the Apache Hadoop supercomputing platform orchestrating thousands of computers to analyze huge volumes of data in reasonable time. Mahout currently offers highly scalable programs for classifying (is this spam?), clustering (are these similar?), recommending (if you like X you might also like Y) and other tasks that can improve their performance by learning from past experiences. Coupled with cost-effective cloud computing infrastructures such as Amazon’s EC2/S3, this means that it is now practical for even small companies to distill Business Intelligence from Internet-sized datasets. The world needs scalable implementations of machine learning under open license and that is what Mahout aims to do.

Coptright 2009 DJ Cline All rights reserved.