Tag Archives: SDForum Business Intelligence SIG

Sept. 21, 2010 SDF Analytics: SQL or NoSQL

On September 21, 2010 in Palo Alto at SAP, the SDForum Business Intelligence SIG hosted SenSage‘s Richard Taylor presentation “Analytics: SQL or NoSQL.” From his early days at Cambridge, Taylor’s research projects in parallel and distributed computing for DEC, Data-Cache, RedBrick Systems, Informix, and IBM are well known to experts in the business intelligence community. That is why the room was packed when he chose to talk about the new challenge to relational databases called the NoSQL movement.

Started as SEQUEL in 1974, it evolved into SQL. Adopted by Oracle, it became the standard for relational databases using schema, multi-version concurrency control, isolation levels and analytics extensions to deal with the complexity of structured data. The relational model created a world of normalized data in rows and columns with tables selected, projected or joined using primary or foreign keys. It had handled transaction processing very well but complicated cases became repetitive. Scaling was difficult.

By 2000, the rise of unstructured data on the web created new levels of complexity and the need for a new approach. Coined by Eric Evans in June of 2009, the NoSQL movement is seen in the development of Google’s Big Table, Amazon’s Dynamo and Facebook’s Cassandra. All of these used a tuple, one table consisting of a structured key with a column timestamp and an unstructured value. The two functions were map and reduce. Map input a tuple and output a list of tuples. Reduce input a key and list of values then output a list or tuple. You specified clusters, input and tuple stores as the framework did the rest. While there is no need to normalize large amounts of semi-structured data and it is cheaper to implement, it still requires some programming ability. There is no guidance from schema or model for historical data.

Taylor gave examples of how SQL and NoSQL would handle the same problems. Each had its advantages and disadvantages. I urge you to read Taylor’s work and listen to him speak on this subject.

Frankly, I would still want an experienced database developer with a strong background in SQL to deal with NoSQL because only they would be able sense when something was wrong. Big data is no place for amateurs.

Note: A delegation from Peru was in the audience. Picture below.

Copyright 2010 DJ Cline All rights reserved.

Oct. 20, 2009 SDF Vertica

SDF logo2009 copyCavazos Monica copyNelson Chuck copyNguyen Tam 2 copyTrajman Omer copyWilcove Brian copyZhovtulya Roman copy

On October 20, 2009 in Palo Alto at SAP, the SDForum Business Intelligence SIG hosted Omer Trajman of Vertica. His topic: The Evolution of BI from Back Office to Business Critical Analytics. Trajman is an expert on cloud-based databases who launched Vertica’s cloud database on Amazon EC2 using Map Reduce integration with the Apache Hadoop project. Text from DJCline.com

Trajman started off with short history of databases and the idea of business intelligence. Gone are the days of gathering the data into reports for managers who decide what step to take next. This slow process in some back office has moved not only to the front office but the website in front of the customer. Now you can have real time analysis and react immediately using integrated real time data warehousing. Cloud servers, complex event processing engines, analytic databases and batch processing map/reduce systems offer near infinite capacity to solve problems deemed too complicated before. Text from DJCline.com

A phone company can rebalance its network on the fly. A cable company can assess who is watching and direct a targeted commercial to individual viewers. A bank should be able to determine which mortgages are likely to default. The wide adoption of business intelligence at the operational level finally answers the question: What is the point of gathering all this information if we cannot act on it in a timely fashion? Text from DJCline.com

There was some talk about the No SQL movement The idea is to have people build databases without using SQL. People are always tempted to do things on the cheap like doing their plumbing or electrical wiring on their own. Personally I think this is likely to cause problems down the road. As an example, I am sure you could build a website without knowing HTML, but you could do more if you understood the underlying code. Building databases without trained professionals is not a good way to build business intelligence. In the drive to do things faster we must remember to do things better as well. Text from DJCline.com

MakowskiDelanyTrajman copyNguyenWilcoveNelson copy10-20-09 crowd copy10-20-09 slide1 copy10-20-09 slide2 copy10-20-09 slide3 copy

Copyright 2009 DJ Cline All rights reserved.