Tag Archives: Hive

Aug. 7, 2013 Hive Hadoop MapReduce And SQL

hivelogo Erickson Justin 2 Gates Alan 2 Kaushik Sausheel 2 Patel Priyank Ramakrishnan Raghu Ravi TM Shiran Tomer 4

On Wednesday, August 7, 2013 in Sunnyvale at NetApp, The Hive Held an event discussing Big Data, Hadoop, Hive, MapReduce, Pig and SQL. Raghu Ramakrishnan of Microsoft moderated panelists Justin Erickson of Cloudera, Alan Gates of Hortonworks, Sausheel Kaushik of Pivotal, Priyank Patel of Teradata Aster, and Tomer Shiran of MapR. Grabbing large batches of data with MapReduce is fine, but businesses still want SQL for interactive and real-time queries. The result will be a hybrid of new and old strategies. The best strategy is to hire an experienced SQL developer with a strong ETL background and let them learn the new tools. You will get the information you need when you need it.

IMG_5959DJClinecom IMG_5968DJClinecom IMG_5969DJClinecom IMG_5971DJClinecom IMG_5972DJClinecom IMG_5974DJClinecom IMG_5976DJClinecom IMG_5978DJClinecom IMG_5982DJClinecom IMG_5985DJClinecom IMG_5986DJClinecom IMG_5987DJClinecom IMG_5995DJClinecom IMG_5996DJClinecom IMG_5997DJClinecom IMG_5998DJClinecom IMG_5999DJClinecom IMG_6001DJClinecom IMG_6003DJClinecom IMG_6004DJClinecom IMG_6008DJClinecom IMG_6009DJClinecom IMG_6018DJClinecom IMG_6019DJClinecom IMG_6020DJClinecom IMG_6021DJClinecom IMG_6030DJClinecom IMG_6041DJClinecom IMG_6045DJClinecom IMG_6076DJClinecom IMG_6083DJClinecom IMG_6108DJClinecom IMG_6212bDJClinecom IMG_6268DJClinecom IMG_6272DJClinecom IMG_6278DJClinecom IMG_6293DJClinecom IMG_6322DJClinecom

Copyright 2013 DJ Cline All rights reserved.

 

Aug. 18, 2009 SDF Business Intelligence in the Cloud

SDF logo2009 copyGali Lenin copyGuanlao Arnel copy

On August 18, 2009 in Palo Alto at SAP, SDForum presented “Cutting Edge Business Intelligence in the Cloud” with Lenin Gali of ShareThis. ShareThis has a widget that allows people to share what they find on the web with others on their social network. It doesn’t matter if it is FaceBook, Twitter, MySpace, or LinkedIn. Their clients include Fox Media, UsMagazine, Wired, ESPN, and movies.com. They built their IT on Amazon EC2, Cascading, Hadoop, Hive and MicroStrategy. They use Aster Data for their Data Warehouse. Text from DJCline.com

If you come from a traditional database IT background, I guarantee that you have never seen an operation like this. Cascading is the processing API for Hadoop Clusters. There are pipes, flows, branches and groups. You get event notification, can write scripts and control it at the tuple level. Hive is the data warehouse built on top of Hadoop. It supports non-complex SQL using HQL. You can build a custom map/reduce jobs for complex analytics. You can still make adhoc queries for large data sets. The Aster Data DW in the cloud is scalable commodity hardware with an Massively Parallel Processing (MPP) Architecture. It uses SQL, Map/Reduce, JDBC, ODBC, and is compatible with Extract Transfer and Load (ETL) tools. Aster Data architecture uses PostgreSQL and has a beehive heirarchy. Queens control the cluster and hold metadata while workers process and store it. If the queen fails it is replaced immediately. Text from DJCline.com

They think that all of this is easier to use and lowers their costs. They keep their headcount down and their revenue up. It works for them. The question is whether it will work elsewhere. Text from DJCline.com

08-18-09 SAP copy08-18-09 crowd pan1 copy08-18-09 sharethisslide copy

Copyright 2009 DJ Cline All rights reserved.