Kylin web界面知识点介绍

Big Data Era：

1.More and more data becoming available on Hadoop
2.Limitations in existing Business Intelligence (BI) Tools
　　Limited support for Hadoop
　　Data size growing exponentially
　　High latency of interactive queries
　　Scale-Up architecture
3.Challenges to adopt Hadoop as interactive analysis system
　　Majority of analyst groups are SQL savvy
　　No mature SQL interface on Hadoop
　　OLAP capability on Hadoop ecosystem not ready yet

Business Needs for Big Data Analysis

1.Sub-second query latency on billions of rows
2.ANSI SQL for both analysts and engineers
3.Full OLAP capability to offer advanced functionality
4.Seamless Integration with BI Tools
5.Support of high cardinality and high dimensions
6.High concurrency – thousands of end users
7.Distributed and scale out architecture for large data volume

Kylin is designed to accelerate 80+% analytics queries performance on Hadoop

Technical Challenges：

1.Huge volume data
　　Table scan
2.Big table joins
　　Data shuffling
3.Analysis on different granularity
　　Runtime aggregation expensive
4.Map Reduce job
　　Batch processing

OLAP Cube – Balance between Space and Time

How Does Kylin Utilize Hadoop Components

1.Hive
　　Input source
　　Pre-join star schema during cube building
2.MapReduce
　　Pre-aggregation metrics during cube building
3.HDFS
　　Store intermediated files during cube building.
4.HBase
　　Store data cube.
　　Serve query on data cube.
　　Coprocessor is used for query processing.

Cube Designer