Kylin web界面 知识点介绍
Big Data Era:
1.More and more data becoming available on Hadoop
2.Limitations in existing Business Intelligence (BI) Tools
Limited support for Hadoop
Data size growing exponentially
High latency of interactive queries
Scale-Up architecture
3.Challenges to adopt Hadoop as interactive analysis system
Majority of analyst groups are SQL savvy
No mature SQL interface on Hadoop
OLAP capability on Hadoop ecosystem not ready yet
Business Needs for Big Data Analysis
1.Sub-second query latency on billions of rows
2.ANSI SQL for both analysts and engineers
3.Full OLAP capability to offer advanced functionality
4.Seamless Integration with BI Tools
5.Support of high cardinality and high dimensions
6.High concurrency – thousands of end users
7.Distributed and scale out architecture for large data volume
Kylin is designed to accelerate 80+% analytics queries performance on Hadoop
Technical Challenges:
1.Huge volume data
Table scan
2.Big table joins
Data shuffling
3.Analysis on different granularity
Runtime aggregation expensive
4.Map Reduce job
Batch processing
OLAP Cube – Balance between Space and Time
How Does Kylin Utilize Hadoop Components
1.Hive
Input source
Pre-join star schema during cube building
2.MapReduce
Pre-aggregation metrics during cube building
3.HDFS
Store intermediated files during cube building.
4.HBase
Store data cube.
Serve query on data cube.
Coprocessor is used for query processing.
Cube Designer
Job Management
Query and Visualization
Tableau Integration
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】博客园社区专享云产品让利特惠,阿里云新客6.5折上折
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 没有源码,如何修改代码逻辑?
· 一个奇形怪状的面试题:Bean中的CHM要不要加volatile?
· [.NET]调用本地 Deepseek 模型
· 一个费力不讨好的项目,让我损失了近一半的绩效!
· .NET Core 托管堆内存泄露/CPU异常的常见思路
· 微软正式发布.NET 10 Preview 1:开启下一代开发框架新篇章
· DeepSeek R1 简明指南:架构、训练、本地部署及硬件要求
· 没有源码,如何修改代码逻辑?
· NetPad:一个.NET开源、跨平台的C#编辑器
· 面试官:你是如何进行SQL调优的?