复习笔记|Lecture 4: HBase(大数据处理技术)
HBase Overview
• HBase is an Apache open source project whose goal is to provide storage for the Hadoop Distributed Computing.
• HBase is a distributed **column-oriented data store built on top of HDFS
• HBase **is the Hadoop application to use when you require real-time read/write random access to very large datasets.
• Data is logically organized into tables, rows and columns
• HBase 是一个Apache 开源项目,其目标是为Hadoop 分布式计算提供存储。
• HBase 是构建在HDFS 之上的分布式面向列的数据存储 • HBase 是Hadoop 应用程序,当您需要对非常大的数据集进行实时读/写随机访问时使用。
• 数据按逻辑组织成表、行和列
HBase: Part of Hadoop’s Ecosystem
HBase vs. HDFS
• 两者都是可扩展到数百或数千个节点的分布式系统 • HDFS 适合批处理(扫描大文件) • 不适合记录查找 • 不适合小批量的增量添加 • 不适合更新 • HBase 被设计有效解决以上几点• 快速记录查找• 支持记录级插入• 支持更新(不是到位,HBase 更新是通过创建新版本的值来完成的)
Row-oriented Vs. Column-oriented
• Row-oriented databases store table records in a sequence of rows. Whereas column-oriented databases store table records in a sequence of columns, i.e. the entries in a column are stored in contiguous locations on disks.
• 面向行的数据库将表记录存储在一系列行中。而面向列的数据库将表记录存储在一系列列中,即列中的条目存储在磁盘上的连续位置。
面向行的数据存储 面向列的数据存储 数据一次存储和检索一行,因此如果只需要一行中的某些数据,则可能会读取不必要的数据。
在这种类型的数据存储中,数据按列存储和检索,因此它只能在需要时读取相关数据。
典型的压缩机制提供的结果效率低于我们从面向列的数据存储中获得的结果。
由于列中的不同或唯一值很少,这些类型的数据存储基本上允许高压缩率。
面向行的数据存储最适合在线交易系统。
面向列的存储最适合在线分析处理。
HBase Data Model
• Table: Data is stored in a table format in HBase. But here tables are in column-oriented format.
• Row Key: Row keys are used to search records which make searches fast.
• Column Family: Various columns are combined in a column family. These column families are stored together which makes the searching process faster because data belonging to same column family can be accessed together in a single seek.
• 表:数据以表的形式存储在HBase 中。但是这里的表是面向列的格式。
• 行键:行键用于搜索记录,使搜索速度更快。
• 列族:各种列组合在一个列族中。这些列族存储在一起,这使得搜索过程更快,因为属于同一列族的数据可以在一次查找中一起访问。
• Column Qualifier: Each column’s name is known as its column qualifier.
• Timestamp: Timestamp is a combination of date and time. Whenever data is stored, it is stored with its timestamp. This makes easy to search for a particular version of data.
• Cell: Data is stored in cells. The data is dumped into cells which are specifically identified by .
• 列限定符:每列的名称称为它的列限定符。
• 时间戳:时间戳是日期和时间的组合。每当存储数据时,都会存储它的时间戳。这使得搜索特定版本的数据变得容易。
• 单元格:数据存储在单元格中。数据被转储到由 <rowkey, column family, column qualifier, timestamp> 明确标识的单元格中。
HBase Logical View
• HBase 模式由多个表组成 • 每个表由一组列族组成 • 列不是模式的一部分 • HBase 有动态列 • 因为列名在单元格内编码 • 不同的单元格可以有不同的列
• Each column family is stored in a separate file (called HTables) • Key & Version numbers are replicated with each column family • Empty cells are not stored • HBase maintains a multi-level index on values:
• 每个列族都存储在一个单独的文件(称为 HTables)中 • 键和版本号与每个列族一起复制 • 不存储空单元格 • HBase 在值上维护一个多级索引:<key, column family, column name , 时间戳>
HBase Physical Model
• Each column family is stored in a separate file (called HTables) • Key & Version numbers are replicated with each column family • Empty cells are not stored • HBase maintains a multi-level index on values:
<key, column family, column name, timestamp>
• 每个列族都存储在一个单独的文件(称为 HTables)中 • 键和版本号与每个列族一起复制 • 不存储空单元格 • HBase 在值上维护一个多级索引:<键,列族,列名, 时间戳>
• Logical View vs. Physical View
• Hbase is a sparse, distributed, persistent, multidimensional, sorted map
Sparse – A given row can have any number of columns in each column family, or none at all.
**Distributed **– Built upon HDFS so that data are spread across several machines
Persistent – HBase stores data on disk with a log, hence it sticks around (means that the data survives after the process with which it was created has ended)
Sorted – All the data in HBase are stored in sorted order so that user can seek them faster
**Multi-dimensional **– Data stored in HBase is addressable in several dimensions- rows, columns, versions, etc.
HBase Regions
• Each column family is partitioned horizontally into regions
• A subset of a table’s rows, like horizontal range partitioning
• A region contains all rows in the table between the region’s start key and end key • Automatically done
• 每个列族被水平划分为区域 • 表行的子集,如水平范围分区 • 区域包含表中区域开始键和结束键之间的所有行 • 自动完成
HBase Architecture
• Major components
• HBaseMaster: one master
• HRegionServer: many region servers
• ZooKeeper: the coordinator
• HBase client
• 主要组件 • HBaseMaster:一个主服务器 • HRegionServer:多个区域服务器 • ZooKeeper:协调器 • HBase 客户端
• Master
• Responsible for coordinating the slaves
• Assigns regions to registered RegionServers, detects failures and recovers failures, load balance, deal with schema change (creating new tables and column families)
• Admin functions
• RegionServer (many slaves)
• Manages data regions
• Serves data for reads and writes
HBase Master
• Coordinating the region servers
• Assigning regions on startup , re-assigning regions for recovery or load balancing
• Monitoring all RegionServer instances in the cluster (listens for notifications from zookeeper)
• Admin functions
• Interface for creating, deleting, updating tables
• 协调区域服务器 • 在启动时分配区域,为恢复或负载平衡重新分配区域 • 监控集群中的所有 RegionServer 实例(监听来自 zookeeper 的通知) • 管理功能 • 用于创建、删除、更新表的界面
ZooKeeper
• HBase uses ZooKeeper as a distributed coordination service to maintain server state. Zookeeper maintains which servers are alive and available, and provides server failure notification.
• The active HMaster listens for region servers, and will recover region servers on failure.
• The Inactive HMaster listens for active HMaster failure, and if an active HMaster fails, the inactive HMaster becomes active.
• HBase 使用ZooKeeper 作为分布式协调服务来维护服务器状态。 Zookeeper 维护哪些服务器处于活动状态且可用,并提供服务器故障通知。
• 活动的HMaster 侦听区域服务器,并在发生故障时恢复区域服务器。
• Inactive HMaster 监听active HMaster 故障,如果active HMaster 故障,inactive HMaster 变为active。
HBase Meta Table
• There is a special HBase Catalog table called the META table, which holds the location of the regions in the cluster.
• This META table is an HBase table that keeps a list of all regions in the system.
• ZooKeeper stores the location of the META table.
• The .META. table structure is as follows: • Key: region start key,region id • Values: RegionServer
• 有一个特殊的HBase 目录表,称为META 表,它保存区域在集群中的位置。
• 这个META 表是一个HBase 表,它保存了系统中所有区域的列表。
• ZooKeeper 存储META 表的位置。
• .META。表结构如下: • Key: region start key,region id • Values: RegionServer
HBase Read & Write
• This is what happens the first time a client reads or writes to HBase:
1. The client gets the Region server that hosts the META table from ZooKeeper.
2. The client will query the .META. server to get the region server corresponding to the row key it wants to access. The client caches this information along with the META table location.
3. It will get the Row from the corresponding Region Server.
• 这是客户端第一次读取或写入HBase 时发生的情况:
1. 客户端从ZooKeeper 获取承载META 表的区域服务器。
2. 客户端将查询.META。 server 获取它要访问的row key对应的region server。客户端将此信息与 META 表位置一起缓存。
3. 会从对应的Region Server 中获取Row。
• For future reads, the client uses the cache to retrieve the META location and previously read row keys.
• It does not need to query the META table, unless there is a miss because a region has moved; then it will re-query and update the cache.
• 对于未来的读取,客户端使用缓存来检索META 位置和之前读取的行键。
• 它不需要查询META 表,除非因为区域移动而丢失;然后它将重新查询并更新缓存。
RegionServer
A RegionServer runs on an HDFS data node and has the following components:
• WAL: Write Ahead Log is a file on the distributed file system. The WAL is used to store new data that hasn't yet been persisted to permanent storage; it is used for recovery in the case of failure.
• BlockCache: is the read cache. It stores frequently read data in memory.
Least Recently Used (LRU) data is evicted when full.
RegionServer 在 HDFS 数据节点上运行并具有以下组件:
• WAL:Write Ahead Log 是分布式文件系统上的文件。 WAL 用于存储尚未持久化到永久存储的新数据;它用于在发生故障时进行恢复。
• BlockCache:是读取缓存。它将经常读取的数据存储在内存中。
最近最少使用 (LRU) 数据在满时被逐出。
A RegionServer runs on an HDFS data node and has the following components:
• MemStore: is the write cache. It stores new data which has not yet been written to disk. It is sorted before writing to disk. There is one MemStore per column family per region.
• HFiles store the rows as sorted KeyValues on disk.
RegionServer运行在HDFS数据节点上,由以下组件组成:
•MemStore:写cache。它存储尚未写入磁盘的新数据。在写入磁盘之前对其进行排序。每个地区的每个列族都有一个MemStore。
•HFiles将行作为排序的keyvalue存储在磁盘上。
HBase Write Steps
• When the client issues a Put request, the first step is to write the data to the write-ahead log, the WAL:
• Edits are appended to the end of the WAL file that is stored on disk.
• The WAL is used to recover not-yet-persisted data in case a server crashes.
• 当客户端发出Put 请求时,第一步是将数据写入预写日志,即WAL: • 编辑附加到存储在磁盘上的WAL 文件的末尾。
• WAL 用于在服务器崩溃时恢复尚未持久化的数据。
• Once the data is written to the WAL, it is placed in the MemStore. Then, the put request acknowledgement returns to the client.
• 一旦数据被写入WAL,它就被放置在MemStore 中。然后,放置请求确认返回给客户端。
HBase MemStore
• The MemStore stores updates in memory as sorted KeyValues, the same as it would be stored in an HFile. There is one MemStore per column family. The updates are sorted per column family.
• MemStore 将更新存储在内存中作为排序的键值,就像存储在 HFile 中一样。每个列族有一个 MemStore。更新按列族排序。
HBase Region Flush
• When the MemStore accumulates enough data, the entire sorted set is written to a new HFile in HDFS.
• HBase uses multiple HFiles per column family.
• Note that this is one reason why there is a limit to the number of column families in HBase. There is one MemStore per CF; when one is full, they all flush.
• 当MemStore 积累了足够的数据时,整个排序集被写入HDFS 中的一个新的HFile。
• HBase 每个列族使用多个HFile。
• 请注意,这是HBase 中列族数量有限制的原因之一。每个 CF 有一个 MemStore;当一个装满时,它们都会冲水。
HBase Read Merge
• The KeyValue cells corresponding to one row can be in multiple places:
• row cells already persisted are in HFiles
• recently updated cells are in the MemStore
• recently read cells are in the BlockCache
• As discussed earlier, there may be many HFiles per MemStore, which means for a read, multiple files may have to be examined, which can affect the performance. This is called read amplification.
• 一行对应的KeyValue cells可以在多个地方: • 已经持久化的行cells在HFiles中 • 最近更新的cells在MemStore中 • 最近读取的cells在BlockCache中 • 如前所述,每个MemStore可能有很多HFiles ,这意味着对于读取,可能必须检查多个文件,这会影响性能。这称为读取放大。
HBase Compaction
• HBase Minor Compaction: HBase will automatically pick some smaller HFiles and rewrite them into fewer bigger Hfiles. Minor compaction reduces the number of storage files by rewriting smaller files into fewer but larger ones, performing a merge sort.
• HBase Minor Compaction:HBase 会自动挑选一些较小的HFiles 并将它们重写为更少的较大的Hfiles。次要压缩通过将较小的文件重写为更少但更大的文件来减少存储文件的数量,执行合并排序。
• HBase Major Compaction: Major compaction merges and rewrites all the HFiles in a region to one HFile per column family, and in the process, drops deleted or expired cells. This improves read performance; however, since major compaction rewrites all of the files, lots of disk I/O and network traffic might occur during the process. Major compactions are usually scheduled for weekends or evenings.
• HBase Major Compaction:Major compaction 将一个区域中的所有HFile 合并并重写为每个列族一个HFile,并在此过程中删除已删除或过期的单元格。这提高了读取性能;但是,由于主要压缩会重写所有文件,因此在此过程中可能会发生大量磁盘 I/O 和网络流量。主要压实通常安排在周末或晚上进行。