第三章.可扩展性
If your cloud is successful, eventually you must add resources to meet the increasing demand. OpenStack is designed to be horizontally scalable. Rather than switching to larger servers, you procure more servers. Ideally, you scale out and load balance among functionally-identical services.
如果你的云服务是成功的,最终你必须添加的资源,以满足日益增长的需求。OpenStack的设计是可水平扩展的。不是改用大型服务器的,而是采购更多的服务器。理想情况下,将相同功能的服务水平扩展并使用负载平衡。
The Starting Point
Determining the scalability of your cloud and how to improve it is an exercise with many variables to balance. No one solution meets everyone's scalability aims. However, it is helpful to track a number of metrics.
如何提高云的可扩展性,是多个因素的一个平衡。没有一个解决方案能满足所有的可扩展性的目的。但是,有多项标准可参考。
The starting point for most is the core count of your cloud. By applying some ratios, you can gather information about the number of virtual machines (VMs) you expect to run ((overcommit fraction × cores) / virtual cores per instance)
, how much storage is required (flavor disk size × number of instances)
. You can use these ratios to determine how much additional infrastructure you need to support your cloud.
首先是云资源的核心数。通过应用某些比例时,可以收集信息希望运行的虚拟机(VM)的数量 ((overcommit fraction × cores) / virtual cores per instance)
,,需要多少存储(flavor disk size × number of instances
) 。您可以使用这些比率,以确定你需要多少额外的基础设施,以支持你的云服务。
The default OpenStack flavors are:
默认OpenStack的配置标准:
Name | Virtual cores | Memory | Disk | Ephemeral |
m1.tiny | 1 | 512 MB | 0 GB | 0 GB |
m1.small | 1 | 2 GB | 10 GB | 20 GB |
m1.medium | 2 | 4 GB | 10 GB | 40 GB |
m1.large | 4 | 8 GB | 10 GB | 80 GB |
m1.xlarge | 8 | 16 GB | 10 GB | 160 GB |
Assume that the following set-up supports (200 / 2) × 16 = 1600 VM instances and requires 80 TB of storage for /var/lib/nova/instances
:
假设按下面的标准需要运行:(200/2)×16 = 1600个虚拟机实例,在/var/lib/nova/instances
需要80 TB的存储:
· 200 physical cores
· Most instances are size m1.medium (2 virtual cores, 50 GB of storage)
· Default CPU over-commit ratio (cpu_allocation_ratio
in nova.conf) of 16:1
However, you need more than the core count alone to estimate the load that the API services, database servers, and queue servers are likely to encounter. You must also consider the usage patterns of your cloud.
然而,你需要更多的核心数来用来于API服务,数据库服务器,以及队列服务器的负载。您还必须考虑你的云的使用模式。
As a specific example, compare a cloud that supports a managed web hosting platform with one running integration tests for a development project that creates one VM per code commit. In the former, the heavy work of creating a VM happens only every few months, whereas the latter puts constant heavy load on the cloud controller. You must consider your average VM lifetime, as a larger number generally means less load on the cloud controller.
一个具体的例子,比较一个托管的虚拟主机平台的云支持一个正在运行的集成测试代码提交创建一个虚拟机用于创建一个开发项目。繁重的工作在每隔几个月中创造一个虚拟机时发生,而后者将恒定负荷在云控制器上。你必须考虑你的虚拟机平均寿命,寿命越长意味着较少的云控制器的负载。
Aside from the creation and termination of VMs, you must consider the impact of users accessing the service — particularly on nova-api and its associated database. Listing instances garners a great deal of information and, given the frequency with which users run this operation, a cloud with a large number of users can increase the load significantly. This can even occur without their knowledge — leaving the OpenStack Dashboard instances tab open in the browser refreshes the list of VMs every 30 seconds.
除了从虚拟机的创建和终止时,您必须考虑以下因素 - 特别是访问nova-api和其相关联的数据库服务的用户。列表虚拟实例的信息,如用户运行此操作的频率,云有大量的用户,可以明着增加负载。这甚至可以在他们不知情的情况下发生 - ,打开在浏览器中的OpenStack的仪表板实例选项卡会每30秒刷新列表的虚拟机。
After you consider these factors, you can determine how many cloud controller cores you require. A typical 8 core, 8 GB of RAM server is sufficient for up to a rack of compute nodes — given the above caveats.
综合考虑到这些因素后,可以决定云控制器需要多少内核。一个典型配置8个核心,8 GB的RAM服务器到机架计算节点 - 足够满足上述需求。
You must also consider key hardware specifications for the performance of user VMs. You must consider both budget and performance needs. Examples include: Storage performance (spindles/core), memory availability (RAM/core), network bandwidth (Gbps/core), and overall CPU performance (CPU/core).
您还必须考虑主要硬件规格满足用户的虚拟机的性能要求。你必须同时考虑预算和性能需求。例子包括:存储性能(spindles / core),可用内存(RAM /核心),网络带宽(Gbps/core),和整体CPU性能(CPU /核心)。
For which metrics to track to determine how to scale your cloud, see Chapter 14, Logging and Monitoring.
更多跟踪指标,用来确定如何扩展你的云,可参见第14章,记录和监控。
Adding Controller Nodes
You can facilitate the horizontal expansion of your cloud by adding nodes. Adding compute nodes is straightforward — they are easily picked up by the existing installation. However, you must consider some important points when you design your cluster to be highly available.
通过增加节点,云可以方便的水平扩展。添加计算节点很简单 - 他们很容易被拾根据现有情况安装。但是,当你设计你的集群高可用时必须考虑一些重点。
Recall that a cloud controller node runs several different services. You can install services that communicate only using the message queue internally — nova-scheduler
and nova-console
— on a new server for expansion. However, other integral parts require more care.
回想一下,云控制器节点运行多个不同的服务。您可以在新服务器上安装- nova-scheduler
和 nova-console
服务通信只使用内部的消息队列进行水平扩展。然而,其他可能需要更多的考虑。
You should load balance user-facing services such as Dashboard, nova-api
or the Object Storage proxy. Use any standard HTTP load balancing method (DNS round robin, hardware load balancer, software like Pound or HAProxy). One caveat with Dashboard is the VNC proxy, which uses the WebSocket protocol — something that a L7 load balancer might struggle with. See also Horizon session storage(http://docs.openstack.org/developer/horizon/topics/deployment.html#session-storage).
面向用户的服务,如Dashboard,nova-api
或Object Storage代理等考虑使用负载均衡衡。使用任何标准的HTTP负载均衡算法(DNS轮循,硬件负载均衡,如Pound或HAProxy的软件负载均衡)。Dashboard的一个功能申请是VNC的代理,它使用WebSocket协议 – 有时这可能会使用L7负载均衡器。有关Horizon session内容请参考(http://docs.openstack.org/developer/horizon/topics/deployment.html#session-storage)。
You can configure some services, such as nova-api
and glance-api
, to use multiple processes by changing a flag in their configuration file — allowing them to share work between multiple cores on the one machine.
有一些服务,如nova-api
和glance-api
,通过在配置文件中改变一个标志从而使用多个进程 - 让他们在一台机器上在共享多个内核工作。
Several options are available for MySQL load balancing, and RabbitMQ has in-built clustering support. Information on how to configure these and many of the other services can be found in the Operations Section.
有几个选项可用于MySQL的负载均衡,RabbitMQ则内置支持集群。如何配置他们和更多其他服务的信息,请参见Operations Section。
Segregating Your Cloud
分离你的云
Availability Zones and Host Aggregates
Use one of the following OpenStack methods to segregate your cloud: cells, regions, zones and host aggregates. Each method provides different functionality, as described in the following table:
使用以下OpenStack的方法分离云:单元,分区,区域和主机分开。每个方法都提供下表描述中不同的功能:
Cells | Regions | Availability Zones | Host Aggregates | |
Use when you need | A single API endpoint for compute, or you require a second level of scheduling. | Discrete regions with separate API endpoints and no coordination between regions. | Logical separation within your nova deployment for physical isolation or redundancy. | To schedule a group of hosts with common features. |
Example | A cloud with multiple sites where you can schedule VMs "anywhere" or on a particular site. | A cloud with multiple sites, where you schedule VMs to a particular site and you want a shared infrastructure. | A single site cloud with equipment fed by separate power supplies. | Scheduling to hosts with trusted hardware support. |
Overhead | · A new service, · Each cell has a full nova installation except | · A different API endpoint for every region. · Each region has a full nova installation. | · Configuration changes to nova.conf | · Configuration changes to nova.conf |
Shared services | Keystone
| Keystone | Keystone All nova services | Keystone All nova services |
Cells | Regions | Availability Zones | Host Aggregates | |
需求 | 一个单一的API端点计算,或者你需要第二级调度。 | 分离的API端点的离散区域,区域之间没有协作。 | 隔离Nova部署物理或冗余逻辑分离。 | 安排有共同功能特点的一组主机。 |
例子 | 可以调度云虚拟机“随时随地”运行多个站点或一个特定的网站。 | 云有多个站点,虚拟机能能调度运行一个特定的网站,需要共享的基础设施。 | 通过独立的电源运行单一的网站云。 | 调度主机的硬件是受信任的。 |
一般操作 |
|
|
|
|
共享服务 | KeyStone NOVA-API | KeyStone | KeyStone 所有nova服务 | KeyStone 所有nova服务 |
This array of options can be best divided into two — those which result in running separate nova deployments (cells and regions), and those which merely divide a single deployment (availability zones and host aggregates).
这些可以分为二种选项 - 分离单独运行的nova部署的(cells and regions),另外就是单个部署(availability zones and host aggregates)。
Cells and Regions
OpenStack Compute cells are designed to allow running the cloud in a distributed fashion without having to use more complicated technologies, or being invasive to existing nova installations. Hosts in a cloud are partitioned into groups called cells. Cells are configured in a tree. The top-level cell ("API cell") has a host that runs the nova-api
service, but no nova-compute
services. Each child cell runs all of the other typical nova-*
services found in a regular installation, except for the nova-api
service. Each cell has its own message queue and database service, and also runs nova-cells
— which manages the communication between the API cell and child cells.
运行OpenStack的云计算单元被设计成分布式,而不必使用更复杂的技术,或修改到现有nova安装。云主机划分组称为单元。单元配置在一棵树上。顶层单元(“API cell”)主机运行Nova -API服务,但没有nova-compute
服务。每个子单元使用常规安装运行所有其他典型的NOVA服务发现,除Nova-API服务。每个单元都有自己的消息队列和数据库服务,并且还运行nova-cells
– 用于管理的API单元和子单元之间的通信。
This allows for a single API server being used to control access to multiple cloud installations. Introducing a second level of scheduling (the cell selection), in addition to the regular nova-scheduler
selection of hosts, provides greater flexibility to control where virtual machines are run.
Contrast this with regions. Regions have a separate API endpoint per installation, allowing for a more discrete separation. Users wishing to run instances across sites have to explicitly select a region. However, the additional complexity of a running a new service is not required.
这允许一个单一的API服务器被用来控制访问多个云的安装。第二级调度(the cell selection)介绍,除了到选择常规的nova-scheduler
主机,额外提供了更大的灵活性,以控制虚拟机运行。同Regions相比。Regions每个安装有独立API端点,允许多个离散分离。如果用户希望运行跨站点的实例就必须明确地选择一个Region。然而,对于一个正在运行的一个新的服务这种另加的复杂性不是必需的。
The OpenStack Dashboard (Horizon) currently only uses a single region, so one dashboard service should be run per region. Regions are a robust way to share some infrastructure between OpenStack Compute installations, while allowing for a high degree of failure tolerance.
OpenStack的Dashboard (Horizon)目前只使用一个单一的region,所以一个Dashboard服务应分Region运行。在OpenStack计算Region之间可以共享一些基础设施,同时允许高度容错失败是一条有效的方法。
Availability Zones Host Aggregates
Both availability zones and host aggregates partition a single nova deployment. While seeming similar to configure, host aggregates and availability zones differ in their intended use. The former allows the partition of OpenStack Compute deployments into logical groups for load balancing and instance distribution, the latter are used to provide some form of physical isolation and redundancy from other availability zones (such as by using separate power supply or network equipment). Host aggregates can be regarded as a mechanism to further partitioning an availability zone, i.e. into multiple groups of hosts that share common resources like storage and network, or have a special property such as trusted computing hardware.
availability zones 和 host aggregates分区都是单一的Nova部署的。虽然看起来配置类似,主机总量和可用性区域,但用途不同。前者允许的OpenStack的分区计算部署到逻辑组中实现负载均衡和分布式实例,后者是用来提供某种形式的物理隔离和冗余(例如,通过使用单独的电源,或者网络设备等)。Host aggregates可视为一种机制来进一步分割一个可用性区域,即到多个组中的主机,共享资源,如存储和网络,或者有一个特殊的属性,如可信计算硬件。
A common use of host aggregates is to provide information for use with the nova-scheduler
. For example, limiting specific flavours or images to a subset of hosts.
host aggregates的一个常见用途是使用nova-scheduler
提供信息。例如,限制特定的flavours或镜像运行在一群主机的子集中。
Availability zones allow you to arrange sets of either OpenStack Compute or OpenStack Block Storage hosts into logical groups. You define the availability zone that a given Compute or block Storage host is in locally on each server. Availability zones are commonly used to identify a sets of servers that have some common attribute. For instance, if some of the racks in your data center are on a separate power source, you may put servers in those racks in their own availability zone. Availability zones can also be helpful for separating out different classes of hardware. This is especially helpful with OpenStack Block Storage where you may have storage servers with different types of hard drives. When provisioning resources, users can specify what availability zone they would like their instance or volume to come from. This allows cloud consumers to ensure that their application resources are spread across multiple disparate machines to achieve high availability in the event of hardware failure.
Availability zones允许将OpenStack计算或OpenStack的块存储主机集分成逻辑组。您可以在本地服务器上每个可用区上定义一个给定的计算或块存储主机。Availability zones通常用来识别一组有一些共同的属性服务器。举例来说,如果你的数据中心的一些机架共用一个单独的电源,你可以把这些机架的服务器在定义在一个Availability zone。Availability zone对区分出不同类别的硬件很有帮助。 OpenStack的块存储,您可能需要使用不同类型的硬盘驱动器的存储服务器,这时就特别有用的。分配资源时,用户可以指定他们希望自己的实例或卷来自什么可用区。这使得云消费者可以确保他们的应用程序的资源分散在多个不同的机器上以在硬件故障的情况下实现高可用性。
Scalable Hardware
While several resources already exist to help with deploying and installing OpenStack, it's very important to make sure you have your deployment planned out ahead of time. This guide expects at least a rack has been set aside for the OpenStack cloud but also offers suggestions for when and what to scale.
一些资源已经是可以帮助OpenStack的部署和安装,这是非常重要,可以确保部署计划的时间提前。但是本南预计至少个机架已为OpenStack云预留,同时还提供了什么时候和规模的建议。
Hardware Procurement
“The Cloud” has been described as a volatile environment where servers can be created and terminated at will. While this may be true, it does not mean that your servers must be volatile. Ensuring your cloud’s hardware is stable and configured correctly means your cloud environment remains up and running. Basically, put effort into creating a stable hardware environment so you can host a cloud that users may treat as unstable and volatile.
“云”已被描述为一个不稳定的环境,服务器可以按需创建和终止。这可能是事实,但并不意味着你的服务器必定是不稳定的。确保云的硬件是稳定的,并正确配置意味着你的云保持正常运行。基本上,努力创造一个稳定的硬件环境,这样用户可以认为当作不稳定性托管的云。
OpenStack can be deployed on any hardware supported by an OpenStack-compatible Linux distribution, such as Ubuntu 12.04 as used in this books' reference architecture.
OpenStack可以部署在OpenStack的兼容的任何硬件支持的Linux发行版,如本书参考架构使用的Ubuntu 12.04。
Hardware does not have to be consistent, but should at least have the same type of CPU to support instance migration.
硬件不必都相同的,但为了支持实例迁移至少应该有同一类型的CPU。
The typical hardware recommended for use with OpenStack is "commodity". That is, very standard "value-for-money" offerings that most hardware vendors stock. It should be straightforward to divide your procurement into building blocks such as "compute," "object storage," and "cloud controller," and request as many of these as desired. Alternately should you be unable to spend more, if you have existing servers, provided they meet your performance requirements and virtualization technology, these are quite likely to be able to support OpenStack.
建议使用OpenStack的典型的硬件是“commodity”。也就是说,非常标准“value-for-money”的产品,大多数硬件厂商的库存。它应该是可以像积木一样采购,如“计算”,“对象存储”和“云控制器”,并可以随需而变。另外,你不应该多花钱,如果你现有的服务器,能满足性能需求和虚拟化技术,这些钱都是很可能用来能够支持OpenStack的。
Capacity Planning
OpenStack is designed to increase in size in a straightforward manner. Taking into account the considerations in the Scalability chapter — particularly on the sizing of the cloud controller, it should be possible to procure additional compute or object storage nodes as needed. New nodes do not need to be the same specification, or even vendor, as existing nodes.
OpenStack被设计成以直观的方式来扩大规模。在Scalability章节 中- 特别考虑了在云控制器的容量,它应该可以促使需要额外的计算或对象存储节点。新节点不需要与现在节点是相同的规范,甚至供应商也可以不同。
For compute nodes, nova-scheduler
will take care of differences in sizing to do with core count and RAM amounts, however you should consider the user experience changes with differing CPU speeds. When adding object storage nodes, a weight should be specified that reflects the capability of the node.
对于计算节点,nova-scheduler
会关心核心数量和RAM差异,但是你应该考虑不同的CPU速度与用户体验的变化。当添加对象存储节点,重点应考虑反映了节点的负载。
Monitoring the resource usage and user growth will enable you to know when to procure. The Monitoring chapter details some useful metrics.
监控资源使用情况和用户的增长将使你知道什么时候需要添加资源。Monitoring章节详细介绍了一些有用的指标。
Burn-in Testing
Server hardware's chance of failure is high at the start and the end of its life. As a result, much effort in dealing with hardware failures while in production can be avoided by appropriate burn-in testing to attempt to trigger the early-stage failures. The general principle is to stress the hardware to its limits. Examples of burn-in tests include running a CPU or disk benchmark for several days.
在服务器开始与结束时硬件发生故障的几率高。其结果是,花费了很多精力去处理硬件故障,而在生产环境,可使用Burn-in测试以试图触发早期故障。总的原则是强调硬件的限制。Burn-in测试的例子包括运行CPU或磁盘基准测试几天。