20170226-云计算设计模式翻译-自动伸缩指南(逐字翻译)

--自己逐字翻译的。

Autoscaling Guidance

自动伸缩指南

Constantly monitoring performance and scaling a system to adapt to fluctuating workloads to meet capacity targets and optimize operational cost can be a labor-intensive process. It may not be feasible to perform these tasks manually. This is where autoscaling is useful.

进行连续的性能监控和系统伸缩以适应不断变化的工作负载来满足流量目标、优化运营成本,这是一个劳动密集型的过程。手动地执行这些任务几乎是不可行的,这是自动伸缩发挥作用的地方。

What is Autoscaling?

什么是自动伸缩?

Autoscaling is the process of dynamically allocating the resources required by an application to match performance requirements and satisfy service level agreements (SLAs). As the volume of work grows, an application may require additional resources to enable it to perform its tasks in a timely manner.

自动伸缩是动态地分配应用程序所需资源以匹配性能需求同时满足服务等级协议(SLAs,Service Level Agreements)的过程。随着工作量的增长,应用程序可能需要额外的资源以及时使其执行任务。

Autoscaling is often an automated process that can help to ease management overhead by reducing the need for an operator to continually monitor the performance of a system and make decisions about adding or removing resources.

自动伸缩通常是一种自动化的过程,可以通过减少操作员持续监视系统性能、帮其决定添加或删除资源的方式来帮助缓解管理开销。

Autoscaling should also be an elastic process; more resources can be provisioned as the load increases on the system, but as demand slackens resources can be de-allocated to minimize costs while still maintaining adequate performance and meeting SLAs.

自动伸缩也应该是一个弹性的过程,随着系统上的负载的增加可以提供更多的资源,但是当需要减少时,资源可以被释放以达到成本最小化,同时仍然保持足够的性能且满足SLAs。

Note:

Autoscaling applies to all of the resources used by an application, not just the compute resources. For example, if your system uses message queues to send and receive information, it could create additional queues as it scales.

注意:

自动伸缩适应于应用程序的所有资源,而不仅仅是计算资源。例如,如果系统使用消息队列来发送、接收信息,则它可以在扩展时创建额外的队列。

Types of Scaling

伸缩类型

Scaling typically takes one of two forms—vertical and horizontal scaling:

伸缩通常有两种形式--垂直伸缩和水平伸缩:

  • Vertical Scaling (often referred to as scaling up) requires that you redeploy the solution using different hardware. In a cloud environment the hardware platform is typically a virtualized environment, and vertical scaling involves provisioning more powerful resources for this environment and moving the system onto these new resources. Vertical scaling is often a disruptive process that requires making the system temporarily unavailable while it is being redeployed. It may be possible to keep the original system running while the new hardware is provisioned and brought online, but there will likely be some interruption while the processing transitions from the old environment to the new one. It is uncommon to use autoscaling to implement a vertical scaling strategy.

垂直伸缩通常称为垂直扩展)需要你使用不同的硬件重新部署解决方案。在云环境中,硬件平台通常是虚拟化环境,垂直伸缩涉需要为该环境提供更强大的资源并将系统移动到这些新资源上。垂直扩展通常是一个破坏性的过程,在系统重新部署时需要使系统暂停使用。在配置新硬件和上线期间可以继续保持原始系统运行,但是在从旧环境转换到新环境过程中可能会有一些中断。通过实施垂直伸缩策略来使用自动伸缩并不常用。

  • Horizontal Scaling (often referred to as scaling out) requires deploying the system on additional resources. The system can continue running without interruption while these resources are provisioned. When the provisioning process is complete, copies of the elements that comprise the system can be deployed on these additional resources and made available. If demand drops, the additional resources can be reclaimed after the elements using them have been shut down cleanly. Many cloud-based systems, including Microsoft Azure, support this form of autoscaling.

水平伸缩(通常称为横向扩展)需要在额外的资源上部署系统。在配置这些资源时,系统可以继续运行而不中断。当配置过程完成时,构成系统的元素副本可以部署在这些附加的资源上且可用。如果需求下降,在使用这些资源的元素完全关闭之后,可以回收这些额外的资源。许多基于云的系统(包括Microsoft Azure)支持这种自动伸缩形式。

Implementing an Autoscaling Strategy

实施自动伸缩策略

Implementing an autoscaling strategy typically involves the following processes and components:

实施自动伸缩策略通常涉及以下内容:

  • Instrumentation at the application level to capture key performance and scaling factors such as response times, queue lengths, CPU utilization, and memory usage.
  • 应用程序级别的工具,用以捕捉关键性能和伸缩因素,例如反应时间、队列长度、CPU使用率、内存使用情况。
  • Monitoring components that can observe these performance and scaling facto
  • 监控组件,可以观察这些性能和伸缩因素。
  • Decision-making logic that can evaluate the monitored scaling factors against predefined system thresholds and make decisions regarding whether to scale or not. Time plays a critical factor in these evaluations. The decision making logic should avoid making scaling decisions too frequently as this can cause the system to oscillate. It may be possible to semi-automate the scaling decision with the final decision left to an operator.
  • 决策逻辑,可以根据预定义的系统阈值来评估所监视的缩放因子,决定是否缩放。时间在这些评估中起着关键作用。决策逻辑应该避免太频繁地使用伸缩决策,因为这可能导致系统不稳定。可以依据操作者留下的最终决定,半自动实施伸缩决策。
  • Execution components that are responsible for carrying out tasks associated with scaling the system. These components typically use tools and scripts to perform the following tasks:
  • 执行组件,负责执行与伸缩系统相关的任务。这些组件通常使用工具、脚本来执行以下任务:
    • Provision or de-provision resources.
    • 配置释放资源
    • Reconfigure the system.
    • 重新配置系统
  • Testing and validation of the autoscaling strategy to ensure that it functions as expected.
  • 测试和验证自动伸缩策略,确保它能按预期执行。

 

Traditionally, many autoscaling solutions for the cloud depended on writing and configuring scripts that gathered the appropriate performance data, analyzed this data, and then added or removed resources as appropriate. It is now becoming increasingly common for cloud-based systems to provide built-in tooling to help reduce the time and effort required to implement autoscaling.

传统上,云的许多自动伸缩解决方案依赖于编写和配置脚本,这些脚本收集适当的性能数据,分析该数据,然后根据需要添加或删除资源。现在基于云的系统提供内置工具帮助减少实现自动伸缩所需的时间和精力变得越来越普遍。

However, it is important to implement an autoscaling strategy based on the specific requirements of the application rather than being driven by the features provided by any specific toolset. Scripting is still an essential skill, and a good autoscaling solution combines the features provided by the selected toolset with customizations in the form of scripts.

然而,基于应用的特定需求实现自动伸缩策略是重要的,而不是由任何特定工具集提供的专门功能完成。脚本仍然是一项基本技能,良好的自动伸缩解决方案是将基于工具集提供专门功能(的方式)与自定义脚本方式相结合。

Note:

If you are using Azure, you can access the Azure Management API through Windows PowerShell to script many tasks associated with starting and stopping instances and provisioning services.

注意:

如果你正在使用Windows Azure,你可以通过Windows PowerShell访问Windows Azure 管理接口,将开启关闭实例、配置服务相关的任务脚本化。

Considerations for Implementing Autoscaling

实施自动伸缩注意事项

Autoscaling is not an instant solution. Simply adding resources to a system or running more instances of a process does not guarantee that the performance of the system will improve. Consider the following points when designing an autoscaling strategy:

自动伸缩不是一个即时的解决方案。简单地给系统增加资源或运行更多的进程实例不能保证系统性能的提升。当设计自动伸缩策略时请考虑以下几点:

  • The system must be designed to be horizontally scalable. Avoid making assumptions about instance affinity; do not design solutions that require that the code is always running in a specific instance of a process. When scaling a cloud service or web site horizontally, do not assume that a series of requests from the same source will always be routed to the same instance. For the same reason, design services to be stateless to avoid requiring that a series of requests from an application are always routed to the same instance of a service. When designing a service that reads messages from a queue and processes them, do not make any assumptions about which instance of the service handles a specific message because autoscaling could start additional instances of a service as the queue length grows. The Competing Consumers pattern describes how to handle this scenario.
  • 系统必须设计为可水平伸缩的。不要假设实例之间关联紧密; 不要设计总是要在特定进程实例中运行代码的解决方案。在水平扩展云服务或Web站点时,不要假设同一来源的一系列请求始终被路由到同一实例。出于同样的原因,将服务设计为无状态的,以避免来自于应用程序的一系列请求总是被路由到同一服务实例。在设计从队列读取消息再进行处理的服务时,不要假设哪个服务实例来处理特定的消息,因为随着队列地增长,自动伸缩可能会启动服务的其他实例。在竞争消费者模式中介绍了如何处理这种情况。
  • If the solution implements a long-running task, design this task to support both scaling out and scaling in. Without due care, such a task could prevent an instance of a process from being shutdown cleanly when the system scales in, or it could lose data if the process is forcibly terminated. Ideally, refactor a long-running task and break up the processing that it performs into smaller, discrete chunks. The Pipes and Filters pattern provides an example of how you can achieve this. Alternatively, you can implement a checkpoint mechanism that records state information about the task at regular intervals, and save this state in durable storage that can be accessed by any instance of the process running the task. In this way, if the process is shutdown, the work that it was performing can be resumed from the last checkpoint by using another instance.
  • 如果解决方案要实现一个长时间运行的任务,设计该任务时使其支持向外伸缩和向内伸缩。如果不小心,在系统向内伸缩时,这个任务在系统扩展时可能会防止进程实例完全关闭,如果该进程被强制关闭可能会丢失数据。理想情况是,重构长时间运行的任务,拆解该处理过程,拆成更小的、离散的模块。管道过滤器模式提供了怎样实现这种操作的示例。或者,可以通过实现检查点机制,以定期的时间间隔记录任务的状态信息,并将此状态保存在可由运行任务的进程实例访问的持久存储介质中。这样,如果进程被关闭,正在执行的工作可以通过使用另一个实例从最后一个检查点恢复。
  • If the solution comprises multiple items, such as web roles, worker roles, and other resources, it might be necessary to scale all of these items as a unit. It is important to understand the relationships between the items that comprise a solution, and identify groupings that should be scaled together (as a scale unit) to achieve a given performance metric. For example, if you know that to handle 10,000 more active users you need to add two more instances of a given web role, three more instances of a particular worker role, and add an additional Service Bus queue, then this is your scalability unit. Obtaining this knowledge takes time and requires careful analysis of telemetry data.
  • 如果解决方案包含多个项目(例如Web角色,工作者角色和其他资源),可能需要将所有的这些项目作为一个整体进行伸缩。重要的是理解这些构成解决方案的项目之间的关系,识别应该一起伸缩的分组(作为伸缩单元),以达到要求的性能指标。例如,如果要处理10,000个活跃用户,需要增加两个Web角色的实例,三个特定工作者角色的实例和一个附加的服务总线队列,那么这三者是一个伸缩单元。拥有这种知识需要时间、需要仔细分析观测数据。
  • To prevent a system from attempting to scale out excessively (and to prevent the costs associated with running many thousands of instances), consider limiting the degree of autoscaling. Consider gracefully degrading the functionality that the system provides if the required resources are currently overloaded. Keep in mind that autoscaling might not be the most appropriate mechanism to handle a sudden burst in workload. It takes time to provision and start new instances of a service or add resources to a system, and the peak may have passed by the time these additional resources have been made available. In this scenario, it may be better to throttle the service. For more information, see the Throttling pattern.
  • 为了防止系统尝试过度扩展(防止运行成千上万个实例相关的成本),可以考虑限制自动伸缩的程度。如果所需资源当前过载,可以考虑巧妙地减少系统提供的功能。请记住,对于处理突然变化的工作负载,自动伸缩可能不是最合适的机制。配置启动服务的新实例、向系统添加资源都需要时间,在这些附加资源可用之后峰值可能已经过去。在这种情况下,可能更好的方案是限制服务。有关更多信息,请参阅限制模式
  • The system should be configured to monitor the autoscaling process, and log the details of each autoscaling event (what triggered it, what resources were added or removed, and when). This information can be analyzed to help measure the effectiveness of the autoscaling strategy, and tune it if necessary. If the system hits the upper limit defined for autoscaling, it might also alert an operator. The operator could examine the system and may be able to manually start additional resources if the situation warrants them. Note that, under these circumstances, the operator may also be responsible for manually removing these resources after the workload eases.
  • 应该为系统配置自动伸缩监视功能,记录每个自动伸缩事件的详细信息(触发它的原因,添加或删除的资源,以及伸缩发生的时间)。可以通过分析该信息来度量自动伸缩策略的有效性,并在必要时调整它。如果系统达到了为自动伸缩定义的上限,也应该考虑警告操作者。操作员可以检查系统,如果情势需要,可以手动启动附加资源。注意,在这些情况下,工作负载减轻之后操作者还可以负责手动移除这些资源。

Autoscaling in a Azure Solution

Windows Azure解决方案中的自动伸缩

Azure provides several options for configuring autoscaling for your solutions:

Windows Azure 为用户的解决方案配置自动伸缩(策略)提供了几个选项:

  • Azure Autoscaling. This feature supports the most common scaling scenarios, and you can configure a solution by using the Azure Management Portal.
  • Windows Azure 自动伸缩。此功能支持最常用的伸缩解决方案,用户可以在Windows Azure网站管理界面配置解决方案。
  • Microsoft Enterprise Library Autoscaling Application Block. This utility enables you to scale a solution based on custom rules and performance data. This approach is more flexible, but more complex, and requires you to write code to capture performance data that is specific to your solutions.
  • 微软企业库自动伸缩应用程序块。此实用程序工具包能够使用户根据自定义规则和性能数据扩展解决方案。这种方法更灵活,但更复杂,需要用户编写代码来收集用户解决方案中特定的性能数据。
  • Azure Monitoring Services Management Library. This library provides access to Azure Monitoring Services operations, including a unified API for retrieving, and configuring metrics, alerts, and autoscale rules for Azure services.
  • Windows Azure监控服务管理类库。此类库提供了对Windows Azure监控服务操作的访问,包括用于为Windows Azure服务提供检索、配置参数,自动报警,自动扩展规则的统一的API。

The following sections summarize these approaches.

以下部分总结了这些方法。

Using Azure Autoscaling

使用Windows Azure自动伸缩

Azure Autoscaling enables you to configure scale out and scale in options for a solution. Using this feature you can automatically add and remove instances of Azure Cloud Services web and worker roles, Azure Websites applications, and Azure Virtual Machines. There are two approaches for configuring autoscaling in Azure:

Windows Azure 自动伸缩能够使用户通过选项为解决方案配置向外扩展、向内扩展。使用此功能,用户可以自动添加和删除Windows Azure云服务Web实例、工作者角色实例、Windows Azure网站应用程序和Azure虚拟机。在Windows Azure中配置自动伸缩有两种方式:

  • Configure autoscaling based on metrics such as average CPU utilization over the last hour, or the backlog of items in a message queue that the solution is processing. You configure the parameters used by Azure Autoscaling, monitor the performance of your system, and then adjust the way in which the system scales if necessary. However, keep in mind that autoscaling is not an instantaneous process—it takes time to react to a metric such as average CPU utilization exceeding (or dropping below) a specified threshold. Avoid setting finely balanced thresholds that could attempt to start and stop instances very frequently; Azure enforces this rule by permitting only one scaling action to occur in a five minute period. You can increase this period if you find that the system is still overreacting.
  • 根据最近一小时的平均CPU利用率、解决方案中正在处理的消息队列里积压的项目这些指标配置自动伸缩。用户配置Windows Azure 自动伸缩使用的参数,监视系统的性能,然后根据需要再调整系统缩放的方式。但是,请记住,自动伸缩不是一个瞬时过程--需要时间对诸如平均CPU利用率超过(或低于)指定阈值做出反应。避免设置精细平衡的阈值,那样系统可能会非常频繁地尝试启动和停止实例;Windows Azure强制执行五分钟内只允许发生一个伸缩操作的规则。如果用户发现系统仍然反应过度,可以延长此时间段。
  • Configure time-based autoscaling to ensure that additional instances are available to coincide with an expected peak in usage, and scale in once the peak time has passed. This strategy enables you to ensure that you have sufficient instances already running without waiting for the system to react to the load.
  • 配置基于时间的自动伸缩,以确保额外的实例在使用中都是可用以配合预期的峰值,并在峰值时间过去后伸缩回(峰值之前的状态)。此策略使用户能够确保有足够的实例已经在运行,而不必等待系统对负载做出反应。

You should also consider scaling other resources linked to a compute instances as part of the same scalability unit. For example, you could resize SQL databases or add storage accounts as the system scales. However, at the time of writing, you must either perform these operations manually or use the Microsoft Enterprise Library Autoscaling Application Block.

用户还应该考虑将与计算实例相关的其他资源扩展为同一伸缩性单元的一部分。例如,用户可以在系统缩放时调整SQL数据库、添加存储帐户。但是,在编写本文时,用户必须手动执行这些操作或者使用Microsoft企业库自动伸缩应用程序块(完成这些操作)。

Note:

For more information about configuring autoscaling by using the Microsoft Azure Management Portal, see How to Scale an Application on MSDN.

注意:

有关使用Windows Azure网站管理界面配置自动伸缩的更多信息,请参阅MSDN上如何扩展应用程序。

Implementing Custom Autoscaling by Using the Microsoft Enterprise Library Autoscaling Application Block

通过使用微软企业库自动伸缩应用程序块实现自定义自动伸缩

The Microsoft Enterprise Library Autoscaling Application Block provides a highly customizable approach to scalability, enabling you to make scaling decisions based on performance counters or other custom metrics.

微软企业库自动伸缩应用程序块为伸缩性提供了一种高度定制的实现途径,使用户能够根据性能计数器或其他自定义指标做出扩展决策。

You specify rules that determine how the Autoscaling Application Block reacts to the metrics. These rules can be complex, and may reference combinations of metrics. For example, you could specify that the Autoscaling Application Block should start an additional instance of a worker role if the length of a message queue is growing at a certain speed and the role has less than 10% of available memory.

用户可以指定用于确定伸缩应用程序块如何对指标数据(队列长度、内存使用率等信息)做出反应的规则。这些规则可以是复杂的,并且可以引用综合指标。例如,如果消息队列的长度以一定速度增长,并且工作者角色(实例)可用内存小于10%,则可以指定自动伸缩应用程序块启动额外的工作者角色实例。

As with Azure Autoscaling, the Autoscaling Application Block also supports time-based scaling, and you can restrict the degree of autoscaling that can occur to help prevent excessive costs.

与Windows Azure Autoscaling一样,Autoscaling应用程序块也支持基于时间的伸缩,用户可以限制可能发生的自动伸缩程度,以帮助防止过高的成本。

Note:

The Autoscaling Application Block page on MSDN provides detailed information on configuring autoscaling, defining rules, and gathering performance data.

注意:

MSDN上自动伸缩应用程序块页面提供了有关配置自动伸缩、定义规则和收集性能数据的详细信息。

Implementing Custom Autoscaling by Using Azure Monitoring Service Library

使用Windows Azure监控服务类库实现自定义自动伸缩

The Azure Monitoring Service Library, which is in preview at the time of writing, can be used to monitor and automatically scale Azure deployments. In addition to defining autoscaling rules, this library provides options for monitoring and alerting. You can download the library from the NuGet gallery.

Windows Azure监控服务类库(在编写本文时处于预览阶段)可用于监控和自动伸缩Windows Azure部署。除了定义自动扩展规则之外,此类库还提供用于监控和警报的功能。可以从NuGet库下载库。

Related Patterns and Guidance

相关模式和指南

The following patterns and guidance may also be relevant to your scenario when implementing autoscaling:

在实施自动伸缩时,以下模式和指南也可能跟你的方案相关:

  • Throttling Pattern. This pattern describes how an application can continue to function and meet service level agreements when an increase in demand places an extreme load on resources. Throttling can be used with autoscaling to prevent a system from being overwhelmed while the system scales out.
  • 限流模式。此模式描述了当需求的增加对资源造成极大负荷时,应用程序如何继续运行并满足服务级别协议。限制模式可以与自动伸缩配合使用,以防止系统在扩展时崩溃。
  • Competing Consumers Pattern. This pattern describes how to implement a pool of service instances that can handle messages from any application instance. Autoscaling can be used to start and stop service instances to match the anticipated workload. This approach enables a system to process multiple messages concurrently to optimize throughput, improve scalability and availability, and balance the workload.
  • 竞争消费者。此模式描述了如何实现一个可以处理来自任何应用程序实例消息的服务实例池。自动伸缩可用于启动和停止服务实例以匹配预期的工作负载。此方法使系统能够同时处理多个消息以优化吞吐量,提高可扩展性和可用性,平衡工作负载。
  • Instrumentation and Telemetry Guidance. Instrumentation and telemetry are vital for gathering the information that can drive the autoscaling process.
  • 仪器和遥测指南。仪器和遥测对于收集可以驱动自动伸缩过程的信息至关重要。

More Information

更多信息

 

posted @ 2017-10-23 22:37  几维  阅读(360)  评论(0编辑  收藏  举报