《Cloud Native Infrastructure》CHAPTER 4(1)
Designing Infrastructure Applications(基础设施应用的设计)
在上一章中,我们学习了表示基础设施以及它周围部署工具的各种方法和关注点。在本章中,我们将研究如何设计“部署和管理基础设施”的应用程序。我们注意到前一章的关注点,并专注于打开infrastructure as software的世界,有时也被称为infrastructure as an application。
In the previous chapter we learned about representing infrastructure and the various approaches and concerns with deployment tools around it. In this chapter we look at what it takes to design applications that deploy and manage infrastructure. We heed(留意) the concerns of the previous chapter and focus on opening up the world of infrastructure as software, sometimes called infrastructure as an application.
在云原生环境中,传统的基础设施操作员(译者注,运维工程师,以下称op)是基础设施软件工程师。 与过去的其他业务角色不同,infrastructure as software仍然是一种新兴的做法。我们迫切需要探索新的模式和设定标准。
In a cloud native environment, traditional infrastructure operators need to be infrastructure software engineers. It is still an emerging practice and differs from other operational roles in the past. We desperately need to begin exploring patterns and setting standards.
infrastructure as code(基础设施即代码)和infrastructure as software(基础设施即软件)之间的一个根本区别是,软件将持续运行,并将根据协调器(reconciler)模式创建或改变基础设施。我们将在本章后面解释。此外,“基础设施即软件”背后的新范式是,软件现在与数据存储系统有更传统的关系,并公开了用于定义预期状态的API。例如,软件可能会根据数据存储系统中的需要改变基础设施的表示,并且非常好地管理数据存储本身!协调预期状态的更改是通过API的形式,而不是静态代码报告发送到软件。
A fundamental difference between infrastructure as code and infrastructure as software is that software continually(持续) runs and will create or mutate infrastructure based on the reconciler pattern, which we will explain later in this chapter. Furthermore, the new paradigm behind infrastructure as software is that the software now has a more traditional relationship with the data store and exposes an API for defining desired state. For instance, the software might mutate the representation of infrastructure as needed in the data store, and very well could manage the data store itself! Desired state changes to reconcile are sent to the software via the API instead of static code repo.
infrastructure as software的第一步是op意识到他们是软件工程师。 我们热烈欢迎您到现场! 以前的工具(例如,配置管理)具有改变基础设施op的工作职能的类似目标,但是op通常仅学习如何编写在有限的范围应用内有效的DSL(即单节点抽象)。
The first step in the direction of infrastructure as software is for infrastructure operators to realize they are software engineers. We welcome you all warmly to the field! Previous tools (e.g., configuration management) had similar goals to change infra‐ structure operators’ job function, but often the operators only learned how to write a limited DSL with narrow scope application (i.e., single node abstraction).
作为一名基础设施工程师,您不仅要掌握设计,管理和操作基础设施的基本原则,还要掌握您的专业知识并以健壮的应用程序形式将其封装起来。 这些应用程序代表了我们将要管理和变更的基础设施。
As an infrastructure engineer, you are tasked not only with having a mastery of the underlying principals of designing, managing, and operating infrastructure, but also with taking your expertise and encapsulating it in the form of a rock-solid applica‐ tion. These applications represent the infrastructure that we will be managing and mutating.
管理基础设施的工程软件并非易事。我们有传统应用程序的所有主要问题和关注点,并且我们正在一个尴尬的领域中开发。从某种意义上说,基础设施工程是一项近乎荒谬的任务,即“构建软件来部署基础设施,以便在新创建的基础设施之上运行相同的软件“
Engineering software to manage infrastructure is not an easy undertaking. We have all the major problems and concerns of a traditional application, and we are developing in an awkward space. It’s awkward in the sense that infrastructure engineering is an almost ridiculous task of building software to deploy infrastructure so that you can then run the same software on top of the newly created infrastructure.
首先,我们需要了解这个新领域中工程软件的细微差别。 我们将研究在云原生社区中验证的模式,以了解在我们的应用程序中编写干净的符合逻辑的代码的重要性。 但首先,基础设施来自哪里?
To begin, we need to understand the nuances of engineering software in this new space. We will look at patterns proven in the cloud native community to understand the importance of writing clean and logical code in our applications. But first, where does infrastructure come from?
The Bootstrapping Problem(引导问题)
1987年3月22日星期日,Richard M. Stallman向GCC邮件列表(类似于讨论组)发送了一封电子邮件报告,成功使用C编译器编译了它自己:
这个编译器在68020上正确编译,最近在vax上编译。 它最近在68020上正确编译了Emacs,并且还编译了tex-in-C和Kyoto Common Lisp。 但是,它可能仍然有许多Bug,我希望你会帮我找到它们。我将离开一个月,在这之前,我无法处理Bug报告。 -Richard M. Stallman
On Sunday, March 22, 1987, Richard M. Stallman sent an email to the GCC mailing list to report successfully compiling the C compiler with itself:This compiler compiles itself correctly on the 68020 and did so recently on the vax. It recently compiled Emacs correctly on the 68020, and has also compiled tex-in-C and Kyoto Common Lisp. However, it probably still has numerous bugs that I hope you will find for me.I will be away for a month, so bugs reported now will not be handled until then. —Richard M. Stallman
这是软件历史上的一个关键转折点,因为我们是在设计软件来引导自身。Stallman真的创造了一个可以编译自己的编译器。即使在哲学上接受这种说法为现实是困难的。
This was a critical(关键的) turning point in the history of software, as we were engineering software to bootstrap itself. Stallman had literally created a compiler that could compile itself. Even accepting this statement as truth can be philosophically difficult.
今天,我们正在解决基础设施中的相同问题。工程师们必须想出解决方案来解决几乎不可能的系统自引导问题,并在runtime中恢复正常运行。
Today we are solving the same problem with infrastructure. Engineers must come up with solutions to almost impossible problems of a system bootstrapping itself and coming to life at runtime.
一种方法是手动配置云中的第一个基础设施和基础设施应用程序。 虽然这种方法确实有效,但通常需要注意的是,在部署了更合适的基础设施之后,op应该销毁初始引导的基础设施。 这种方法繁琐,难以重复,并且容易出现人为错误。
One approach is to provision the first bit of infrastructure in the cloud and infrastructure applications manually. While this approach does work, it usually comes with the caveat that the operator should destroy the initial bootstrap infrastructure after more appropriate infrastructure has been deployed. This approach is tedious, difficult to repeat, and prone to human errors.
解决这个问题的一个更优雅和Cloud Native的方法是假设(通常是正确的)任何试图引导基础设施软件的人都有可以利用的一台本地机器。现有的机器(您的计算机)作为第一个部署工具,自动在云中创建基础设施。 在基础设施到位后,您的本地部署工具可以将自身部署到新创建的基础设施并持续运行。 良好的部署工具可以让您在完成后轻松清理它。
A more elegant and cloud native approach to solving this problem is to make the (usually correct) assumption(假定) that whoever is attempting to bootstrap infrastructure software has a local machine that we can use to our advantage. The existing machine (your computer) serves as the first deployment tool, to create infrastructure in a cloud automatically. After the infrastructure is in place, your local deployment tool can then deploy itself to the newly created infrastructure and continually run. Good deployment tools will allow you to easily clean this up when you are done.
在初始基础设施引导问题解决之后,我们可以使用基础设施应用程序来引导新的基础设施。 现在,本地计算机被取消了,我们此时正在完全云原生的运行了。
After the initial infrastructure bootstrap problem is solved, we can then use the infrastructure applications to bootstrap new infrastructure. The local computer is now taken out of the equation, and we are running entirely cloud native at this point.
The API(译者注:此处意为数据结构)
在前面的章节中,我们讨论了表示基础设施的各种方法。 在本章中,我们将探索“为基础设施提供API”的概念。
In earlier chapters we discussed the various(许多的) methods for representing infrastructure. In this chapter we will be exploring the concept of having an API for infrastructure.
当API在软件中实现时,很可能通过数据结构完成。 因此,根据您使用的编程语言,将API视为类,字典,数组,对象或结构是安全的
When the API is implemented in software, it more than likely will be done via a data structure. So, depending on the programming language you are using, it’s safe to think of the API as a class, dictionary, array, object, or struct.
API可能是数据值的任意定义,可能是一些字符串,一些整数和一个布尔值。 API也可能从某种类似JSON或YAML的编码进行编码和解码,甚至可能存储在数据库中。
The API will be an arbitrary(随意的) definition of data values, maybe a handful of strings, a few integers, and a boolean. The API will be encoded and decoded from some sort of encoding standing like JSON or YAML, or might even be stored in a database.
对于程序而言,具有可版本化的API是大多数软件工程师的常见做法。 这允许程序随着时间的推移而移动,改变和增长。 工程师保证支持较旧的API版本,并提供向后兼容性保证。 在作为软件的工程基础设施中,出于这些原因,首选使用API。
Having a versionable API for a program is a common practice for most software engineers. This allows the program to move, change, and grow over time. Engineers can advertise to support older API versions, and offer backward-compatibility guarantees. In engineering infrastructure as software, using an API is preferred for these reasons.
寻找API作为基础设施的接口是用户将基础设施作为软件使用的众多线索之一。 传统上,infrastructure as code是一个直接表示的基础设施,用户可以管理的,而API可能是在管理的确切底层资源之上的抽象。
Finding an API as the interface for infrastructure is one of the many clues that a user will be working with infrastructure as software. Traditionally, infrastructure as code is a direct representation of the infrastructure a user will be managing, whereas(然而) an API might be an abstraction on top of the exact underlying resources being managed
最终,API只是代表基础设施的数据结构。
Ultimately, an API is just a data structure that represents infrastructure.
The State of the World(系统状态)
在infrastructure as software的背景下,world是我们将要管理的基础设施。因此,world的状态只是我们计划中存在的world的审计表示。
Within the context of an infrastructure as software tool, the world is the infrastructure that we will be managing. Thus, the state of the world is just an audited representation of the world as it exists to our program.
world状况最终将回归到基础设施的内存表示中。 这些内存中的表示应映射到用于声明基础设施的原始API。 经过审核的API或world状态通常需要保存起来。
The state of the world will ultimately make its way back to an in-memory representation of the infrastructure. These in-memory representations should map to the original API used to declare infrastructure. The audited API, or state of the world, typically will needed to be saved.
存储介质(有时称为状态存储)可用于存储新审计的API。 介质可以是任何传统存储系统,例如本地文件系统,云对象存储或数据库。 如果数据存储在类似文件系统的存储系统中,则该工具很可能以逻辑方式对数据进行编码,以便可以在运行时轻松编码和解码数据。 对此的常见编码包括JSON,YAML和TOML。
A storage medium (sometimes referred to as a state store) can be used to store the freshly audited API. The medium can be any traditional storage system, such as a local filesystem, cloud object storage, or a database. If the data is stored in a filesystem-like store, the tool will most likely encode the data in a logical way so that the data can easily be encoded and decoded at runtime. Common encodings for this include JSON, YAML, and TOML.
当您开始设计程序时,您可能会发现自己希望将特权信息与您存储的其余数据一起存储。 这可能是也可能不是最佳做法,具体取决于您的安全要求以及您计划存储数据的位置。
As you begin to engineer your program, you might catch yourself wanting to store privileged information with the rest of the data you are storing. This may or may not be best practice, depending on your security requirements and where you plan on storing data.
重要的是要记住,存储机密可能是一个漏洞。 在设计软件来控制堆栈的最基本部分时,安全性至关重要。 因此,确保秘密安全通常是值得的。
It is important to remember that storing secrets can be a vulnerability. While you are designing software to control the most fundamental part of the stack, security is critical. So it’s usually worth the extra effort to ensure secrets are safe.
除了存储有关程序和云提供程序凭据的元信息之外,工程师还需要存储有关基础设施的信息。 重要的是要记住,基础设施将以某种方式表示,理想情况下,程序可以轻松解码。 同样重要的是要记住,对系统进行更改不会立即发生,而是随着时间的推移发生。
Aside from storing meta information about the program and cloud provider credentials, an engineer will also need to store information about infrastructure. It is important to remember that the infrastructure will be represented in some way, ideally one that’s easy for the program to decode. It is also important to remember that making changes to a system does not happen instantly, but rather over time.
存储和轻松访问这些数据是设计基础设施管理应用程序的重要部分。 仅基础设施定义很可能是系统中最具价值最值得设计的部分。 让我们看一个基本的例子来看看这些数据和程序如何协同工作。
Having these pieces of data stored and easily accessible is a large part of designing the infrastructure management application. The infrastructure definition alone is quite possibly the most intellectually valuable part of the system. Let’s take a look at a basic example to see how this data and the program will work together.
A filesystem state store example(一个filesystem状态存储例子)
想象一个数据存储,它只是一个名为state的目录。 在目录中,将有三个文件:
- meta_information.yaml
- secrets.yaml
- infrastructure.yaml
这个简单的数据存储可以准确地封装所需的信息,以便有效地管理基础设施。
Imagine a data store that was simply a directory called state. Within the directory, there would be three files:
- meta_information.yaml
- secrets.yaml
- infrastructure.yaml
This simple data store can accurately encapsulate the information needed to be pre‐ served in order to effectively manage infrastructure.
secrets.yaml和infrastructure.yaml存储基础设施的表示形式,meta_information.yaml文件(示例4-1)存储其他重要信息,例如上次配置基础结构的时间,配置它的人员以及日志记录信息。
The secrets.yaml and infrastructure.yaml files store the representation of the infrastructure, and the meta_information.yaml file (Example 4-1) stores other important information such as when the infrastructure was last provisioned, who provisioned it, and logging information.
lastExecution:
exitCode: 0
timestamp: 2017-08-01 15:32:11 +00:00
user: kris
logFile: /var/log/infra.log
Example 4-1. state/meta_information.yaml
第二个文件secrets.yaml保存私有的信息,用于在整个程序执行过程中以任意方式进行身份验证(例4-2)。(以这种方式存储秘密可能不安全。我们只使用secrets.yaml 举个例子。)
The second file, secrets.yaml, holds private information, used to authenticate in arbitrary ways throughout the execution of the program (Example 4-2).(Again, storing secrets in this way might be unsafe. We are using secrets.yaml merely as an example.)
apiAccessToken: a8233fc28d09a9c27b2e2f
apiSecret: 8a2976744f239eaa9287f83b23309023d
privateKeyPath: ~/.ssh/id_rsa
Example 4-2. state/secrets.yaml
第三个文件infrastructure.yaml将包含API的编码表示,包括使用的API版本(示例4-3)。 我们可以在这里找到基础设施的表示,例如网络和DNS信息,防火墙规则和虚拟机定义。
The third file, infrastructure.yaml, would contain an encoded representation of the API, including the API version used (Example 4-3). Here can we find infrastructure representation, such as as network and DNS information, firewall rules, and virtual machine definitions.
location: "San Francisco 2"
name: infra1
dns:
fqdn: infra.example.com # Fully Qualified Domain Name,全限定域名:同时带有主机名和域名的名称
network:
cidr: 10.0.0.0/12 # 无类别域间路由,Classless Inter-Domain Routing,是一个在Internet上创建附加地址的方法
serverPools:
- bootstrapScript: /opt/infra/bootstrap.sh # 启动脚本
diskSize: large
workload: medium
memory: medium
subnetHostsCount: 256
firewalls: # 防火墙
- rules:
- ingressFromPort: 22
ingressProtocol: tcp
ingressSource: 0.0.0.0/0
ingressToPort: 22
image: ubuntu-16-04-x64 # 镜像
Example 4-3. state/infrastructure.yaml
infrastructure.yaml文件最初可能只是 infrastructure as code 的示例。 但是如果你仔细观察,你会发现许多定义的指令都是在具体基础设施之上的抽象。 例如,subnetHostsCount指令是一个整数值,用于定义子网的预期主机数。 该程序将管理为op划分的网络中定义的较大的无类域间路由(CIDR)值。 op不会声明子网,也不会声明他们想要的主机数量。 该软件为op提供了其余的原因。
The infrastructure.yaml file at first might appear to be nothing more than an example of infrastructure as code. But if you look closely, you will see that many of the directives defined are an abstraction on top of the concrete infrastructure. For instance, the subnetHostsCount directive is an integer value and defines the intended number of hosts for a subnet. The program will manage sectioning off the larger classless interdomain routing (CIDR) value defined in network for the operator. The operator does not declare a subnet, just how many hosts they would like. The software reasons about the rest for the operator.
当程序运行时,它可能会更新API并将新的表示形式写入数据存储(在本例中,它只是一个文件)。为了继续我们的子网主机计数示例,假设程序确实为我们选择了子网CIDR。新的数据结构可能类似于示例4-4。
As the program runs, it might update the API and write the new representation out to the data store (which in this case is simply a file). To continue with our subnetHosts Count example, let’s say that the program did pick out a subnet CIDR for us. The new data structure might look something like Example 4-4.
location: "San Francisco 2"
name: infra1
dns:
fqdn: infra.example.com
network:
cidr: 10.0.0.0/12
serverPools:
- bootstrapScript: /opt/infra/bootstrap.sh
diskSize: large
workload: medium
memory: medium
subnetHostsCount: 256
assignedSubnetCIDR: 10.0.100.0/24 # 此项是较4.3多出的一项
firewalls:
- rules:
- ingressFromPort: 22
ingressProtocol: tcp
ingressSource: 0.0.0.0/0
ingressToPort: 22
image: ubuntu-16-04-x64
Example 4-4. state/infrastructure.yaml
注意程序如何编写assignedsubnetcidr指令,而不是op。还请记住,更新API的程序是一个标志,表明用户正在与infrastructure as software进行交互。
Notice how the program wrote the assignedSubnetCIDR directive, not the operator. Also remember how the program updating the API is a sign that a user is interacting with infrastructure as software.
现在请记住,这只是一个例子,并不一定主张使用抽象来计算子网CIDR。不同的用例可能需要在应用程序中进行不同的抽象和实现。关于构建基础设施应用程序的一个美妙而强大的方面是,用户可以用他们认为解决他们的一组问题所必需的任何方式来设计软件。
Now remember this is just an example, and does not necessarily advocate for using an abstraction for calculating a subnet CIDR. Different use cases may require different abstractions and implementation in the application. One of the beautiful and powerful things about building infrastructure applications is that users can engineer the software in any way they find necessary to solve their set of problems.
数据存储(infrastructure.yaml文件)现在可以被认为是软件工程领域中的传统数据存储。 也就是说,程序可以对文件进行完全写入控制。
The data store (the infrastructure.yaml file) can now be thought of as a traditional data store in the software engineering realm. That is, the program can have full write control over the file.
这会带来风险,但也会给工程师带来巨大的胜利,正如我们将要发现的那样。基础设施表示不必存储在文件系统上的文件中。相反,它可以存储在任何数据存储中,如传统的数据库或键/值存储系统。
This introduces risk, but also a great win for the engineer, as we will discover. The infrastructure representation doesn’t have to be stored in files on a filesystem. Instead, it can be stored in any data storage such as a traditional database or key/value storage system.
为了理解“软件如何处理这种新的基础设施表示”的复杂性,我们必须了解系统中的两种状态,即在infrastructure.yaml文件中找到的API形式的预期状态,以及可以在现实(或审核)中观察到的实际状态,或world状态
To understand the complexities of how software will handle this new representation of infrastructure, we have to understand the two states in the system—the expected state in the form of the API, which is found in the infrastructure.yaml file, and the actual state that can be observed in reality (or audited), or the state of the world.
在这个例子中,软件还没有做任何事情或采取任何行动,我们正处在管理时间活动表的开始。因此,world的实际状态将是零,而预期的world状态将是infrastructure.yaml文件中包含的任何状态。
In this example, the software hasn’t done anything or taken any action yet, and we are at the beginning of the management timeline. Thus, the actual state of the world would be nothing, while the expected state of the world would be whatever is encapsulated in the infrastructure.yaml file.