NodeJS-Web-开发第五版-全-

NodeJS Web 开发第五版(全)

原文:zh.annas-archive.org/md5/E4F616CD5ADA487AF57868CB589CA6CA

译者:飞龙

协议:CC BY-NC-SA 4.0

前言

Node.js 是一个服务器端的 JavaScript 平台,允许开发人员在网页浏览器之外使用 JavaScript 构建快速可扩展的应用程序。它在软件开发世界中扮演着越来越重要的角色,最初作为服务器应用程序的平台,但现在在命令行开发工具甚至 GUI 应用程序中得到广泛应用,这要归功于 Electron 等工具包。Node.js 已经将 JavaScript 从浏览器中解放出来。

它运行在谷歌 Chrome 浏览器核心的超快 JavaScript 引擎 V8 之上。Node.js 运行时遵循一个巧妙的事件驱动模型,尽管使用单线程模型,但在并发处理能力方面被广泛使用。

Node.js 的主要重点是高性能、高可扩展性的 Web 应用程序,但它也在其他领域得到了应用。例如,基于 Node.js 的 Electron 包装了 Chrome 引擎,让 Node.js 开发人员可以创建桌面 GUI 应用程序,并成为许多热门应用程序的基础,包括 Atom 和 Visual Studio Code 编辑器、GitKraken、Postman、Etcher 和桌面版 Slack 客户端。Node.js 在物联网设备上很受欢迎。它的架构特别适合微服务开发,并经常帮助构建全栈应用程序的服务器端。

在单线程系统上提供高吞吐量的关键是 Node.js 的异步执行模型。这与依赖线程进行并发编程的平台非常不同,因为这些系统通常具有很高的开销和复杂性。相比之下,Node.js 使用一个简单的事件分发模型,最初依赖回调函数,但今天依赖 JavaScript Promise 对象和 async 函数。

由于 Node.js 建立在 Chrome 的 V8 引擎之上,该平台能够快速采用 JavaScript 语言的最新进展。Node.js 核心团队与 V8 团队密切合作,让它能够快速采用 V8 中实现的新 JavaScript 语言特性。Node.js 14.x 是当前版本,本书是针对该版本编写的。

第一章:这本书适合谁

服务器端工程师可能会发现 JavaScript 是一种优秀的替代编程语言。由于语言的进步,JavaScript 早就不再是一种只适用于在浏览器中为按钮添加动画效果的简单玩具语言。我们现在可以使用这种语言构建大型系统,而 Node.js 具有许多内置功能,比如一流的模块系统,可以帮助开发更大的项目。

有经验的浏览器端 JavaScript 开发人员可能会发现通过本书扩展视野,包括使用服务器端开发。

本书内容

《第一章》《关于 Node.js》介绍了 Node.js 平台。它涵盖了 Node.js 的用途、技术架构选择、历史、服务器端 JavaScript 的历史、JavaScript 应该从浏览器中解放出来以及 JavaScript 领域的重要最新进展。

《第二章》《设置 Node.js》介绍了如何设置 Node.js 开发环境。这包括在 Windows、macOS 和 Linux 上安装 Node.js。还介绍了一些重要的工具,包括npmyarn包管理系统,以及用于将现代 JavaScript 转译为在旧 JavaScript 实现上可运行形式的 Babel。

《第三章》《探索 Node.js 模块》深入探讨了模块作为 Node.js 应用程序中的模块化单元。我们将深入了解和开发 Node.js 模块,并使用npm来维护依赖关系。我们将了解新的模块格式 ES6 模块,以及如何在 Node.js 中使用它,因为它现在得到了原生支持。

第四章,“HTTP 服务器和客户端”,开始探索 Node.js 的 Web 开发。我们将在 Node.js 中开发几个小型 Web 服务器和客户端应用程序。我们将使用斐波那契算法来探索重型、长时间运行计算对 Node.js 应用程序的影响。我们还将学习几种缓解策略,并获得我们开发 REST 服务的第一次经验。

第五章,“你的第一个 Express 应用程序”,开始了本书的主要旅程,即开发一个用于创建和编辑笔记的应用程序。在本章中,我们运行了一个基本的笔记应用程序,并开始使用 Express 框架。

第六章,“实现移动优先范式”,使用 Bootstrap V4 框架在笔记应用程序中实现响应式 Web 设计。这包括集成流行的图标集以及自定义 Bootstrap 所需的步骤。

第七章,“数据存储和检索”,探索了几种数据库引擎和一种可以轻松切换数据库的方法。目标是将数据稳健地持久化到磁盘。

第八章,“使用微服务对用户进行身份验证”,为笔记应用程序添加了用户身份验证。我们将学习使用 PassportJS 处理登录和注销。身份验证既支持本地存储的用户凭据,也支持使用 Twitter 的 OAuth。

第九章,“使用 Socket.IO 进行动态客户端/服务器交互”,让我们看看如何让用户实时交流。我们将使用 Socket.IO 这个流行的框架来支持内容的动态更新和简单的评论系统。所有内容都是由用户在伪实时中动态更新的,这给了我们学习实时动态更新的机会。

第十章,“将 Node.js 应用部署到 Linux 服务器”,是我们开始部署旅程的地方。在本章中,我们将使用传统的方法在 Ubuntu 上使用 Systemd 部署后台服务。

第十一章,“使用 Docker 部署 Node.js 微服务”,让我们开始探索使用 Docker 进行基于云的部署,将笔记应用程序视为一组微服务的集群。

第十二章,“使用 Terraform 在 AWS EC2 上部署 Docker Swarm”,让我们看看如何构建一个使用 AWS EC2 系统的云托管系统。我们将使用流行的工具 Terraform 来创建和管理 EC2 集群,并学习如何几乎完全自动化使用 Terraform 功能部署 Docker Swarm 集群。

第十三章,“单元测试和功能测试”,让我们探索三种测试模式:单元测试、REST 测试和功能测试。我们将使用流行的测试框架 Mocha 和 Chai 来驱动这三种模式的测试用例。对于功能测试,我们将使用 Puppeteer,这是一个在 Chrome 实例中自动化测试执行的流行框架。

第十四章,“Node.js 应用程序中的安全性”,是我们集成安全技术和工具以减轻安全入侵的地方。我们将首先在 AWS EC2 部署中使用 Let's Encrypt 实现 HTTPS。然后,我们将讨论 Node.js 中的几种工具来实现安全设置,并讨论 Docker 和 AWS 环境的最佳安全实践。

为了充分利用本书

基本要求是安装 Node.js 并拥有面向程序员的文本编辑器。编辑器不必太复杂;即使是 vi/vim 也可以。我们将向您展示如何安装所需的一切,而且这些都是开源的,因此没有任何准入障碍。

最重要的工具是您的大脑,我们指的不是耳屎。

书中涵盖的软件/硬件 操作系统要求
Node.js 及相关框架,如 Express、Sequelize 和 Socket.IO
使用npm/yarn软件包管理工具
Python 和 C/C++编译器
MySQL、SQLite3 和 MongoDB 数据库
Docker
Multipass
Terraform
Mocha 和 Chai

每个涉及的软件都是 readily available。对于 Windows 和 macOS 上的 C/C++编译器,您需要获取 Visual Studio(Windows)或 Xcode(macOS),但两者都是免费提供的。

如果您已经有一些 JavaScript 编程经验,这将会很有帮助。如果您已经有其他编程语言的经验,学习它会相当容易。

下载示例代码文件

尽管我们希望书中和存储库中的代码片段是相同的,但在某些地方可能会有细微差异。存储库中可能包含书中未显示的注释、调试语句或替代实现(已注释掉)。

您可以从www.packt.com的帐户中下载本书的示例代码文件。如果您在其他地方购买了本书,您可以访问www.packtpub.com/support并注册,文件将直接发送到您的邮箱。

您可以按照以下步骤下载代码文件:

  1. www.packt.com登录或注册。

  2. 选择支持选项卡。

  3. 单击代码下载。

  4. 在搜索框中输入书名,然后按照屏幕上的说明操作。

下载文件后,请确保使用以下最新版本的软件解压或提取文件夹:

  • Windows 上的 WinRAR/7-Zip

  • Mac 上的 Zipeg/iZip/UnRarX

  • Linux 上的 7-Zip/PeaZip

该书的代码包也托管在 GitHub 上,网址为github.com/PacktPublishing/Node.js-Web-Development-Fifth-Edition。如果代码有更新,将在现有的 GitHub 存储库上进行更新。

我们还有来自丰富书籍和视频目录的其他代码包,可在github.com/PacktPublishing/上找到。快去看看吧!

使用的约定

本书中使用了许多文本约定。

CodeInText:表示文本中的代码字、数据库表名、文件夹名、文件名、文件扩展名、路径名、虚拟 URL、用户输入和 Twitter 用户名。例如:"首先更改package.json,使其具有以下scripts部分。"

代码块设置如下:


When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

任何命令行输入或输出都以以下形式编写:


**粗体**:表示新术语、重要单词或屏幕上看到的单词。例如,菜单或对话框中的单词会在文本中以这种方式出现。例如:"单击提交按钮。"

警告或重要说明看起来像这样。

提示和技巧看起来像这样。


# 第二章

第一部分:Node.js 简介

这是对 Node.js 领域的高层概述。读者将已经迈出了使用 Node.js 的第一步。

本节包括以下章节:

+   第一章,*关于 Node.js*

+   第二章,*设置 Node.js*

+   第三章,*探索 Node.js 模块*

+   第四章,*HTTP 服务器和客户端*


关于 Node.js

JavaScript 是每个前端 Web 开发人员的得心应手,使其成为一种非常流行的编程语言,以至于被刻板地认为是用于 Web 页面中的客户端代码。有可能,拿起这本书的时候,你已经听说过 Node.js,这是一个用于在 Web 浏览器之外编写 JavaScript 代码的编程平台。现在大约有十年的历史,Node.js 正在成为一个成熟的编程平台,在大大小小的项目中被广泛使用。

本书将为您介绍 Node.js。通过本书,您将学习使用 Node.js 开发服务器端 Web 应用程序的完整生命周期,从概念到部署和安全性。在撰写本书时,我们假设以下内容:

+   你已经知道如何编写软件。

+   你熟悉 JavaScript。

+   你对其他语言中开发 Web 应用程序有所了解。

当我们评估一个新的编程工具时,我们是因为它是流行的新工具而抓住它吗?也许我们中的一些人会这样做,但成熟的方法是将一个工具与另一个工具进行比较。这就是本章的内容,介绍使用 Node.js 的技术基础。在着手编写代码之前,我们必须考虑 Node.js 是什么,以及它如何适应软件开发工具的整体市场。然后我们将立即着手开发工作应用程序,并认识到通常学习的最佳方式是通过在工作代码中进行搜索。

我们将在本章中涵盖以下主题:

+   Node.js 简介

+   Node.js 可以做什么

+   为什么你应该使用 Node.js

+   Node.js 的架构

+   使用 Node.js 的性能、利用率和可扩展性

+   Node.js、微服务架构和测试

+   使用 Node.js 实现十二要素应用程序模型

# 第三章:Node.js 概述

Node.js 是一个令人兴奋的新平台,用于开发 Web 应用程序、应用服务器、任何类型的网络服务器或客户端以及通用编程。它旨在通过服务器端 JavaScript、异步 I/O 和异步编程的巧妙组合,在网络应用程序中实现极端可扩展性。

尽管只有十年的历史,Node.js 迅速崭露头角,现在正发挥着重要作用。无论是大公司还是小公司,都在大规模和小规模项目中使用它。例如,PayPal 已经将许多服务从 Java 转换为 Node.js。

Node.js 的架构与其他应用平台通常选择的方式有所不同。在其他应用平台中,线程被广泛使用来扩展应用程序以填充 CPU,而 Node.js 则避免使用线程,因为线程具有固有的复杂性。据称,采用单线程事件驱动架构,内存占用低,吞吐量高,负载下的延迟配置文件更好,并且编程模型更简单。Node.js 平台正处于快速增长阶段,许多人认为它是传统的使用 Java、PHP、Python 或 Ruby on Rails 的 Web 应用程序架构的一个引人注目的替代方案。

在其核心,它是一个独立的 JavaScript 引擎,具有适用于通用编程的扩展,并且专注于应用服务器开发。尽管我们正在将 Node.js 与应用服务器平台进行比较,但它并不是一个应用服务器。相反,Node.js 是一个类似于 Python、Go 或 Java SE 的编程运行时。虽然有一些用 Node.js 编写的 Web 应用程序框架和应用服务器,但它只是一个执行 JavaScript 程序的系统。

关键的架构选择是 Node.js 是事件驱动的,而不是多线程的。Node.js 架构基于将阻塞操作分派到单线程事件循环,结果以调用事件处理程序的事件返回给调用者。在大多数情况下,事件被转换为由`async`函数处理的 promise。由于 Node.js 基于 Chrome 的 V8 JavaScript 引擎,Chrome 中实现的性能和功能改进很快就会流入 Node.js 平台。

Node.js 核心模块足够通用,可以实现执行任何 TCP 或 UDP 协议的服务器,无论是 DNS、HTTP、互联网中继聊天(IRC)还是 FTP。虽然它支持互联网服务器或客户端的开发,但它最大的用例是常规网站开发,取代了像 Apache/PHP 或 Rails 堆栈这样的技术,或者作为现有网站的补充,例如,使用 Node.js 的 Socket.IO 库可以轻松地添加实时聊天或监控现有网站。它的轻量级、高性能的特性经常被用作 Node.js 的“胶水”服务。

特别有趣的组合是在现代云基础设施上部署小型服务,使用诸如 Docker 和 Kubernetes 之类的工具,或者像 AWS Lambda 这样的函数即服务平台。将大型应用程序划分为易于部署的微服务时,Node.js 在规模上表现良好。

掌握了 Node.js 的高级理解后,让我们深入一点。

# Node.js 的能力

Node.js 是一个在 Web 浏览器之外编写 JavaScript 应用程序的平台。这不是我们在 Web 浏览器中熟悉的 JavaScript 环境!虽然 Node.js 执行与我们在浏览器中使用的相同的 JavaScript 语言,但它没有一些与浏览器相关的功能。例如,Node.js 中没有内置 HTML DOM。

除了其本身执行 JavaScript 的能力外,内置模块提供了以下类型的功能:

+   命令行工具(以 shell 脚本风格)

+   交互式终端风格的程序,即 REPL

+   优秀的进程控制功能来监督子进程

+   处理二进制数据的缓冲对象

+   TCP 或 UDP 套接字与全面的事件驱动回调

+   DNS 查找

+   HTTP、HTTPS 和 HTTP/2 客户端服务器在 TCP 库文件系统访问之上

+   内置的基本单元测试支持通过断言

Node.js 的网络层是低级的,同时使用起来很简单,例如,HTTP 模块允许您使用几行代码编写 HTTP 服务器(或客户端)。这很强大,但它让您,程序员,非常接近协议请求,并让您实现应该在请求响应中返回的那些 HTTP 头部。

典型的 Web 应用程序开发人员不需要在 HTTP 或其他协议的低级别上工作;相反,我们倾向于使用更高级别的接口更加高效,例如,PHP 程序员假设 Apache/Nginx 等已经提供了 HTTP,并且他们不必实现堆栈的 HTTP 服务器部分。相比之下,Node.js 程序员确实实现了一个 HTTP 服务器,他们的应用代码附加到其中。

为了简化情况,Node.js 社区有几个 Web 应用程序框架,比如 Express,提供了典型程序员所需的更高级别的接口。您可以快速配置一个具有内置功能的 HTTP 服务器,比如会话、cookie、提供静态文件和日志记录,让开发人员专注于业务逻辑。其他框架提供 OAuth 2 支持或专注于 REST API 等等。

使用 Node.js 的社区在这个基础上构建了各种令人惊叹的东西。

## 人们如何使用 Node.js?

Node.js 不仅限于 Web 服务应用程序开发;Node.js 周围的社区已经将其引向了许多其他方向:

+   构建工具:Node.js 已经成为开发命令行工具的热门选择,这些工具用于软件开发或与服务基础设施通信。Grunt、Gulp 和 Webpack 被广泛用于前端开发人员构建网站资产。Babel 被广泛用于将现代 ES-2016 代码转译为在旧版浏览器上运行。流行的 CSS 优化器和处理器,如 PostCSS,都是用 Node.js 编写的。静态网站生成系统,如 Metalsmith、Punch 和 AkashaCMS,在命令行上运行,并生成您上传到 Web 服务器的网站内容。

+   Web UI 测试:Puppeteer 让您控制一个无头 Chrome 浏览器实例。借助它,您可以通过控制现代、功能齐全的 Web 浏览器来开发 Node.js 脚本。一些典型的用例是 Web 抓取和 Web 应用程序测试。

+   桌面应用程序:Electron 和 node-webkit(NW.js)都是用于开发 Windows、macOS 和 Linux 桌面应用程序的框架。这些框架利用大量的 Chrome,由 Node.js 库包装,使用 Web UI 技术开发桌面应用程序。应用程序使用现代的 HTML5、CSS3 和 JavaScript 编写,并可以利用领先的 Web 框架,如 Bootstrap、React、VueJS 和 AngularJS。许多流行的应用程序都是使用 Electron 构建的,包括 Slack 桌面客户端应用程序、Atom、Microsoft Visual Code 编程编辑器、Postman REST 客户端、GitKraken GIT 客户端和 Etcher 等。

+   移动应用程序:Node.js for Mobile Systems 项目允许您使用 Node.js 开发 iOS 和 Android 的智能手机或平板电脑应用程序。苹果的 App Store 规定不允许将具有 JIT 功能的 JavaScript 引擎纳入其中,这意味着普通的 Node.js 不能在 iOS 应用程序中使用。对于 iOS 应用程序开发,该项目使用 Node.js-on-ChakraCore 来规避 App Store 规定。对于 Android 应用程序开发,该项目使用常规的 Node.js 在 Android 上运行。在撰写本文时,该项目处于早期开发阶段,但看起来很有前景。

+   物联网(IoT):Node.js 是物联网项目中非常流行的语言,Node.js 可以在大多数基于 ARM 的单板计算机上运行。最明显的例子是 NodeRED 项目。它提供了一个图形化的编程环境,让您通过连接块来绘制程序。它具有面向硬件的输入和输出机制,例如与树莓派或 Beaglebone 单板计算机上的通用 I/O(GPIO)引脚进行交互。

您可能已经在使用 Node.js 应用程序而没有意识到!JavaScript 在 Web 浏览器之外也有用武之地,这不仅仅是因为 Node.js。

## 服务器端 JavaScript

别再挠头了!当然,您正在这样做,挠头并自言自语地说:“浏览器语言在服务器上做什么?”事实上,JavaScript 在浏览器之外有着悠久而鲜为人知的历史。JavaScript 是一种编程语言,就像任何其他语言一样,更好的问题是“为什么 JavaScript 应该被困在 Web 浏览器内部?”

回到网络时代的黎明,编写 Web 应用程序的工具处于萌芽阶段。一些开发人员尝试使用 Perl 或 TCL 编写 CGI 脚本,PHP 和 Java 语言刚刚被开发出来。即便那时,JavaScript 也在服务器端使用。早期的 Web 应用程序服务器之一是网景的 LiveWire 服务器,它使用了 JavaScript。微软的 ASP 的一些版本使用了 JScript,他们的 JavaScript 版本。一个更近期的服务器端 JavaScript 项目是 Java 领域的 RingoJS 应用程序框架。Java 6 和 Java 7 都附带了 Rhino JavaScript 引擎。在 Java 8 中,Rhino 被新的 Nashorn JavaScript 引擎所取代。

换句话说,JavaScript 在浏览器之外并不是一件新事物,尽管它并不常见。

您已经了解到 Node.js 是一个用于在 Web 浏览器之外编写 JavaScript 应用程序的平台。Node.js 社区使用这个平台进行各种类型的应用程序开发,远远超出了最初为该平台构思的范围。这证明了 Node.js 的受欢迎程度,但我们仍然必须考虑使用它的技术原因。

# 为什么要使用 Node.js?

在众多可用的 Web 应用程序开发平台中,为什么应该选择 Node.js?有很多选择,那么 Node.js 有什么特点使其脱颖而出呢?我们将在接下来的部分中找到答案。

## 流行度

Node.js 迅速成为一种受欢迎的开发平台,并被许多大大小小的参与者所采用。其中之一是 PayPal,他们正在用 Node.js 替换其现有的基于 Java 的系统。其他大型 Node.js 采用者包括沃尔玛的在线电子商务平台、LinkedIn 和 eBay。

有关 PayPal 关于此的博客文章,请访问[`www.paypal-engineering.com/2013/11/22/node-js-at-paypal/`](https://www.paypal-engineering.com/2013/11/22/node-js-at-paypal/)。

根据 NodeSource 的说法,Node.js 的使用量正在迅速增长(有关更多信息,请访问[`nodesource.com/node-by-numbers`](https://nodesource.com/node-by-numbers))。这种增长的证据包括下载 Node.js 版本的带宽增加,与 Node.js 相关的 GitHub 项目的活动增加等。

对 JavaScript 本身的兴趣仍然非常强烈,但在搜索量(Google Insights)和作为编程技能的使用方面(Dice Skills Center)已经停滞多年。Node.js 的兴趣一直在迅速增长,但正在显示出停滞的迹象。

有关更多信息,请参阅[`itnext.io/choosing-typescript-vs-javascript-technology-popularity-ea978afd6b5f`](https://itnext.io/choosing-typescript-vs-javascript-technology-popularity-ea978afd6b5f)或[`bit.ly/2q5cu0w`](http://bit.ly/2q5cu0w)。

最好不要只是跟随潮流,因为有不同的潮流,每一个都声称他们的软件平台有很酷的功能。Node.js 确实有一些很酷的功能,但更重要的是它的技术价值。

## JavaScript 无处不在

在服务器和客户端上使用相同的编程语言一直是网络上的一个长期梦想。这个梦想可以追溯到早期的 Java 时代,当时 Java 小程序在浏览器中被视为用于 Java 编写的服务器应用程序的前端,而 JavaScript 最初被设想为这些小程序的轻量级脚本语言。然而,Java 从未实现其作为客户端编程语言的炒作,甚至“Java 小程序”这个词组也正在逐渐消失,成为被放弃的客户端应用程序模型的模糊记忆。最终,我们选择了 JavaScript 作为浏览器中的主要客户端语言,而不是 Java。通常情况下,前端 JavaScript 开发人员使用的是与服务器端团队不同的语言,后者可能是 PHP、Java、Ruby 或 Python。

随着时间的推移,在浏览器中的 JavaScript 引擎变得非常强大,让我们能够编写越来越复杂的浏览器端应用程序。有了 Node.js,我们终于能够使用相同的编程语言在客户端和服务器上实现应用程序,因为 JavaScript 在网络的两端,即浏览器和服务器上。

前端和后端使用相同的编程语言具有几个潜在的好处:

+   同一编程人员可以在网络两端工作。

+   代码可以更轻松地在服务器和客户端之间迁移。

+   服务器和客户端之间的常见数据格式(JSON)。

+   服务器和客户端存在常见的软件工具。

+   服务器和客户端的常见测试或质量报告工具。

+   在编写 Web 应用程序时,视图模板可以在两端使用。

JavaScript 语言非常受欢迎,因为它在 Web 浏览器中非常普遍。它与其他语言相比具有许多现代、先进的语言概念。由于其受欢迎程度,有许多经验丰富的 JavaScript 程序员。

## 利用谷歌对 V8 的投资

为了使 Chrome 成为一款受欢迎且出色的 Web 浏览器,谷歌投资于使 V8 成为一个超快的 JavaScript 引擎。因此,谷歌有巨大的动力继续改进 V8。V8 是 Chrome 的 JavaScript 引擎,也可以独立执行。

Node.js 建立在 V8 JavaScript 引擎之上,使其能够利用 V8 的所有工作。因此,Node.js 能够在 V8 实现新的 JavaScript 语言特性时迅速采用,并因此获得性能优势。

## 更精简、异步、事件驱动的模型

Node.js 架构建立在单个执行线程上,具有巧妙的事件驱动、异步编程模型和快速的 JavaScript 引擎,据称比基于线程的架构具有更少的开销。其他使用线程进行并发的系统往往具有内存开销和复杂性,而 Node.js 没有。我们稍后会更深入地讨论这一点。

## 微服务架构

软件开发中的一个新感觉是微服务的概念。微服务专注于将大型 Web 应用程序拆分为小型、紧密专注的服务,可以由小团队轻松开发。虽然它们并不是一个全新的想法,它们更像是对旧的客户端-服务器计算模型的重新构架,但是微服务模式与敏捷项目管理技术很匹配,并且为我们提供了更精细的应用部署。

Node.js 是实现微服务的优秀平台。我们稍后会详细介绍。

## Node.js 在一次重大分裂和敌对分支之后变得更加强大

在 2014 年和 2015 年,Node.js 社区因政策、方向和控制而发生了重大分裂。**io.js**项目是一个敌对的分支,由一群人驱动,他们希望合并几个功能并改变决策过程中的人员。最终的结果是合并了 Node.js 和 io.js 存储库,成立了独立的 Node.js 基金会来运作,并且社区共同努力朝着共同的方向前进。

弥合这一分歧的一个具体结果是快速采用新的 ECMAScript 语言特性。V8 引擎迅速采用这些新特性来推进 Web 开发的状态。Node.js 团队也在 V8 中尽快采用这些特性,这意味着承诺和`async`函数很快就会成为 Node.js 程序员的现实。

总之,Node.js 社区不仅在 io.js 分支和后来的 ayo.js 分支中幸存下来,而且社区和它培育的平台因此变得更加强大。

在本节中,您已经了解了使用 Node.js 的几个原因。它不仅是一个受欢迎的平台,有一个强大的社区支持,而且还有一些严肃的技术原因可以使用它。它的架构具有一些关键的技术优势,让我们更深入地了解一下这些优势。

# Node.js 事件驱动架构

据说 Node.js 的出色性能是因为其异步事件驱动架构和使用 V8 JavaScript 引擎。这使其能够同时处理多个任务,例如在多个 Web 浏览器的请求之间进行协调。Node.js 的原始创始人 Ryan Dahl 遵循了这些关键点:

+   单线程、事件驱动的编程模型比依赖线程处理多个并发任务的应用服务器更简单,复杂性更低,开销更小。

+   通过将阻塞函数调用转换为异步代码执行,可以配置系统以在满足阻塞请求时发出事件。

+   您可以利用来自 Chrome 浏览器的 V8 JavaScript 引擎,并且所有工作都用于改进 V8;所有性能增强都进入 V8,因此也有益于 Node.js。

在大多数应用服务器中,并发或处理多个并发请求的能力是通过多线程架构实现的。在这样的系统中,对数据的任何请求或任何其他阻塞函数调用都会导致当前执行线程暂停并等待结果。处理并发请求需要有多个执行线程。当一个线程被暂停时,另一个线程可以执行。这会导致应用服务器启动和停止线程来处理请求。每个暂停的线程(通常在输入/输出操作完成时等待)都会消耗完整的内存调用堆栈,增加开销。线程会给应用服务器增加复杂性和服务器开销。

为了帮助我们理解为什么会这样,Node.js 的创始人 Ryan Dahl 在 2010 年 5 月的 Cinco de NodeJS 演示中提供了以下示例。([`www.youtube.com/watch?v=M-sc73Y-zQA`](https://www.youtube.com/watch?v=M-sc73Y-zQA)) Dahl 问我们当我们执行这样的代码行时会发生什么:

Of course, the program pauses at this point while the database layer sends the query to the database and waits for the result or the error. This is an example of a blocking function call. Depending on the query, this pause can be quite long (well, a few milliseconds, which is ages in computer time). This pause is bad because the execution thread can do nothing while it waits for the result to arrive. If your software is running on a single-threaded platform, the entire server would be blocked and unresponsive. If instead your application is running on a thread-based server platform, a thread-context switch is required to satisfy any other requests that arrive. The greater the number of outstanding connections to the server, the greater the number of thread-context switches. Context switching is not free because more threads require more memory per thread state and more time for the CPU to spend on thread management overheads.

The key inspiration guiding the original development of Node.js was the simplicity of a single-threaded system. A single execution thread means that the server doesn't have the complexity of multithreaded systems. This choice meant that Node.js required an event-driven model for handling concurrent tasks. Instead of the code waiting for results from a blocking request, such as retrieving data from a database, an event is instead dispatched to an event handler.

Using threads to implement concurrency often comes with admonitions, such as expensive and error-pronethe error-prone synchronization primitives of Java, or designing concurrent software can be complex and error-prone. The complexity comes from access to shared variables and various strategies to avoid deadlock and competition between threads. The synchronization primitives of Java are an example of such a strategy, and obviously many programmers find them difficult to use. There's a tendency to create frameworks such as java.util.concurrent to tame the complexity of threaded concurrency, but some argue that papering over complexity only makes things more complex.

A typical Java programmer might object at this point. Perhaps their application code is written against a framework such as Spring, or maybe they're directly using Java EE. In either case, their application code does not use concurrency features or deal with threads, and therefore where is the complexity that we just described? Just because that complexity is hidden within Spring and Java EE does not mean that there is no complexity and overhead.

Okay, we get it: while multithreaded systems can do amazing things, there is inherent complexity. What does Node.js offer?

The Node.js answer to complexity

Node.js asks us to think differently about concurrency. Callbacks fired asynchronously from an event loop are a much simpler concurrency model—simpler to understand, simpler to implement, simpler to reason about, and simpler to debug and maintain.

Node.js has a single execution thread with no waiting on I/O or context switching. Instead, there is an event loop that dispatches events to handler functions as things happen. A request that would have blocked the execution thread instead executes asynchronously, with the results or errors triggering an event. Any operation that would block or otherwise take time to complete must use the asynchronous model.

The original Node.js paradigm delivered the dispatched event to an anonymous function. Now that JavaScript has async functions, the Node.js paradigm is shifting to deliver results and errors via a promise that is handled by the await keyword. When an asynchronous function is called, control quickly passes to the event loop rather than causing Node.js to block. The event loop continues handling the variety of events while recording where to send each result or error.

By using an asynchronous event-driven I/O, Node.js removes most of this overhead while introducing very little of its own.

One of the points Ryan Dahl made in the Cinco de Node presentation is a hierarchy of execution time for different requests. Objects in memory are more quickly accessed (in the order of nanoseconds) than objects on disk or objects retrieved over the network (milliseconds or seconds). The longer access time for external objects is measured in zillions of clock cycles, which can be an eternity when your customer is sitting at their web browser ready to move on if it takes longer than two seconds to load the page.

Therefore, concurrent request handling means using a strategy to handle the requests that take longer to satisfy. If the goal is to avoid the complexity of a multithreaded system, then the system must use asynchronous operations as Node.js does.

What do these asynchronous function calls look like?

Asynchronous requests in Node.js

In Node.js, the query that we looked at previously will read as follows:


程序员提供一个在结果(或错误)可用时被调用的函数(因此称为*回调函数*)。`query`函数仍然需要相同的时间。它不会阻塞执行线程,而是返回到事件循环,然后可以处理其他请求。Node.js 最终会触发一个事件,导致调用此回调函数并返回结果或错误指示。

在客户端 JavaScript 中使用类似的范例,我们经常编写事件处理程序函数。

JavaScript 语言的进步为我们提供了新的选择。与 ES2015 promises 一起使用时,等效的代码如下:

This is a little better, especially in instances of deeply nested event handling.

The big advance came with the ES-2017 async function:


除了`async`和`await`关键字之外,这看起来像我们在其他语言中编写的代码,并且更容易阅读。由于`await`的作用,它仍然是异步代码执行。

这三个代码片段都执行了我们之前编写的相同查询。`query`不再是阻塞函数调用,而是异步的,不会阻塞执行线程。

使用回调函数和 promise 的异步编码,Node.js 也存在自己的复杂性问题。我们经常在一个异步函数之后调用另一个异步函数。使用回调函数意味着深度嵌套的回调函数,而使用 promise 则意味着长长的`.then`处理程序函数链。除了编码的复杂性,我们还有错误和结果出现在不自然的位置。异步执行的回调函数被调用时,不会落在下一行代码上。执行顺序不是像同步编程语言中一行接一行的,而是由回调函数执行的顺序决定的。

`async`函数的方法解决了这种编码复杂性。编码风格更自然,因为结果和错误出现在自然的位置,即下一行代码。`await`关键字集成了异步结果处理,而不会阻塞执行线程。`async/await`功能的背后有很多东西,我们将在本书中广泛涵盖这个模型。

但是 Node.js 的异步架构实际上改善了性能吗?

## 性能和利用率

Node.js 引起了一些兴奋是因为它的吞吐量(每秒请求量)。对比类似应用的基准测试,比如 Apache,显示出 Node.js 有巨大的性能提升。

一个流传的基准是以下简单的 HTTP 服务器(从[`nodejs.org/en/`](https://nodejs.org/en/)借来的),它直接从内存中返回一个`Hello World`消息:

This is one of the simpler web servers that you can build with Node.js. The http object encapsulates the HTTP protocol, and its http.createServer method creates a whole web server, listening on the port specified in the listen method. Every request (whether a GET or POST on any URL) on that web server calls the provided function. It is very simple and lightweight. In this case, regardless of the URL, it returns a simple text/plain that is the Hello World response.

Ryan Dahl showed a simple benchmark in a video titled Ryan Dahl: Introduction to Node.js (on the YUI Library channel on YouTube, www.youtube.com/watch?v=M-sc73Y-zQA). It used a similar HTTP server to this, but that returned a one-megabyte binary buffer; Node.js gave 822 req/sec, while Nginx gave 708 req/sec, for a 15% improvement over Nginx. He also noted that Nginx peaked at four megabytes of memory, while Node.js peaked at 64 megabytes.

The key observation was that Node.js, running an interpreted, JIT-compiled, high-level language, was about as fast as Nginx, built of highly optimized C code, while running similar tasks. That presentation was in May 2010, and Node.js has improved hugely since then, as shown in Chris Bailey's talk that we referenced earlier.

Yahoo! search engineer Fabian Frank published a performance case study of a real-world search query suggestion widget implemented with Apache/PHP and two variants of Node.js stacks (www.slideshare.net/FabianFrankDe/nodejs-performance-case-study). The application is a pop-up panel showing search suggestions as the user types in phrases using a JSON-based HTTP query. The Node.js version could handle eight times the number of requests per second with the same request latency. Fabian Frank said both Node.js stacks scaled linearly until CPU usage hit 100%.

LinkedIn did a massive overhaul of their mobile app using Node.js for the server-side to replace an old Ruby on Rails app. The switch lets them move from 30 servers down to 3, and allowed them to merge the frontend and backend team because everything was written in JavaScript. Before choosing Node.js, they'd evaluated Rails with Event Machine, Python with Twisted, and Node.js, chose Node.js for the reasons that we just discussed. For a look at what LinkedIn did, see arstechnica.com/information-technology/2012/10/a-behind-the-scenes-look-at-linkedins-mobile-engineering/.

Most existing Node.js performance tips tend to have been written for older V8 versions that used the CrankShaft optimizer. The V8 team has completely dumped CrankShaft, and it has a new optimizer called TurboFan—for example, under CrankShaft, it was slower to use try/catch, let/const, generator functions, and so on. Therefore, common wisdom said to not use those features, which is depressing because we want to use the new JavaScript features because of how much it has improved the JavaScript language. Peter Marshall, an engineer on the V8 team at Google, gave a talk at Node.js Interactive 2017 claiming that, using TurboFan, you should just write natural JavaScript. With TurboFan, the goal is for across-the-board performance improvements in V8. To view the presentation, see the video titled High Performance JS in V8 at www.youtube.com/watch?v=YqOhBezMx1o.

A truism about JavaScript is that it's no good for heavy computation work because of the nature of JavaScript. We'll go over some ideas that are related to this in the next section. A talk by Mikola Lysenko at Node.js Interactive 2016 went over some issues with numerical computing in JavaScript, and some possible solutions. Common numerical computing involves large numerical arrays processed by numerical algorithms that you might have learned in calculus or linear algebra classes. What JavaScript lacks is multidimensional arrays and access to certain CPU instructions. The solution that he presented is a library to implement multidimensional arrays in JavaScript, along with another library full of numerical computing algorithms. To view the presentation, see the video titled Numerical Computing in JavaScript by Mikola Lysenko at www.youtube.com/watch?v=1ORaKEzlnys

At the Node.js Interactive conference in 2017, IBM's Chris Bailey made a case for Node.js being an excellent choice for highly scalable microservices. Key performance characteristics are I/O performance (measured in transactions per second), startup time (because that limits how quickly your service can scale up to meet demand), and memory footprint (because that determines how many application instances can be deployed per server). Node.js excels on all those measures; with every subsequent release, it either improves on each measure or remains fairly steady. Bailey presented figures comparing Node.js to a similar benchmark written in Spring Boot showing Node.js to perform much better. To view his talk, see the video titled Node.js Performance and Highly Scalable Micro-Services - Chris Bailey, IBM at www.youtube.com/watch?v=Fbhhc4jtGW4.

The bottom line is that Node.js excels at event-driven I/O throughput. Whether a Node.js program can excel at computational programs depends on your ingenuity in working around some limitations in the JavaScript language.

A big problem with computational programming is that it prevents the event loop from executing. As we will see in the next section, that can make Node.js look like a poor candidate for anything.

Is Node.js a cancerous scalability disaster?

In October 2011, a blog post (since pulled from the blog where it was published) titled Node.js is a cancer called Node.js a scalability disaster. The example shown for proof was a CPU-bound implementation of the Fibonacci sequence algorithm. While the argument was flawed—since nobody implements Fibonacci that way—it made the valid point that Node.js application developers have to consider the following: where do you put the heavy computational tasks?

A key to maintaining high throughput of Node.js applications is by ensuring that events are handled quickly. Because it uses a single execution thread, if that thread is bogged down with a big calculation, Node.js cannot handle events, and event throughput will suffer.

The Fibonacci sequence, serving as a stand-in for heavy computational tasks, quickly becomes computationally expensive to calculate for a naïve implementation such as this:


这是一个特别简单的方法来计算斐波那契数。是的,有很多更快的计算斐波那契数的方法。我们展示这个作为 Node.js 在事件处理程序缓慢时会发生什么的一个一般性例子,而不是讨论计算数学函数的最佳方法。考虑以下服务器:

This is an extension of the simple web server shown earlier. It looks in the request URL for an argument, n, for which to calculate the Fibonacci number. When it's calculated, the result is returned to the caller.

For sufficiently large values of n (for example, 40), the server becomes completely unresponsive because the event loop is not running. Instead, this function has blocked event processing because the event loop cannot dispatch events while the function is grinding through the calculation.

In other words, the Fibonacci function is a stand-in for any blocking operation.

Does this mean that Node.js is a flawed platform? No, it just means that the programmer must take care to identify code with long-running computations and develop solutions. These include rewriting the algorithm to work with the event loop, rewriting the algorithm for efficiency, integrating a native code library, or foisting computationally expensive calculations to a backend server.

A simple rewrite dispatches the computations through the event loop, letting the server continue to handle requests on the event loop. Using callbacks and closures (anonymous functions), we're able to maintain asynchronous I/O and concurrency promises, as shown in the following code:


这是一个同样愚蠢的计算斐波那契数的方法,但是通过使用`process.nextTick`,事件循环有机会执行。

因为这是一个需要回调函数的异步函数,它需要对服务器进行小的重构:

We've added a callback function to receive the result. In this case, the server is able to handle multiple Fibonacci number requests. But there is still a performance issue because of the inefficient algorithm.

Later in this book, we'll explore this example a little more deeply to explore alternative approaches.

In the meantime, we can discuss why it's important to use efficient software stacks.

Server utilization, overhead costs, and environmental impact

The striving for optimal efficiency (handling more requests per second) is not just about the geeky satisfaction that comes from optimization. There are real business and environmental benefits. Handling more requests per second, as Node.js servers can do, means the difference between buying lots of servers and buying only a few servers. Node.js potentially lets your organization do more with less.

Roughly speaking, the more servers you buy, the greater the monetary cost and the greater the environmental cost. There's a whole field of expertise around reducing costs and the environmental impact of running web-server facilities to which that rough guideline doesn't do justice. The goal is fairly obvious—fewer servers, lower costs, and a lower environmental impact by using more efficient software.

Intel's paper, Increasing Data Center Efficiency with Server Power Measurements (www.intel.com/content/dam/doc/white-paper/intel-it-data-center-efficiency-server-power-paper.pdf), gives an objective framework for understanding efficiency and data center costs. There are many factors, such as buildings, cooling systems, and computer system designs. Efficient building design, efficient cooling systems, and efficient computer systems (data center efficiency, data center density, and storage density) can lower costs and environmental impact. But you can destroy these gains by deploying an inefficient software stack, compelling you to buy more servers than you would if you had an efficient software stack. Alternatively, you can amplify gains from data center efficiency with an efficient software stack that lets you decrease the number of servers required.

This talk about efficient software stacks isn't just for altruistic environmental purposes. This is one of those cases where being green can help your business bottom line.

In this section, we have learned a lot about how Node.js architecture differs from other programming platforms. The choice to eschew threads to implement concurrency simplifies away the complexity and overhead that comes from using threads. This seems to have fulfilled the promise of being more efficient. Efficiency has a number of benefits to many aspects of a business.

Embracing advances in the JavaScript language

The last couple of years have been an exciting time for JavaScript programmers. The TC-39 committee that oversees the ECMAScript standard has added many new features, some of which are syntactic sugar, but several of which have propelled us into a whole new era of JavaScript programming. By itself, the async/await feature promises us a way out of what's called callback fell, the situation that we find ourselves in when nesting callbacks within callbacks. It's such an important feature that it should necessitate a broad rethinking of the prevailing callback-oriented paradigm in Node.js and the rest of the JavaScript ecosystem.

A few pages ago, you saw this:


这是 Ryan Dahl 的一个重要洞察,也是推动 Node.js 流行的原因。某些操作需要很长时间才能运行,比如数据库查询,不应该和快速从内存中检索数据的操作一样对待。由于 JavaScript 语言的特性,Node.js 必须以一种不自然的方式表达这种异步编码结构。结果不会出现在下一行代码,而是出现在这个回调函数中。此外,错误必须以一种不自然的方式处理,出现在那个回调函数中。

在 Node.js 中的约定是回调函数的第一个参数是一个错误指示器,随后的参数是结果。这是一个有用的约定,你会在整个 Node.js 领域找到它;然而,它使得处理结果和错误变得复杂,因为两者都出现在一个不方便的位置——那个回调函数。错误和结果自然地应该出现在随后的代码行上。

随着每一层回调函数嵌套,我们陷入了回调地狱。第七层回调嵌套比第六层回调嵌套更复杂。为什么?至少有一点是因为随着回调的嵌套更深,错误处理的特殊考虑变得更加复杂。

但正如我们之前所看到的,这是在 Node.js 中编写异步代码的新首选方式:

相反,ES2017 的async函数使我们回到了这种非常自然的编程意图表达。结果和错误会在正确的位置上,同时保持了使 Node.js 变得伟大的出色的事件驱动的异步编程模型。我们将在本书的后面看到这是如何工作的。

TC-39 委员会为 JavaScript 添加了许多新功能,比如以下的:

  • 改进的类声明语法,使对象继承和 getter/setter 函数非常自然。

  • 一个在浏览器和 Node.js 中标准化的新模块格式。

  • 字符串的新方法,比如模板字符串表示法。

  • 集合和数组的新方法,例如map/reduce/filter的操作。

  • 使用const关键字来定义不能被改变的变量,使用let关键字来定义变量的作用域仅限于它们声明的块,而不是被提升到函数的前面。

  • 新的循环结构和与这些新循环配合使用的迭代协议。

  • 一种新类型的函数,箭头函数,它更轻量,意味着更少的内存和执行时间影响。

  • Promise对象表示将来承诺交付的结果。单独使用,承诺可以缓解回调地狱问题,并且它们构成了async函数的一部分基础。

  • 生成器函数是一种有趣的方式,用于表示一组值的异步迭代。更重要的是,它们构成了异步函数的基础的另一半。

你可能会看到新的 JavaScript 被描述为 ES6 或 ES2017。描述正在使用的 JavaScript 版本的首选名称是什么?

ES1 到 ES5 标志着 JavaScript 发展的各个阶段。ES5 于 2009 年发布,并在现代浏览器中得到广泛实现。从 ES6 开始,TC-39 委员会决定改变命名约定,因为他们打算每年添加新的语言特性。因此,语言版本现在包括年份,例如,ES2015 于 2015 年发布,ES2016 于 2016 年发布,ES2017 于 2017 年发布。

部署 ES2015/2016/2017/2018 JavaScript 代码

问题在于,通常 JavaScript 开发人员无法使用最新的功能。前端 JavaScript 开发人员受到部署的网络浏览器和大量旧浏览器的限制,这些浏览器在长时间未更新操作系统的计算机上使用。幸运的是,Internet Explorer 6 版本几乎已经完全退出使用,但仍然有大量旧浏览器安装在老旧计算机上,仍然为其所有者提供有效的角色。旧浏览器意味着旧的 JavaScript 实现,如果我们希望我们的代码能够运行,我们需要它与旧浏览器兼容。

Babel 和其他代码重写工具的一个用途是处理这个问题。许多产品必须能够被使用旧浏览器的人使用。开发人员仍然可以使用最新的 JavaScript 或 TypeScript 功能编写他们的代码,然后使用 Babel 重写他们的代码,以便在旧浏览器上运行。这样,前端 JavaScript 程序员可以采用(部分)新功能,但需要更复杂的构建工具链,并且代码重写过程可能引入错误的风险。

Node.js 世界没有这个问题。Node.js 迅速采用了 ES2015/2016/2017 功能,就像它们在 V8 引擎中实现一样。从 Node.js 8 开始,我们可以自由地使用async函数作为一种原生功能。新的模块格式首次在 Node.js 版本 10 中得到支持。

换句话说,虽然前端 JavaScript 程序员可以主张他们必须等待几年才能采用 ES2015/2016/2017 功能,但 Node.js 程序员无需等待。我们可以简单地使用新功能,而无需任何代码重写工具,除非我们的管理人员坚持支持早于这些功能采用的旧 Node.js 版本。在这种情况下,建议您使用 Babel。

JavaScript 世界的一些进步是在 TC-39 社区之外发生的。

TypeScript 和 Node.js

TypeScript 语言是 JavaScript 环境的一个有趣的分支。因为 JavaScript 越来越能够用于复杂的应用程序,编译器帮助捕捉编程错误变得越来越有用。其他语言的企业程序员,如 Java,习惯于强类型检查作为防止某些类别的错误的一种方式。

强类型检查在某种程度上与 JavaScript 程序员相悖,但它确实很有用。TypeScript 项目旨在从 Java 和 C#等语言中引入足够的严谨性,同时保留 JavaScript 的松散性。结果是编译时类型检查,而不会像其他语言中的程序员那样承载沉重的负担。

虽然我们在本书中不会使用 TypeScript,但它的工具链在 Node.js 应用程序中非常容易采用。

在本节中,我们了解到随着 JavaScript 语言的变化,Node.js 平台也跟上了这些变化。

使用 Node.js 开发微服务或最大服务

新的功能,如云部署系统和 Docker,使得实现一种新的服务架构成为可能。Docker 使得可以在可重复部署到云托管系统中的数百万个容器中定义服务器进程配置。它最适合小型、单一用途的服务实例,可以连接在一起组成一个完整的系统。Docker 并不是唯一可以帮助简化云部署的工具;然而,它的特性非常适合现代应用部署需求。

一些人将微服务概念作为描述这种系统的一种方式。根据microservices.io网站,微服务由一组狭义、独立可部署的服务组成。他们将其与单片应用部署模式进行对比,单片应用将系统的每个方面集成到一个捆绑包中(例如 Java EE 应用服务器的单个 WAR 文件)。微服务模型为开发人员提供了非常需要的灵活性。

微服务的一些优势如下:

  • 每个微服务可以由一个小团队管理。

  • 每个团队可以按照自己的时间表工作,只要保持服务 API 的兼容性。

  • 微服务可以独立部署,如果需要的话,比如为了更容易进行测试。

  • 更容易切换技术栈选择。

Node.js 在这方面的定位如何?它的设计与微服务模型非常契合:

  • Node.js 鼓励小型、紧密专注、单一用途的模块。

  • 这些模块由出色的 npm 包管理系统组成应用程序。

  • 发布模块非常简单,无论是通过 NPM 仓库还是 Git URL。

  • 虽然 Express 等应用框架可以用于大型服务,但它非常适用于小型轻量级服务,并支持简单易用的部署。

简而言之,使用 Node.js 以精益和敏捷的方式非常容易,可以根据您的架构偏好构建大型或小型服务。

总结

在本章中,您学到了很多东西。特别是,您看到了 JavaScript 在 Web 浏览器之外的生活,以及 Node.js 是一个具有许多有趣特性的优秀编程平台。虽然它是一个相对年轻的项目,但 Node.js 已经变得非常流行,不仅广泛用于 Web 应用程序,还用于命令行开发工具等。由于 Node.js 平台基于 Chrome 的 V8 JavaScript 引擎,该项目已经能够跟上 JavaScript 语言的快速改进。

Node.js 架构由事件循环触发回调函数管理的异步函数组成,而不是使用线程和阻塞 I/O。这种架构声称具有性能优势,似乎提供了许多好处,包括能够在更少的硬件上完成更多的工作。但我们也了解到低效的算法可能会抵消任何性能优势。

本书的重点是开发和部署 Node.js 应用程序的现实考虑。我们将尽可能涵盖开发、完善、测试和部署 Node.js 应用程序的许多方面。

既然我们已经对 Node.js 有了介绍,我们准备好开始使用它了。在第二章 设置 Node.js中,我们将介绍如何在 Mac、Linux 或 Windows 上设置 Node.js 开发环境,甚至编写一些代码。让我们开始吧。

设置 Node.js

在开始使用 Node.js 之前,您必须设置好开发环境。虽然设置非常简单,但有许多考虑因素,包括是否使用包管理系统安装 Node.js,满足安装本地代码 Node.js 包的要求,以及决定使用什么编辑器最好与 Node.js 一起使用。在接下来的章节中,我们将使用这个环境进行开发和非生产部署。

在本章中,我们将涵盖以下主题:

  • 如何在 Linux、macOS 或 Windows 上从源代码和预打包的二进制文件安装 Node.js

  • 如何安装node 包管理器npm)和其他一些流行的工具

  • Node.js 模块系统

  • Node.js 和 ECMAScript 委员会的 JavaScript 语言改进

第四章:系统要求

Node.js 可以在类似 POSIX 的操作系统、各种 UNIX 衍生系统(例如 Solaris)和 UNIX 兼容的操作系统(如 Linux、macOS 等),以及 Microsoft Windows 上运行。它可以在各种大小的计算机上运行,包括像树莓派这样的微型 ARM 设备,树莓派是一个用于 DIY 软件/硬件项目的微型嵌入式计算机。

Node.js 现在可以通过包管理系统获得,无需从源代码编译和安装。

由于许多 Node.js 包是用 C 或 C++编写的,您必须有 C 编译器(如 GCC)、Python 2.7(或更高版本)和node-gyp包。由于 Python 2 将在 2019 年底停止维护,Node.js 社区正在为 Python 3 兼容性重写其工具。如果您计划在网络编码中使用加密,还需要 OpenSSL 加密库。现代 UNIX 衍生系统几乎肯定会自带这些内容,Node.js 的配置脚本(在从源代码安装时使用)将检测它们的存在。如果您需要安装它,Python 可以在python.org获取,OpenSSL 可以在openssl.org获取。

现在我们已经了解了运行 Node.js 的要求,让我们学习如何安装它。

使用包管理器安装 Node.js

安装 Node.js 的首选方法是使用包管理器中提供的版本,比如apt-get或 MacPorts。包管理器通过输入简单的命令,如apt-get update,来帮助您在计算机上维护软件的当前版本,并确保更新依赖包,从而让您的生活更加轻松。让我们首先来看一下如何从包管理系统进行安装。

有关从包管理器安装的官方说明,请访问nodejs.org/en/download/package-manager/.

在 macOS 上使用 MacPorts 安装 Node.js

MacPorts 项目(www.macports.org/)多年来一直在为 macOS 打包大量开源软件包,他们已经打包了 Node.js。它默认管理的命令安装在/opt/local/bin上。安装 MacPorts 后,安装 Node.js 非常简单,可以在 MacPorts 安装命令的目录中找到 Node.js 二进制文件:


If you have followed the directions for setting up MacPorts, the MacPorts directory is already in your PATH environment variable. Running the `node`, `npm`, or `npx` commands is then simple. This proves Node.js has been installed and the installed version matched what you asked for.

MacPorts isn't the only tool for managing open source software packages on macOS.

## Installing Node.js on macOS with Homebrew

Homebrew is another open source software package manager for macOS, which some say is the perfect replacement for MacPorts. It is available through their home page at [`brew.sh/`](http://brew.sh/). After installing Homebrew using the instructions on their website and ensuring that it is correctly set up, use the following code:

然后,像这样安装:


Like MacPorts, Homebrew installs commands on a public directory, which defaults to `/usr/local/bin`. If you have followed the Homebrew instructions to add that directory to your `PATH` variable, run the Node.js command as follows:

这证明 Node.js 已经安装,并且安装的版本与您要求的版本相匹配。

当然,macOS 只是我们可能使用的众多操作系统之一。

从包管理系统在 Linux、*BSD 或 Windows 上安装 Node.js

Node.js 现在可以通过大多数包管理系统获得。Node.js 网站上的说明目前列出了 Node.js 的打包版本,适用于长列表的 Linux,以及 FreeBSD,OpenBSD,NetBSD,macOS,甚至 Windows。访问nodejs.org/en/download/package-manager/获取更多信息。

例如,在 Debian 和其他基于 Debian 的 Linux 发行版(如 Ubuntu)上,使用以下命令:


This adds the NodeSource APT repository to the system, updates the package data, and prepares the system so that you can install Node.js packages. It also instructs us on how to install Node.js and the required compiler and developer tools.

To download other Node.js versions (this example shows version 14.x), modify the URL to suit you:

命令将安装在/usr/bin中,我们可以测试下载的版本是否符合我们的要求。

由于一种名为Windows 子系统 LinuxWSL)的新工具,Windows 正开始成为 Unix/Linux 极客可以工作的地方。

在 WSL 中安装 Node.js

WSL允许您在 Windows 上安装 Ubuntu、openSUSE 或 SUSE Linux Enterprise。所有这三个都可以通过内置到 Windows 10 中的商店获得。您可能需要更新 Windows 设备才能进行安装。为了获得最佳体验,请安装 WSL2,这是 WSL 的一次重大改进,提供了 Windows 和 Linux 之间更好的集成。

安装完成后,Linux 特定的说明将在 Linux 子系统中安装 Node.js。

要安装 WSL,请参阅msdn.microsoft.com/en-us/commandline/wsl/install-win10

要了解并安装 WSL2,请参阅docs.microsoft.com/en-us/windows/wsl/wsl2-index

在 Windows 上,该过程可能需要提升的权限。

在 Windows 上打开具有管理员特权的 PowerShell

在 Windows 上安装工具时,您将运行一些命令需要在具有提升权限的 PowerShell 窗口中执行。我们提到这一点是因为在启用 WSL 的过程中,需要在 PowerShell 窗口中运行一个命令。

该过程很简单:

  1. 在“开始”菜单中,在应用程序的搜索框中输入PowerShell。生成的菜单将列出 PowerShell。

  2. 右键单击 PowerShell 条目。

  3. 弹出的上下文菜单将有一个名为“以管理员身份运行”的条目。点击它。

生成的命令窗口将具有管理员特权,并且标题栏将显示管理员:Windows PowerShell。

在某些情况下,您将无法使用软件包管理系统中的 Node.js。

从 nodejs.org 安装 Node.js 发行版

nodejs.org/en/网站提供了 Windows、macOS、Linux 和 Solaris 的内置二进制文件。我们只需转到该网站,单击安装按钮,然后运行安装程序。对于具有软件包管理器的系统,例如我们刚刚讨论的系统,最好使用软件包管理系统。这是因为您会发现更容易保持最新版本。但是,由于以下原因,这并不适用于所有人:

  • 有些人更喜欢安装二进制文件,而不是使用软件包管理器。

  • 他们选择的系统没有软件包管理系统。

  • 他们的软件包管理系统中的 Node.js 实现已经过时。

只需转到 Node.js 网站,您将看到以下屏幕截图中的内容。该页面会尽力确定您的操作系统并提供适当的下载。如果您需要其他内容,请单击标题中的 DOWNLOADS 链接以获取所有可能的下载:

对于 macOS,安装程序是一个PKG文件,提供了典型的安装过程。对于 Windows,安装程序只需按照典型的安装向导过程进行。

安装程序完成后,您将拥有命令行工具,例如nodenpm,您可以使用它们来运行 Node.js 程序。在 Windows 上,您将获得一个预配置为与 Node.js 良好配合工作的 Windows 命令外壳版本。

正如您刚刚了解的,我们大多数人将完全满意于安装预构建的软件包。但是,有时我们必须从源代码安装 Node.js。

在类似 POSIX 的系统上从源代码安装

安装预打包的 Node.js 发行版是首选的安装方法。但是,在一些情况下,从源代码安装 Node.js 是可取的:

  • 它可以让您根据需要优化编译器设置。

  • 它可以让您交叉编译,比如为嵌入式 ARM 系统。

  • 您可能需要保留多个 Node.js 版本进行测试。

  • 您可能正在处理 Node.js 本身。

现在您已经有了一个高层次的视图,让我们通过一些构建脚本来动手。一般的过程遵循您可能已经用其他开源软件包执行过的configuremakemake install例程。如果没有,不用担心,我们会指导您完成这个过程。

官方安装说明在源分发的README.md中,位于github.com/nodejs/node/blob/master/README.md

安装先决条件

有三个先决条件:C 编译器、Python 和 OpenSSL 库。Node.js 编译过程会检查它们的存在,如果 C 编译器或 Python 不存在,将会失败。这些命令将检查它们的存在:


Go to [`github.com/nodejs/node/blob/master/BUILDING.md`](https://github.com/nodejs/node/blob/master/BUILDING.md) for details on the requirements.

The specific method for installing these depends on your OS.

The Node.js build tools are in the process of being updated to support Python 3.x. Python 2.x is in an end-of-life process, slated for the end of 2019, so it is therefore recommended that you update to Python 3.x.

Before we can compile the Node.js source, we must have the correct tools installed and on macOS, there are a couple of special considerations.

## Installing developer tools on macOS

Developer tools (such as GCC) are an optional installation on macOS. Fortunately, they're easy to acquire.

You start with Xcode, which is available for free through the Macintosh app store. Simply search for `Xcode` and click on the Get button. Once you have Xcode installed, open a Terminal window and type the following:

这将安装 Xcode 命令行工具:

有关更多信息,请访问osxdaily.com/2014/02/12/install-command-line-tools-mac-os-x/

现在我们已经安装了所需的工具,我们可以继续编译 Node.js 源代码。

为所有类 POSIX 系统从源代码安装

从源代码编译 Node.js 遵循以下熟悉的过程:

  1. nodejs.org/download.下载源代码。

  2. 使用./configure配置源代码进行构建。

  3. 运行make,然后运行make install

源代码包可以通过浏览器下载,或者按照以下步骤进行替换您喜欢的版本:


Now, we configure the source so that it can be built. This is just like with many other open source packages and there is a long list of options to customize the build:

要使安装到您的home目录中,以这种方式运行它:


If you're going to install multiple Node.js versions side by side, it's useful to put the version number in the path like this. That way, each version will sit in a separate directory. It will then be a simple matter of switching between Node.js versions by changing the `PATH` variable appropriately:

安装多个 Node.js 版本的更简单方法是使用nvm脚本,稍后将进行描述。

如果你想在系统范围的目录中安装 Node.js,只需省略--prefix选项,它将默认安装在/usr/local中。

过一会儿,它会停止,并且很可能已经成功地配置了源树,以便在您选择的目录中进行安装。如果这不成功,打印出的错误消息将描述需要修复的内容。一旦配置脚本满意,您就可以继续下一步。

配置脚本满意后,您可以编译软件:


If you are installing on a system-wide directory, perform the last step this way instead:

安装完成后,您应该确保将安装目录添加到您的PATH变量中,如下所示:


Alternatively, for `csh` users, use this syntax to make an exported environment variable:

安装完成后,它会创建一个目录结构,如下所示:


Now that we've learned how to install Node.js from the source on UNIX-like systems, we get to do the same on Windows.

## Installing from the source on Windows

The `BUILDING.md` document referenced previously has instructions. You can use the build tools from Visual Studio or the full Visual Studio 2017 or 2019 product: 

*   Visual Studio 2019: [`www.visualstudio.com/downloads/`](https://www.visualstudio.com/downloads/)
*   The build tools: [`visualstudio.microsoft.com/downloads/#build-tools-for-visual-studio-2019`](https://visualstudio.microsoft.com/downloads/#build-tools-for-visual-studio-2019)

Three additional tools are required:

*   Git for Windows: [`git-scm.com/download/win`](http://git-scm.com/download/win)  
*   Python: [`www.python.org/`](https://www.python.org/)
*   OpenSSL: [`www.openssl.org/source/`](https://www.openssl.org/source/) and [`wiki.openssl.org/index.php/Binaries`](https://wiki.openssl.org/index.php/Binaries)
*   The **Netwide Assembler** (**NASM**) for OpenSSL: [`www.nasm.us/`](https://www.nasm.us/)

Then, run the included `.\vcbuild` script to perform the build. 

We've learned how to install one Node.js instance, so let's now take it to the next level by installing multiple instances.

# Installing multiple Node.js instances with nvm

Normally, you wouldn't install multiple versions of Node.js—doing so adds complexity to your system. But if you are hacking on Node.js itself or testing your software against different Node.js releases, you may want to have multiple Node.js installations. The method to do so is a simple variation on what we've already discussed.

Earlier, while discussing building Node.js from the source, we noted that you can install multiple Node.js instances in separate directories. It's only necessary to build from the source if you need a customized Node.js build but most folks would be satisfied with pre-built Node.js binaries. They, too, can be installed on separate directories.

Switching between Node.js versions is simply a matter of changing the `PATH` variable (on POSIX systems), as in the following code, using the directory where you installed Node.js:

在一段时间后,维护这个变得有点乏味。对于每个发布,您都必须在 Node.js 安装中设置 Node.js、npm 和任何第三方模块。此外,显示更改PATH的命令并不是最佳的。富有创造力的程序员已经创建了几个版本管理器,以简化管理多个 Node.js/npm 版本,并提供智能更改PATH的命令:

两者都维护多个同时版本的 Node.js,并且让你可以轻松切换版本。安装说明可以在它们各自的网站上找到。

例如,使用nvm,您可以运行这样的命令:


In this example, we first listed the available versions. Then, we demonstrated how to switch between Node.js versions, verifying the version changed each time. We also installed and used a new version using `nvm`. Finally, we showed the directory where nvm installs Node.js packages versus Node.js versions that are installed using MacPorts or Homebrew.

This demonstrates that you can have Node.js installed system-wide, keep multiple private Node.js versions managed by `nvm`, and switch between them as needed. When new Node.js versions are released, they are simple to install with `nvm`, even if the official package manager for your OS hasn't yet updated its packages.

## Installing nvm on Windows

Unfortunately, `nvm` doesn't support Windows. Fortunately, a couple of Windows-specific clones of the `nvm` concept exist:

*   Node.js version management utility for Windows: [`github.com/coreybutler/nvm-windows`](https://github.com/coreybutler/nvm-windows)
*   Natural Node.js and npm version manager for Windows: [`github.com/marcelklehr/nodist`](https://github.com/marcelklehr/nodist)

Another route is to use WSL. Because in WSL you're interacting with a Linux command line, you can use `nvm` itself. But let's stay focused on what you can do in Windows.

Many of the examples in this book were tested using the `nvm-windows` application. There are slight behavior differences but it acts largely the same as `nvm` for Linux and macOS. The biggest change is the version number specifier in the `nvm use` and `nvm install` commands.

With `nvm` for Linux and macOS, you can type a simple version number, such as `nvm use 8`, and it will automatically substitute the latest release of the named Node.js version. With `nvm-windows`, the same command acts as if you typed `nvm use 8.0.0`. In other words, with `nvm-windows`, you must use the exact version number. Fortunately, the list of supported versions is easily available using the `nvm list available` command.

Using a tool such as `nvm` simplifies the process of testing a Node.js application against multiple Node.js versions.

Now that we can install Node.js, we need to make sure we are installing any Node.js module that we want to use. This requires having build tools installed on our computer.

# Requirements for installing native code modules

While we won't discuss native code module development in this book, we do need to make sure that they can be built. Some modules in the npm repository are native code and they must be compiled with a C or C++ compiler to build the corresponding `.node` files (the `.node` extension is used for binary native code modules).

The module will often describe itself as a wrapper for some other library. For example, the `libxslt` and `libxmljs` modules are wrappers around the C/C++ libraries of the same name. The module includes the C/C++ source code and when installed, a script is automatically run to do the compilation with `node-gyp`.

The `node-gyp` tool is a cross-platform command-line tool written in Node.js for compiling native add-on modules for Node.js. We've mentioned native code modules several times and it is this tool that compiles them for use with Node.js.

You can easily see this in action by running these commands:

这是在临时目录中完成的,所以之后可以删除它。如果您的系统没有安装编译本地代码模块的工具,您将看到错误消息。否则,您将看到node-gyp的执行输出,然后是许多明显与编译 C/C++文件相关的文本行。

node-gyp工具具有与从源代码编译 Node.js 相似的先决条件,即 C/C++编译器、Python 环境和其他构建工具,如 Git。对于 Unix、macOS 和 Linux 系统,这些都很容易获得。对于 Windows,您应该安装以下内容:

通常,您不需要担心安装node-gyp。这是因为它作为 npm 的一部分在后台安装。这样做是为了让 npm 可以自动构建本地代码模块。

它的 GitHub 存储库包含文档;转到github.com/nodejs/node-gyp

阅读node-gyp存储库中的文档将让您更清楚地了解之前讨论的编译先决条件和开发本地代码模块。

这是一个非显式依赖的示例。最好明确声明软件包依赖的所有内容。在 Node.js 中,依赖关系在package.json中声明,以便包管理器(npmyarn)可以下载和设置所有内容。但是这些编译器工具是由操作系统包管理系统设置的,这是npmyarn无法控制的。因此,我们无法明确声明这些依赖关系。

我们刚刚了解到 Node.js 不仅支持用 JavaScript 编写的模块,还支持其他编程语言。我们还学会了如何支持这些模块的安装。接下来,我们将了解 Node.js 版本号。

选择要使用的 Node.js 版本和版本策略

在上一节中,我们提到了许多不同的 Node.js 版本号,您可能会对要使用哪个版本感到困惑。本书针对的是 Node.js 版本 14.x,并且预计我们将涵盖的所有内容都与 Node.js 10.x 和任何后续版本兼容。

从 Node.js 4.x 开始,Node.js 团队采用了双轨道方法。偶数版本(4.x、6.x、8.x 等)被称为长期支持LTS),而奇数版本(5.x、7.x、9.x 等)是当前新功能开发的地方。虽然开发分支保持稳定,但 LTS 版本被定位为用于生产使用,并将在几年内接收更新。

在撰写本文时,Node.js 12.x 是当前的 LTS 版本;Node.js 14.x 已发布,最终将成为 LTS 版本。

每个新的 Node.js 发布的主要影响,除了通常的性能改进和错误修复之外,还包括引入最新的 V8 JavaScript 引擎发布。反过来,这意味着引入更多的 ES2015/2016/2017 功能,因为 V8 团队正在实现它们。在 Node.js 8.x 中,async/await函数到达,在 Node.js 10.x 中,支持标准的 ES6 模块格式到达。在 Node.js 14.x 中,该模块格式将得到完全支持。

一个实际的考虑是新的 Node.js 发布是否会破坏您的代码。新的语言功能总是在 V8 赶上 ECMAScript 的过程中添加,Node.js 团队有时会对 Node.js API 进行重大更改。如果您在一个 Node.js 版本上进行了测试,它是否会在较早的版本上工作?Node.js 的更改是否会破坏我们的一些假设?

npm 的作用是确保我们的软件包在正确的 Node.js 版本上执行。这意味着我们可以在package.json文件中指定软件包的兼容 Node.js 版本(我们将在第三章,探索 Node.js 模块中探讨)。

我们可以在package.json中添加条目如下:


This means exactly what it implies—that the given package is compatible with Node.js version 8.x or later.

Of course, your development environment(s) could have several Node.js versions installed. You'll need the version your software is declared to support, plus any later versions you wish to evaluate.

We have just learned how the Node.js community manages releases and version numbers. Our next step is to discuss which editor to use.

# Choosing editors and debuggers for Node.js

Since Node.js code is JavaScript, any JavaScript-aware editor will be useful. Unlike some other languages that are so complex that an IDE with code completion is a necessity, a simple programming editor is perfectly sufficient for Node.js development.

Two editors are worth shouting out because they are written in Node.js: Atom and Microsoft Visual Studio Code. 

Atom ([`atom.io/`](https://atom.io/)) describes itself as a hackable editor for the 21st century. It is extendable by writing Node.js modules using the Atom API and the configuration files are easily editable. In other words, it's hackable in the same way plenty of other editors have been—going back to Emacs, meaning you write a software module to add capabilities to the editor. The Electron framework was invented in order to build Atom and it is is a super-easy way of building desktop applications using Node.js.

Microsoft Visual Studio Code ([`code.visualstudio.com/`](https://code.visualstudio.com/)) is a hackable editor (well, the home page says extensible and customizable, which means the same thing) that is also open source and implemented in Electron. However, it's not a hollow me-too editor, copying Atom while adding nothing of its own. Instead, Visual Studio Code is a solid programmer's editor in its own right, bringing interesting functionality to the table.

As for debuggers, there are several interesting choices. Starting with Node.js 6.3, the `inspector` protocol has made it possible to use the Google Chrome debugger. Visual Studio Code has a built-in debugger that also uses the `inspector` protocol.

For a full list of debugging options and tools, see [`nodejs.org/en/docs/guides/debugging-getting-started/`](https://nodejs.org/en/docs/guides/debugging-getting-started/).

Another task related to the editor is adding extensions to help with the editing experience. Most programmer-oriented editors allow you to extend the behavior and assist with writing the code. A trivial example is syntax coloring for JavaScript, CSS, HTML, and so on. Code completion extensions are where the editor helps you write the code. Some extensions scan code for common errors; often these extensions use the word *lint*. Some extensions help to run unit test frameworks. Since there are so many editors available, we cannot provide specific suggestions.  

For some, the choice of programming editor is a serious matter defended with fervor, so we carefully recommend that you use whatever editor you prefer, as long as it helps you edit JavaScript code. Next, we will learn about the Node.js commands and a little about running Node.js scripts.

# Running and testing commands

Now that you've installed Node.js, we want to do two things—verify that the installation was successful and familiarize ourselves with the Node.js command-line tools and running simple scripts with Node.js. We'll also touch again on `async` functions and look at a simple example HTTP server. We'll finish off with the `npm` and `npx` command-line tools.

## Using Node.js's command-line tools

The basic installation of Node.js includes two commands: `node` and `npm`. We've already seen the `node` command in action. It's used either for running command-line scripts or server processes. The other, `npm`, is a package manager for Node.js.

The easiest way to verify that your Node.js installation works is also the best way to get help with Node.js. Type the following command:

输出很多,但不要过于仔细研究。关键是node --help提供了很多有用的信息。

请注意,Node.js 和 V8 都有选项(在上一个命令行中未显示)。请记住 Node.js 是建立在 V8 之上的;它有自己的选项宇宙,主要关注字节码编译、垃圾回收和堆算法的细节。输入node --v8-options以查看这些选项的完整列表。

在命令行上,您可以指定选项、单个脚本文件和该脚本的参数列表。我们将在下一节使用 Node.js 运行简单脚本中进一步讨论脚本参数。

在没有参数的情况下运行 Node.js 会将您放在一个交互式 JavaScript shell 中:


Any code you can write in a Node.js script can be written here. The command interpreter gives a good terminal-oriented user experience and is useful for interactively playing with your code. You do play with your code, don't you? Good!

## Running a simple script with Node.js

Now, let's look at how to run scripts with Node.js. It's quite simple; let's start by referring to the help message shown previously. The command-line pattern is just a script filename and some script arguments, which should be familiar to anyone who has written scripts in other languages.

Creating and editing Node.js scripts can be done with any text editor that deals with plain text files, such as VI/VIM, Emacs, Notepad++, Atom, Visual Studio Code, Jedit, BB Edit, TextMate, or Komodo. It's helpful if it's a programmer-oriented editor, if only for the syntax coloring.

For this and other examples in this book, it doesn't truly matter where you put the files. However, for the sake of neatness, you can start by making a directory named `node-web-dev` in the `home` directory of your computer and inside that, creating one directory per chapter (for example, `chap02` and `chap03`).

First, create a text file named `ls.js` with the following content:

接下来,通过输入以下命令来运行它:


This is a pale and cheap imitation of the Unix `ls` command (as if you couldn't figure that out from the name!). The `readdir` function is a close analog to the Unix `readdir` system call used to list the files in a directory. On Unix/Linux systems, we can run the following command to learn more:

当然,man命令让你阅读手册页,第3节涵盖了 C 库。

在函数体内,我们读取目录并打印其内容。使用require('fs').promises给我们提供了一个返回 Promise 的fs模块(文件系统函数)的版本;因此,在异步函数中它可以很好地工作。同样,ES2015 的for..of循环构造让我们能够以一种适合在async函数中工作的方式循环遍历数组中的条目。

默认情况下,fs模块函数使用最初为 Node.js 创建的回调范式。因此,大多数 Node.js 模块使用回调范式。在async函数中,如果函数返回 Promise,那么更方便使用await关键字。util模块提供了一个函数,util.promisify,它为旧式的面向回调的函数生成一个包装函数,因此它返回一个 Promise。

这个脚本是硬编码为列出当前目录中的文件。真正的ls命令需要一个目录名,所以让我们稍微修改一下脚本。

命令行参数会落入一个名为process.argv的全局数组中。因此,我们可以修改ls.js,将其复制为ls2.js(如下所示)来看看这个数组是如何工作的:


You can run it as follows:

我们只是检查了命令行参数是否存在,if (process.argv[2])。如果存在,我们会覆盖dir变量的值,dir = process.argv[2],然后将其用作readdir的参数:


If you give it a non-existent directory pathname, an error will be thrown and printed using the `catch` clause. 

### Writing inline async arrow functions

There is a different way to write these examples that some feel is more concise. These examples were written as a regular function—with the `function` keyword—but with the `async` keyword in front. One of the features that came with ES2015 is the arrow function, which lets us streamline the code a little bit.

Combined with the `async` keyword, an async arrow function looks like this:

你可以在任何地方使用这个;例如,该函数可以被分配给一个变量,或者它可以作为回调传递给另一个函数。当与async关键字一起使用时,箭头函数的主体具有所有async函数的行为。

为了这些示例的目的,可以将异步箭头函数包装为立即执行:


The final parenthesis causes the inline function to immediately be invoked.

Then, because `async` functions return a Promise, it is necessary to add a `.catch` block to catch errors. With all that, the example looks as follows:

也许这种风格或者之前的风格更可取。然而,你会发现这两种风格都在使用中,了解这两种风格的工作方式是必要的。

在脚本的顶层调用异步函数时,有必要捕获任何错误并报告它们。未能捕获和报告错误可能导致难以解决的神秘问题。在这个示例的原始版本中,错误是通过try/catch块明确捕获的。在这个版本中,我们使用.catch块捕获错误。

在我们拥有异步函数之前,我们有 Promise 对象,而在那之前,我们有回调范式。所有三种范式在 Node.js 中仍在使用,这意味着你需要理解每一种。

转换为异步函数和 Promise 范式

在上一节中,我们讨论了util.promisify及其将面向回调的函数转换为返回 Promise 的能力。后者与异步函数很好地配合,因此,最好让函数返回一个 Promise。

更准确地说,util.promisify应该给出一个使用错误优先回调范式的函数。这些函数的最后一个参数是一个回调函数,其第一个参数被解释为错误指示器,因此有了错误优先回调这个短语。util.promisify返回的是另一个返回 Promise 的函数。

Promise 的作用与错误优先回调相同。如果指示了错误,则 Promise 解析为拒绝状态,而如果指示了成功,则 Promise 解析为成功状态。正如我们在这些示例中看到的那样,Promise 在async函数中处理得非常好。

Node.js 生态系统拥有大量使用错误优先回调的函数。社区已经开始了一个转换过程,其中函数将返回一个 Promise,并可能还会接受一个错误优先回调以实现 API 兼容性。

Node.js 10 中的一个新功能就是这样的转换的一个例子。在fs模块中有一个名为fs.promises的子模块,具有相同的 API,但产生 Promise 对象。我们使用该 API 编写了前面的示例。

另一个选择是第三方模块fs-extra。该模块具有超出标准fs模块的扩展 API。一方面,如果没有提供回调函数,它的函数会返回一个 Promise,否则会调用回调函数。此外,它还包括几个有用的函数。

在本书的其余部分,我们经常使用fs-extra,因为它具有额外的功能。有关该模块的文档,请访问www.npmjs.com/package/fs-extra

util模块还有另一个函数util.callbackify,它的功能与其名称暗示的一样——它将返回 Promise 的函数转换为使用回调函数的函数。

现在我们已经看到如何运行一个简单的脚本,让我们来看一个简单的 HTTP 服务器。

使用 Node.js 启动服务器

你将运行许多服务器进程的脚本;我们稍后将运行许多这样的脚本。由于我们仍在尝试验证安装并让你熟悉使用 Node.js,我们想要运行一个简单的 HTTP 服务器。让我们借用 Node.js 首页上的简单服务器脚本(nodejs.org)。

创建一个名为app.js的文件,其中包含以下内容:


Run it as follows:

这是你可以用 Node.js 构建的最简单的网络服务器。如果你对它的工作原理感兴趣,请翻到第四章,HTTP 服务器和客户端,第五章,你的第一个 Express 应用程序,和第六章,实现移动优先范式。但现在,只需在浏览器中键入http://127.0.0.1:8124,就可以看到 Hello, World!的消息:

一个值得思考的问题是为什么这个脚本在ls.js退出时没有退出。在两种情况下,脚本的执行都到达了文件的末尾;Node.js 进程在app.js中没有退出,而在ls.js中退出了。

这是因为存在活动事件监听器。Node.js 始终启动一个事件循环,在app.js中,listen函数创建了一个实现 HTTP 协议的事件listener。这个listener事件会一直保持app.js运行,直到你做一些事情,比如在终端窗口中按下Ctrl + C。在ls.js中,没有任何内容来创建一个长时间运行的listener事件,所以当ls.js到达脚本的末尾时,node进程将退出。

要使用 Node.js 执行更复杂的任务,我们必须使用第三方模块。npm 存储库是去的地方。

使用 npm,Node.js 包管理器

Node.js 作为一个具有一些有趣的异步 I/O 库的 JavaScript 解释器,本身就是一个相当基本的系统。使 Node.js 有趣的事情之一是不断增长的用于 Node.js 的第三方模块生态系统。

在这个生态系统的中心是 npm 模块存储库。虽然 Node.js 模块可以作为源代码下载并手动组装以供 Node.js 程序使用,但这样做很麻烦,而且很难实现可重复的构建过程。npm 为我们提供了一个更简单的方法;npm 是 Node.js 的事实标准包管理器,它极大地简化了下载和使用这些模块。我们将在下一章详细讨论 npm。

你们中的敏锐者可能已经注意到,npm 已经通过之前讨论的所有安装方法安装了。过去,npm 是单独安装的,但今天它与 Node.js 捆绑在一起。

现在我们已经安装了npm,让我们快速试一下。hexy程序是一个用于打印文件的十六进制转储的实用程序。这是一个非常 70 年代的事情,但它仍然非常有用。它现在正好符合我们的目的,因为它可以让我们快速安装和尝试:


Adding the `-g` flag makes the module available globally, irrespective of the present working directory of your command shell. A global install is most useful when the module provides a command-line interface. When a package provides a command-line script, `npm` sets that up. For a global install, the command is installed correctly for use by all users of the computer.

Depending on how Node.js is installed for you, it may need to be run with `sudo`:

安装完成后,您可以以以下方式运行新安装的程序:


The `hexy` command was installed as a global command, making it easy to run.

Again, we'll be doing a deep dive into npm in the next chapter. The `hexy` utility is both a Node.js library and a script for printing out these old-style hex dumps.

In the open source world, a perceived need often leads to creating an open source project. The folks who launched the Yarn project saw needs that weren't being addressed by npm and created an alternative package manager tool. They claim a number of advantages over npm, primarily in the area of performance. To learn more about Yarn, go to [`yarnpkg.com/`](https://yarnpkg.com/).

For every example in this book that uses npm, there is a close equivalent command that uses Yarn.

For npm-packaged command-line tools, there is another, simpler way to use the tool.

## Using npx to execute Node.js packaged binaries

Some packages in the npm repository are command-line tools, such as the `hexy` program we looked at earlier. Having to first install such a program before using it is a small hurdle. The sharp-eyed among you will have noticed that `npx` is installed alongside the `node` and `npm` commands when installing Node.js. This tool is meant to simplify running command-line tools from the npm repository by removing the need to first install the package.

The previous example could have been run this way:

在底层,npx使用npm将包下载到缓存目录,除非包已经安装在当前项目目录中。因为包然后在缓存目录中,所以只下载一次。

这个工具有很多有趣的选项;要了解更多,请访问www.npmjs.com/package/npx

在本节中,我们已经学到了有关 Node.js 提供的命令行工具,以及运行简单脚本和 HTTP 服务器的知识。接下来,我们将学习 JavaScript 语言的进步如何影响 Node.js 平台。

用 ECMAScript 2015、2016、2017 和以后推进 Node.js

2015 年,ECMAScript 委员会发布了 JavaScript 语言的一个期待已久的重大更新。更新为 JavaScript 带来了许多新功能,如 Promises、箭头函数和类对象。这个语言更新为改进奠定了基础,因为它应该大大提高我们编写清晰、易懂的 JavaScript 代码的能力。

浏览器制造商正在添加这些非常需要的功能,这意味着 V8 引擎也在添加这些功能。这些功能正在以 Node.js 的方式进入,从 4.x 版本开始。

要了解 Node.js 中 ES2015/2016/2017 等的当前状态,请访问nodejs.org/en/docs/es6/

默认情况下,Node.js 启用 V8 认为稳定的 ES2015、2016 和 2017 功能。其他功能可以通过命令行选项启用。几乎完整的功能可以通过--es_staging选项启用。网站文档提供了更多信息。

Node green 网站(node.green/)有一张表格列出了 Node.js 版本中许多功能的状态。

ES2019 语言规范发布在www.ecma-international.org/publications/standards/Ecma-262.htm

TC-39 委员会在 GitHub 上进行工作,网址为github.com/tc39

ES2015(以及之后)的功能对 JavaScript 语言有很大的改进。其中一个功能,Promise类,应该意味着 Node.js 编程中常见习语的根本性重新思考。在 ES2017 中,一对新关键字,asyncawait,简化了在 Node.js 中编写异步代码,这应该鼓励 Node.js 社区进一步重新思考平台的常见习语。

JavaScript 有很多新功能,但让我们快速浏览其中两个我们将大量使用的功能。

第一个是称为箭头函数的轻量级函数语法:


This is more than the syntactic sugar of replacing the `function` keyword with the fat arrow. Arrow functions are lighter weight as well as being easier to read. The lighter weight comes at the cost of changing the value of `this` inside the arrow function. In regular functions, `this` has a unique value inside the function. In an arrow function, `this` has the same value as the scope containing the arrow function. This means that, when using an arrow function, we don't have to jump through hoops to bring `this` into the callback function because `this` is the same at both levels of the code.

The next feature is the `Promise` class, which is used for deferred and asynchronous computations. Deferred code execution to implement asynchronous behavior is a key paradigm for Node.js and it requires two idiomatic conventions:

*   The last argument to an asynchronous function is a callback function, which is called when an asynchronous execution is to be performed.
*   The first argument to the callback function is an error indicator.

While convenient, these conventions have resulted in multilayer code pyramids that can be difficult to understand and maintain:

您不需要理解代码;这只是实践中发生的概述,因为我们使用回调。根据特定任务所需的步骤数量,代码金字塔可能会变得非常深。Promise 将让我们解开代码金字塔,并提高可靠性,因为错误处理更直接,可以轻松捕获所有错误。

Promise类的创建如下:


Rather than passing in a callback function, the caller receives a `Promise` object. When properly utilized, the preceding pyramid can be coded as follows:

这是因为Promise类支持链接,如果then函数返回一个Promise对象。

async/await功能实现了Promise类的承诺,简化了异步编码。这个功能在async函数中变得活跃:


An `async` arrow function is as follows: 

为了看到async函数范式给我们带来了多大的改进,让我们将之前的示例重新编码如下:


Again, we don't need to understand the code but just look at its shape. Isn't this a breath of fresh air compared to the nested structure we started with?

The `await` keyword is used with a Promise. It automatically waits for the Promise to resolve. If the Promise resolves successfully, then the value is returned and if it resolves with an error, then that error is thrown. Both handling results and throwing errors are handled in the usual manner.

This example also shows another ES2015 feature: destructuring. The fields of an object can be extracted using the following code:

这演示了一个具有三个字段的对象,但只提取了两个字段。

为了继续探索 JavaScript 的进步,让我们来看看 Babel。

使用 Babel 使用实验性 JavaScript 功能

Babel 转译器是使用尖端 JavaScript 功能或尝试新 JavaScript 功能的主要工具。由于您可能从未见过转译器这个词,它的意思是将源代码从一种语言重写为另一种语言。它类似于编译器,Babel 将计算机源代码转换为另一种形式,但是 Babel 生成的是 JavaScript,而不是直接可执行代码。也就是说,它将 JavaScript 代码转换为 JavaScript 代码,这可能看起来没有用,直到您意识到 Babel 的输出可以针对旧的 JavaScript 版本。

更简单地说,Babel 可以配置为将具有 ES2015、ES2016、ES2017(等等)功能的代码重写为符合 ES5 版本 JavaScript 的代码。由于 ES5 JavaScript 与几乎所有旧计算机上的网络浏览器兼容,开发人员可以使用现代 JavaScript 编写其前端代码,然后使用 Babel 将其转换为在旧浏览器上执行。

要了解更多关于 Babel 的信息,请访问babeljs.io

Node Green 网站明确表示 Node.js 支持几乎所有的 ES2015、2016 和 2017 功能。因此,实际上,我们不再需要为 Node.js 项目使用 Babel。您可能需要支持旧版本的 Node.js,可以使用 Babel 来实现。

对于网络浏览器来说,一组 ECMAScript 功能和我们可以在浏览器端代码中可靠使用这些功能之间存在着更长的时间延迟。并不是网络浏览器制造商在采用新功能方面速度慢,因为谷歌、Mozilla 和微软团队都积极采用最新功能。不幸的是,苹果的 Safari 团队似乎在采用新功能方面较慢。然而,更慢的是新浏览器在现场计算机中的渗透。

因此,现代 JavaScript 程序员需要熟悉 Babel。

我们还没有准备好展示这些功能的示例代码,但我们可以继续记录 Babel 工具的设置。有关设置文档的更多信息,请访问babeljs.io/docs/setup/并单击 CLI 按钮。

为了简要介绍 Babel,我们将使用它来转译我们之前看到的脚本,以在 Node.js 6.x 上运行。在这些脚本中,我们使用了异步函数,这是 Node.js 6.x 不支持的功能。

在包含ls.jsls2.js的目录中,输入以下命令:


This installs the Babel software, along with a couple of transformation plugins. Babel has a plugin system so that you can enable the transformations required by your project. Our primary goal in this example is converting the `async` functions shown earlier into Generator functions. Generators are a new sort of function introduced with ES2015 that form the foundation for the implementation of `async` functions.

Because Node.js 6.x does not have either the `fs.promises` function or `util.promisify`, we need to make some substitutions to create a file named `ls2-old-school.js`:

我们有之前看过的相同示例,但有一些更改。fs_readdir函数创建一个 Promise 对象,然后调用fs.readdir,确保根据我们得到的结果要么reject要么resolvePromise。这基本上是util.promisify函数所做的。

因为fs_readdir返回一个 Promise,所以await关键字可以做正确的事情,并等待请求成功或失败。这段代码应该在支持async函数的 Node.js 版本上运行。但我们感兴趣的是,也是我们添加fs_readdir函数的原因是它在旧的 Node.js 版本上是如何工作的。

fs_readdir中使用的模式是在async函数上下文中使用基于回调的函数所需的。

接下来,创建一个名为.babelrc的文件,其中包含以下内容:


This file instructs Babel to use the named transformation plugins that we installed earlier. As the name implies, it will transform the `async` functions to `generator` functions.

Because we installed `babel-cli`, a `babel` command is installed, such that we can type the following:

要转译您的代码,请运行以下命令:


This command transpiles the named file, producing a new file. The new file is as follows:

这段代码并不是为了人类易读。相反,它意味着你编辑原始源文件,然后将其转换为目标 JavaScript 引擎。要注意的主要事情是转译后的代码使用了生成器函数(function*表示生成器函数)代替async函数,使用yield关键字代替await关键字。生成器函数是什么,以及yield关键字的确切作用并不重要;唯一需要注意的是yield大致相当于await,而_asyncToGenerator函数实现了类似于 async 函数的功能。否则,转译后的代码相当清晰,看起来与原始代码相似。

转译后的脚本运行如下:


换句话说,它在旧的 Node.js 版本上运行与`async`版本相同。使用类似的过程,您可以转译使用现代 ES2015(等等)构造编写的代码,以便在旧的 Web 浏览器中运行。

在本节中,我们了解了 JavaScript 语言的进展,特别是 async 函数,然后学习了如何使用 Babel 在旧的 Node.js 版本或旧的 Web 浏览器上使用这些功能。

# 摘要

在本章中,您学到了使用 Node.js 的命令行工具安装 Node.js 并运行 Node.js 服务器。我们也匆匆忽略了很多细节,这些细节将在本书的后面进行详细介绍,所以请耐心等待。

具体来说,我们涵盖了下载和编译 Node.js 源代码,安装 Node.js(无论是在家目录中用于开发还是在系统目录中用于部署),以及安装 npm,这是与 Node.js 一起使用的事实上的标准包管理器。我们还看到了如何运行 Node.js 脚本或 Node.js 服务器。然后我们看了 ES2015、2016 和 2017 的新功能。最后,我们看了如何使用 Babel 在您的代码中实现这些功能。

现在我们已经看到如何设置开发环境,我们准备开始使用 Node.js 实现应用程序。第一步是学习 Node.js 应用程序和模块的基本构建模块,即更仔细地查看 Node.js 模块,它们是如何使用的,以及如何使用 npm 来管理应用程序的依赖关系。我们将在下一章中涵盖所有这些内容。


探索 Node.js 模块

模块和包是将应用程序拆分为较小部分的基本构建模块。模块封装了一些功能,主要是 JavaScript 函数,同时隐藏实现细节并为模块公开 API。模块可以由第三方分发并安装供我们的模块使用。已安装的模块称为包。

npm 包存储库是一个庞大的模块库,供所有 Node.js 开发人员使用。在该库中有数十万个包,可以加速您的应用程序开发。

由于模块和包是应用程序的构建模块,了解它们的工作原理对于您在 Node.js 中取得成功至关重要。在本章结束时,您将对 CommonJS 和 ES6 模块有扎实的基础,了解如何在应用程序中构建模块,如何管理第三方包的依赖关系,以及如何发布自己的包。

在本章中,我们将涵盖以下主题:

+   所有类型的 Node.js 模块的定义以及如何构建简单和复杂的模块

+   使用 CommonJS 和 ES2015/ES6 模块以及何时使用每种模块

+   了解 Node.js 如何找到模块和已安装的包,以便更好地构建您的应用程序

+   使用 npm 包管理系统(以及 Yarn)来管理应用程序的依赖关系,发布包,并记录项目的管理脚本

所以,让我们开始吧。

# 第五章:定义 Node.js 模块

模块是构建 Node.js 应用程序的基本构建模块。Node.js 模块封装了函数,将细节隐藏在一个受保护的容器内,并公开明确定义的 API。

当 Node.js 创建时,当然还不存在 ES6 模块系统。因此,Ryan Dahl 基于 CommonJS 标准创建了 Node.js 模块系统。到目前为止,我们看到的示例都是按照该格式编写的模块。随着 ES2015/ES2016,为所有 JavaScript 实现创建了一个新的模块格式。这种新的模块格式被前端工程师用于其浏览器 JavaScript 代码,也被 Node.js 工程师和其他 JavaScript 实现使用。

由于 ES6 模块现在是标准模块格式,Node.js **技术指导委员会**(**TSC**)承诺支持 ES6 模块与 CommonJS 格式的一流支持。从 Node.js 14.x 开始,Node.js TSC 兑现了这一承诺。

在 Node.js 平台上应用程序中使用的每个源文件都是一个*模块*。在接下来的几节中,我们将检查不同类型的模块,从 CommonJS 模块格式开始。

在本书中,我们将传统的 Node.js 模块标识为 CommonJS 模块,新的模块格式标识为 ES6 模块。

要开始探索 Node.js 模块,当然要从头开始。

## 检查传统的 Node.js 模块格式

我们已经在上一章中看到了 CommonJS 模块的实际应用。现在是时候看看它们是什么以及它们是如何工作的了。

在第二章中的`ls.js`示例中,*设置 Node.js*,我们编写了以下代码来引入`fs`模块,从而可以访问其函数:

The require function is given a module identifier, and it searches for the module named by that identifier. If found, it loads the module definition into the Node.js runtime and making its functions available. In this case, the fs object contains the code (and data) exported by the fs module. The fs module is part of the Node.js core and provides filesystem functions.

By declaring fs as const, we have a little bit of assurance against making coding mistakes. We could mistakenly assign a value to fs, and then the program would fail, but as a const we know the reference to the fs module will not be changed.

The file, ls.js, is itself a module because every source file we use on Node.js is a module. In this case, it does not export anything but is instead a script that consumes other modules.

What does it mean to say the fs object contains the code exported by the fs module? In a CommonJS module, there is an object, module, provided by Node.js, with which the module's author describes the module. Within this object is a field, module.exports, containing the functions and data exported by the module. The return value of the require function is the object. The object is the interface provided by the module to other modules. Anything added to the module.exports object is available to other pieces of code, and everything else is hidden. As a convenience, the module.exports object is also available as exports.

The module object contains several fields that you might find useful. Refer to the online Node.js documentation for details.

Because exports is an alias of module.exports, the following two lines of code are equivalent:


您可以选择使用`module.exports`还是`exports`。但是,绝对不要做以下类似的事情:

Any assignment to exports will break the alias, and it will no longer be equivalent to module.exports. Assignments to exports.something are okay, but assigning to exports will cause failure. If your intent is to assign a single object or function to be returned by require, do this instead:


有些模块导出单个函数,因为这是模块作者设想提供所需功能的方式。

当我们说`ls.js`没有导出任何内容时,我们的意思是`ls.js`没有将任何内容分配给`module.exports`。

为了给我们一个简单的例子,让我们创建一个简单的模块,名为`simple.js`:

We have one variable, count, which is not attached to the exports object, and a function, next, which is attached. Because count is not attached to exports, it is private to the module.

Any module can have private implementation details that are not exported and are therefore not available to any other code.

Now, let's use the module we just wrote:


模块中的`exports`对象是由`require('./simple')`返回的对象。因此,每次调用`s.next`都会调用`simple.js`中的`next`函数。每次返回(并递增)局部变量`count`的值。试图访问私有字段`count`会显示它在模块外部不可用。

这就是 Node.js 解决基于浏览器的 JavaScript 的全局对象问题的方式。看起来像全局变量的变量只对包含该变量的模块是全局的。这些变量对任何其他代码都不可见。

Node.js 包格式源自 CommonJS 模块系统([`commonjs.org`](http://commonjs.org))。在开发时,CommonJS 团队的目标是填补 JavaScript 生态系统中的空白。当时,没有标准的模块系统,使得打包 JavaScript 应用程序变得更加棘手。`require`函数、`exports`对象和 Node.js 模块的其他方面直接来自 CommonJS `Modules/1.0`规范。

`module`对象是由 Node.js 注入的全局模块对象。它还注入了另外两个变量:`__dirname`和`__filename`。这些对于帮助模块中的代码知道其在文件系统中的位置非常有用。主要用于使用相对于模块位置的路径加载其他文件。

例如,可以将像 CSS 或图像文件这样的资源存储在相对于模块的目录中。然后应用框架可以通过 HTTP 服务器提供这些文件。在 Express 中,我们可以使用以下代码片段来实现:

This says that HTTP requests on the /assets/vendor/jquery URL are to be handled by the static handler in Express, from the contents of a directory relative to the directory containing the module. Don't worry about the details because we'll discuss this more carefully in a later chapter. Just notice that __dirname is useful to calculate a filename relative to the location of the module source code.

To see it in action, create a file named dirname.js containing the following:


这让我们看到我们收到的值:

Simple enough, but as we'll see later these values are not directly available in ES6 modules.

Now that we've got a taste for CommonJS modules, let's take a look at ES2015 modules.

Examining the ES6/ES2015 module format

ES6 modules are a new module format designed for all JavaScript environments. While Node.js has always had a good module system, browser-side JavaScript has not. That meant the browser-side community had to use non-standardized solutions. The CommonJS module format was one of those non-standard solutions, which was borrowed for use in Node.js. Therefore, ES6 modules are a big improvement for the entire JavaScript world, by getting everyone on the same page with a common module format and mechanisms.

An issue we have to deal with is the file extension to use for ES6 modules. Node.js needs to know whether to parse using the CommonJS or ES6 module syntax. To distinguish between them, Node.js uses the file extension .mjs to denote ES6 modules, and .js to denote CommonJS modules. However, that's not the entire story since Node.js can be configured to recognize the .js files as ES6 modules. We'll give the exact particulars later in this chapter.

The ES6 and CommonJS modules are conceptually similar. Both support exporting data and functions from a module, and both support hiding implementation inside a module. But they are very different in many practical ways.

Let's start with defining an ES6 module. Create a file named simple2.mjs in the same directory as the simple.js example that we looked at earlier:


这与`simple.js`类似,但添加了一些内容以演示更多功能。与以前一样,`count`是一个未导出的私有变量,`next`是一个导出的函数,用于递增`count`。

`export`关键字声明了从 ES6 模块中导出的内容。在这种情况下,我们有几个导出的函数和两个导出的变量。`export`关键字可以放在任何顶层声明的前面,比如变量、函数或类声明:

The effect of this is similar to the following:


两者的目的本质上是相同的:使函数或其他对象可供模块外部的代码使用。但是,我们不是显式地创建一个对象`module.exports`,而是简单地声明要导出的内容。例如`export function next()`这样的语句是一个命名导出,意味着导出的函数(就像这里)或对象有一个名称,模块外部的代码使用该名称来访问对象。正如我们在这里看到的,命名导出可以是函数或对象,也可以是类定义。

模块的*默认导出*是使用`export default`定义的,每个模块只能导出一次。默认导出是模块外部代码在使用模块对象本身时访问的内容,而不是使用模块中的导出之一。

你也可以先声明一些东西,比如`squared`函数,然后再导出它。

现在让我们看看如何使用 ES2015 模块。创建一个名为`simpledemo.mjs`的文件,内容如下:

The import statement does what it says: it imports objects exported from a module. Because it uses the import * as foo syntax, it imports everything from the module, attaching everything to an object, in this case named simple2. This version of the import statement is most similar to a traditional Node.js require statement because it creates an object with fields containing the objects exported from the module.

This is how the code executes:


过去,ES6 模块格式是隐藏在一个选项标志`--experimental-module`后面的,但是从 Node.js 13.2 开始,不再需要该标志。访问`default`导出是通过访问名为`default`的字段来实现的。访问导出的值,比如`meaning`字段,是不需要括号的,因为它是一个值而不是一个函数。

现在来看一种从模块中导入对象的不同方法,创建另一个文件,名为`simpledemo2.mjs`,内容如下:

In this case, the import is treated similarly to an ES2015 destructuring assignment. With this style of import, we specify exactly what is to be imported, rather than importing everything. Furthermore, instead of attaching the imported things to a common object, and therefore executing simple2.next(), the imported things are executed using their simple name, as in next().

The import for default as simple is the way to declare an alias of an imported thing. In this case, it is necessary so that the default export has a name other than default.

Node.js modules can be used from the ES2015 .mjs code. Create a file named ls.mjs containing the following:


这是第二章中`ls.js`示例的重新实现,*设置 Node.js*。在这两种情况下,我们都使用了`fs`包的`promises`子模块。要使用`import`语句,我们访问`fs`模块中的`promises`导出,并使用`as`子句将`fs.promises`重命名为`fs`。这样我们就可以使用异步函数而不是处理回调。

否则,我们有一个`async`函数`listFiles`,它执行文件系统操作以从目录中读取文件名。因为`listFiles`是`async`,它返回一个 Promise,我们必须使用`.catch`子句捕获任何错误。

执行脚本会得到以下结果:

The last thing to note about ES2015 module code is that the import and export statements must be top-level code. Try putting an export inside a simple block like this:


这个无辜的代码导致了一个错误:

While there are a few more details about the ES2015 modules, these are their most important attributes.

Remember that the objects injected into CommonJS modules are not available to ES6 modules. The __dirname and __filename objects are the most important, since there are many cases where we compute a filename relative to the currently executing module. Let us explore how to handle that issue.

Injected objects in ES6 modules

Just as for CommonJS modules, certain objects are injected into ES6 modules. Furthermore, ES6 modules do not receive the __dirname, and __filename objects or other objects that are injected into CommonJS modules.

The import.meta meta-property is the only value injected into ES6 modules. In Node.js it contains a single field, url. This is the URL from which the currently executing module was loaded.

Using import.meta.url, we can compute __dirname and __filename.

Computing the missing __dirname variable in ES6 modules

If we make a duplicate of dirname.js as dirname.mjs, so it will be interpreted as an ES6 module, we get the following:


由于`__dirname`和`__filename`不是 JavaScript 规范的一部分,它们在 ES6 模块中不可用。输入`import.meta.url`对象,我们可以计算`__dirname`和`__filename`。要看它的运行情况,创建一个包含以下内容的`dirname-fixed.mjs`文件:

We are importing a couple of useful functions from the url and path core packages. While we could take the import.meta.url object and do our own computations, these functions already exist. The computation is to extract the pathname portion of the module URL, to compute __filename, and then use dirname to compute __dirname.


我们看到模块的`file://` URL,以及使用内置核心函数计算的`__dirname`和`__filename`的值。

我们已经讨论了 CommonJS 和 ES6 模块格式,现在是时候讨论在应用程序中同时使用它们了。

## 同时使用 CommonJS 和 ES6 模块

Node.js 支持 JavaScript 代码的两种模块格式:最初为 Node.js 开发的 CommonJS 格式,以及新的 ES6 模块格式。这两种格式在概念上是相似的,但在实际上有许多不同之处。因此,我们将面临在同一个应用程序中同时使用两种格式的情况,并需要知道如何进行操作。

首先是文件扩展名的问题,以及识别要使用哪种模块格式。以下情况下使用 ES6 模块格式:

+   文件名以`.mjs`结尾的文件。

+   如果`package.json`有一个名为`type`且值为`module`的字段,则以`.js`结尾的文件。

+   如果`node`二进制文件使用`--input-type=module`标志执行,则通过`--eval`或`--print`参数传递的任何代码,或者通过 STDIN(标准输入)传入的代码,都将被解释为 ES6 模块代码。

这是相当直截了当的。ES6 模块在以`.mjs`扩展名命名的文件中,除非你在`package.json`中声明包默认使用 ES6 模块,这样以`.js`扩展名命名的文件也会被解释为 ES6 模块。

以下情况下使用 CommonJS 模块格式:

+   文件名以`.cjs`结尾的文件。

+   如果`package.json`不包含`type`字段,或者包含一个值为`commonjs`的`type`字段,则文件名将以`.js`结尾。

+   如果`node`二进制文件使用`--input-type`标志或`--type-type=commonjs`标志执行,则通过`--eval`或`--print`参数传递的任何代码,或者通过 STDIN(标准输入)传入的代码,都将被解释为 CommonJS 模块代码。

再次,这是直截了当的,Node.js 默认使用 CommonJS 模块来处理`.js`文件。如果包明确声明为默认使用 CommonJS 模块,则 Node.js 将把`.js`文件解释为 CommonJS。

Node.js 团队强烈建议包作者在`package.json`中包含一个`type`字段,即使类型是`commonjs`。

考虑一个具有这个声明的`package.json`:

This, of course, informs Node.js that the package defaults to ES6 modules. Therefore, this command interprets the module as an ES6 module:


这个命令将执行相同的操作,即使没有`package.json`条目:

If instead, the type field had the commonjs, or the --input-type flag specified as commonjs, or if both those were completely missing, then my-module.js would be interpreted as a CommonJS module.

These rules also apply to the import statement, the import() function, and the require() function. We will cover those commands in more depth in a later section. In the meantime, let's learn how the import() function partly resolves the inability to use ES6 modules in a CommonJS module.

Using ES6 modules from CommonJS using import()

The import statement in ES6 modules is a statement, and not a function like require(). This means that import can only be given a static string, and you cannot compute the module identifier to import. Another limitation is that import only works in ES6 modules, and therefore a CommonJS module cannot load an ES6 module. Or, can it?

Since the import() function is available in both CommonJS and ES6 modules, that means we should be able to use it to import ES6 modules in a CommonJS module.

To see how this works, create a file named simple-dynamic-import.js containing the following:


这是一个使用我们之前创建的 ES6 模块的 CommonJS 模块。它只是调用了一些函数,除了它在我们之前说过的只有 ES6 模块中才能使用`import`之外,没有什么激动人心的地方。让我们看看这个模块的运行情况:

This is a CommonJS module successfully executing code contained in an ES6 module simply by using import().

Notice that import() was called not in the global scope of the module, but inside an async function. As we saw earlier, the ES6 module keyword statements like export and import must be called in the global scope. However, import() is an asynchronous function, limiting our ability to use it in the global scope.

The import statement is itself an asynchronous process, and by extension the import() function is asynchronous, while the Node.js require() function is synchronous.

In this case, we executed import() inside an async function using the await keyword. Therefore, even if import() were used in the global scope, it would be tricky getting a global-scope variable to hold the reference to that module. To see, why let's rewrite that example as simple-dynamic-import-fail.js:


这是相同的代码,但在全局范围内运行。在全局范围内,我们不能使用`await`关键字,所以我们应该期望`simple2`将包含一个挂起的 Promise。运行脚本会导致失败:

We see that simple2 does indeed contain a pending Promise, meaning that import() has not yet finished. Since simple2 does not contain a reference to the module, attempts to call the exported function fail.

The best we could do in the global scope is to attach the .then and .catch handlers to the import() function call. That would wait until the Promise transitions to either a success or failure state, but the loaded module would be inside the callback function. We'll see this example later in the chapter.

Let's now see how modules hide implementation details.

Hiding implementation details with encapsulation in CommonJS and ES6 modules

We've already seen a couple of examples of how modules hide implementation details with the simple.js example and the programs we examined in Chapter 2, Setting up Node.js. Let's take a closer look.

Node.js modules provide a simple encapsulation mechanism to hide implementation details while exposing an API. To review, in CommonJS modules the exposed API is assigned to the module.exports object, while in ES6 modules the exposed API is declared with the export keyword. Everything else inside a module is not available to code outside the module.

In practice, CommonJS modules are treated as if they were written as follows:


因此,模块内的一切都包含在一个匿名的私有命名空间上下文中。这就解决了全局对象问题:模块中看起来全局的一切实际上都包含在一个私有上下文中。这也解释了注入的变量实际上是如何注入到模块中的。它们是创建模块的函数的参数。

另一个优势是代码安全性。因为模块中的私有代码被隐藏在私有命名空间中,所以模块外部的代码或数据无法访问私有代码。

让我们来看一个封装的实际演示。创建一个名为`module1.js`的文件,其中包含以下内容:

Then, create a file named module2.js, containing the following:


使用这两个模块,我们可以看到每个模块都是其自己受保护的泡泡。

然后按照以下方式运行它:

This artificial example demonstrates encapsulation of the values in module1.js from those in module2.js. The A and B values in module1.js don't overwrite A and B in module2.js because they're encapsulated within module1.js. The values function in module1.js does allow code in module2.js access to the values; however, module2.js cannot directly access those values. We can modify the object module2.js received from module1.js. But doing so does not change the values within module1.js.

In Node.js modules can also be data, not just code.

Using JSON modules

Node.js supports using require('./path/to/file-name.json') to import a JSON file in a CommonJS module. It is equivalent to the following code:


也就是说,JSON 文件是同步读取的,文本被解析为 JSON。生成的对象作为模块导出的对象可用。创建一个名为`data.json`的文件,其中包含以下内容:

Now create a file named showdata.js containing the following:


它将执行如下:

The console.log function outputs information to the Terminal. When it receives an object, it prints out the object content like this. And this demonstrates that require correctly read the JSON file since the resulting object matched the JSON.

In an ES6 module, this is done with the import statement and requires a special flag. Create a file named showdata-es6.mjs containing the following:


到目前为止,这相当于该脚本的 CommonJS 版本,但使用`import`而不是`require`。

Currently using import to load a JSON file is an experimental feature. Enabling the feature requires these command-line arguments, causing this warning to be printed. We also see that instead of data being an anonymous object, it is an object with the type Module.

Now let's look at how to use ES6 modules on some older Node.js releases.

Supporting ES6 modules on older Node.js versions

Initially, ES6 module support was an experimental feature in Node.js 8.5 and became a fully supported feature in Node.js 14. With the right tools, we can use it on earlier Node.js implementations.

For an example of using Babel to transpile ES6 code for older Node.js versions, see blog.revillweb.com/using-es2015-es6-modules-with-babel-6-3ffc0870095b.

The better method of using ES6 modules on Node.js 6.x is the esm package. Simply do the following:


有两种方法可以使用这个模块:

+   在 CommonJS 模块中,调用`require('esm')`。

+   在命令行中使用`--require esm`,如下所示。

在这两种情况下,效果是一样的,即加载`esm`模块。这个模块只需要加载一次,我们不必调用它的任何方法。相反,`esm`将 ES6 模块支持改装到 Node.js 运行时中,并且与 6.x 版本及更高版本兼容。

因此,我们可以使用这个模块来改装 ES6 模块支持;它不改装其他功能,比如`async`函数。成功执行`ls.mjs`示例需要对`async`函数和箭头函数的支持。由于 Node.js 6.x 不支持任何一个,`ls.mjs`示例将能够正确加载,但仍将失败,因为它使用了其他不受支持的功能。

It is, of course, possible to use Babel in such cases to convert the full set of ES2015+ features to run on older Node.js releases.

For more information about esm, see: 
medium.com/web-on-the-edge/es-modules-in-node-today-32cff914e4b. The article describes an older release of the esm module, at the time named @std/esm.

Th current documentation for the esm package is available at: www.npmjs.com/package/esm.

In this section, we've learned about how to define a Node.js module and various ways to use both CommonJS and ES6 modules. But we've left out some very important things: what is the module identifier and all the ways to locate and use modules. In the next section, we cover these topics.

Finding and loading modules using require and import

In the course of learning about modules for Node.js, we've used the require and import features without going into detail about how modules are found and all the options available. The algorithm for finding Node.js modules is very flexible. It supports finding modules that are siblings of the currently executing module, or have been installed local to the current project, or have been installed globally.

For both require and import, the command takes a module identifier. The algorithm Node.js uses is in charge of resolving the module identifier into a file containing the module, so that Node.js can load the module.

The official documentation for this is in the Node.js documentation, at nodejs.org/api/modules.html. The official documentation for ES6 modules also discusses how the algorithm differs, atnodejs.org/api/esm.html.

Understanding the module resolution algorithm is one key to success with Node.js. This algorithm determines how best to structure the code in a Node.js application. While debugging problems with loading the correct version of a given package, we need to know how Node.js finds packages.

First, we must consider several types of modules, starting with the simple file modules we've already used.

Understanding File modules

The CommonJS and ES6 modules we've just looked at are what the Node.js documentation describes as a file module. Such modules are contained within a single file, whose filename ends with .js, .cjs.mjs, .json, or .node. The latter are compiled from C or C++ source code, or even other languages such as Rust, while the former are, of course, written in JavaScript or JSON.

The module identifier of a file module must start with ./ or ../. This signals Node.js that the module identifier refers to a local file. As should already be clear, this module identifier refers to a pathname relative to the currently executing module.

It is also possible to use an absolute pathname as the module identifier. In a CommonJS module, such an identifier might be /path/to/some/directory/my-module.js. In an ES6 module, since the module identifier is actually a URL, then we must use a file:// URL like file:///path/to/some/directory/my-module.mjs. There are not many cases where we would use an absolute module identifier, but the capability does exist.

One difference between CommonJS and ES6 modules is the ability to use extensionless module identifiers. The CommonJS module loader allows us to do this, which you should save as extensionless.js:


这使用了一个无扩展名的模块标识符来加载我们已经讨论过的模块`simple.js`:

And we can run it with the node command using an extension-less module identifier.

But if we specify an extension-less identifier for an ES6 module:


我们收到了错误消息,清楚地表明 Node.js 无法解析文件名。同样,在 ES6 模块中,给`import`语句的文件名必须带有文件扩展名。

接下来,让我们讨论 ES6 模块标识符的另一个副作用。

### ES6 的 import 语句采用 URL

ES6 `import`语句中的模块标识符是一个 URL。有几个重要的考虑因素。

由于 Node.js 只支持`file://`URL,我们不允许从 Web 服务器检索模块。这涉及明显的安全问题,如果模块可以从`http://`URL 加载,企业安全团队将会感到焦虑。

引用具有绝对路径名的文件必须使用`file:///path/to/file.ext`语法,如前面所述。这与`require`不同,我们将使用`/path/to/file.ext`。

由于`?`和`#`在 URL 中具有特殊意义,它们对`import`语句也具有特殊意义,如下例所示:

This loads the module named module-name.mjs with a query string containing query=1. By default, this is ignored by the Node.js module loader, but there is an experimental loader hook feature by which you can do something with the module identifier URL.

The next type of module to consider is those baked into Node.js, the core modules.

Understanding the Node.js core modules

Some modules are pre-compiled into the Node.js binary. These are the core Node.js modules documented on the Node.js website at nodejs.org/api/index.html.

They start out as source code within the Node.js build tree. The build process compiles them into the binary so that the modules are always available.

We've already seen how the core modules are used. In a CommonJS module, we might use the following:


在 ES6 模块中的等效代码如下:

In both cases, we're loading the http and fs core modules that would then be used by other code in the module.

Moving on, we will next talk about more complex module structures.

Using a directory as a module

We commonly organize stuff into a directory structure. The stuff here is a technical term referring to internal file modules, data files, template files, documentation, tests, assets, and more. Node.js allows us to create an entry-point module into such a directory structure.

For example, with a module identifier like ./some-library that refers to a directory, then there must be a file named index.js, index.cjs, index.mjs, or index.node in the directory. In such a case, the module loader loads the appropriate index module even though the module identifier did not reference a full pathname. The pathname is computed by appending the file it finds in the directory.

One common use for this is that the index module provides an API for a library stored in the directory and that other modules in the directory contain what's meant to be private implement details.

This may be a little confusing because the word module is being overloaded with two meanings. In some cases, a module is a file, and in other cases, a module is a directory containing one or more file modules.

While overloading the word module this way might be a little confusing, it's going to get even more so as we consider the packages we install from other sources.

Comparing installed packages and modules

Every programming platform supports the distribution of libraries or packages that are meant to be used in a wide array of applications. For example, where the Perl community has CPAN, the Node.js community has the npm registry. A Node.js installed package is the same as we just described as a folder as a module, in that the package format is simply a directory containing a package.json file along with the code and other files comprising the package.

There is the same risk of confusion caused by overloading the word module since an installed package is typically the same as the directories as modules concept just described. Therefore, it's useful to refer to an installed package with the word package.

The package.json file describes the package. A minimal set of fields are defined by Node.js, specifically as follows:


`name`字段给出了包的名称。如果存在`main`字段,它将命名要在加载包时使用的 JavaScript 文件,而不是`index.js`。像 npm 和 Yarn 这样的包管理应用程序支持`package.json`中的更多字段,它们用来管理依赖关系、版本和其他一切。

如果没有`package.json`,那么 Node.js 将寻找`index.js`或`index.node`。在这种情况下,`require('some-library')`将加载`/path/to/some-library/index.js`中的文件模块。

安装的包保存在一个名为`node_modules`的目录中。当 JavaScript 源代码有`require('some-library')`或`import 'some-library'`时,Node.js 会在一个或多个`node_modules`目录中搜索以找到命名的包。

请注意,在这种情况下,模块标识符只是包名。这与我们之前学习的文件和目录模块标识符不同,因为这两者都是路径名。在这种情况下,模块标识符有点抽象,这是因为 Node.js 有一个算法来在嵌套的`node_modules`目录中找到包。

要理解这是如何工作的,我们需要深入了解算法。

## 在文件系统中找到安装的包

Node.js 包系统如此灵活的关键之一是用于搜索包的算法。

对于给定的`require`、`import()`或`import`语句,Node.js 会在包含该语句的目录中向上搜索文件系统。它正在寻找一个名为`node_modules`的目录,其中包含满足模块标识符的模块。

例如,对于名为`/home/david/projects/notes/foo.js`的源文件和请求模块标识符`bar.js`的`require`或`import`语句,Node.js 尝试以下选项:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/b20040f4-8a85-4f12-b445-b49a673a904a.png)

正如刚才所说,搜索从`foo.js`所在的文件系统级别开始。Node.js 会查找名为`bar.js`的文件模块,或者包含模块的名为`bar.js`的目录,如*使用目录作为模块*中所述。Node.js 将在`foo.js`旁边的`node_modules`目录以及该文件上方的每个目录中检查这个包。但是,它不会进入任何目录,比如`express`或`express/node_modules`。遍历只会向文件系统上方移动,而不会向下移动。

虽然一些第三方包的名称以`.js`结尾,但绝大多数不是。因此,我们通常会使用`require('bar')`。通常,第三方安装的包是作为一个包含`package.json`文件和一些 JavaScript 文件的目录交付的。因此,在典型情况下,包模块标识符将是`bar`,Node.js 将在一个`node_modules`目录中找到一个名为`bar`的目录,并从该目录访问包。

在文件系统中向上搜索的这种行为意味着 Node.js 支持包的嵌套安装。一个 Node.js 包可能依赖于其他模块,这些模块将有自己的`node_modules`目录;也就是说,`bar`包可能依赖于`fred`包。包管理应用程序可能会将`fred`安装为`/home/david/projects/notes/node_modules/bar/node_modules/fred`:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/0d84888b-cc4d-4f3c-881e-dd3dc6c71008.png)

在这种情况下,当`bar`包中的 JavaScript 文件使用`require('fred')`时,它的模块搜索从`/home/david/projects/notes/node_modules/bar/node_modules`开始,在那里它会找到`fred`包。但是,如果包管理器检测到`notes`中使用的其他包也使用`fred`包,包管理器将把它安装为`/home/david/projects/notes/node_modules/fred`。

因为搜索算法会在文件系统中向上查找,它会在任一位置找到`fred`。

最后要注意的是,这种`node_modules`目录的嵌套可以任意深。虽然包管理应用程序尝试在一个平面层次结构中安装包,但可能需要将它们深度嵌套。

这样做的一个原因是为了能够使用两个或更多版本的同一个包。

### 处理同一安装包的多个版本

Node.js 包标识符解析算法允许我们安装两个或更多版本的同一个包。回到假设的*notes*项目,注意`fred`包不仅为`bar`包安装,也为`express`包安装。

查看算法,我们知道`bar`软件包和`express`软件包中的`require('fred')`将分别满足于本地安装的相应`fred`软件包。

通常,软件包管理应用程序将检测`fred`软件包的两个实例并仅安装一个。但是,假设`bar`软件包需要`fred`版本 1.2,而`express`软件包需要`fred`版本 2.1。

在这种情况下,软件包管理应用程序将检测不兼容性,并安装两个版本的`fred`软件包,如下所示:

+   在`/home/david/projects/notes/node_modules/bar/node_modules`中,它将安装`fred`版本 1.2。

+   在`/home/david/projects/notes/node_modules/express/node_modules`中,它将安装`fred`版本 2.1。

当`express`软件包执行`require('fred')`或`import 'fred'`时,它将满足于`/home/david/projects/notes/node_modules/express/node_modules/fred`中的软件包。同样,`bar`软件包将满足于`/home/david/projects/notes/node_modules/bar/node_modules/fred`中的软件包。在这两种情况下,`bar`和`express`软件包都有`fred`软件包的正确版本可用。它们都不知道已安装另一个版本的`fred`。

`node_modules`目录用于应用程序所需的软件包。Node.js 还支持在全局位置安装软件包,以便它们可以被多个应用程序使用。

## 搜索全局安装的软件包

我们已经看到,使用 npm 可以执行*全局安装*软件包。例如,如果全局安装了`hexy`或`babel`等命令行工具,那么很方便。在这种情况下,软件包将安装在项目目录之外的另一个文件夹中。Node.js 有两种策略来查找全局安装的软件包。

与`PATH`变量类似,`NODE_PATH`环境变量可用于列出额外的目录,以便在其中搜索软件包。在类 Unix 操作系统上,`NODE_PATH`是一个由冒号分隔的目录列表,在 Windows 上是用分号分隔的。在这两种情况下,它类似于`PATH`变量的解释,这意味着`NODE_PATH`有一个目录名称列表,用于查找已安装的模块。

不建议使用`NODE_PATH`方法,因为如果人们不知道必须设置这个变量,可能会发生令人惊讶的行为。如果需要特定目录中的特定模块以正确运行,并且未设置该变量,应用程序可能会失败。最佳做法是明确声明所有依赖关系,对于 Node.js 来说,这意味着在`package.json`文件中列出所有依赖项,以便`npm`或`yarn`可以管理依赖项。

在刚刚描述的模块解析算法之前,已经实现了这个变量。由于该算法,`NODE_PATH`基本上是不必要的。

有三个额外的位置可以存放模块:

+   `$HOME/.node_modules`

+   `$HOME/.node_libraries`

+   `$PREFIX/lib/node`

在这种情况下,`$HOME`是您期望的(用户的主目录),而`$PREFIX`是安装 Node.js 的目录。

有人建议不要使用全局软件包。理由是希望实现可重复性和可部署性。如果您已经测试了一个应用程序,并且所有代码都方便地位于一个目录树中,您可以将该目录树复制到其他机器上进行部署。但是,如果应用程序依赖于系统其他位置神奇安装的某些其他文件,该怎么办?您会记得部署这些文件吗?应用程序的作者可能会编写文档,说明在运行*npm install*之前*安装这个*,然后*安装那个*,以及*安装其他东西*,但是应用程序的用户是否会正确地遵循所有这些步骤?

最好的安装说明是简单地运行*npm install*或*yarn install*。为了使其工作,所有依赖项必须在`package.json`中列出。

在继续之前,让我们回顾一下不同类型的模块标识符。

## 审查模块标识符和路径名

这是分布在几个部分的许多细节。因此,当使用`require`、`import()`或`import`语句时,快速回顾一下模块标识符是如何解释的是很有用的:

+   **相对模块标识符**:这些以 `./` 或 `../` 开头,绝对标识符以 `/` 开头。模块名称与 POSIX 文件系统语义相同。结果路径名是相对于正在执行的文件的位置进行解释的。也就是说,以 `./` 开头的模块标识符在当前目录中查找,而以 `../` 开头的模块标识符在父目录中查找。

+   **绝对模块标识符**:这些以 `/` (或 `file://` 用于 ES6 模块)开头,当然,会在文件系统的根目录中查找。这不是推荐的做法。

+   **顶级模块标识符**:这些不以这些字符串开头,只是模块名称。这些必须存储在`node_modules`目录中,Node.js 运行时有一个非常灵活的算法来定位正确的`node_modules`目录。

+   **核心模块**:这些与*顶级模块标识符*相同,即没有前缀,但核心模块已经预先嵌入到 Node.js 二进制文件中。

在所有情况下,除了核心模块,模块标识符都会解析为包含实际模块的文件,并由 Node.js 加载。因此,Node.js 所做的是计算模块标识符和实际文件名之间的映射关系。

不需要使用包管理器应用程序。Node.js 模块解析算法不依赖于包管理器,如 npm 或 Yarn,来设置`node_modules`目录。这些目录并没有什么神奇之处,可以使用其他方法构建包含已安装包的`node_modules`目录。但最简单的机制是使用包管理器应用程序。

一些包提供了我们可以称之为主包的子包,让我们看看如何使用它们。

## 使用深度导入模块标识符

除了像 `require('bar')` 这样的简单模块标识符外,Node.js 还允许我们直接访问包中包含的模块。使用不同的模块标识符,以模块名称开头,添加所谓的*深度导入*路径。举个具体的例子,让我们看一下 `mime` 模块([`www.npmjs.com/package/mime`](https://www.npmjs.com/package/mime)),它处理将文件名映射到相应的 MIME 类型。

在正常情况下,你使用 `require('mime')` 来使用该包。然而,该包的作者开发了一个精简版本,省略了许多特定供应商的 MIME 类型。对于该版本,你使用 `require('mime/lite')`。当然,在 ES6 模块中,你会相应地使用 `import 'mime'` 和 `import 'mime/lite'`。

`mime/lite`是深度导入模块标识符的一个例子。

使用这样的模块标识符,Node.js 首先定位包含主要包的`node_modules`目录。在这种情况下,就是 `mime` 包。默认情况下,深度导入模块只是相对于包目录的路径名,例如,`/path/to/node_modules/mime/lite`。根据我们已经检查过的规则,它将被满足为一个名为 `lite.js` 的文件,或者一个名为 `lite` 的目录,其中包含一个名为 `index.js` 或 `index.mjs` 的文件。

但是可以覆盖默认行为,使深度导入标识符指向模块中的不同文件。

### 覆盖深度导入模块标识符

使用该包的代码所使用的深度导入模块标识符不必是包源内部使用的路径名。我们可以在 `package.json` 中放置声明,描述每个深度导入标识符的实际路径名。例如,具有内部模块命名为 `./src/cjs-module.js` 和 `./src/es6-module.mjs` 的包可以在 `package.json` 中使用此声明进行重新映射:

With this, code using such a package can load the inner module using require('module-name/cjsmodule') or import 'module-name/es6module'. Notice that the filenames do not have to match what's exported.

In a package.json file using this exports feature, a request for an inner module not listed in exports will fail. Supposing the package has a ./src/hidden-module.js file, calling require('module-name/src/hidden-module.js') will fail.

All these modules and packages are meant to be used in the context of a Node.js project. Let's take a brief look at a typical project.

Studying an example project directory structure

A typical Node.js project is a directory containing a package.json file declaring the characteristics of the package, especially its dependencies. That, of course, describes a directory module, meaning that each module is its own project. At the end of the day, we create applications, for example, an Express application, and these applications depend on one or more (possibly thousands of) packages that are to be installed:

This is an Express application (we'll start using Express in Chapter 5, Your First Express Application) containing a few modules installed in the node_modules directory. A typical Express application uses app.js as the main module for the application, and has code and asset files distributed in the public, routes, and views directories. Of course, the project dependencies are installed in the node_modules directory.

But let's focus on the content of the node_modules directory versus the actual project files. In this screenshot, we've selected the express package. Notice it has a package.json file and there is an index.js file. Between those two files, Node.js will recognize the express directory as a module, and calling require('express') or import 'express' will be satisfied by this directory.

The express directory has its own node_modules directory, in which are installed two packages. The question is, why are those packages installed in express/node_modules rather than as a sibling of the express package?

Earlier we discussed what happens if two modules (modules A and B) list a dependency on different versions of the same module (C). In such a case, the package manager application will install two versions of C, one as A/node_modules/C and the other as B/node_modules/C. The two copies of C are thus located such that the module search algorithm will cause module A and module B to have the correct version of module C.

That's the situation we see with express/node_modules/cookie. To verify this, we can use an npm command to query for all references to the module:


这表示 `cookie-parser` 模块依赖于 `cookie` 的 0.1.3 版本,而 Express 依赖于 0.1.5 版本。

现在我们可以认识到模块是什么,以及它们如何在文件系统中找到,让我们讨论何时可以使用每种方法来加载模块。

## 使用 require、import 和 import() 加载模块

显然,CommonJS 模块中使用 `require`,ES6 模块中使用 `import`,但有一些细节需要讨论。我们已经讨论了 CommonJS 和 ES6 模块之间的格式和文件名差异,所以让我们在这里专注于加载模块。

`require` 函数仅在 CommonJS 模块中可用,用于加载 CommonJS 模块。该模块是同步加载的,也就是说当 `require` 函数返回时,模块已完全加载。

默认情况下,CommonJS 模块无法加载 ES6 模块。但正如我们在 `simple-dynamic-import.js` 示例中看到的,CommonJS 模块可以使用 `import()` 加载 ES6 模块。由于 `import()` 函数是一个异步操作,它返回一个 Promise,因此我们不能将结果模块用作顶级对象。但我们可以在函数内部使用它:

And at the top-level of a Node.js script, the best we can do is the following:


这与 `simple-dynamic-import.js` 示例相同,但我们明确处理了 `import()` 返回的 Promise,而不是使用异步函数。虽然我们可以将 `simple2` 赋给全局变量,但使用该变量的其他代码必须适应赋值可能尚未完成的可能性。

`import()` 提供的模块对象包含在 ES6 模块中使用 `export` 语句导出的字段和函数。正如我们在这里看到的,默认导出具有 `default` 名称。

换句话说,在 CommonJS 模块中使用 ES6 模块是可能的,只要我们等待模块完成加载后再使用它。

`import` 语句用于加载 ES6 模块,仅在 ES6 模块内部有效。您传递给 `import` 语句的模块说明符被解释为 URL。

ES6 模块可以有多个命名导出。在我们之前使用的 `simple2.mjs` 中,这些是函数 `next`、`squared` 和 `hello`,以及值 `meaning` 和 `nocount`。ES6 模块可以有单个默认导出,就像我们在 `simple2.mjs` 中看到的那样。

通过 `simpledemo2.mjs`,我们看到可以只从模块中导入所需的内容:

In this case, we use the exports as just the name, without referring to the module: simple(), hello(), and next().

It is possible to import just the default export:


在这种情况下,我们可以调用函数为 `simple()`。我们还可以使用所谓的命名空间导入;这类似于我们导入 CommonJS 模块的方式:

In this case, each property exported from the module is a property of the named object in the import statement.

An ES6 module can also use import to load a CommonJS module. Loading the simple.js module we used earlier is accomplished as follows:


这类似于 ES6 模块所示的 *默认导出* 方法,我们可以将 CommonJS 模块内的 `module.exports` 对象视为默认导出。实际上,`import` 可以重写为以下形式:

This demonstrates that the CommonJS module.exports object is surfaced as default when imported.

We've learned a lot about using modules in Node.js. This included the different types of modules, and how to find them in the file system. Our next step is to learn about package management applications and the npm package repository.

Using npm – the Node.js package management system

As described in Chapter 2, Setting **up **Node.js, npm is a package management and distribution system for Node.js. It has become the de facto standard for distributing modules (packages) for use with Node.js. Conceptually, it's similar to tools such as apt-get (Debian), rpm/yum (Red Hat/Fedora), MacPorts/Homebrew (macOS), CPAN (Perl), or PEAR (PHP). Its purpose is to publish and distributing Node.js packages over the internet using a simple command-line interface. In recent years, it has also become widely used for distributing front-end libraries like jQuery and Bootstrap that are not Node.js modules. With npm, you can quickly find packages to serve specific purposes, download them, install them, and manage packages you've already installed.

The npm application extends on the package format for Node.js, which in turn is largely based on the CommonJS package specification. It uses the same package.json file that's supported natively by Node.js, but with additional fields for additional functionality.

The npm package format

An npm package is a directory structure with a package.json file describing the package. This is exactly what was referred to earlier as a directory module, except that npm recognizes many more package.json tags than Node.js does. The starting point for npm's package.json file is the CommonJS Packages/1.0 specification. The documentation for the npm package.json implementation is accessed using the following command:


一个基本的 `package.json` 文件如下:

Npm recognizes many more fields than this, and we'll go over some of them in the coming sections. The file is in JSON format, which, as a JavaScript programmer, you should be familiar with.

There is a lot to cover concerning the npm package.json format, and we'll do so over the following sections.

Accessing npm helpful documentation

The main npm command has a long list of subcommands for specific package management operations. These cover every aspect of the life cycle of publishing packages (as a package author), and downloading, using, or removing packages (as an npm consumer).

You can view the list of these commands just by typing npm (with no arguments). If you see one you want to learn more about, view the help information:


帮助文本将显示在您的屏幕上。

npm 网站上也提供了帮助信息:[`docs.npmjs.com/cli-documentation/`](https://docs.npmjs.com/cli-documentation/)。

在查找和安装 Node.js 包之前,我们必须初始化项目目录。

## 使用 npm init 初始化 Node.js 包或项目

npm 工具使得初始化 Node.js 项目目录变得容易。这样的目录包含至少一个 `package.json` 文件和一个或多个 Node.js JavaScript 文件。

因此,所有 Node.js 项目目录都是模块,根据我们之前学到的定义。然而,在许多情况下,Node.js 项目并不打算导出任何功能,而是一个应用程序。这样的项目可能需要其他 Node.js 包,并且这些包将在`package.json`文件中声明,以便使用 npm 轻松安装。Node.js 项目的另一个常见用例是一个旨在供其他 Node.js 包或应用程序使用的功能包。这些包也包括一个`package.json`文件和一个或多个 Node.js JavaScript 文件,但在这种情况下,它们是导出函数的 Node.js 模块,可以使用`require`、`import()`或`import`加载。

这意味着初始化 Node.js 项目目录的关键是创建`package.json`文件。

`package.json`文件可以手动创建 - 毕竟它只是一个 JSON 文件 - npm 工具提供了一个方便的方法:

In a blank directory, run npm init, answer the questions, and as quick as that you have the starting point for a Node.js project.

This is, of course, a starting point, and as you write the code for your project it will often be necessary to use other packages.

Finding npm packages

By default, npm packages are retrieved over the internet from the public package registry maintained on npmjs.com. If you know the module name, it can be installed simply by typing the following:


但是如果您不知道模块名称怎么办?如何发现有趣的模块?网站[`npmjs.com`](http://npmjs.com)发布了一个可搜索的模块注册表索引。npm 包还具有命令行搜索功能,可以查询相同的索引:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/de2ec86a-0e7d-4a98-ad55-6ee68f3e0d15.png)

当然,在找到一个模块后,它会被安装如下:

The npm repository uses a few package.json fields to aid in finding packages.

The package.json fields that help finding packages

For a package to be easily found in the npm repository requires a good package name, package description, and keywords. The npm search function scans those package attributes and presents them in search results.

The relevant package.json fields are as follows:


`npm view`命令向我们显示了给定包的`package.json`文件中的信息,并且使用`--json`标志,我们可以看到原始的 JSON 数据。

`name`标签当然是包名,它在 URL 和命令名称中使用,因此选择一个对两者都安全的名称。如果您希望在公共`npm`存储库中发布一个包,最好通过在[`npmjs.com`](https://npmjs.com)上搜索或使用`npm search`命令来检查特定名称是否已被使用。

`description`标签是一个简短的描述,旨在作为包的简要描述。

在 npm 搜索结果中显示的是名称和描述标签。

`keywords`标签是我们列出包的属性的地方。npm 网站包含列出使用特定关键字的所有包的页面。当搜索包时,这些关键字索引非常有用,因为它们将相关的包列在一个地方,因此在发布包时,着陆在正确的关键字页面上是很有用的。

另一个来源是`README.md`文件的内容。这个文件应该被添加到包中,以提供基本的包文档。这个文件显示在`npmjs.com`上的包页面上,因此对于这个文件来说,说服潜在用户实际使用它是很重要的。正如文件名所示,这是一个 Markdown 文件。

一旦找到要使用的包,您必须安装它才能使用该包。

## 安装 npm 包

`npm install`命令使得在找到梦寐以求的包后安装变得容易,如下所示:

The named module is installed in node_modules in the current directory. During the installation process, the package is set up. This includes installing any packages it depends on and running the preinstall and postinstall scripts. Of course, installing the dependent packages also involves the same installation process of installing dependencies and executing pre-install and post-install scripts.

Some packages in the npm repository have a package scope prepended to the package name. The package name in such cases is presented as @scope-name/package-name, or, for example, @akashacms/plugins-footnotes. In such a package, the name field in package.json contains the full package name with its @scope.

We'll discuss dependencies and scripts later. In the meantime, we notice that a version number was printed in the output, so let's discuss package version numbers.

Installing a package by version number

Version number matching in npm is powerful and flexible. With it, we can target a specific release of a given package or any version number range. By default, npm installs the latest version of the named package, as we did in the previous section. Whether you take the default or specify a version number, npm will determine what to install.

The package version is declared in the package.json file, so let's look at the relevant fields:


`version`字段显然声明了当前包的版本。`dist-tags`字段列出了包维护者可以使用的符号标签,以帮助用户选择正确的版本。这个字段由`npm dist-tag`命令维护。

`npm install`命令支持这些变体:

The last two are what they sound like. You can specify express@4.16.2 to target a precise version, or express@">4.1.0 < 5.0" to target a range of Express V4 versions. We might use that specific expression because Express 5.0 might include breaking changes.

The version match specifiers include the following choices:

  • Exact version match: 1.2.3
  • At least version N: >1.2.3
  • Up to version N: <1.2.3
  • Between two releases: >=1.2.3 <1.3.0

The @tag attribute is a symbolic name such as @latest@stable, or @canary. The package owner assigns these symbolic names to specific version numbers and can reassign them as desired. The exception is @latest, which is updated whenever a new release of the package is published.

For more documentation, run these commands: npm help json and npm help npm-dist-tag.

In selecting the correct package to use, sometimes we want to use packages that are not in the npm repository.

Installing packages from outside the npm repository

As awesome as the npm repository is, we don't want to push everything we do through their service. This is especially true for internal development teams who cannot publish their code for all the world to see. Fortunately, Node.js packages can be installed from other locations. Details about this are in npm help package.json in the dependencies section. Some examples are as follows:

  • URL: You can specify any URL that downloads a tarball, that is, a .tar.gz file. For example, GitHub or GitLab repositories can easily export a tarball URL. Simply go to the Releases tab to find them.
  • Git URL: Similarly, any Git repository can be accessed with the right URL, for example:

+   **GitHub 快捷方式**:对于 GitHub 存储库,您可以只列出存储库标识符,例如`expressjs/express`。可以使用`expressjs/express#tag-name`引用标签或提交。

+   **GitLab、BitBucket 和 GitHub URL 快捷方式**:除了 GitHub 快捷方式外,npm 还支持特定 Git 服务的特殊 URL 格式,如`github:user/repo`、`bitbucket:user/repo`和`gitlab:user/repo`。

+   **本地文件系统**:您可以使用 URL 从本地目录安装,格式为:`file:../../path/to/dir`。

有时,我们需要安装一个包,以供多个项目使用,而不需要每个项目都安装该包。

## 全局包安装

在某些情况下,您可能希望全局安装一个模块,以便可以从任何目录中使用它。例如,Grunt 或 Babel 构建工具非常有用,您可能会发现如果这些工具全局安装会很有用。只需添加`-g`选项:

If you get an error, and you're on a Unix-like system (Linux/Mac), you may need to run this with sudo:


当然,这种变体会以提升的权限运行`npm install`。

npm 网站提供了更多信息的指南,网址为[`docs.npmjs.com/resolving-eacces-permissions-errors-when-installing-packages-globally`](https://docs.npmjs.com/resolving-eacces-permissions-errors-when-installing-packages-globally)。

如果本地软件包安装到`node_modules`中,全局软件包安装会在哪里?在类 Unix 系统上,它会安装到`PREFIX/lib/node_modules`中,在 Windows 上,它会安装到`PREFIX/node_modules`中。在这种情况下,`PREFIX`表示安装 Node.js 的目录。您可以按以下方式检查目录的位置:

The algorithm used by Node.js for the require function automatically searches the directory for packages if the package is not found elsewhere.

ES6 modules do not support global packages.

Many believe it is not a good idea to install packages globally, which we will look at next.

Avoiding global module installation

Some in the Node.js community now frown on installing packages globally. One rationale is that a software project is more reliable if all its dependencies are explicitly declared. If a build tool such as Grunt is required but is not explicitly declared in package.json, the users of the application would have to receive instructions to install Grunt, and they would have to follow those instructions.

Users being users, they might skip over the instructions, fail to install the dependency, and then complain the application doesn't work. Surely, most of us have done that once or twice.

It's recommended to avoid this potential problem by installing everything locally via one mechanism—the npm install command.

There are two strategies we use to avoid using globally installed Node.js packages. For the packages that install commands, we can configure the PATH variable, or use npx to run the command. In some cases, a package is used only during development and can be declared as such in package.json.

Maintaining package dependencies with npm

The npm install command by itself, with no package name specified, installs the packages listed in the dependencies section of package.json. Likewise, the npm update command compares the installed packages against the dependencies and against what's available in the npm repository and updates any package that is out of date in regards to the repository.

These two commands make it easy and convenient to set up a project, and to keep it up to date as dependencies are updated. The package author simply lists all the dependencies, and npm installs or updates the dependencies required for using the package. What happens is npm looks in package.json for the dependencies or devDependencies fields, and it works out what to do from there.

You can manage the dependencies manually by editing package.json. Or you can use npm to assist you with editing the dependencies. You can add a new dependency like so:


使用`--save`标志,npm 将在`package.json`中添加一个`dependencies`标签:

With the added dependency, when your application is installed, npm will now install the package along with any other dependencies listed in package.json file.

The devDependencies lists modules used during development and testing. The field is initialized the same as the preceding one, but with the --save-dev flag. The devDependencies can be used to avoid some cases where one might instead perform a global package install.

By default, when npm install is run, modules listed in both dependencies and devDependencies are installed. Of course, the purpose of having two dependency lists is to control when each set of dependencies is installed.


这将安装“生产”版本,这意味着只安装`dependencies`中列出的模块,而不安装`devDependencies`中的任何模块。例如,如果我们在开发中使用像 Babel 这样的构建工具,该工具就不应该在生产环境中安装。

虽然我们可以在`package.json`中手动维护依赖关系,但 npm 可以为我们处理这些。

### 自动更新 package.json 的依赖关系

使用 npm@5(也称为 npm 版本 5),一个变化是不再需要向`npm install`命令添加`--save`。相反,`npm`默认会像您使用了`--save`命令一样操作,并会自动将依赖项添加到`package.json`中。这旨在简化使用`npm`,可以说`npm`现在更方便了。与此同时,`npm`自动修改`package.json`对您来说可能会非常令人惊讶和不便。可以使用`--no-save`标志来禁用此行为,或者可以使用以下方法永久禁用:

The npm config command supports a long list of settable options for tuning the behavior of npm. See npm help config for the documentation and npm help 7 config for the list of options.

Now let's talk about the one big use for package dependencies: to fix or avoid bugs.

Fixing bugs by updating package dependencies

Bugs exist in every piece of software. An update to the Node.js platform may break an existing package, as might an upgrade to packages used by the application. Your application may trigger a bug in a package it uses. In these and other cases, fixing the problem might be as simple as updating a package dependency to a later (or earlier) version.

First, identify whether the problem exists in the package or in your code. After determining it's a problem in another package, investigate whether the package maintainers have already fixed the bug. Is the package hosted on GitHub or another service with a public issue queue? Look for an open issue on this problem. That investigation will tell you whether to update the package dependency to a later version. Sometimes, it will tell you to revert to an earlier version; for example, if the package maintainer introduced a bug that doesn't exist in an earlier version.

Sometimes, you will find that the package maintainers are unprepared to issue a new release. In such a case, you can fork their repository and create a patched version of their package. In such a case, your package might use a Github URL referencing your patched package.

One approach to fixing this problem is pinning the package version number to one that's known to work. You might know that version 6.1.2 was the last release against which your application functioned and that starting with version 6.2.0 your application breaks. Hence, in package.json:


这将冻结您对特定版本号的依赖。然后,您可以自由地花时间更新您的代码以适应模块的后续版本。一旦您的代码更新了,或者上游项目更新了,就相应地更改依赖关系。

在`package.json`中列出依赖项时,很容易变懒,但这会导致麻烦。

## 明确指定软件包依赖版本号

正如我们在本章中已经说过多次的那样,明确声明您的依赖关系是一件好事。我们已经提到过这一点,但值得重申并看看 npm 如何简化这一点。

第一步是确保您的应用程序代码已经检入源代码存储库。您可能已经知道这一点,并且甚至打算确保所有内容都已检入。对于 Node.js,每个模块应该有自己的存储库,而不是将每一个代码片段都放在一个存储库中。

然后,每个模块可以按照自己的时间表进行进展。一个模块的故障很容易通过在`package.json`中更改版本依赖来撤消。

下一步是明确声明每个模块的所有依赖关系。目标是简化和自动化设置每个模块的过程。理想情况下,在 Node.js 平台上,模块设置就像运行`npm install`一样简单。

任何额外所需的步骤都可能被遗忘或执行不正确。自动设置过程消除了几种潜在的错误。

通过`package.json`的`dependencies`和`devDependencies`部分,我们不仅可以明确声明依赖关系,还可以指定版本号。

懒惰地声明依赖关系的方法是在版本字段中放入`*`。这将使用 npm 存储库中的最新版本。这似乎有效,直到有一天,该软件包的维护者引入了一个 bug。你会输入`npm update`,突然间你的代码就无法工作了。你会跳转到软件包的 GitHub 网站,查看问题队列,可能会看到其他人已经报告了你所看到的问题。其中一些人会说他们已经固定在之前的版本上,直到这个 bug 被修复。这意味着他们的`package.json`文件不依赖于最新版本的`*`,而是依赖于在 bug 产生之前的特定版本号。

不要做懒惰的事情,做明智的事情。

明确声明依赖关系的另一个方面是不隐式依赖全局软件包。之前,我们说过 Node.js 社区中有些人警告不要在全局目录中安装模块。这可能看起来像在应用程序之间共享代码的一种简便方法。只需全局安装,你就不必在每个应用程序中安装代码。

但是,这会让部署变得更加困难吗?新的团队成员会被指示安装这里和那里的所有特殊文件来使应用程序运行吗?你会记得在所有目标机器上安装那个全局模块吗?

对于 Node.js 来说,这意味着列出`package.json`中的所有模块依赖项,然后安装指令就是简单的`npm install`,然后可能是编辑配置文件。

尽管 npm 存储库中的大多数软件包都是带有 API 的库,但有些是我们可以从命令行运行的工具。

## 安装命令的软件包

有些软件包安装命令行程序。安装这些软件包的一个副作用是,你可以在 shell 提示符下输入新的命令,或者在 shell 脚本中使用。一个例子是我们在第二章中简要使用过的`hexy`程序,*设置 Node.js*。另一个例子是广泛使用的 Grunt 或 Babel 构建工具。

明确声明所有依赖关系在`package.json`中的建议适用于命令行工具以及任何其他软件包。因此,这些软件包通常会被本地安装。这需要特别注意正确设置`PATH`环境变量。正如你可能已经知道的那样,`PATH`变量在类 Unix 系统和 Windows 上都用于列出命令行 shell 搜索命令的目录。

命令可以安装到两个地方之一:

+   **全局安装**:它安装到一个目录,比如`/usr/local`,或者 Node.js 安装的`bin`目录。`npm bin -g`命令告诉你这个目录的绝对路径名。在这种情况下,你不太可能需要修改 PATH 环境变量。

+   **本地安装**:安装到正在安装模块的`package`中的`node_modules/.bin`,`npm bin`命令告诉你该目录的绝对路径名。因为该目录不方便运行命令,所以改变 PATH 变量是有用的。

要运行命令,只需在 shell 提示符下输入命令名称。如果命令安装的目录恰好在 PATH 变量中,这样就能正确运行。让我们看看如何配置 PATH 变量以处理本地安装的命令。

### 配置 PATH 变量以处理本地安装的命令

假设我们已经安装了`hexy`命令,如下所示:

As a local install, this creates a command as node_modules/.bin/hexy. We can attempt to use it as follows:


但这会出错,因为命令不在`PATH`中列出的目录中。解决方法是使用完整路径名或相对路径名:

But obviously typing the full or partial pathname is not a user-friendly way to execute the command. We want to use the commands installed by modules, and we want a simple process for doing so. This means, we must add an appropriate value in the PATH variable, but what is it?

For global package installations, the executable lands in a directory that is probably already in your PATH variable, like /usr/bin or /usr/local/bin. Local package installations require special handling. The full path for the node_modules/.bin directory varies for each project, and obviously it won't work to add the full path for every node_modules/.bin directory to your PATH.

Adding ./node_modules/.bin to the PATH variable (or, on Windows, .\node_modules\.bin) works great. Any time your shell is in the root of a Node.js project, it will automatically find locally installed commands from Node.js packages.

How we do this depends on the command shell you use and your operating system.

On a Unix-like system, the command shells are bash and csh. Your PATH variable would be set up in one of these ways:


下一步是将命令添加到你的登录脚本中,这样变量就会一直设置。在`bash`上,添加相应的行到`~/.bashrc`,在`csh`上,添加到`~/.cshrc`。

一旦完成了这一步,命令行工具就能正确执行。

### 在 Windows 上配置 PATH 变量

在 Windows 上,这个任务是通过系统范围的设置面板来处理的:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/4e178c1e-2d36-4402-8e8e-713142886014.png)

在 Windows 设置屏幕中搜索`PATH`,可以找到`系统属性`面板的这个窗格。点击`环境变量`按钮,然后选择`Path`变量,最后点击`编辑`按钮。在这个屏幕上,点击`新建`按钮添加一个条目到这个变量中,并输入`.\node_modules\.bin`如图所示。你必须重新启动任何打开的命令行窗口。一旦你这样做了,效果就会如前所示。

尽管修改 PATH 变量很容易,但我们不希望在所有情况下都这样做。

### 避免修改 PATH 变量

如果你不想始终将这些变量添加到你的`PATH`中怎么办?`npm-path`模块可能会引起你的兴趣。这是一个小程序,可以计算出适合你的 shell 和操作系统的正确`PATH`变量。查看[`www.npmjs.com/package/npm-path`](https://www.npmjs.com/package/npm-path)上的包。

另一个选择是使用`npx`命令来执行这些命令。这个工具会自动安装在`npm`命令旁边。这个命令要么执行来自本地安装包的命令,要么在全局缓存中静默安装命令:

Using npx is this easy.

Of course, once you've installed some packages, they'll go out of date and need to be updated.

Updating packages you've installed when they're outdated

The coder codes, updating their package, leaving you in the dust unless you keep up.

To find out whether your installed packages are out of date, use the following command:


报告显示了当前的 npm 包、当前安装的版本,以及`npm`仓库中的当前版本。更新过时的包非常简单:

Specifying a package name updates just the named package. Otherwise, it updates every package that would be printed by npm outdated.

Npm handles more than package management, it has a decent built-in task automation system.

Automating tasks with scripts in package.json

The npm command handles not just installing packages, it can also be used to automate running tasks related to the project. In package.json, we can add a field, scripts, containing one or more command strings. Originally scripts were meant to handle tasks related to installing an application, such as compiling native code, but they can be used for much more. For example, you might have a deployment task using rsync to copy files to a server. In package.json, you can add this:


重要的是,我们可以添加任何我们喜欢的脚本,`scripts`条目记录了要运行的命令:

Once it has been recorded in scripts, running the command is this easy.

There is a long list of "lifecycle events" for which npm has defined script names. These include the following:

  • install, for when the package is installed
  • uninstall, for when it is uninstalled
  • test, for running a test suite
  • start and stop, for controlling a server defined by the package

Package authors are free to define any other script they like.

For the full list of predefined script names, see the documentation: docs.npmjs.com/misc/scripts

Npm also defines a pattern for scripts that run before or after another script, namely to prepend pre or post to the script name. Therefore the pretest script runs before the test script, and the posttest script runs afterward.

A practical example is to run a test script in a prepublish script to ensure the package is tested before publishing it to the npm repository:


有了这个组合,如果测试作者输入`npm publish`,`prepublish`脚本将导致`test`脚本运行,然后使用`mocha`运行测试套件。

自动化所有管理任务是一个众所周知的最佳实践,即使只是为了你永远不会忘记如何运行这些任务。为每个这样的任务创建`scripts`条目不仅可以防止你忘记如何做事,还可以为他人记录管理任务。

接下来,让我们谈谈如何确保执行包的 Node.js 平台支持所需的功能。

## 声明 Node.js 版本兼容性

重要的是,你的 Node.js 软件必须在正确的 Node.js 版本上运行。主要原因是你的包运行时需要的 Node.js 平台功能必须可用。因此,包的作者必须知道哪些 Node.js 版本与包兼容,然后在`package.json`中描述这种兼容性。

这个依赖在`package.json`中使用`engines`标签声明:

版本字符串类似于我们可以在dependenciesdevDependencies中使用的。在这种情况下,我们定义了该包与 Node.js 8.x、9.x 和 10.x 兼容。

现在我们知道如何构建一个包,让我们谈谈发布包。

发布 npm 包

npm 仓库中的所有这些包都来自像你一样有更好的做事方式的人。发布包非常容易入门。

关于发布包的在线文档可以在docs.npmjs.com/getting-started/publishing-npm-packages找到。

还要考虑这个:xkcd.com/927/

首先使用npm adduser命令在 npm 仓库中注册。你也可以在网站上注册。接下来,使用npm login命令登录。

最后,在包的根目录中使用npm publish命令。然后,退后一步,以免被涌入的粉丝踩到,或者可能不会。仓库中有数以亿计的包,每天都有数百个包被添加。要使你的包脱颖而出,你需要一些营销技巧,这是本书范围之外的另一个话题。

建议你的第一个包是一个作用域包,例如@my-user-name/my-great-package

在本节中,我们学到了很多关于使用 npm 来管理和发布包。但是 npm 并不是管理 Node.js 包的唯一选择。

Yarn 包管理系统

尽管 npm 非常强大,但它并不是 Node.js 的唯一包管理系统。因为 Node.js 核心团队并没有规定一个包管理系统,Node.js 社区可以自由地开发他们认为最好的任何系统。我们绝大多数人使用 npm 是对其价值和有用性的证明。但是,还有一个重要的竞争对手。

Yarn(参见yarnpkg.com/en/)是 Facebook、Google 和其他几家公司的工程师合作开发的。他们宣称 Yarn 是超快、超安全(通过使用所有内容的校验和)和超可靠(通过使用yarn-lock.json文件记录精确的依赖关系)。

Yarn 不是运行自己的包存储库,而是在npmjs.com的 npm 包存储库上运行。这意味着 Node.js 社区并没有被 Yarn 分叉,而是通过一个改进的包管理工具得到了增强。

npm 团队在 npm@5(也称为 npm 版本 5)中对 Yarn 做出了回应,通过提高性能和引入package-lock.json文件来提高可靠性。npm 团队在 npm@6 中实施了额外的改进。

Yarn 已经变得非常流行,并且被广泛推荐用于 npm。它们执行非常相似的功能,性能与 npm@5 并没有太大的不同。命令行选项的表述方式也有所不同。我们讨论过的 npm 的一切功能 Yarn 也都支持,尽管命令语法略有不同。Yarn 给 Node.js 社区带来的一个重要好处是,Yarn 和 npm 之间的竞争似乎正在促使 Node.js 包管理的更快进步。

为了让你开始,这些是最重要的命令:

  • yarn add:将一个包添加到当前包中使用

  • yarn init:初始化一个包的开发

  • yarn install:安装package.json文件中定义的所有依赖项

  • yarn publish:将包发布到包管理器

  • yarn remove:从当前包中移除一个未使用的包

运行yarn本身就会执行yarn install的行为。Yarn 还有其他几个命令,yarn help会列出它们所有。

总结

在本章中,你学到了很多关于 Node.js 的模块和包。具体来说,我们涵盖了为 Node.js 实现模块和包,我们可以使用的不同模块结构,CommonJS 和 ES6 模块之间的区别,管理已安装的模块和包,Node.js 如何定位模块,不同类型的模块和包,如何以及为什么声明对特定包版本的依赖关系,如何找到第三方包,以及我们如何使用 npm 或 Yarn 来管理我们使用的包并发布我们自己的包。

现在你已经学习了关于模块和包,我们准备使用它们来构建应用程序,在下一章中我们将看到。

HTTP 服务器和客户端

现在你已经了解了 Node.js 模块,是时候将这些知识应用到构建一个简单的 Node.js web 应用程序中了。本书的目标是学习使用 Node.js 进行 web 应用程序开发。在这个过程中的下一步是对HTTPServerHTTPClient对象有一个基本的了解。为了做到这一点,我们将创建一个简单的应用程序,使我们能够探索 Node.js 中一个流行的应用程序框架——Express。在后面的章节中,我们将在应用程序上做更复杂的工作,但在我们能够行走之前,我们必须学会爬行。

本章的目标是开始了解如何在 Node.js 平台上创建应用程序。我们将创建一些小型的应用程序,这意味着我们将编写代码并讨论它的作用。除了学习一些具体的技术之外,我们还希望熟悉初始化工作目录、创建应用程序的 Node.js 代码、安装应用程序所需的依赖项以及运行/测试应用程序的过程。

Node.js 运行时包括诸如EventEmitterHTTPServerHTTPClient等对象,它们为我们构建应用程序提供了基础。即使我们很少直接使用这些对象,了解它们的工作原理也是有用的,在本章中,我们将涵盖使用这些特定对象的一些练习。

我们将首先直接使用HTTPServer对象构建一个简单的应用程序。然后,我们将使用 Express 来创建一个计算斐波那契数的应用程序。因为这可能是计算密集型的,我们将利用这一点来探讨为什么在 Node.js 中不阻塞事件队列是重要的,以及对这样做的应用程序会发生什么。这将给我们一个借口来开发一个简单的后台 REST 服务器,一个用于在服务器上发出请求的 HTTP 客户端,以及一个多层 web 应用程序的实现。

在今天的世界中,微服务应用架构实现了后台 REST 服务器,这就是我们在本章中要做的事情。

在本章中,我们将涵盖以下主题:

  • 使用EventEmitter模式发送和接收事件

  • 通过构建一个简单的应用程序来理解 HTTP 服务器应用程序

  • Web 应用程序框架

  • 使用 Express 框架构建一个简单的应用程序

  • 在 Express 应用程序中处理计算密集型计算和 Node.js 事件循环。

  • 发出 HTTP 客户端请求

  • 使用 Express 创建一个简单的 REST 服务

通过学习这些主题,你将了解设计基于 HTTP 的 web 服务的几个方面。目标是让你了解如何创建或消费一个 HTTP 服务,并对 Express 框架有一个介绍。在本章结束时,你将对这两个工具有一个基本的了解。

这是很多内容,但这将为本书的其余部分奠定一个良好的基础。

第六章:使用 EventEmitter 发送和接收事件

EventEmitter是 Node.js 的核心习语之一。如果 Node.js 的核心思想是事件驱动的架构,那么从对象中发出事件是该架构的主要机制之一。EventEmitter是一个在其生命周期的不同阶段提供通知(事件)的对象。例如,一个HTTPServer对象会发出与服务器对象的启动/关闭以及处理来自 HTTP 客户端的 HTTP 请求的每个阶段相关的事件。

许多核心的 Node.js 模块都是EventEmitter对象,而EventEmitter对象是实现异步编程的一个很好的基础。EventEmitter对象在 Node.js 中非常常见,以至于你可能会忽略它们的存在。然而,因为它们随处可见,我们需要了解它们是什么,以及在必要时如何使用它们。

在本章中,我们将使用HTTPServerHTTPClient对象。两者都是EventEmitter类的子类,并依赖于它来发送 HTTP 协议每个步骤的事件。在本节中,我们将首先学习使用 JavaScript 类,然后创建一个EventEmitter子类,以便我们可以学习EventEmitter

JavaScript 类和类继承

在开始EventEmitter类之前,我们需要看一下 ES2015 的另一个特性:类。JavaScript 一直有对象和类层次结构的概念,但没有其他语言那样正式。ES2015 类对象建立在现有的基于原型的继承模型之上,但其语法看起来很像其他语言中的类定义。

例如,考虑以下类,我们将在本书的后面使用:


This should look familiar to anyone who's implemented a class definition in other languages. The class has a name—`Note`. There is also a constructor method and attributes for each instance of the class.

Once you've defined the class, you can export the class definition to other modules:

使用getset关键字标记的函数是 getter 和 setter,用法如下:


New instances of a class are created with `new`. You access a getter or setter function as if it is a simple field on the object. Behind the scenes, the getter/setter function is invoked.

The preceding implementation is not the best because the `_title` and `_body` fields are publicly visible and there is no data-hiding or encapsulation. There is a technique to better hide the field data, which we'll go over in Chapter 5, *Your First Express Application*.

You can test whether a given object is of a certain class by using the `instanceof` operator:

最后,您可以使用extends运算符声明一个子类,类似于其他语言中的操作:


In other words, the `LoveNote` class has all the fields of `Note`, plus a new field named `heart`.

This was a brief introduction to JavaScript classes. By the end of this book, you'll have had lots of practice with this feature. The `EventEmitter` class gives us a practical use for classes and class inheritance.

## The EventEmitter class

The `EventEmitter` object is defined in the `events` module of Node.js. Using the `EventEmitter` class directly means performing `require('events')`. In most cases, we don't do this. Instead, our typical use of `EventEmitter` objects is via an existing object that uses `EventEmitter` internally. However, there are some cases where needs dictate implementing an `EventEmitter` subclass.

Create a file named `pulser.mjs`, containing the following code:

这是一个定义了名为Pulser的类的 ES6 模块。该类继承自EventEmitter并提供了一些自己的方法。

另一件要检查的事情是回调函数中的this.emit如何引用Pulser对象实例。这个实现依赖于 ES2015 箭头函数。在箭头函数之前,我们的回调使用了一个常规的function,而this不会引用Pulser对象实例。相反,this会引用与setInterval函数相关的其他对象。箭头函数的一个特性是,箭头函数内部的this与周围上下文中的this具有相同的值。这意味着,在这种情况下,this确实引用Pulser对象实例。

在我们必须使用function而不是箭头函数时,我们必须将this分配给另一个变量,如下所示:


What's different is the assignment of `this` to `self`. The value of `this` inside the function is different—it is related to the `setInterval` function—but the value of `self` remains the same in every enclosed scope. You'll see this trick used widely, so remember this in case you come across this pattern in code that you're maintaining.

If you want to use a simple `EventEmitter` object but with your own class name, the body of the extended class can be empty:

Pulser类的目的是每秒向任何监听器发送一个定时事件。start方法使用setInterval来启动重复的回调执行,计划每秒调用emitpulse事件发送给任何监听器。

现在,让我们看看如何使用Pulser对象。创建一个名为pulsed.mjs的新文件,其中包含以下代码:


Here, we create a `Pulser` object and consume its `pulse` events. Calling `pulser.on('pulse')` sets up an event listener for the `pulse` events to invoke the callback function. It then calls the `start` method to get the process going.

When it is run, you should see the following output:

对于每个接收到的pulse事件,都会打印一个pulse received消息。

这为您提供了一些关于EventEmitter类的实际知识。现在让我们看一下它的操作理论。

EventEmitter 理论

使用EventEmitter类,您的代码会发出其他代码可以接收的事件。这是一种连接程序中两个分离部分的方式,有点像量子纠缠的方式,两个电子可以在任何距离上相互通信。看起来很简单。

事件名称可以是任何对您有意义的内容,您可以定义尽可能多的事件名称。事件名称是通过使用事件名称调用.emit来定义的。无需进行任何正式操作,也不需要注册事件名称。只需调用.emit就足以定义事件名称。

按照惯例,error事件名称表示错误。

一个对象使用.emit函数发送事件。事件被发送到任何已注册接收对象事件的监听器。程序通过调用该对象的.on方法注册接收事件,给出事件名称和事件处理程序函数。

所有事件没有一个中央分发点。相反,每个EventEmitter对象实例管理其自己的监听器集,并将其事件分发给这些监听器。

通常,需要在事件中发送数据。要这样做,只需将数据作为参数添加到.emit调用中,如下所示:


When the program receives the event, the data appears as arguments to the callback function. Your program listens to this event, as follows:

事件接收器和事件发送器之间没有握手。也就是说,事件发送器只是继续它的业务,不会收到任何关于接收到的事件、采取的任何行动或发生的任何错误的通知。

在这个例子中,我们使用了 ES2015 的另一个特性——rest运算符——在这里以...theArgs的形式使用。rest运算符将任意数量的剩余函数参数捕获到一个数组中。由于EventEmitter可以传递任意数量的参数,而rest运算符可以自动接收任意数量的参数,它们是天作之合,或者至少是在 TC-39 委员会中。

我们现在已经学会了如何使用 JavaScript 类以及如何使用EventEmitter类。接下来要做的是检查HTTPServer对象如何使用EventEmitter

理解 HTTP 服务器应用程序

HTTPServer对象是所有 Node.js Web 应用程序的基础。这个对象本身非常接近 HTTP 协议,使用它需要对这个协议有所了解。幸运的是,在大多数情况下,您可以使用应用程序框架,比如 Express,来隐藏 HTTP 协议的细节。作为应用程序开发者,我们希望专注于业务逻辑。

我们已经在第二章中看到了一个简单的 HTTP 服务器应用程序,设置 Node.js。因为HTTPServer是一个EventEmitter对象,所以可以以另一种方式编写示例,以明确这一事实,通过分别添加事件监听器:


Here, we created an HTTP `server` object, then attached a listener to the `request` event, and then told the server to listen to connections from `localhost` (`127.0.0.1`) on port `8124`. The `listen` function causes the server to start listening and arranges to dispatch an event for every request arriving from a web browser.

The `request` event is fired any time an HTTP request arrives on the server. It takes a function that receives the `request` and `response` objects. The `request` object has data from the web browser, while the `response` object is used to gather data to be sent in the response. 

Now, let's look at a server application that performs different actions based on the URL.

Create a new file named `server.mjs`, containing the following code:

request事件是由HTTPServer每次从 Web 浏览器接收到请求时发出的。在这种情况下,我们希望根据请求 URL 的不同而有不同的响应,请求 URL 以req.url的形式到达。这个值是一个包含来自 HTTP 请求的 URL 的字符串。由于 URL 有许多属性,我们需要解析 URL 以便正确匹配两个路径中的一个的路径名://osinfo

使用 URL 类解析 URL 需要一个基本 URL,我们在listenOn变量中提供了这个 URL。请注意,我们在其他地方多次重用了这个变量,使用一个字符串来配置应用程序的多个部分。

根据路径,要么调用homePage函数,要么调用osInfo函数。

这被称为请求路由,我们在其中查看传入请求的属性,比如请求路径,并将请求路由到处理程序函数。

在处理程序函数中,reqres参数对应于requestresponse对象。req包含有关传入请求的数据,我们使用res发送响应。writeHead函数设置返回状态(200表示成功,而404表示页面未找到),end函数发送响应。

如果请求的 URL 没有被识别,服务器将使用404结果代码发送回一个错误页面。结果代码通知浏览器有关请求状态,其中200代码表示一切正常,404代码表示请求的页面不存在。当然,还有许多其他 HTTP 响应代码,每个代码都有自己的含义。

这两个对象都附加了许多其他函数,但这已经足够让我们开始了。

要运行它,请输入以下命令:


Then, if we paste the URL into a web browser, we see something like this:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/1a594a70-7504-4f7b-81ce-e8bd939911e2.png)

This application is meant to be similar to PHP's `sysinfo` function. Node.js's `os` module is consulted to provide information about the computer. This example can easily be extended to gather other pieces of data.

A central part of any web application is the method of routing requests to request handlers. The `request` object has several pieces of data attached to it, two of which are useful for routing requests: the `request.url` and `request.method` fields.

In `server.mjs`, we consult the `request.url` data to determine which page to show after parsing using the URL object. Our needs are modest in this server, and a simple comparison of the `pathname` field is enough. Larger applications will use pattern matching to use part of the request URL to select the request handler function and other parts to extract request data out of the URL. We'll see this in action when we look at Express later in the *Getting started with Express* section.

Some web applications care about the HTTP verb that is used (`GET`, `DELETE`, `POST`, and so on) and so we must consult the `request.method` field of the `request` object. For example, `POST` is frequently used for any `FORM` submissions.

That gives us a taste of developing servers with Node.js. Along the way, we breezed past one big ES2015 feature—template strings. The template strings feature simplifies substituting values into strings. Let's see how that works.

## ES2015 multiline and template strings

The previous example showed two of the new features introduced with ES2015: multiline and template strings. These features are meant to simplify our lives when creating text strings.

The existing JavaScript string representations use single quotes and double quotes. Template strings are delimited with the backtick character, which is also known as the **grave accent**:

在 ES2015 之前,实现多行字符串的一种方法是使用以下结构:


This is an array of strings that uses the `join` function to smash them together into one string. Yes, this is the code used in the same example in previous versions of this book. This is what we can do with ES2015:

这更加简洁和直接。开头引号在第一行,结束引号在最后一行,中间的所有内容都是我们的字符串的一部分。

模板字符串功能的真正目的是支持将值直接替换到字符串中。许多其他编程语言支持这种能力,现在 JavaScript 也支持了。

在 ES2015 之前,程序员会这样编写他们的代码:


Similar to the previous snippet, this relied on the `replace` function to insert values into the string. Again, this is extracted from the same example that was used in previous versions of this book. With template strings, this can be written as follows:

在模板字符串中,${..}括号中的部分被解释为表达式。这可以是一个简单的数学表达式、一个变量引用,或者在这种情况下,一个函数调用。

使用模板字符串插入数据存在安全风险。您是否验证了数据的安全性?它会成为安全攻击的基础吗?与始终如一的数据来自不受信任的来源,如用户输入,必须为数据要插入的目标上下文正确编码。在这个例子中,我们应该使用一个函数来将这些数据编码为 HTML,也许。但是对于这种情况,数据是简单的字符串和数字形式,并来自已知的安全数据源——内置的os模块,因此我们知道这个应用程序是安全的。

出于这个原因和许多其他原因,通常更安全使用外部模板引擎。诸如 Express 之类的应用程序可以轻松实现这一点。

现在我们有一个简单的基于 HTTP 的 Web 应用程序。为了更多地了解 HTTP 事件,让我们为监听所有 HTTP 事件的模块添加一个。

HTTP Sniffer - 监听 HTTP 对话

HTTPServer对象发出的事件可以用于除了传递 Web 应用程序的直接任务之外的其他目的。以下代码演示了一个有用的模块,它监听所有HTTPServer事件。这可能是一个有用的调试工具,还演示了HTTPServer对象的操作方式。

Node.js 的HTTPServer对象是一个EventEmitter对象,而 HTTP Sniffer 只是监听每个服务器事件,打印出与每个事件相关的信息。

创建一个名为httpsniffer.mjs的文件,其中包含以下代码:


The key here is the `sniffOn` function. When given an `HTTPServer` object, it attaches listener functions to each `HTTPServer` event to print relevant data. This gives us a fairly detailed trace of the HTTP traffic on an application.

In order to use it, make two simple modifications to `server.mjs`. To the top, add the following `import` statement:

然后,按照以下方式更改服务器设置:


Here, we're importing the `sniffOn` function and then using it to attach listener methods to the `server` object.

With this in place, run the server as we did earlier. You can visit `http://localhost:8124/` in your browser and see the following console output:

现在您有一个用于窥探HTTPServer事件的工具。这种简单的技术打印出事件数据的详细日志。这种模式可以用于任何EventEmitter对象。您可以使用这种技术来检查程序中EventEmitter对象的实际行为。

在我们继续使用 Express 之前,我们需要讨论为什么要使用应用程序框架。

Web 应用程序框架

HTTPServer对象与 HTTP 协议非常接近。虽然这在某种程度上很强大,就像驾驶手动挡汽车可以让您对驾驶体验进行低级控制一样,但典型的 Web 应用程序编程最好在更高的级别上完成。有人使用汇编语言来编写 Web 应用程序吗?最好将 HTTP 细节抽象出来,集中精力放在应用程序上。

Node.js 开发者社区已经开发了相当多的应用程序框架,以帮助抽象 HTTP 协议细节的不同方面。在这些框架中,Express 是最受欢迎的,而 Koa(koajs.com/)应该被考虑,因为它完全集成了对异步函数的支持。

Express.js 维基上列出了建立在 Express.js 之上或与其一起使用的框架和工具。这包括模板引擎、中间件模块等。Express.js 维基位于github.com/expressjs/express/wiki

使用 Web 框架的一个原因是它们通常具有在 Web 应用程序开发中使用了 20 多年的最佳实践的经过充分测试的实现。通常的最佳实践包括以下内容:

  • 提供一个用于错误 URL 的页面(404页面)

  • 筛选 URL 和表单以防注入脚本攻击

  • 支持使用 cookie 来维护会话

  • 记录请求以进行使用跟踪和调试

  • 认证

  • 处理静态文件,如图像、CSS、JavaScript 或 HTML

  • 提供缓存控制头以供缓存代理使用

  • 限制页面大小或执行时间等事项

Web 框架帮助您将时间投入到任务中,而不会迷失在实现 HTTP 协议的细节中。抽象化细节是程序员提高效率的一种历史悠久的方式。当使用提供预打包函数来处理细节的库或框架时,这一点尤其正确。

考虑到这一点,让我们转向使用 Express 实现的一个简单应用程序。

开始使用 Express

Express 可能是最受欢迎的 Node.js Web 应用程序框架。Express 被描述为类似于 Sinatra,这是一个流行的 Ruby 应用程序框架。它也被认为不是一种武断的框架,这意味着框架作者不会对应用程序的结构施加自己的意见。这意味着 Express 对代码的结构并不严格;您只需按照您认为最好的方式编写即可。

您可以访问 Express 的主页expressjs.com/

截至撰写本书时,Express 4.17 是当前版本,Express 5 正在进行 alpha 测试。根据 Express.js 网站,Express 4 和 Express 5 之间几乎没有什么区别。

让我们首先安装express-generator。虽然我们可以直接开始编写一些代码,但express-generator提供了一个空白的起始应用程序,我们将使用它并进行修改。

使用以下命令安装express-generator


This is different from the suggested installation method on the Express website, which says to use the `-g` tag for a global installation. We're also using an explicit version number to ensure compatibility. As of the time of writing, `express-generator@5.x` does not exist, but it should exist sometime in the future. The instructions here are written for Express 4.x, and by explicitly naming the version, we're ensuring that we're all on the same page.

Earlier, we discussed how many people now recommend against installing modules globally. Maybe they would consider `express-generator` as an exception to that rule, or maybe not. In any case, we're not following the recommendation on the Express website, and toward the end of this section, we'll have to uninstall `express-generator`.

The result of this is that an `express` command is installed in the `./node_modules/.bin` directory:

运行express命令,如下所示:


We probably don't want to type `./node_modules/.bin/express` every time we run the `express-generator` application, or, for that matter, any of the other applications that provide command-line utilities. Refer back to the discussion we had in Chapter 3, *Exploring Node.js Modules*, about adding this directory to the `PATH` variable. Alternatively, the `npx` command, also described in Chapter 3, *Exploring Node.js Modules*, is useful for this.

For example, try using the following instead of installing `express-generator`:

这样执行完全相同,无需安装express-generator,并且(我们马上会看到)在使用命令结束时记得卸载它。

现在,您已经在fibonacci目录中安装了express-generator,使用它来设置空白框架应用程序:


This creates a bunch of files for us, which we'll walk through in a minute. We asked it to initialize the use of the Handlebars template engine and to initialize a `git` repository. 

The `node_modules` directory still has the `express-generator` module, which is no longer useful. We can just leave it there and ignore it, or we can add it to `devDependencies` of the `package.json` file that it generated. Most likely, we will want to uninstall it:

这将卸载express-generator工具。接下来要做的是按照我们被告知的方式运行空白应用程序。npm start命令依赖于提供的package.json文件的一个部分:


It's cool that the Express team showed us how to run the server by initializing the `scripts` section in `package.json`. The `start` script is one of the scripts that correspond to the `npm` sub-commands. The instructions we were given, therefore, say to run `npm start`.

The steps are as follows:

1.  Install the dependencies with `npm install`.
2.  Start the application by using `npm start`.
3.  Optionally, modify `package.json` to always run with debugging.

To install the dependencies and run the application, type the following commands:

以这种方式设置DEBUG变量会打开调试输出,其中包括有关监听端口3000的消息。否则,我们不会得到这些信息。这种语法是在 Bash shell 中使用环境变量运行命令的方式。如果在运行npm start时出错,请参考下一节。

我们可以修改提供的npm start脚本,始终使用启用调试的应用程序。将scripts部分更改为以下内容:


Since the output says it is listening on port `3000`, we direct our browser to
`http://localhost:3000/` and see the following output:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/2a2b1a0f-a6e2-43da-9945-8a49e19b8dbf.png)

Cool, we have some running code. Before we start changing the code, we need to discuss how to set environment variables in Windows.

## Setting environment variables in the Windows cmd.exe command line

If you're using Windows, the previous example may have failed, displaying an error that says `DEBUG` is not a known command. The problem is that the Windows shell, the `cmd.exe` program, does not support the Bash command-line structure.

Adding `VARIABLE=value` to the beginning of a command line is specific to some shells, such as Bash, on Linux and macOS. It sets that environment variable only for the command line that is being executed and is a very convenient way to temporarily override environment variables for a specific command.

Clearly, a solution is required if you want to be able to use your `package.json` file across different operating systems.

The best solution appears to be using the `cross-env` package in the `npm` repository; refer to [`www.npmjs.com/package/cross-env`](https://www.npmjs.com/package/cross-env) for more information.

With this package installed, commands in the `scripts` section in `package.json` can set environment variables just as in Bash on Linux/macOS. The use of this package looks as follows:

然后,执行以下命令:


We now have a simple way to ensure the scripts in `package.json` are cross-platform. Our next step is a quick walkthrough of the generated application.

## Walking through the default Express application

We now have a working, blank Express application; let's look at what was generated for us. We do this to familiarize ourselves with Express before diving in to start coding our **Fibonacci** application.

Because we used the `--view=hbs` option, this application is set up to use the Handlebars.js template engine. 

For more information about Handlebars.js, refer to its home page at [`handlebarsjs.com/`](http://handlebarsjs.com/). The version shown here has been packaged for use with Express and is documented at [`github.com/pillarjs/hbs`](https://github.com/pillarjs/hbs). 

Generally speaking, a template engine makes it possible to insert data into generated web pages. The Express.js wiki has a list of template engines for Express ([`github.com/expressjs/express/wiki#template-engines`](https://github.com/expressjs/express/wiki#template-engines)).

Notice that the JavaScript files are generated as CommonJS modules. The `views` directory contains two files—`error.hbs` and `index.hbs`. The `hbs` extension is used for Handlebars files. Another file, `layout.hbs`, is the default page layout. Handlebars has several ways to configure layout templates and even partials (snippets of code that can be included anywhere).

The `routes` directory contains the initial routing setup—that is, code to handle specific URLs. We'll modify this later.

The `public` directory contains assets that the application doesn't generate but are simply sent to the browser. What's initially installed is a CSS file, `public/stylesheets/style.css`. The `package.json` file contains our dependencies and other metadata.

The `bin` directory contains the `www` script that we saw earlier. This is a Node.js script that initializes the `HTTPServer` objects, starts listening on a TCP port, and calls the last file that we'll discuss, `app.js`. These scripts initialize Express and hook up the routing modules, as well as other things.

There's a lot going on in the `www` and `app.js` scripts, so let's start with the application initialization. Let's first take a look at a couple of lines in `app.js`:

这意味着app.js是一个 CommonJS 模块,它导出了由express模块生成的应用程序对象。我们在app.js中的任务是配置该应用程序对象。但是,这个任务不包括启动HTTPServer对象。

现在,让我们转向bin/www脚本。在这个脚本中启动了 HTTP 服务器。首先要注意的是它以以下行开始:


This is a Unix/Linux technique to make a command script. It says to run the following as a script using the `node` command. In other words, we have Node.js code and we're instructing the operating system to execute that code using the Node.js runtime:

我们还可以看到该脚本是通过express-generator可执行的。

它调用app.js模块,如下所示:


Namely, it loads the module in `app.js`, gives it a port number to use, creates the `HTTPServer` object, and starts it up.

We can see where port `3000` comes from; it's a parameter to the `normalizePort` function. We can also see that setting the `PORT` environment variable will override the default port `3000`. Finally, we can see that the `HTTPServer` object is created here and is told to use the application instance created in `app.js`. Try running the following command:

通过为PORT指定环境变量,我们可以告诉应用程序监听端口4242,您可以在那里思考生活的意义。

接下来将app对象传递给http.createServer()。查看 Node.js 文档告诉我们,这个函数接受requestListener,它只是一个接受我们之前看到的requestresponse对象的函数。因此,app对象是相同类型的函数。

最后,bin/www脚本启动了服务器监听进程,监听我们指定的端口。

现在让我们更详细地了解app.js


This tells Express to look for templates in the `views` directory and to use the Handlebars templating engine.

The `app.set` function is used to set the application properties. It'll be useful to browse the API documentation as we go through ([`expressjs.com/en/4x/api.html`](http://expressjs.com/en/4x/api.html)).

Next is a series of `app.use` calls:

app.use函数挂载中间件函数。这是 Express 术语中的重要部分,我们很快会讨论。目前,让我们说中间件函数在处理请求时被执行。这意味着app.js中启用了这里列出的所有功能:

静态文件 Web 服务器安排通过 HTTP 请求提供命名目录中的文件。使用此配置,public/stylesheets/style.css文件可在http://HOST/stylesheets/style.css上访问。

我们不应该感到受限于以这种方式设置 Express 应用程序。这是 Express 团队的建议,但我们并不受限于以另一种方式设置它。例如,在本书的后面部分,我们将完全将其重写为 ES6 模块,而不是坚持使用 CommonJS 模块。一个明显的遗漏是未捕获异常和未处理的 Promise 拒绝的处理程序。我们稍后会在本书中讨论这两者。

接下来,我们将讨论 Express 的中间件函数。

理解 Express 中间件

让我们通过讨论 Express 中间件函数为我们的应用程序做了什么来完成对app.js的漫游。中间件函数参与处理请求并将结果发送给 HTTP 客户端。它们可以访问requestresponse对象,并且预期处理它们的数据,也许向这些对象添加数据。例如,cookie 解析中间件解析 HTTP cookie 头,以记录浏览器发送的 cookie 在request对象中。

我们在脚本的最后有一个例子:


The comment says `catch 404 and forward it to the error handler`. As you probably know, an HTTP `404` status means the requested resource was not found. We need to tell the user that their request wasn't satisfied, and maybe show them something such as a picture of a flock of birds pulling a whale out of the ocean. This is the first step in doing this. Before getting to the last step of reporting this error, you need to learn how middleware works.

The name *middleware* implies software that executes in the middle of a chain of processing steps.

Refer to the documentation about middleware at [`expressjs.com/en/guide/writing-middleware.html`](http://expressjs.com/en/guide/writing-middleware.html).

Middleware functions take three arguments. The first two—`request` and `response`—are equivalent to the `request` and `response` objects of the Node.js HTTP request object. Express expands these objects with additional data and capabilities. The last argument, `next`, is a callback function that controls when the request-response cycle ends, and it can be used to send errors down the middleware pipeline.

As an aside, one critique of Express is that it was written prior to the existence of Promises and async functions. Therefore, its design is fully enmeshed with the callback function pattern. We can still use async functions, but integrating with Express requires using the callback functions it provides.

The overall architecture is set up so that incoming requests are handled by zero or more middleware functions, followed by a router function, which sends the response. The middleware functions call `next`, and in a normal case, provide no arguments by calling `next()`. If there is an error, the middleware function indicates the error by calling `next(err)`, as shown here.

For each middleware function that executes, there is, in theory, several other middleware functions that have already been executed, and potentially several other functions still to be run. It is required to call `next` to pass control to the next middleware function.

What happens if `next` is not called? There is one case where we must not call `next`. In all other cases, if `next` is not called, the HTTP request will hang because no response will be given. 

What is the one case where we must not call `next`? Consider the following hypothetical router function:

这不调用next,而是调用res.send。对于response对象上的某些函数,如res.sendres.render,会发送 HTTP 响应。这是通过发送响应(res.send)来结束请求-响应循环的正确方法。如果既不调用next也不调用res.send,则请求永远不会得到响应,请求的客户端将挂起。

因此,中间件函数执行以下四种操作中的一种:

  • 执行自己的业务逻辑。前面显示的请求记录中间件就是一个例子。

  • 修改requestresponse对象。body-parser

cookie-parser执行此操作,查找要添加到request对象的数据。

  • 调用next以继续下一个中间件函数,或者以其他方式发出错误信号。

  • 发送响应,结束循环。

中间件执行的顺序取决于它们添加到app对象的顺序。添加的第一个函数首先执行,依此类推。

接下来要理解的是请求处理程序以及它们与中间件函数的区别。

中间件和请求处理程序的对比

到目前为止,我们已经看到了两种中间件函数。在一种中,第一个参数是处理程序函数。在另一种中,第一个参数是包含 URL 片段的字符串,第二个参数是处理程序函数。

实际上,app.use有一个可选的第一个参数:中间件挂载的路径。该路径是对请求 URL 的模式匹配,并且如果 URL 匹配模式,则触发给定的函数。甚至有一种方法可以在 URL 中提供命名参数:


This path specification has a pattern, `id`, and the value will land in `req.params.id`. In an Express route, this `:id` pattern marks a **route parameter**. The pattern will match a URL segment, and the matching URL content will land and be available through the `req.params` object. In this example, we're suggesting a user profile service and that for this URL, we want to display information about the named user.

As Express scans the available functions to execute, it will try to match this pattern against the request URL. If they match, then the router function is invoked.

It is also possible to match based on the HTTP request method, such as `GET` or `PUT`. Instead of `app.use`, we would write `app.METHOD`—for example, `app.get` or `app.put`. The preceding example would, therefore, be more likely to appear as follows:

GET的所需行为是检索数据,而PUT的行为是存储数据。然而,如上所述的示例,当处理程序函数仅对GET动词正确时,它将匹配任一 HTTP 方法。但是,使用app.get,如本例中的情况,确保应用程序正确匹配所需的 HTTP 方法。

最后,我们来到了Router对象。这是一种专门用于根据其 URL 路由请求的中间件。看一下routes/users.js


We have a module that creates a `router` object, then adds one or more `router` functions. It makes the `Router` object available through `module.exports` so that `app.js` can use it. This router has only one route, but `router` objects can have any number of routes that you think is appropriate.

This one route matches a `GET` request on the `/` URL. That's fine until you notice that in `routes/index.js`, there is a similar `router` function that also matches `GET` requests on the `/` URL.

Back in `app.js`, `usersRouter` is added, as follows:

这将router对象及其零个或多个路由函数挂载到/users URL 上。当 Express 寻找匹配的路由函数时,首先扫描附加到app对象的函数,对于任何路由器对象,它也会扫描其函数。然后调用与请求匹配的任何路由函数。

回到/ URL 的问题,router实际上挂载在/users URL 上是很重要的。这是因为它考虑匹配的实际 URL 是挂载点(/users)与router函数中的 URL 连接起来的。

效果是为了匹配附加到router对象的router函数,请求 URL 的挂载前缀被剥离。因此,使用该挂载点,/users/login的传入 URL 将被剥离为/login,以便找到匹配的router函数。

由于并非一切都按计划进行,我们的应用程序必须能够处理错误指示并向用户显示错误消息。

错误处理

现在,我们终于可以回到生成的app.js文件,404 Error page not found错误,以及应用程序可能向用户显示的任何其他错误。

中间件函数通过将值传递给next函数调用来指示错误,即通过调用next(err)。一旦 Express 看到错误,它将跳过任何剩余的非错误路由,并仅将错误传递给错误处理程序。错误处理程序函数的签名与我们之前看到的不同。

在我们正在检查的app.js中,以下是我们的错误处理程序,由express-generator提供:


Error handler functions take four parameters, with `err` added to the familiar `req`, `res`, and `next` functions.

Remember that `res` is the response object, and we use it to set up the HTTP response sent to the browser; even though there is an error, we still send a response.

Using `res.status` sets the HTTP response status code. In the simple application that we examined earlier, we used `res.writeHead` to set not only the status code but also the **Multipurpose Internet Mail Extensions** (**MIME**) type of the response.

The `res.render` function takes data and renders it through a template. In this case, we're using the template named `error`. This corresponds to the `views/error.hbs` file, which looks as follows:

在 Handlebars 模板中,{{value}}标记意味着将表达式或变量的值替换到模板中。此模板引用的messageerror是通过设置res.locals提供的,如下所示。

要查看错误处理程序的操作,请将以下内容添加到routes/index.js


This is a route handler, and going by what we've said, it simply generates an error indication. In a real route handler, the code would make some kind of query, gathering up data to show to the user, and it would indicate an error only if something happened along the way. However, we want to see the error handler in action.

By calling `next(err)`, as mentioned, Express will call the error handler function, causing an error response to pop up in the browser:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/f918cdd1-1894-448d-afbb-f385f8d2bb2a.png)

Indeed, at the `/error` URL, we get the Fake error message, which matches the error data sent by the route handler function.

In this section, we've created for ourselves a foundation for how Express works. Let's now turn to an Express application that actually performs a function.

# Creating an Express application to compute Fibonacci numbers

As we discussed in Chapter 1, *About Node.js* we'll be using an inefficient algorithm to calculate Fibonacci numbers to explore how to mitigate performance problems, and along the way, we'll learn how to build a simple REST service to offload computation to the backend server.

The Fibonacci numbers are the following integer sequence:

*0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ... *

Each Fibonacci number is the sum of the previous two numbers in the sequence. This sequence was discovered in 1202 by Leonardo of Pisa, who was also known as Fibonacci. One method to calculate entries in the Fibonacci sequence is using the recursive algorithm, which we discussed in Chapter 1, *About Node.js*. We will create an Express application that uses the Fibonacci implementation and along the way, we will get a better understanding of Express applications, as well as explore several methods to mitigate performance problems in computationally intensive algorithms.

Let's start with the blank application we created in the previous step. We named that application `Fibonacci` for a reason—we were thinking ahead!

In `app.js`, make the following changes to the top portion of the file:

这大部分是express-generator给我们的。var语句已更改为const,以获得更多的舒适度。我们明确导入了hbs模块,以便进行一些配置。我们还导入了一个Fibonacci的路由模块,我们马上就会看到。

对于Fibonacci应用程序,我们不需要支持用户,因此已删除了路由模块。我们将在接下来展示的routes/fibonacci.js模块用于查询我们将计算斐波那契数的数字。

在顶级目录中,创建一个名为math.js的文件,其中包含以下极其简单的斐波那契实现:


In the `views` directory, look at the file named `layout.hbs`, which was created by `express-generator`:

该文件包含我们将用于 HTML 页面的结构。根据 Handlebars 语法,我们可以看到{{title}}出现在 HTMLtitle标记中。这意味着当我们调用res.render时,我们应该提供一个title属性。{{{body}}}标记是view模板内容的落脚点。

views/index.hbs更改为只包含以下内容:


This serves as the front page of our application. It will be inserted in place of `{{{body}}}` in `views/layout.hbs`. The marker, `{{> navbar}}`, refers to a partially named `navbar` object. Earlier, we configured a directory named `partials` to hold partials. Now, let's create a file, `partials/navbar.html`, containing the following:

这将作为包含在每个页面上的导航栏。

创建一个名为views/fibonacci.hbs的文件,其中包含以下代码:


If `fiboval` is set, this renders a message that for a given number (`fibonum`), we have calculated the corresponding Fibonacci number. There is also an HTML form that we can use to enter a `fibonum` value.

Because it is a `GET` form, when the user clicks on the Submit button, the browser will issue an HTTP `GET` method to the `/fibonacci` URL. What distinguishes one `GET` method on `/fibonacci` from another is whether the URL contains a query parameter named `fibonum`. When the user first enters the page, there is no `fibonum` number and so there is nothing to calculate. After the user has entered a number and clicked on Submit, there is a `fibonum` number and so something to calculate.

Remember that the files in `views` are templates into which data is rendered. They serve the **v****iew**aspect of the **Model-View-Controller** (**MVC**) paradigm, hence the directory name.

In `routes/index.js`, change the `router` function to the following:

传递给res.render的匿名对象包含我们提供给布局和视图模板的数据值。我们现在传递了一个新的欢迎消息。

最后,在routes目录中,创建一个名为fibonacci.js的文件,其中包含以下代码:


This route handler says it matches the `/` route. However, there is a route handler in `index.js` that matches the same route. We haven't made a mistake, however. The `router` object created by this module becomes `fibonacciRouter` when it lands in `app.js`. Refer back to `app.js` and you will see that `fibonacciRouter` is mounted on `/fibonacci`. The rule is that the actual URL path matched by a router function is the path that the router is mounted on plus the path given for the router function. In this case, that is `/fibonacci` plus `/`, and for a URL, that equates to `/fibonacci`. 

The handler checks for the existence of `req.query.fibonum`. Express automatically parses the HTTP request URL and any query parameters will land in `req.query`. Therefore, this will trigger a URL such as `/fibonacci?fibonum=5`.

If this value is present, then we call `res.render('fibonacci')` with data including `fibonum`, the number for which we want its Fibonacci number, and `fiboval`, the corresponding Fibonacci number. Otherwise, we pass `undefined` for `fiboval`. If you refer back to the template, if `fiboval` is not set, then the user only sees the form to enter a `fibonum` number. Otherwise, if `fiboval` is set, both `fibonum` and `fiboval` are displayed.

The `package.json` file is already set up, so we can use `npm start` to run the script and always have debugging messages enabled. Now, we're ready to do this:

正如这个示例所暗示的,您可以访问http://localhost:3000/,看看我们有什么:

这个页面是从views/index.hbs模板中渲染出来的。只需点击斐波那契的链接,就可以进入下一个页面,当然,这个页面是从views/fibonacci.hbs模板中渲染出来的。在那个页面上,您可以输入一个数字,点击提交按钮,然后得到一个答案(提示-如果您希望在合理的时间内得到答案,请选择一个小于40的数字):

我们要求您输入一个小于40的数字。继续输入一个更大的数字,比如50,但是请喝杯咖啡,因为这将需要一段时间来计算。或者,继续阅读下一节,我们将开始讨论使用计算密集型代码。

计算密集型代码和 Node.js 事件循环

这个斐波那契的例子故意效率低下,以演示应用程序的一个重要考虑因素。当长时间计算运行时,Node.js 事件循环会发生什么?为了看到效果,打开两个浏览器窗口,每个窗口查看斐波那契页面。在一个窗口中,输入数字55或更大,而在另一个窗口中,输入10。注意第二个窗口会冻结,如果您让它运行足够长的时间,答案最终会在两个窗口中弹出。Node.js 事件循环中发生的情况是,由于斐波那契算法正在运行并且从不让出事件循环,事件循环被阻塞无法处理事件。

由于 Node.js 具有单个执行线程,处理请求取决于请求处理程序快速返回到事件循环。通常,异步编码风格确保事件循环定期执行。

即使是从地球的另一端加载数据的请求,也是如此,因为异步请求是非阻塞的,并且控制很快返回到事件循环。我们选择的天真的斐波那契函数不符合这个模型,因为它是一个长时间运行的阻塞操作。这种类型的事件处理程序会阻止系统处理请求,并阻止 Node.js 做它应该做的事情-即成为一个速度极快的 Web 服务器。

在这种情况下,长响应时间的问题是显而易见的。计算斐波那契数的响应时间迅速上升到您可以去西藏度假,成为喇嘛,也许在这段时间内转世为秘鲁的羊驼!然而,也有可能创建一个长响应时间的问题,而不像这个问题那么明显。在大型 Web 服务中的无数异步操作中,哪一个既是阻塞的又需要很长时间来计算结果?像这样的任何阻塞操作都会对服务器吞吐量产生负面影响。

为了更清楚地看到这一点,创建一个名为fibotimes.js的文件,其中包含以下代码:


Now, run it. You will get the following output:

这个方法可以快速计算斐波那契数列的前 40 个成员,但是在第 40 个成员之后,每个结果开始花费几秒钟的时间,并且很快就会变得更糟。在依赖快速返回到事件循环的单线程系统上执行这种代码是不可行的。包含这种代码的 Web 服务会给用户带来糟糕的性能。

在 Node.js 中有两种一般的方法来解决这个问题:

  • 算法重构:也许,就像我们选择的斐波那契函数一样,你的某个算法是次优的,可以重写为更快的。或者,如果不更快,它可以被拆分成通过事件循环分派的回调。我们马上就会看到其中一种方法。

  • 创建后端服务:你能想象一个专门用于计算斐波那契数的后端服务器吗?好吧,也许不行,但实现后端服务器以卸载前端服务器的工作是非常常见的,我们将在本章末实现一个后端斐波那契服务器。

考虑到这一点,让我们来看看这些可能性。

算法重构

为了证明我们手头上有一个人为的问题,这里有一个更有效的斐波那契函数:


If we substitute a call to `math.fibonacciLoop` in place of `math.fibonacci`, the `fibotimes` program runs much faster. Even this isn't the most efficient implementation; for example, a simple, prewired lookup table is much faster at the cost of some memory.

Edit `fibotimes.js` as follows and rerun the script. The numbers will fly by so fast that your head will spin:

有时,你的性能问题会很容易优化,但有时则不会。

这里的讨论不是关于优化数学库,而是关于处理影响 Node.js 服务器事件吞吐量的低效算法。因此,我们将坚持使用低效的斐波那契实现。

可以将计算分成块,然后通过事件循环分派这些块的计算。将以下代码添加到math.js中:


This converts the `fibonacci` function from a synchronous function into a traditional callback-oriented asynchronous function. We're using `setImmediate` at each stage of the calculation to ensure that the event loop executes regularly and that the server can easily handle other requests while churning away on a calculation. It does nothing to reduce the computation required; this is still the inefficient Fibonacci algorithm. All we've done is spread the computation through the event loop.

In `fibotimes.js`, we can use the following:

我们又回到了一个低效的算法,但是其中的计算是通过事件循环分布的。运行这个fibotimes.js版本会展示它的低效性。为了在服务器中展示它,我们需要做一些改变。

因为它是一个异步函数,我们需要更改我们的路由器代码。创建一个名为routes/fibonacci-async1.js的新文件,其中包含以下代码:


This is the same code as earlier, just rewritten for an asynchronous Fibonacci calculation. The Fibonacci number is returned via a callback function, and even though we have the beginnings of a callback pyramid, it is still manageable.

In `app.js`, make the following change to the application wiring:

有了这个改变,服务器在计算一个大的斐波那契数时不再冻结。当然,计算仍然需要很长时间,但至少应用程序的其他用户不会被阻塞。

您可以通过再次在应用程序中打开两个浏览器窗口来验证这一点。在一个窗口中输入60,在另一个窗口中开始请求较小的斐波那契数。与原始的fibonacci函数不同,使用fibonacciAsync允许两个窗口都给出答案,尽管如果您确实在第一个窗口中输入了60,那么您可能会去西藏度个三个月的假期:

优化代码和处理可能存在的长时间运行的计算是由你和你的具体算法来选择的。

我们创建了一个简单的 Express 应用程序,并演示了一个影响性能的缺陷。我们还讨论了算法重构,这只剩下我们讨论如何实现后端服务了。但首先,我们需要学习如何创建和访问 REST 服务。

进行 HTTPClient 请求

另一种缓解计算密集型代码的方法是将计算推送到后端进程。为了探索这种策略,我们将使用HTTPClient对象从后端斐波那契服务器请求计算。然而,在讨论这个之前,让我们先一般性地讨论一下使用HTTPClient对象。

Node.js 包括一个HTTPClient对象,用于进行 HTTP 请求非常有用。它具有发出任何类型的 HTTP 请求的能力。在本节中,我们将使用HTTPClient对象来进行类似调用 REST web 服务的 HTTP 请求。

让我们从受wgetcurl命令启发的一些代码开始,以便进行 HTTP 请求并显示结果。创建一个名为wget.js的文件,其中包含以下代码:


We invoke an HTTP request by using `http.request`, passing in an `options` object describing the request. In this case, we're making a `GET` request to the server described in a URL we provide on the command line. When the response arrives, the `response` event is fired and we can print out the response. Likewise, an `error` event is fired on errors, and we can print out the error.

This corresponds to the HTTP protocol, where the client sends a request and receives a response.

You can run the script as follows:

是的,example.com是一个真实的网站——有一天去访问它。在打印输出中还有更多内容,即http://example.com/页面的 HTML。我们所做的是演示如何使用http.request函数调用 HTTP 请求。

options对象非常简单,hostportpath字段指定了请求的 URL。method字段必须是 HTTP 动词之一(GETPUTPOST等)。你还可以为 HTTP 请求中的头部提供一个headers数组。例如,你可能需要提供一个 cookie:


The `response` object is itself an `EventEmitter` object that emits the `data` and `error` events. The `data` event is called as data arrives and the `error` event is, of course, called on errors.

The `request` object is a `WritableStream` object, which is useful for HTTP requests containing data, such as `PUT` or `POST`. This means the `request` object has a `write` function, which writes data to the requester. The data format in an HTTP request is specified by the standard MIME type, which was originally created to give us a better email service. Around 1992, the **World Wide Web** (**WWW**) community worked with the MIME standard committee, who were developing a format for multi-part, multi-media-rich electronic mail. Receiving fancy-looking email is so commonplace today that you might not be aware that email used to come in plaintext. MIME types were developed to describe the format of each piece of data, and the WWW community adopted this for use on the web. HTML forms will post with a content type of `multipart/form-data`, for example.

The next step in offloading some computation to a backend service is to implement the REST service and to make HTTP client requests to that service.

# Calling a REST backend service from an Express application

Now that we've seen how to make HTTP client requests, we can look at how to make a REST query within an Express web application. What that effectively means is making an HTTP `GET` request to a backend server, which responds to the Fibonacci number represented by the URL. To do so, we'll refactor the Fibonacci application to make a Fibonacci server that is called from the application. While this is overkill for calculating Fibonacci numbers, it lets us see the basics of implementing a multi-tier application stack in Express.

Inherently, calling a REST service is an asynchronous operation. That means calling the REST service will involve a function call to initiate the request and a callback function to receive the response. REST services are accessed over HTTP, so we'll use the `HTTPClien`t object to do so. We'll start this little experiment by writing a REST server and exercising it by making calls to the service. Then, we'll refactor the Fibonacci service to call that server.

## Implementing a simple REST server with Express

While Express can also be used to implement a simple REST service, the parameterized URLs we showed earlier (`/user/profile/:id`) can act like parameters to a REST call. Express makes it easy to return data encoded in JSON format.

Now, create a file named `fiboserver.js`, containing the following code:

这是一个简化的 Express 应用程序,直接提供 Fibonacci 计算服务。它支持的一个路由使用了我们已经使用过的相同函数来处理 Fibonacci 计算。

这是我们第一次看到res.send的使用。这是一种灵活的发送响应的方式,可以接受一个头部值的数组(用于 HTTP 响应头)和一个 HTTP 状态码。在这里使用时,它会自动检测对象,将其格式化为 JSON 文本,并使用正确的Content-Type参数发送它。

package.json中,将以下内容添加到scripts部分:


This automates launching our Fibonacci service.

Note that we're specifying the TCP/IP port via an environment variable and using that variable in the application. Some suggest that putting configuration data in the environment variable is the best practice.

Now, let's run it:

然后,在一个单独的命令窗口中,我们可以使用curl程序对这个服务发出一些请求:


Over in the window where the service is running, we'll see a log of `GET` requests and how long each request took to process:

这很简单——使用curl,我们可以发出 HTTP GET请求。现在,让我们创建一个简单的客户端程序fiboclient.js,以编程方式调用 Fibonacci 服务:


This is our good friend `http.request` with a suitable `options` object. We're executing it in a loop, so pay attention to the order that the requests are made versus the order the responses arrive.

Then, in `package.json`, add the following to the `scripts` section:

然后,运行client应用程序:


We're building our way toward adding the REST service to the web application. At this point, we've proved several things, one of which is the ability to call a REST service in our program.

We also inadvertently demonstrated an issue with long-running calculations. You'll notice that the requests were made from the largest to the smallest, but the results appeared in a very different order. Why? This is because of the processing time required for each request, and the inefficient algorithm we're using. The computation time increases enough to ensure that larger request values have enough processing time to reverse the order.

What happens is that `fiboclient.js` sends all of its requests right away, and then each one waits for the response to arrive. Because the server is using `fibonacciAsync`, it will work on calculating all the responses simultaneously. The values that are quickest to calculate are the ones that will be ready first. As the responses arrive in the client, the matching response handler fires, and in this case, the result prints to the console. The results will arrive when they're ready, and not a millisecond sooner.

We now have enough on our hands to offload Fibonacci calculation to a backend service.

## Refactoring the Fibonacci application to call the REST service

Now that we've implemented a REST-based server, we can return to the Fibonacci application, applying what we've learned to improve it. We will lift some of the code from `fiboclient.js` and transplant it into the application to do this. Create a new file, `routes/fibonacci-rest.js`, with the following code:

这是 Fibonacci 路由处理程序的一个新变体,这次调用 REST 后端服务。我们将fiboclient.js中的http.request调用移植过来,并将来自client对象的事件与 Express 路由处理程序集成。在正常的执行路径中,HTTPClient发出一个response事件,包含一个response对象。当该对象发出一个data事件时,我们就有了结果。结果是 JSON 文本,我们可以解析然后作为响应返回给浏览器。

app.js中,进行以下更改:


This, of course, reconfigures it to use the new route handler. Then, in `package.json`, change the `scripts` entry to the following:

我们如何为所有三个scripts条目设置相同的SERVERPORT值?答案是该变量在不同的地方使用方式不同。在startrest中,该变量用于routes/fibonacci-rest.js中,以知道 REST 服务运行在哪个端口。同样,在client中,fiboclient.js使用该变量来达到相同的目的。最后,在server中,fiboserver.js脚本使用SERVERPORT变量来知道要监听哪个端口。

startstartrest中,没有为PORT指定值。在这两种情况下,如果没有指定值,bin/www默认为PORT=3000

在命令窗口中,启动后端服务器,在另一个窗口中,启动应用程序。像之前一样,打开一个浏览器窗口,并发出一些请求。你应该会看到类似以下的输出:


The output looks like this for the application:

因为我们没有改变模板,所以屏幕看起来和之前一样。

我们可能会在这个解决方案中遇到另一个问题。我们低效的 Fibonacci 算法的异步实现可能会导致 Fibonacci 服务进程耗尽内存。在 Node.js 的 FAQ 中,github.com/nodejs/node/wiki/FAQ,建议使用--max_old_space_size标志。你可以将这个标志添加到package.json中,如下所示:


然而,FAQ 中还说,如果你遇到最大内存空间问题,你的应用程序可能需要重构。这回到了我们之前提到的一点,解决性能问题有几种方法,其中之一是对应用程序进行算法重构。

为什么要费力开发这个 REST 服务器,而不直接使用`fibonacciAsync`呢?

主要优势是将这种繁重计算的 CPU 负载推送到一个单独的服务器上。这样做可以保留前端服务器的 CPU 容量,以便它可以处理 Web 浏览器。 GPU 协处理器现在广泛用于数值计算,并且可以通过简单的网络 API 访问。重计算可以保持分离,甚至可以部署一个位于负载均衡器后面的后端服务器集群,均匀分发请求。这样的决策一直在不断地制定,以创建多层系统。

我们所展示的是,在几行 Node.js 和 Express 代码中实现简单的多层 REST 服务是可能的。整个练习让我们有机会思考在 Node.js 中实现计算密集型代码的价值,以及将一个较大的服务拆分成多个服务的价值。

当然,Express 并不是唯一可以帮助我们创建 REST 服务的框架。

## 一些 RESTful 模块和框架

以下是一些可用的包和框架,可以帮助您的基于 REST 的项目:

+   **Restify** ([>http://restify.com/](http://restify.com/)):这为 REST 事务的两端提供了客户端和服务器端框架。服务器端 API 类似于 Express。

+   **Loopback** ([`loopback.io/`](http://loopback.io/)):这是 StrongLoop 提供的一个产品。它提供了许多功能,并且当然是建立在 Express 之上的。

在这一部分,我们在创建后端 REST 服务方面取得了很大的成就。

# 总结

在本章中,您学到了很多关于 Node.js 的`EventEmitter`模式、`HTTPClient`和服务器对象,至少有两种创建 HTTP 服务的方法,如何实现 Web 应用程序,甚至如何创建一个 REST 客户端和 REST 服务集成到面向客户的 Web 应用程序中。在这个过程中,我们再次探讨了阻塞操作的风险,保持事件循环运行的重要性,以及在多个服务之间分发工作的几种方法。

现在,我们可以继续实现一个更完整的应用程序:一个用于记笔记的应用程序。在接下来的几章中,我们将使用`Notes`应用程序作为一个工具来探索 Express 应用程序框架、数据库访问、部署到云服务或您自己的服务器、用户身份验证、用户之间的半实时通信,甚至加强应用程序对多种攻击的防御。最终,我们将得到一个可以部署到云基础设施的应用程序。

这本书还有很多内容要涵盖,下一章将从创建一个基本的 Express 应用程序开始。


# 第七章

第二部分:开发 Express 应用程序

本书的核心是从最初的概念开始开发一个 Express 应用程序,该应用程序可以将数据存储在数据库中并支持多个用户。

本节包括以下章节:

+   第五章,*你的第一个 Express 应用程序*

+   第六章,*实现移动优先的范例*

+   第七章,*数据存储和检索*

+   第八章,*使用微服务对用户进行身份验证*

+   第九章,*使用 Socket.IO 进行动态客户端/服务器交互*


您的第一个 Express 应用程序

现在我们已经开始为 Node.js 构建 Express 应用程序,让我们开始开发一个执行有用功能的应用程序。我们将构建的应用程序将保留一个笔记列表,并最终会有用户可以互发消息。在本书的过程中,我们将使用它来探索一些真实 Express Web 应用程序的方面。

在本章中,我们将从应用程序的基本结构、初始 UI 和数据模型开始。我们还将为添加持久数据存储和我们将在后续章节中涵盖的所有其他功能奠定基础。

本章涵盖的主题包括以下内容:

+   在 Express 路由器函数中使用 Promises 和 async 函数

+   JavaScript 类定义和 JavaScript 类中的数据隐藏

+   使用 MVC 范例的 Express 应用程序架构

+   构建 Express 应用程序

+   实现 CRUD 范例

+   Express 应用程序主题和 Handlebars 模板

首先,我们将讨论如何将 Express 路由器回调与 async 函数集成。

# 第八章:在 Express 路由器函数中探索 Promises 和 async 函数的主题

在我们开始开发应用程序之前,我们需要深入了解如何在 Express 中使用`Promise`类和 async 函数,因为 Express 是在这些功能存在之前发明的,因此它不直接与它们集成。虽然我们应该尽可能使用 async 函数,但我们必须了解如何在某些情况下正确使用它们,比如在 Express 应用程序中。

Express 处理异步执行的规则如下:

+   同步错误由 Express 捕获,并导致应用程序转到错误处理程序。

+   异步错误必须通过调用`next(err)`来报告。

+   成功执行的中间件函数告诉 Express 通过调用`next()`来调用下一个中间件。

+   返回 HTTP 请求结果的路由器函数不调用`next()`。

在本节中,我们将讨论三种使用 Promises 和 async 函数的方法,以符合这些规则。

Promise 和 async 函数都用于延迟和异步计算,并且可以使深度嵌套的回调函数成为过去的事情:

+   `Promise`类表示尚未完成但预计将来完成的操作。我们已经使用过 Promises,所以我们知道当承诺的结果(或错误)可用时,`.then`或`.catch`函数会异步调用。

+   在异步函数内部,`await`关键字可用于自动等待 Promise 解析。它返回 Promise 的结果,否则在下一行代码的自然位置抛出错误,同时也适应异步执行。

异步函数的魔力在于我们可以编写看起来像同步代码的异步代码。它仍然是异步代码——意味着它与 Node.js 事件循环正确工作——但是结果和错误不再落在回调函数内部,而是自然地作为异常抛出,结果自然地落在下一行代码上。

因为这是 JavaScript 中的一个新功能,所以我们必须正确地整合几种传统的异步编码实践。您可能会遇到一些其他用于管理异步代码的库,包括以下内容:

+   `async`库是一组用于各种异步模式的函数。它最初完全围绕回调函数范式实现,但当前版本可以处理 async 函数,并且作为 ES6 包可用。有关更多信息,请参阅[`www.npmjs.com/package/async`](https://www.npmjs.com/package/async)。

+   在 Promise 标准化之前,至少有两种实现可用:Bluebird ([`bluebirdjs.com/`](http://bluebirdjs.com/))和 Q ([`www.npmjs.com/package/q`](https://www.npmjs.com/package/q))。如今,我们专注于使用标准内置的`Promise`对象,但这两个包都提供了额外的功能。更有可能的是,我们会遇到使用这些库的旧代码。

这些和其他工具的开发是为了更容易编写异步代码并解决**末日金字塔**问题。这是根据代码在几层嵌套后采取的形状而命名的。任何以回调函数编写的多阶段过程都可能迅速升级为嵌套多层的代码。考虑以下例子:

We don't need to worry about the specific functions, but we should instead recognize that one callback tends to lead to another. Before you know it, you've landed in the middle of a deeply nested structure like this. Rewriting this as an async function will make it much clearer. To get there, we need to examine how Promises are used to manage asynchronous results, as well as get a deeper understanding of async functions.

A Promise is either in an unresolved or resolved state. This means that we create a Promise using new Promise, and initially, it is in the unresolved state. The Promise object transitions to the resolved state, where either its resolve or reject functions are called. If the resolve function is called, the Promise is in a successful state, and if instead its reject function is called, the Promise is in a failed state.

More precisely, Promise objects can be in one of three states:

  • Pending: This is the initial state, which is neither fulfilled nor rejected.
  • Fulfilled: This is the final state, where it executes successfully and produces a result.
  • Rejected: This is the final state, where execution fails.

We generate a Promise in the following way:


这样的函数创建了`Promise`对象,给它一个回调函数,在其中是您的异步操作。`resolve`和`reject`函数被传递到该函数中,并在 Promise 解析为成功或失败状态时调用。`new Promise`的典型用法是这样的结构:

This is the pattern that we use when promisifying an asynchronous function that uses callbacks. The asynchronous code executes, and in the callback, we invoke either resolve or reject, as appropriate. We can usually use the util.promisify Node.js function to do this for us, but it's very useful to know how to construct this as needed.

Your caller then uses the function, as follows:


`Promise`对象足够灵活,传递给`.then`处理程序的函数可以返回一些东西,比如另一个 Promise,并且可以将`.then`调用链接在一起。在`.then`处理程序中返回的值(如果有的话)将成为一个新的`Promise`对象,通过这种方式,您可以构建一个`.then`和`.catch`调用链来管理一系列异步操作。

使用`Promise`对象,一系列异步操作被称为**Promise 链**,由链接的`.then`处理程序组成,我们将在下一节中看到。

## 在 Express 路由函数中的 Promise 和错误处理

重要的是要正确处理所有错误并将其报告给 Express。对于同步代码,Express 将正确捕获抛出的异常并将其发送到错误处理程序。看下面的例子:

Express catches that exception and does the right thing, meaning it invokes the error handler, but it does not see a thrown exception in asynchronous code. Consider the following error example:


这是一个错误指示器落在回调函数中不方便的地方的例子。异常在一个完全不同的堆栈帧中抛出,而不是由 Express 调用的堆栈帧。即使我们安排返回一个 Promise,就像异步函数的情况一样,Express 也不处理 Promise。在这个例子中,错误被丢失;调用者永远不会收到响应,也没有人知道为什么。

重要的是要可靠地捕获任何错误,并用结果或错误回应调用者。为了更好地理解这一点,让我们重新编写一下“末日金字塔”示例:

This is rewritten using a Promise chain, rather than nested callbacks. What had been a deeply nested pyramid of callback functions is now arguably a little cleaner thanks to Promises.

The Promise class automatically captures all the errors and searches down the chain of operations attached to the Promise to find and invoke the first .catch function. So long as no errors occur, each .then function in the chain is executed in turn.

One advantage of this is that error reporting and handling is much easier. With the callback paradigm, the nature of the callback pyramid makes error reporting trickier, and it's easy to miss adding the correct error handling to every possible branch of the pyramid. Another advantage is that the structure is flatter and, therefore, easier to read.

To integrate this style with Express, notice the following:

  • The final step in the Promise chain uses res.render or a similar function to return a response to the caller.
  • The final catch function reports any errors to Express using next(err).

If instead we simply returned the Promise and it was in the rejected state, Express would not handle that failed rejection and the error would be lost.

Having looked at integrating asynchronous callbacks and Promise chains with Express, let's look at integrating async functions.

Integrating async functions with Express router functions

There are two problems that need to be addressed that are related to asynchronous coding in JavaScript. The first is the pyramid of doom, an unwieldily nested callback structure. The second is the inconvenience of where results and errors are delivered in an asynchronous callback.

To explain, let's reiterate the example that Ryan Dahl gives as the primary Node.js idiom:


这里的目标是避免使用长时间操作阻塞事件循环。使用回调函数推迟处理结果或错误是一个很好的解决方案,也是 Node.js 的基本习惯用法。回调函数的实现导致了这个金字塔形的问题。Promise 帮助扁平化代码,使其不再呈现金字塔形状。它们还捕获错误,确保将其传递到有用的位置。在这两种情况下,错误和结果都被埋在一个匿名函数中,并没有传递到下一行代码。

生成器和迭代协议是一个中间的架构步骤,当与 Promise 结合时,会导致异步函数。我们在本书中不会使用这两者,但值得了解。

有关迭代协议的文档,请参阅[`developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Iteration_protocols`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Iteration_protocols)。

有关生成器函数的文档,请参阅[`developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Generator`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Generator)。

我们已经使用了异步函数,并了解了它们如何让我们编写看起来整洁的异步代码。例如,`db.query`作为异步函数的示例如下:

This is much cleaner, with results and errors landing where we want them to.

However, to discuss integration with Express, let's return to the pyramid of doom example from earlier, rewriting it as an async function:


除了`try/catch`,这个例子与之前的形式相比非常干净,无论是作为回调金字塔还是 Promise 链。所有样板代码都被抹去,程序员的意图清晰地展现出来。没有东西丢失在回调函数中。相反,一切都方便地落在下一行代码中。

`await`关键字寻找一个 Promise。因此,`doSomething`和其他函数都应该返回一个 Promise,而`await`管理其解析。这些函数中的每一个都可以是一个异步函数,因此自动返回一个 Promise,或者可以显式创建一个 Promise 来管理异步函数调用。生成器函数也涉及其中,但我们不需要知道它是如何工作的。我们只需要知道`await`管理异步执行和 Promise 的解析。

更重要的是,带有`await`关键字的每个语句都是异步执行的。这是`await`的一个副作用——管理异步执行以确保异步结果或错误被正确传递。然而,Express 无法捕获异步错误,需要我们使用`next()`通知它异步结果。

`try/catch`结构是为了与 Express 集成而需要的。基于刚才给出的原因,我们必须显式捕获异步传递的错误,并使用`next(err)`通知 Express。

在本节中,我们讨论了三种通知 Express 有关异步传递错误的方法。接下来要讨论的是一些架构选择,以便结构化代码。

# 在 MVC 范式中架构 Express 应用程序

Express 不会强制规定你应该如何构建应用程序的**模型**、**视图**和**控制器**(**MVC**)模块的结构,或者是否应该完全遵循任何 MVC 范式。MVC 模式被广泛使用,涉及三个主要的架构组件。**控制器**接受用户的输入或请求,将其转换为发送给模型的命令。**模型**包含应用程序操作的数据、逻辑和规则。**视图**用于向用户呈现结果。

正如我们在上一章中学到的,Express 生成器创建的空应用程序提供了 MVC 模型的两个方面:

+   `views`目录包含模板文件,控制显示部分,对应于视图。

+   `routes`目录包含实现应用程序识别的 URL 并协调生成每个 URL 响应所需的数据操作的代码。这对应于控制器。

由于路由器函数还调用函数来使用模板生成结果,我们不能严格地说路由器函数是控制器,`views`模板是视图。然而,这足够接近 MVC 模型,使其成为一个有用的类比。

这让我们面临一个问题,那就是在哪里放置模型代码。由于相同的数据操作可以被多个路由器函数使用,显然路由器函数应该使用一个独立的模块(或模块)来包含模型代码。这也将确保关注点的清晰分离,例如,以便轻松进行每个单元的测试。

我们将使用的方法是创建一个`models`目录,作为`views`和`routes`目录的同级目录。`models`目录将包含处理数据存储和其他我们可能称之为**业务逻辑**的代码的模块。`models`目录中模块的 API 将提供创建、读取、更新或删除数据项的函数——一个**C****reate,** **R****ead,** **Update, and D****elete**/**Destroy **(**CRUD**)模型——以及视图代码执行其任务所需的其他函数。

CRUD 模型包括持久数据存储的四个基本操作。`Notes`应用程序被构建为一个 CRUD 应用程序,以演示实现这些操作的过程。

我们将使用`create`、`read`、`update`和`destroy`函数来实现每个基本操作。

我们使用`destroy`动词,而不是`delete`,因为`delete`是 JavaScript 中的保留字。

考虑到这个架构决定,让我们继续创建`Notes`应用程序。

# 创建 Notes 应用程序

由于我们正在启动一个新的应用程序,我们可以使用 Express 生成器给我们一个起点。虽然不一定要使用这个工具,因为我们完全可以自己编写代码。然而,优点在于它给了我们一个完全成熟的起点:

As in the previous chapter, we will use cross-env to ensure that the scripts run cross-platform. Start by changing package.json to have the following scripts section:


提供的脚本使用`bin/www`,但很快,我们将重新构造生成的代码,将所有内容放入一个名为`app.mjs`的单个 ES6 脚本中。

然后,安装`cross-env`,如下所示:

With cross-env, the scripts are executable on either Unix-like systems or Windows.

If you wish, you can run npm start and view the blank application in your browser. Instead, let's rewrite this starting-point code using ES6 modules, and also combine the contents of bin/www with app.mjs.

Rewriting the generated router module as an ES6 module

Let's start with the routes directory. Since we won't have a Users concept right now, delete users.js. We need to convert the JavaScript files into ES6 format, and we can recall that the simplest way for a module to be recognized as an ES6 module is to use the .mjs extension. Therefore, rename index.js to index.mjs, rewriting it as follows:


我们稍后会完成这个,但我们所做的是重新构造我们得到的代码。我们可以导入 Express 包,然后导出`router`对象。添加路由函数当然是以相同的方式进行的,无论是 CommonJS 还是 ES6 模块。我们将路由回调设置为异步函数,因为它将使用异步代码。

我们需要遵循相同的模式来创建任何其他路由模块。

将其转换为 ES6 模块后,下一步是将`bin/www`和`app.js`的代码合并到一个名为`app.mjs`的 ES6 模块中。

## 创建 Notes 应用程序连接 - app.mjs

由于`express-generator`工具给了我们一个略显混乱的应用程序结构,没有使用 ES6 模块,让我们适当地重新构思它给我们的代码。首先,`app.mjs`包含了应用程序的“连接”,意味着它配置了构成应用程序的对象和函数,而不包含任何自己的函数。另一个代码`appsupport.mjs`包含了在生成的`app.js`和`bin/www`模块中出现的回调函数。

在`app.mjs`中,从这里开始:

The generated app.js code had a series of require statements. We have rewritten them to use corresponding import statements. We also added code to calculate the __filename and __dirname variables, but presented a little differently. To support this, add a new module, approotdir.mjs, containing the following:


在第三章的`dirname-fixed.mjs`示例中,我们从`path`和`url`核心模块中导入了特定的函数。我们使用了那段代码,然后将`__dirname`的值导出为`approotdir`。Notes 应用程序的其他部分只需要应用程序的根目录的路径名,以便计算所需的路径名。

回到`app.mjs`,你会看到路由模块被导入为`indexRouter`和`notesRouter`。目前,`notesRouter`被注释掉了,但我们将在后面的部分中处理它。

现在,让我们初始化`express`应用程序对象:

This should look familiar to the app.js code we used in the previous chapter. Instead of inline functions, however, they're pushed into appsupport.mjs.

The app and port objects are exported in case some other code in the application needs those values.

This section of code creates and configures the Express application instance. To make it a complete running server, we need the following code:


这段代码将 Express 应用程序包装在 HTTP 服务器中,并让它监听 HTTP 请求。`server`对象也被导出,以便其他代码可以访问它。

将`app.mjs`与生成的`app.js`和`bin/www`代码进行比较,你会发现我们已经覆盖了这两个模块中的所有内容,除了内联函数。这些内联函数可以写在`app.mjs`的末尾,但我们选择创建第二个模块来保存它们。

创建`appsupport.mjs`来保存内联函数,从以下开始:

This function handles safely converting a port number string that we might be given into a numerical value that can be used in the application. The isNaN test is used to handle cases where instead of a TCP port number, we want to use a named pipe. Look carefully at the other functions and you'll see that they all accommodate either a numerical port number or a string described as a pipe:


前面的代码处理了来自 HTTP 服务器对象的错误。其中一些错误将简单地导致服务器退出:

The preceding code prints a user-friendly message saying where the server is listening for HTTP connections. Because this function needs to reference the server object, we have imported it:


这些以前是实现 Express 应用程序的错误处理的内联函数。

这些更改的结果是`app.mjs`现在没有分散注意力的代码,而是专注于连接构成应用程序的不同部分。由于 Express 没有固定的意见,它并不在乎我们像这样重构代码。我们可以以任何对我们有意义并且正确调用 Express API 的方式来构建代码结构。

由于这个应用程序是关于存储数据的,让我们接下来谈谈数据存储模块。

## 实现 Notes 数据存储模型

请记住,我们之前决定将数据模型和数据存储代码放入一个名为`models`的目录中,以配合`views`和`routes`目录。这三个目录将分别存储 MVC 范例的三个方面。

这个想法是集中存储数据的实现细节。数据存储模块将提供一个 API 来存储和操作应用程序数据,在本书的过程中,我们将对这个 API 进行多次实现。要在不同的存储引擎之间切换,只需要进行配置更改。应用程序的其余部分将使用相同的 API 方法,无论使用的是哪种存储引擎。

首先,让我们定义一对类来描述数据模型。在`models/Notes.mjs`中创建一个名为`models/Notes.mjs`的文件,并在其中包含以下代码:

This defines two classes—Note and AbstractNotesStore—whose purpose is as follows:

  • The Note class describes a single note that our application will manage.
  • The AbstractNotesStore class describes methods for managing some note instances.

In the Note class, key is how we look for the specific note, and title and body are the content of the note. It uses an important data hiding technique, which we'll discuss in a minute.

The AbstractNotesStore class documents the methods that we'll use for accessing notes from a data storage system. Since we want the Notes application to implement the CRUD paradigm, we have the create, read, update, and destroy methods, plus a couple more to assist in searching for notes. What we have here is an empty class that serves to document the API, and we will use this as the base class for several storage modules that we'll implement later.

The close method is meant to be used when we're done with a datastore. Some datastores keep an open connection to a server, such as a database server, and the close method should be used to close that connection.

This is defined with async functions because we'll store data in the filesystem or in databases. In either case, we need an asynchronous API.

Before implementing our first data storage model, let's talk about data hiding in JavaScript classes.

Data hiding in ES-2015 class definitions

In many programming languages, class definitions let us designate some data fields as private and others as public. This is so that programmers can hide implementation details. However, writing code on the Node.js platform is all about JavaScript, and JavaScript, in general, is very lax about everything. So, by default, fields in an instance of a JavaScript class are open to any code to access or modify.

One concern arises if you have several modules all adding fields or functions to the same object. How do you guarantee that one module won't step on fields added by another module? By default, in JavaScript, there is no such guarantee.

Another concern is hiding implementation details so that the class can be changed while knowing that internal changes won't break other code. By default, JavaScript fields are open to all other code, and there's no guarantee other code won't access fields that are meant to be private.

The technique used in the Note class gates access to the fields through getter and setter functions. These in turn set or get values stored in the instance of the class. By default, those values are visible to any code, and so these values could be modified in ways that are incompatible with the class. The best practice when designing classes is to localize all manipulation of class instance data to the member functions. However, JavaScript makes the fields visible to the world, making it difficult to follow this best practice. The pattern used in the Note class is the closest we can get in JavaScript to data hiding in a class instance.

The technique we use is to name the fields using instances of the Symbol class. Symbol, another ES-2015 feature, is an opaque object with some interesting attributes that make it attractive for use as keys for private fields in objects. Consider the following code:


创建`Symbol`实例是通过`Symbol('symbol-name')`完成的。生成的`Symbol`实例是一个唯一标识符,即使再次调用`Symbol('symbol-name')`,唯一性也得到保留。每个`Symbol`实例都是唯一的,即使是由相同的字符串形成的。在这个例子中,`b`和`b1`变量都是通过调用`Symbol('b')`形成的,但它们并不相等。

让我们看看如何使用`Symbol`实例来附加字段到一个对象上:

We've created a little object, then used those Symbol instances as field keys to store data in the object. Notice that when we dump the object's contents, the two fields both register as Symbol(b), but they are two separate fields.

With the Note class, we have used the Symbol instances to provide a small measure of data hiding. The actual values of the Symbol instances are hidden inside Notes.mjs. This means the only code that can directly access the fields is the code running inside Notes.mjs:


定义了`Note`类之后,我们可以创建一个`Note`实例,然后转储它并查看结果字段。这些字段的键确实是`Symbol`实例。这些`Symbol`实例被隐藏在模块内部。这些字段本身对模块外部的代码是可见的。正如我们在这里看到的,企图用`note[Symbol('key')] = 'new key'`来破坏实例并不会覆盖字段,而是会添加第二个字段。

定义了我们的数据类型,让我们从一个简单的内存数据存储开始实现应用程序。

## 实现内存中的笔记数据存储

最终,我们将创建一个`Notes`数据存储模块,将笔记持久化到长期存储中。但是为了让我们开始,让我们实现一个内存数据存储,这样我们就可以继续实现应用程序。因为我们设计了一个抽象基类,我们可以很容易地为各种存储服务创建新的实现。

在`models`目录中创建一个名为`notes-memory.mjs`的文件,其中包含以下代码:

This should be fairly self-explanatory. The notes are stored in a private array, named notes. The operations, in this case, are defined in terms of adding or removing items in that array. The key object for each Note instance is used as the index to the notes array, which in turn holds the Note instance. This is simple, fast, and easy to implement. It does not support any long-term data persistence, and any data stored in this model will disappear when the server is killed.

We need to initialize an instance of NotesStore so that it can be used in the application. Let's add the following to app.mjs, somewhere near the top:


这将创建一个类的实例并将其导出为`NotesStore`。只要我们有一个单一的`NotesStore`实例,这将起作用,但是在第七章中,*数据存储和检索*,我们将改变这一点,以支持动态选择`NotesStore`实例。

我们现在准备开始实现应用程序的网页和相关代码,从主页开始。

## 笔记主页

我们将修改起始应用程序以支持创建、编辑、更新、查看和删除笔记。让我们从更改主页开始,显示一个笔记列表,并在顶部导航栏中添加一个链接到添加笔记页面,这样我们就可以随时添加新的笔记。

`app.mjs`中不需要更改,因为主页是在这个路由模块中控制的。

In app.mjs, we configured the Handlebars template engine to use the partials directory to hold partial files. Therefore, make sure you create that directory.

To implement the home page, update routes/index.mjs to the following:


我们之前展示了这个概要,并且已经定义了`Notes`数据存储模型,我们可以填写这个函数。

这使用了我们之前设计的`AbstractNotesStore` API。`keylist`方法返回当前应用程序存储的笔记的键值列表。然后,它使用`read`方法检索每个笔记,并将该列表传递给一个模板,该模板呈现主页。这个模板将呈现一个笔记列表。

如何检索所有的笔记?我们可以编写一个简单的`for`循环,如下所示:

This has the advantage of being simple to read since it's a simple for loop. The problem is that this loop reads the notes one at a time. It's possible that reading the notes in parallel is more efficient since there's an opportunity to interweave the processing.

The Promise.all function executes an array of Promises in parallel, rather than one at a time. The keyPromises variable ends up being an array of Promises, each of which is executing notes.read to retrieve a single note.

The map function in the arrays converts (or maps) the values of an input array to produce an output array with different values. The output array has the same length as the input array, and the entries are a one-to-one mapping of the input value to an output value. In this case, we map the keys in keylist to a Promise that's waiting on a function that is reading each note. Then, Promise.all waits for all the Promises to resolve into either success or failure states.

The output array, notelist, will be filled with the notes once all the Promises succeed. If any Promises fail, they are rejected—in other words, an exception will be thrown instead.

The notelist array is then passed into the view template that we're about to write.

But first, we need a page layout template. Create a file, views/layout.hbs, containing the following:


这是由`express-generator`生成的文件,还添加了一个用于页面标题的`header`部分。

请记住,在斐波那契应用程序中,我们使用了一个*partial*来存储导航的 HTML 片段。部分是 HTML 模板片段,可以在一个或多个模板中重用。在这种情况下,`header`部分将出现在每个页面上,并作为应用程序中的通用导航栏。创建`partials/header.hbs`,包含以下内容:

This simply looks for a variable, title, which should have the page title. It also outputs a navigation bar containing a pair of links—one to the home page and another to /notes/add, where the user will be able to add a new note.

Now, let's rewrite views/index.hbs to this:


这只是简单地遍历笔记数据数组并格式化一个简单的列表。每个项目都链接到`/notes/view` URL,并带有一个`key`参数。我们还没有编写处理该 URL 的代码,但显然会显示笔记。另一个需要注意的是,如果`notelist`为空,将不会生成列表的 HTML。

当然,还有很多东西可以放进去。例如,通过在这里添加适当的`script`标签,可以很容易地为每个页面添加 jQuery 支持。

我们现在已经写了足够的内容来运行应用程序,让我们查看主页:

If we visit http://localhost:3000, we will see the following page:

Because there aren't any notes (yet), there's nothing to show. Clicking on the Home link just refreshes the page. Clicking on the ADD Note link throws an error because we haven't (yet) implemented that code. This shows that the provided error handler in app.mjs is performing as expected.

Having implemented the home page, we need to implement the various pages of the application. We will start with the page for creating new notes, and then we will implement the rest of the CRUD support.

Adding a new note – create

If we click on the ADD Note link, we get an error because the application doesn't have a route configured for the /notes/add URL; we need to add one. To do that, we need a controller module for the notes that defines all the pages for managing notes in the application.

In app.mjs, uncomment the two lines dealing with notesRouter:


我们最终会在`app.mjs`中得到这个。我们导入两个路由,然后将它们添加到应用程序配置中。

创建一个名为`routes/notes.mjs`的文件来保存`notesRouter`,并以以下内容开始:

This handles the /notes/add URL corresponding to the link in partials/header.hbs. It simply renders a template, noteedit, using the provided data.

In the views directory, add the corresponding template, named noteedit.hbs, containing the following:


这个模板支持创建新笔记和更新现有笔记。我们将通过`docreate`标志重用这个模板来支持这两种情况。

请注意,在这种情况下,传递给模板的`note`和`notekey`对象是空的。模板检测到这种情况,并确保输入区域为空。此外,还传递了一个标志`docreate`,以便表单记录它是用于创建还是更新笔记。在这一点上,我们正在添加一个新的笔记,所以没有`note`对象存在。模板代码被防御性地编写,以避免抛出错误。

创建 HTML 表单时,必须小心使用包含值的元素中的空格。考虑一个情况,`<textarea>`元素被格式化如下:

By normal coding practices, this looks alright, right? It's nicely indented, with the code arranged for easy reading. The problem is that extra whitespace ends up being included in the body value when the form is submitted to the server. That extra whitespace is added because of the nicely indented code. To avoid that extra whitespace, we need to use the angle brackets in the HTML elements that are directly adjacent to the Handlebars code to insert the value. Similar care must be taken with the elements with the value= attributes, ensuring no extra whitespace is within the value string.

This template is a form that will post its data to the /notes/save URL. If you were to run the application now, it would give you an error message because no route is configured for that URL.

To support the /notes/save URL, add it to routes/notes.mjs:


因为这个 URL 也将用于创建和更新笔记,所以我们检查`docreate`标志来调用适当的模型操作。

`notes.create`和`notes.update`都是异步函数,这意味着我们必须使用`await`。

这是一个 HTTP `POST` 处理程序。由于`bodyParser`中间件,表单数据被添加到`req.body`对象中。附加到`req.body`的字段直接对应于 HTML 表单中的元素。

在这里,以及大多数其他路由函数中,我们使用了我们之前讨论过的`try/catch`结构,以确保错误被捕获并正确转发给 Express。这与前面的`/notes/add`路由函数的区别在于路由器是否使用异步回调函数。在这种情况下,它是一个异步函数,而对于`/notes/add`,它不是异步的。Express 知道如何处理非异步回调中的错误,但不知道如何处理异步回调函数中的错误。

现在,我们可以再次运行应用程序并使用“添加笔记”表单:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/c949b296-32e0-4690-be97-a94016e40b5e.png)

然而,点击提交按钮后,我们收到了一个错误消息。这是因为还没有任何东西来实现`/notes/view` URL。

您可以修改`Location`框中的 URL 以重新访问`http://localhost:3000`,然后在主页上看到类似以下截图的内容:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/94936122-5cf1-4959-999d-0fc10d3766b3.png)

笔记实际上已经存在;我们只需要实现`/notes/view`。让我们继续进行。

## 查看笔记-读取

现在我们已经了解了如何创建笔记,我们需要继续阅读它们。这意味着为`/notes/view` URL 实现控制器逻辑和视图模板。

将以下`router`函数添加到`routes/notes.mjs`中:

Because this route is mounted on a router handling, /notes, this route handles /notes/view.

The handler simply calls notes.read to read the note. If successful, the note is rendered with the noteview template. If something goes wrong, we'll instead display an error to the user through Express.

Add the noteview.hbs template to the views directory, referenced by the following code:


这很简单;我们从`note`对象中取出数据,并使用 HTML 显示它。底部有两个链接——一个是到`/notes/destroy`用于删除笔记,另一个是到`/notes/edit`用于编辑它。

这两个对应的代码目前都不存在,但这并不妨碍我们继续执行应用程序:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/1f551683-feb4-41c4-86a4-c5fcdabafde8.png)

正如预期的那样,使用这段代码,应用程序会正确重定向到`/notes/view`,我们可以看到我们的成果。同样,预期之中,点击删除或编辑链接都会给我们一个错误,因为代码还没有被实现。

接下来我们将创建处理编辑链接的代码,稍后再创建处理删除链接的代码。

## 编辑现有的笔记 - 更新

现在我们已经看过了`create`和`read`操作,让我们看看如何更新或编辑一个笔记。

在`routes/notes.mjs`中添加以下路由函数:

This handles the /notes/edit URL.

We're reusing the noteedit.hbs template because it can be used for both the create and update/edit operations. Notice that we pass false for docreate, informing the template that it is to be used for editing.

In this case, we first retrieve the note object and then pass it through to the template. This way, the template is set up for editing, rather than note creation. When the user clicks on the Submit button, we end up in the same /notes/save route handler shown in the preceding screenshot. It already does the right thing—calling the notes.update method in the model, rather than notes.create.

Because that's all we need to do, we can go ahead and rerun the application:

Click on the Submit button here and you will be redirected to the /notes/view screen, where you will then be able to read the newly edited note. Back at the /notes/view screen, we've just taken care of the Edit link, but the Delete link still produces an error.

Therefore, we next need to implement a page for deleting notes.

Deleting notes – destroy

Now, let's look at how to implement the /notes/destroy URL to delete notes.

Add the following router function to routes/notes.mjs:


销毁一个笔记是一个重要的步骤,因为如果用户犯了错误,就没有垃圾桶可以从中恢复。因此,我们需要询问用户是否确定要删除笔记。在这种情况下,我们检索笔记,然后呈现以下页面,显示一个问题以确保他们确定要删除笔记。

在`views`目录中添加一个`notedestroy.hbs`模板:

This is a simple form that asks the user to confirm by clicking on the button. The Cancel link just sends them back to the /notes/view page. Clicking on the Submit button generates a POST request on the /notes/destroy/confirm URL.

This URL needs a request handler. Add the following code to routes/notes.mjs:


这调用模型中的`notes.destroy`函数。如果成功,浏览器将重定向到主页。如果不成功,会向用户显示错误消息。重新运行应用程序,我们现在可以看到它在运行中的样子:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/bafe62b3-9c7a-4c38-a32a-e0ca175fad05.png)

现在应用程序中的一切都在运行,您可以点击任何按钮或链接,并保留所有想要的笔记。

我们已经实现了一个简单的笔记管理应用程序。现在让我们看看如何改变外观,因为在下一章中,我们将实现一个移动优先的用户界面。

# 为您的 Express 应用程序设置主题

Express 团队在确保 Express 应用程序一开始看起来不错方面做得相当不错。我们的`Notes`应用程序不会赢得任何设计奖,但至少它不丑陋。现在基本应用程序正在运行,有很多方法可以改进它。让我们快速看看如何为 Express 应用程序设置主题。在第六章*实现移动优先范式*中,我们将深入探讨这一点,重点关注解决移动市场这一重要目标。

如果您正在使用推荐的方法`npm start`运行`Notes`应用程序,控制台窗口中将打印出一条不错的活动日志。其中之一是以下内容:

This is due to the following line of code, which we put into layout.hbs:


这个文件是由 Express 生成器在一开始为我们自动生成的,并且被放在`public`目录中。`public`目录由 Express 静态文件服务器管理,使用`app.mjs`中的以下行:

Therefore, the CSS stylesheet is at public/stylesheets/style.css, so let's open it and take a look:


一个显眼的问题是应用程序内容在屏幕顶部和左侧有很多空白。原因是`body`标签有`padding: 50px`样式。更改它很快。

由于 Express 静态文件服务器中没有缓存,我们可以简单地编辑 CSS 文件并重新加载页面,CSS 也将被重新加载。

让我们做一些调整:

This changes the padding and also adds a gray box around the header area.

As a result, we'll have the following:

We're not going to win any design awards with this either, but there's the beginning of some branding and theming possibilities. More importantly, it proves that we can make edits to the theming.

Generally speaking, through the way that we've structured the page templates, applying a site-wide theme is just a matter of adding appropriate code to layout.hbs, along with appropriate stylesheets and other assets.

In Chapter 6, Implementing the Mobile-First Paradigm, we will look at a simple method to add these frontend libraries to your application.

Before closing out this chapter, we want to think ahead to scaling the application to handle multiple users.

Scaling up – running multiple Notes instances

Now that we've got ourselves a running application, you'll have played around a bit and created, read, updated, and deleted many notes.

Suppose for a moment that this isn't a toy application, but one that is interesting enough to draw millions of users a day. Serving a high load typically means adding servers, load balancers, and many other things. A core part of this is to have multiple instances of the application running at the same time to spread the load.

Let's see what happens when you run multiple instances of the Notes application at the same time.

The first thing is to make sure the instances are on different ports. In app.mjs, you'll see that setting the PORT environment variable controls the port being used. If the PORT variable is not set, it defaults to http://localhost:3000, or what we've been using all along.

Let's open up package.json and add the following lines to the scripts section:


`server1`脚本在`PORT 3001`上运行,而`server2`脚本在`PORT 3002`上运行。在一个地方记录所有这些是不是很好?

然后,在一个命令窗口中,运行以下命令:

In another command window, run the following:


这给了我们两个`Notes`应用程序的实例。使用两个浏览器窗口访问`http://localhost:3001`和`http://localhost:3002`。输入一些笔记,你可能会看到类似这样的东西:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/2c6d2829-f1c9-4df4-9131-d8163de6210a.png)

编辑和添加一些笔记后,您的两个浏览器窗口可能看起来像前面的截图。这两个实例不共享相同的数据池;每个实例都在自己的进程和内存空间中运行。您在一个上添加一个笔记,在另一个屏幕上不会显示。

另外,由于模型代码不会将数据持久化存储在任何地方,笔记也不会被保存。你可能已经写了有史以来最伟大的 Node.js 编程书,但一旦应用服务器重新启动,它就消失了。

通常情况下,你会运行多个应用实例以提高性能。这就是老生常谈的“增加服务器”的把戏。为了使其生效,数据当然必须共享,并且每个实例必须访问相同的数据源。通常情况下,这涉及到数据库,当涉及到用户身份信息时,甚至可能需要武装警卫。

所有这些意味着数据库、更多的数据模型、单元测试、安全实施、部署策略等等。等一下——我们很快就会涉及到所有这些!

# 总结

在本章中,我们走了很长的路。

我们首先看了一下回调地狱,以及 Promise 对象和 async 函数如何帮助我们驯服异步代码。因为我们正在编写一个 Express 应用,我们看了如何在 Express 中使用 async 函数。我们将在本书中始终使用这些技术。

我们迅速转向使用 Express 编写真实应用的基础。目前,我们的应用程序将数据保存在内存中,但它具有成为支持实时协作评论的笔记应用的基本功能。

在下一章中,我们将初步涉足响应式、移动友好的网页设计领域。由于移动计算设备的日益普及,有必要先考虑移动设备,而不是桌面电脑用户。为了每天能够触达数百万用户,"Notes"应用用户在使用智能手机时需要良好的用户体验。

在接下来的章节中,我们将继续扩展"Notes"应用的功能,首先是数据库存储模型。但首先,在下一章中,我们有一个重要的任务——使用 Bootstrap 实现移动优先的用户界面。


实施移动优先范式

现在我们的第一个 Express 应用程序可用,我们应该按照这个软件开发时代的口头禅行事:以移动设备为先。无论是智能手机、平板电脑、汽车仪表盘、冰箱门还是浴室镜子,移动设备正在占领世界。

在为移动设备设计时,主要考虑因素是小屏幕尺寸、触摸导向的交互、没有鼠标以及略有不同的**用户界面**(**UI**)期望。在 1997-8 年,当流媒体视频首次开发时,视频制作人员必须学会如何为视口大小与无花果(一种美国零食)大小相当的视频体验设计。今天,应用程序设计师必须应对与一张扑克牌大小相当的应用程序窗口。

对于*Notes*应用程序,我们的 UI 需求是简单的,而且没有鼠标对我们没有任何影响。

在本章中,我们不会进行太多的 Node.js 开发。相反,我们将进行以下操作:

+   修改 Notes 应用程序模板以获得更好的移动呈现效果。

+   编辑 Bootstrap SASS 文件以自定义应用程序主题。

+   安装第三方 Bootstrap 主题。

+   了解 Bootstrap 4.5,这是一个流行的响应式 UI 设计框架。

截至撰写本文时,Bootstrap v5 刚刚进入 alpha 阶段。这使得现在采用它为时尚早,但我们可能希望将来这样做。根据迁移指南,Bootstrap 的大部分内容在第 5 版中将保持不变,或者非常相似。然而,第 5 版中最大的变化是不再需要 jQuery。因为我们在第九章中相当频繁地使用 jQuery,这是一个重要的考虑因素,*使用 Socket.IO 进行动态客户端/服务器交互*。

通过完成前面列表中的任务,我们将初步了解成为全栈 Web 工程师意味着什么。本章的目标是获得应用程序开发的一个重要部分,即 UI 的介绍,以及 Web UI 开发的主要工具包之一。

与其仅仅因为它是流行的事物而进行移动优先开发,不如首先尝试理解正在解决的问题。

# 第九章:了解问题-Notes 应用程序不适合移动设备

让我们首先量化问题。我们需要探索应用在移动设备上的表现如何(或者不好)。这很容易做到:

1.  启动*Notes*应用程序。确定主机系统的 IP 地址。

1.  使用您的移动设备,使用 IP 地址连接到服务,并浏览*Notes*应用程序,对其进行测试并记录任何困难。

另一种方法是使用您的桌面浏览器,将其调整为非常窄。Chrome DevTools 还包括移动设备模拟器。无论哪种方式,您都可以在桌面上模拟智能手机的小屏幕尺寸。

要在移动屏幕上看到真正的 UI 问题,请编辑`views/noteedit.hbs`并进行以下更改:

What's changed is that we've added the cols=80 parameter to set its width to be fixed at 80 columns. We want this textarea element to be overly large so that you can experience how a non-responsive web app appears on a mobile device. View the application on a mobile device and you'll see something like one of the screens in this screenshot:

Viewing a note works well on an iPhone 6, but the screen for editing/adding a note is not good. The text entry area is so wide that it runs off the side of the screen. Even though interaction with FORM elements works well, it's clumsy. In general, browsing the Notes application gives an acceptable mobile user experience that doesn't suck, but won't make our users leave rave reviews.

In other words, we have an example of a screen that works well on the developers' laptop but is horrid on the target platform. By following the mobile-first paradigm, the developer is expected to constantly check the behavior in a mobile web browser, or else the mobile view in the Chrome developer tool, and to design accordingly.

This gives us an idea of the sort of problem that responsive web design aims to correct. Before implementing a mobile-first design in our Notes app, let's discuss some of the theory behind responsive web design.

Learning the mobile-first paradigm theory

Mobile devices have a smaller screen, are generally touch-oriented, and have different user experience expectations than a desktop computer.

To accommodate smaller screens, we use responsive web design techniques. This means designing the application to accommodate the screen size and ensuring websites provide optimal viewing and interaction across a wide range of devices. Techniques include changing font sizes, rearranging elements on the screen, using collapsible elements that open when touched, and resizing images or videos to fit available space. This is called responsive because the application responds to device characteristics by making these changes.

By mobile-first, we mean that you design the application to work well on a mobile device first, and then move on to devices with larger screens. It's about prioritizing mobile devices first.

The primary technique is using media queries in stylesheets to detect device characteristics. Each media query section targets a range of devices, using a CSS declaration to appropriately restyle content.

Let's consult a concrete example. The Twenty Twelve theme for WordPress has a straightforward responsive design implementation. It's not built with any framework, so you can see clearly how the mechanism works, and the stylesheet is small enough to be easily digestible. We're not going to use this code anywhere; instead, it is intended as a useful example of implementing a responsive design.

You can refer to the source code for the Twenty Twelve theme in the WordPress repository at themes.svn.wordpress.org/twentytwelve/1.9/style.css.

The stylesheet starts with a number of resets, where the stylesheet overrides some typical browser style settings with clear defaults. Then, the bulk of the stylesheet defines styling for mobile devices. Toward the bottom of the stylesheet is a section labeled Media queries where, for certain sized screens, the styles defined for mobile devices are overridden to work on devices with larger screens.

It does this with the following two media queries:


样式表的第一部分配置了所有设备的页面布局。接下来,对于任何至少宽度为`600px`的浏览器视口,重新配置页面以在较大屏幕上显示。然后,对于任何至少宽度为`960px`的浏览器视口,再次重新配置。样式表有一个最终的媒体查询来覆盖打印设备。

这些宽度被称为**断点**。这些阈值视口宽度是设计自身改变的点。您可以通过访问任何响应式网站,然后调整浏览器窗口大小来查看断点的作用。观察设计在特定尺寸处的跳跃。这些是该网站作者选择的断点。

关于选择断点的最佳策略有很多不同的意见。您是要针对特定设备还是要针对一般特征?Twenty Twelve 主题仅使用两个视口大小媒体查询在移动设备上表现得相当不错。CSS-Tricks 博客发布了一个针对每个已知设备的具体媒体查询的广泛列表,可在[`css-tricks.com/snippets/css/media-queries-for-standard-devices/`](https://css-tricks.com/snippets/css/media-queries-for-standard-devices/)上找到。

我们至少应该针对这些设备:

+   **小**:这包括 iPhone 5 SE。

+   **中等**:这可以指平板电脑或更大的智能手机。

+   **大**:这包括更大的平板电脑或更小的台式电脑。

+   **特大**:这指的是更大的台式电脑和其他大屏幕。

+   **横向/纵向**:您可能希望区分横向模式和纵向模式。在两者之间切换当然会改变视口宽度,可能会将其推过断点。但是,您的应用程序可能需要在这两种模式下表现不同。

这就足够让我们开始响应式网页设计的理论。在我们的*Notes*应用程序中,我们将致力于使用触摸友好的 UI 组件,并使用 Bootstrap 根据屏幕尺寸调整用户体验。让我们开始吧。

# 在 Notes 应用程序中使用 Twitter Bootstrap

Bootstrap 是一个移动优先的框架,包括 HTML5、CSS3 和 JavaScript 代码,提供了一套世界级的响应式网页设计组件。它是由 Twitter 的工程师开发的,然后于 2011 年 8 月发布到世界上。

该框架包括将现代功能应用于旧浏览器的代码,响应式的 12 列网格系统,以及用于构建 Web 应用程序和网站的大量组件(其中一些使用 JavaScript)。它旨在为您的应用程序提供坚实的基础。

有关 Bootstrap 的更多详细信息,请参考[`getbootstrap.com`](http://getbootstrap.com)。

通过这个对 Bootstrap 的介绍,让我们继续设置它。

## 设置 Bootstrap

第一步是复制您在上一章中创建的代码。例如,如果您创建了一个名为`chap05/notes`的目录,那么从`chap05/notes`的内容中创建一个名为`chap06/notes`的目录。

现在,我们需要开始在*Notes*应用程序中添加 Bootstrap 的代码。Bootstrap 网站建议从 Bootstrap(和 jQuery)公共 CDN 加载所需的 CSS 和 JavaScript 文件。虽然这很容易做到,但我们不会这样做,有两个原因:

+   这违反了将所有依赖项保持本地化到应用程序并且不依赖全局依赖项的原则。

+   这使我们的应用程序依赖于 CDN 是否正常运行。

+   这会阻止我们生成自定义主题。

相反,我们将安装 Bootstrap 的本地副本。有几种方法可以在本地安装 Bootstrap。例如,Bootstrap 网站提供可下载的 TAR/GZIP 存档(tarball)。更好的方法是使用自动化依赖管理工具,幸运的是,npm 存储库中有我们需要的所有包。

最直接的选择是在 npm 存储库中使用 Bootstrap ([`www.npmjs.com/package/bootstrap`](https://www.npmjs.com/package/bootstrap))、Popper.js ([`www.npmjs.com/package/popper.js`](https://www.npmjs.com/package/popper.js))和 jQuery ([`www.npmjs.com/package/jquery`](https://www.npmjs.com/package/jquery))包。这些包不提供 Node.js 模块,而是通过 npm 分发的前端代码。许多前端库都是通过 npm 存储库分发的。

我们使用以下命令安装包:

As we can see here, when we install Bootstrap, it helpfully tells us the corresponding versions of jQuery and Popper.js to use. But according to the Bootstrap website, we are to use a different version of jQuery than what's shown here. Instead, we are to use jQuery 3.5.x instead of 1.9.1, because 3.5.x has many security issues fixed.

On the npm page for the Popper.js package (www.npmjs.com/package/popper.js), we are told this package is deprecated, and that Popper.js v2 is available from the @popperjs/core npm package. However, the Bootstrap project tells us to use this version of Popper.js, so that's what we'll stick with.

The Bootstrap Getting Started documentation explicitly says to use jQuery 3.5.1 and Popper 1.16.0, as of the time time of writing, as you can see at getbootstrap.com/docs/4.5/getting-started/introduction/.

What's most important is to see what got downloaded:


在每个目录中都有用于在浏览器中使用的 CSS 和 JavaScript 文件。更重要的是,这些文件位于已知路径名的特定目录中,具体来说,就是我们刚刚检查过的目录。

让我们看看如何在浏览器端配置我们的 Notes 应用程序来使用这三个包,并在页面布局模板中设置 Bootstrap 支持。

## 将 Bootstrap 添加到 Notes 应用程序

在这一部分,我们将首先在页面布局模板中加载 Bootstrap CSS 和 JavaScript,然后确保 Bootstrap、jQuery 和 Popper 包可供使用。我们已经确保这些库安装在`node_modules`中,因此我们需要确保 Notes 知道将这些文件作为静态资产提供给 Web 浏览器。

在 Bootstrap 网站上,他们为页面提供了推荐的 HTML 结构。我们将从他们的建议中插入,以使用刚刚安装的 Bootstrap、jQuery 和 Popper 的本地副本。

请参阅[`getbootstrap.com/docs/4.5/getting-started/introduction/`](https://getbootstrap.com/docs/4.5/getting-started/introduction/)的*入门*页面。

我们将修改`views/layout.hbs`以匹配 Bootstrap 推荐的模板,通过进行粗体文本中显示的更改:

This is largely the template shown on the Bootstrap site, incorporated into the previous content of views/layout.hbs. Our own stylesheet is loaded following the Bootstrap stylesheet, giving us the opportunity to override anything in Bootstrap we want to change. What's different is that instead of loading Bootstrap, Popper.js, and jQuery packages from their respective CDNs, we use the path /assets/vendor/product-name instead.

This is the same as recommended on the Bootstrap website except the URLs point to our own site rather than relying on the public CDN. The pathname prefix, /assets/vendor, is routinely used to hold code provided by a third party.

This /assets/vendor URL is not currently recognized by the Notes application. To add this support, edit app.mjs to add these lines:


我们再次使用`express.static`中间件来为访问*Notes*应用程序的浏览器提供资产文件。每个路径名都是 npm 安装的 Bootstrap、jQuery 和 Popper 库的位置。

Popper.js 库有一个特殊的考虑。在`popper.js/dist`目录中,团队以 ES6 模块语法分发了一个库。此时,我们不能相信所有浏览器都支持 ES6 模块。在`popper.js/dist/umd`中是一个适用于所有浏览器的 Popper.js 库的版本。因此,我们已经适当地设置了目录。

在`public`目录中,我们需要做一些整理。当`express-generator`设置初始项目时,它生成了`public/images`、`public/javascripts`和`public/stylesheets`目录。因此,每个的 URL 都以`/images`、`/javascripts`和`/stylesheets`开头。给这些文件一个以`/assets`目录开头的 URL 更清晰。要实现这个改变,首先要移动文件如下:

We now have our asset files, including Bootstrap, Popper.js, and jQuery, all available to the Notes application under the /assets directory. Referring back to views/layout.hbs, notice that we said to change the URL for our stylesheet to /assets/stylesheets/style.css, which matches this change.

We can now try this out by running the application:


屏幕上的差异很小,但这是 CSS 和 JavaScript 文件被加载的必要证明。我们已经实现了第一个主要目标——使用现代的、移动友好的框架来实现移动优先设计。

在修改应用程序的外观之前,让我们谈谈其他可用的框架。

## 替代布局框架

Bootstrap 并不是唯一提供响应式布局和有用组件的 JavaScript/CSS 框架。当然,所有其他框架都有自己的特点。一如既往,每个项目团队都可以选择他们使用的技术,当然,市场也在不断变化,新的库不断出现。我们在这个项目中使用 Bootstrap 是因为它很受欢迎。这些其他框架也值得一看:

+   Pure.css ([`purecss.io/`](https://purecss.io/)):一个强调小代码占用空间的响应式 CSS 框架。

+   Picnic CSS ([`picnicss.com/`](https://picnicss.com/)):一个强调小尺寸和美观的响应式 CSS 框架。

+   Bulma ([`bulma.io/`](https://bulma.io/)):一个自称非常易于使用的响应式 CSS 框架。

+   Shoelace ([`shoelace.style/`](https://shoelace.style/)):一个强调使用未来 CSS 的 CSS 框架,意味着它使用 CSS 标准化的最前沿的 CSS 构造。由于大多数浏览器不支持这些功能,使用 cssnext ([`cssnext.io/`](http://cssnext.io/)) 来进行支持。Shoelace 使用基于 Bootstrap 网格的网格布局系统。

+   PaperCSS ([`www.getpapercss.com/`](https://www.getpapercss.com/)):一个看起来像手绘的非正式 CSS 框架。

+   Foundation ([`foundation.zurb.com/`](https://foundation.zurb.com/)):自称为世界上最先进的响应式前端框架。

+   Base([`getbase.org/`](http://getbase.org/)):一个轻量级的现代 CSS 框架。

HTML5 Boilerplate([`html5boilerplate.com/`](https://html5boilerplate.com/))是编写 HTML 和其他资产的极其有用的基础。它包含了网页 HTML 代码的当前最佳实践,以及用于规范化 CSS 支持和多个 Web 服务器的配置文件。

浏览器技术也在迅速改进,布局技术是其中之一。Flexbox 和 CSS Grid 布局系统在使 HTML 内容布局比以前的技术更容易方面是一个重大进步。

# Flexbox 和 CSS Grids

这两种新的 CSS 布局方法正在影响 Web 应用程序开发。CSS3 委员会一直在多个方面进行工作,包括页面布局。

在遥远的过去,我们使用嵌套的 HTML 表格进行页面布局。这是一个不愉快的回忆,我们不必再次回顾。最近,我们一直在使用使用`<div>`元素的盒模型,甚至有时使用绝对或相对定位技术。所有这些技术在多种方面都不够理想,有些更甚于其他。

一个流行的布局技术是将水平空间分成列,并为页面上的每个元素分配一定数量的列。使用一些框架,我们甚至可以有嵌套的`<div>`元素,每个都有自己的列集。Bootstrap 3 和其他现代框架使用了这种布局技术。

两种新的 CSS 布局方法,Flexbox([`en.wikipedia.org/wiki/CSS_flex-box_layout`](https://en.wikipedia.org/wiki/CSS_flex-box_layout))和 CSS Grids([`developer.mozilla.org/en-US/docs/Web/CSS/CSS_Grid_Layout`](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Grid_Layout)),是对以往所有方法的重大改进。我们提到这些技术是因为它们都值得关注。

在 Bootstrap 4 中,Bootstrap 团队选择了 Flexbox。因此,在底层是 Flexbox CSS 构造。

在设置了 Bootstrap 并学习了一些响应式 Web 设计的背景之后,让我们立即开始在*Notes*中实现响应式设计。

# Notes 应用的移动优先设计

当我们为 Bootstrap 等添加了 CSS 和 JavaScript 时,那只是开始。为了实现响应式的移动友好设计,我们需要修改每个模板以使用 Bootstrap 组件。Bootstrap 的功能在 4.x 版本中分为四个领域:

+   **布局**:声明来控制 HTML 元素的布局,支持基于设备尺寸的不同布局

+   **内容**:用于规范化 HTML 元素、排版、图片、表格等外观

+   **组件**:包括导航栏、按钮、菜单、弹出窗口、表单、轮播图等全面的 UI 元素,使应用程序的实现变得更加容易

+   **实用工具**:用于调整 HTML 元素的呈现和布局的附加工具

Bootstrap 文档中充满了我们可以称之为*配方*的内容,用于实现特定 Bootstrap 组件或效果的 HTML 元素结构。实现的关键在于,通过向每个 HTML 组件添加正确的 HTML 类声明来触发 Bootstrap 效果。

让我们从使用 Bootstrap 进行页面布局开始。

## 奠定 Bootstrap 网格基础

Bootstrap 使用 12 列网格系统来控制布局,为应用程序提供了一个响应式的移动优先基础。当正确设置时,使用 Bootstrap 组件的布局可以自动重新排列组件,以适应从超小屏幕到大型台式电脑的不同尺寸屏幕。该方法依赖于带有类的`<div>`元素来描述布局中每个`<div>`的作用。

Bootstrap 中的基本布局模式如下:

This is a generic Bootstrap layout example, not anything we're putting into the Notes app. Notice how each layer of the layout relies on different class declarations. This fits Bootstrap's pattern of declaring behavior by using classes.

In this case, we're showing a typical page layout of a container, containing two rows, with two columns on the first row and three columns on the second. The outermost layer uses the .container or .container-fluid elements. Containers provide a means to center or horizontally pad the content. Containers marked as .container-fluid act as if they have width: 100%, meaning they expand to fill the horizontal space.

.row is what it sounds like, a "row" of a structure that's somewhat like a table. Technically, a row is a wrapper for columns. Containers are wrappers for rows, and rows are wrappers for columns, and columns contain the content displayed to our users.

Columns are marked with variations of the .col class. With the basic column class, .col, the columns are divided equally into the available space. You can specify a numerical column count to assign different widths to each column. Bootstrap supports up to 12 numbered columns, hence each row in the example adds up to 12 columns.

You can also specify a breakpoint to which the column applies:

  • Using col-xs targets extra-small devices (smartphones, <576px).
  • Using col-sm targets small devices (>= 576px).
  • Using col-md targets medium devices (>= 768px).
  • Using col-lg targets large devices (>= 992px).
  • Using col-xl targets extra-large devices (>= 1200px).

Specifying a breakpoint, for example, col-sm, means that the declaration applies to devices matching that breakpoint or larger. Hence, in the example shown earlier, the column definitions were applied to col-sm, col-md, col-lg, and col-xl devices, but not to col-xs devices.

The column count is appended to the class name. That means using col-# when not targeting a breakpoint, for example, col-4, or col-{breakpoint}-# when targeting a breakpoint, for example, col-md-4, to target a space four columns wide on medium devices. If the columns add up to more than 12, the columns beyond the twelfth column wrap around to become a new row. The word auto can be used instead of a numerical column count to size the column to the natural width of its contents.

It's possible to mix and match to target multiple breakpoints:


这声明了三种不同的布局,一种用于超小设备,另一种用于中等设备,最后一种用于大型设备。

网格系统可以做更多。详情请参阅[`getbootstrap.com/docs/4.5/layout/overview/`](https://getbootstrap.com/docs/4.5/layout/overview/)中的文档。

这个介绍给了我们足够的知识来开始修改*Notes*应用程序。我们下一个任务是更好地理解应用程序页面的结构。

## *Notes*应用程序的响应式页面结构

我们可以对*Notes*进行整个用户体验分析,或者让设计师参与,并为*Notes*应用程序的每个屏幕设计完美的页面设计。但是当前*Notes*应用程序的设计是开发人员编写的功能性而不是丑陋的页面设计的结果。让我们从讨论我们拥有的页面设计结构的逻辑开始。考虑以下结构:

This is the general structure of the pages in Notes. The page content has two visible rows: the header and the main content. At the bottom of the page are invisible things such as the JavaScript files for Bootstrap and jQuery.

As it currently stands, the header contains a title for each page as well as navigation links so the user can browse the application. The content area is what changes from page to page, and is either about viewing content or editing content. The point is that for every page we have two sections for which to handle layout.

The question is whether views/layout.hbs should have any visible page layout. This template is used for the layout of every page in the application. The content of those pages is different enough that it seems layout.hbs cannot have any visible elements.

That's the decision we'll stick with for now. The next thing to set up is an icon library we can use for graphical buttons.

Using icon libraries and improving visual appeal

The world around us isn't constructed of words, but instead things. Hence, pictorial elements and styles, such as icons, can help computer software to be more comprehensible. Creating a good user experience should make our users reward us with more likes in the app store.

There are several icon libraries that can be used on a website. The Bootstrap team has a curated list at getbootstrap.com/docs/4.5/extend/icons/. For this project, we'll use Feather Icons (feathericons.com/). It is a conveniently available npm package at www.npmjs.com/package/feather-icons.

To install the package, run this command:


然后您可以检查已下载的包,看到`./node_modules/feather-icons/dist/feather.js`包含了浏览器端的代码,使得使用图标变得容易。

我们通过在`app.mjs`中挂载它来使该目录可用,就像我们为 Bootstrap 和 jQuery 库所做的那样。将此代码添加到`app.mjs`中:

Going by the documentation, we must put this at the bottom of views/layout.hbs to enable feather-icons support:


这会加载浏览器端的库,然后调用该库来使用图标。

要使用其中一个图标,使用`data-feather`属性指定其中一个图标名称,就像这样:

As suggested by the icon name, this will display a circle. The Feather Icons library looks for elements with the data-feather attribute, which the Feather Icons library uses to identify the SVG file to use. The Feather Icons library completely replaces the element where it finds the data-feather attribute. Therefore, if you want the icon to be a clickable link, it's necessary to wrap the icon definition with an <a> tag, rather than adding data-feather to the <a> tag.

Let's now redesign the page header to be a navigation bar, and use one of the Feather icons.

Responsive page header navigation bar

The header section we designed before contains a page title and a little navigation bar. Bootstrap has several ways to spiff this up, and even give us a responsive navigation bar that neatly collapses to a menu on small devices.

In views/header.hbs, make this change:


添加`class="page-header"`告诉 Bootstrap 这是页面标题。在其中,我们有与之前一样的`<h1>`标题,提供页面标题,然后是一个响应式的 Bootstrap `navbar`。

默认情况下,`navbar`是展开的——这意味着`navbar`内部的组件是可见的——因为有`navbar-expand-md`类。这个`navbar`使用一个`navbar-toggler`按钮来控制`navbar`的响应性。默认情况下,这个按钮是隐藏的,`navbar`的主体是可见的。如果屏幕足够小,`navbar-toggler`会切换为可见状态,`navbar`的主体变为不可见,当点击现在可见的`navbar-toggler`时,会弹出一个包含`navbar`主体的菜单:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/9e463059-e746-40ad-8de7-86722783d8d1.png)

我们选择了 Feather Icons 的*home*图标,因为该链接指向*主页*。打算`navbar`的中间部分将包含一个面包屑路径,当我们在*Notes*应用程序中导航时。

添加笔记按钮与右侧粘合,使用一些 Flexbox 魔法。容器是 Flexbox,这意味着我们可以使用 Bootstrap 类来控制每个项目所占用的空间。面包屑区域是主页图标和添加笔记按钮之间的空白区域。在这种情况下是空的,但是包含它的`<div>`元素已经声明为`class="col"`,这意味着它占据一个列单位。另一方面,添加笔记按钮声明为`class="col-auto"`,这意味着它只占据自己所需的空间。因此,空的面包屑区域将扩展以填充可用空间,而添加笔记按钮只填充自己的空间,因此被推到一边。

因为它是同一个应用程序,所有功能都能正常工作;我们只是在处理演示。我们已经添加了一些笔记,但是在首页上的列表呈现还有很多需要改进的地方。标题的小尺寸不太适合触摸操作,因为它没有为手指提供一个大的目标区域。你能解释为什么`notekey`值必须显示在主页上吗?考虑到这一点,让我们继续修复首页。

## 在首页改进笔记列表

当前的主页有一些简单的文本列表,不太适合触摸操作,并且在行首显示*key*可能会让用户感到困惑。让我们来修复这个问题。

按照以下方式编辑`views/index.hbs`,修改的行用粗体显示:

The first change is to switch away from using a list and to use a vertical button group. The button group is a Bootstrap component that's what it sounds like, a group of buttons. By making the text links look and behave like buttons, we're improving the UI, especially its touch-friendliness. We chose the btn-outline-dark button style because it looks good in the UI. We use large buttons (btn-lg) that fill the width of the container (btn-block).

We eliminated showing the notekey value to the user. This information doesn't add anything to the user experience. Running the application, we get the following:

This is beginning to take shape, with a decent-looking home page that handles resizing very nicely and is touch-friendly. The buttons have been enlarged nicely to be large enough for big fingers to easily tap.

There's still something more to do with this since the header area is taking up a fair amount of space. We should always feel free to rethink a plan as we look at intermediate results. Earlier, we created a design for the header area, but on reflection, that design looks to be too large. The intention had been to insert a breadcrumb trail just to the right of the home icon, and to leave the <h1> title at the top of the header area. But this takes up too much vertical space, so we can tighten up the header and possibly improve the appearance.

Edit partials/header.hbs with the following line in bold:


这会移除页眉区域顶部的`<h1>`标签,立即收紧演示。

在`navbar-collapse`区域内,我们用一个简单的`navbar-text`组件替换了原本意为面包屑的内容,其中包含页面标题。为了保持“添加笔记”按钮固定在右侧,我们保持了`class="col"`和`class="col-auto"`的设置:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/f36a31a8-1cbd-4018-a2a6-2ca910157907.png)

哪种页眉设计更好?这是一个很好的问题。因为美在于观者的眼中,两种设计可能同样好。我们展示的是通过编辑模板文件轻松更新设计的便利性。

现在让我们来处理查看笔记的页面。

## 清理笔记查看体验

查看笔记并不坏,但用户体验可以得到改善。例如,用户不需要看到`notekey`,这意味着我们可以从显示中删除它。此外,Bootstrap 有更漂亮的按钮可以使用。

在`views/noteview.hbs`中进行以下更改:

We have declared two rows, one for the note, and another for buttons for actions related to the note. Both are declared to consume all 12 columns, and therefore take up the full available width. The buttons are again contained within a button group, but this time a horizontal group rather than vertical.

Running the application, we get the following:

Do we really need to show the notekey to the user? We'll leave it there, but that's an open question for the user experience team. Otherwise, we've improved the note-reading experience.

Next on our list is the page for adding and editing notes.

Cleaning up the add/edit note form

The next major glaring problem is the form for adding and editing notes. As we said earlier, it's easy to get the text input area to overflow a small screen. Fortunately, Bootstrap has extensive support for making nice-looking forms that work well on mobile devices.

Change the form in views/noteedit.hbs to this:


这里有很多事情要做。我们重新组织了`form`,以便 Bootstrap 可以对其进行正确处理。首先要注意的是我们有几个这样的实例:

The entire form is contained within a container-fluid, meaning that it will automatically stretch to fit the screen. The form has three of these rows with the form-group class.

Bootstrap uses form-group elements to add structure to forms and to encourage proper use of <label> elements, along with other form elements. It's good practice to use a <label> element with every <input> element to improve assistive behavior in the browser, rather than simply leaving some dangling text.

For horizontal layout, notice that for each row there is a <label> with a col-1 class, and the <input> element is contained within a <div> that has a col class. The effect is that the <label> has a controlled width and that the labels all have the same width, while the <input> elements take up the rest of the horizontal space.

Every form element has class="form-control". Bootstrap uses this to identify the controls so it can add styling and behavior.

The placeholder='key' attribute puts sample text in an otherwise empty text input element. It disappears as soon as the user types something and is an excellent way to prompt the user with what's expected.

Finally, we changed the Submit button to be a Bootstrap button. These look nice, and Bootstrap makes sure that they work great:

The result looks good and works well on the iPhone. It automatically sizes itself to whatever screen it's on. Everything behaves nicely. In the preceding screenshot, we've resized the window small enough to cause the navbar to collapse. Clicking on the so-called hamburger icon on the right (the three horizontal lines) causes the navbar contents to pop up as a menu.

We have learned how to improve forms using Bootstrap. We have a similar task in the form to confirm deleting notes.

Cleaning up the delete-note window

The window used to verify the user's choice to delete a note doesn't look bad, but it can be improved.

Edit views/notedestroy.hbs to contain the following:


我们重新设计了它,以使用类似的 Bootstrap 表单标记。关于删除笔记的问题被包裹在`class="form-text"`中,以便 Bootstrap 可以正确显示它。

按钮与以前一样包裹在`class="btn-group"`中。按钮的样式与其他屏幕上完全相同,使应用程序在整体外观上保持一致:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/8f9abc52-09b0-4df4-b231-afb832a45aee.png)

存在一个问题,即导航栏中的标题文本没有使用单词`Delete`。在`routes/notes.mjs`中,我们可以进行这个更改:

What we've done is to change the title parameter passed to the template. We'd done this in the /notes/edit route handler and seemingly missed doing so in this handler.

That handles rewriting the Notes application to use Bootstrap. Having a complete Bootstrap-based UI, let's look at what it takes to customize the Bootstrap look and feel.

Customizing a Bootstrap build

One reason to use Bootstrap is that you can easily build a customized version. The primary reason to customize a Bootstrap build is to adjust the theme from the default. While we can use stylesheet.css to adjust the presentation, it's much more effective to adjust theming the Bootstrap way. That means changing the SASS variables and recompiling Bootstrap to generate a new bootstrap.css file.

Bootstrap stylesheets are built using the build process described in the package.json file. Therefore, customizing a Bootstrap build means first downloading the Bootstrap source tree, making modifications, then using the npm run dist command to build the distribution. By the end of this section, you'll know how to do all that.

The Bootstrap uses SASS, which is one of the CSS preprocessors used to simplify CSS development. In Bootstrap's code, one file (scss/_variables.scss) contains variables used throughout the rest of Bootstrap's .scss files. Change one variable and it automatically affects the rest of Bootstrap.

The official documentation on the Bootstrap website (getbootstrap.com/docs/4.5/getting-started/build-tools/) is useful for reference on the build process.

If you've followed the directions given earlier, you have a directory, chap06/notes, containing the Notes application source code. Create a directory named chap06/notes/theme, within which we'll set up a custom Bootstrap build process.

In order to have a clear record of the steps involved, we'll use a package.json file in that directory to automate the build process. There isn't any Node.js code involved; npm is also a convenient tool to automate the software build processes.

To start, we need a script for downloading the Bootstrap source tree from github.com/twbs/bootstrap. While the bootstrap npm package includes SASS source files, it isn't sufficient to build Bootstrap, and therefore we must download the source tree. What we do is navigate to the GitHub repository, click on the Releases tab, and select the URL for the most recent release. But instead of downloading it manually, let's automate the process.

With theme/package.json can contain this scripts section:


这将自动下载并解压 Bootstrap 源代码分发包,然后`postdownload`步骤将运行`npm install`来安装 Bootstrap 项目声明的依赖项。这样就可以设置好源代码树,准备修改和构建。

输入以下命令:

This executes the steps to download and unpack the Bootstrap source tree. The scripts we gave will work for a Unix-like system, but if you are on Windows it will be easiest to run this in the Windows Subsystem for Linux.

This much only installs the tools necessary to build Bootstrap. The documentation on the Bootstrap website also discusses installing Bundler from the Ruby Gems repository, but that tool only seems to be required to bundle the built distribution. We do not need that tool, so skip that step.

To build Bootstrap, let's add the following lines to the scripts section in our theme/package.json file:


显然,当发布新的 Bootstrap 版本时,您需要调整这些目录名称。

在 Bootstrap 源代码树中,运行`npm run dist`将使用 Bootstrap`package.json`文件中记录的过程构建 Bootstrap。同样,`npm run watch`设置了一个自动化过程,用于扫描更改的文件并在更改任何文件时重新构建 Bootstrap。运行`npm run clean`将删除 Bootstrap 源代码树。通过将这些行添加到我们的`theme/package.json`文件中,我们可以在终端中启动这个过程,现在我们可以根据需要重新运行构建,而不必绞尽脑汁,努力记住该做什么。

为了避免将 Bootstrap 源代码检入到 Git 存储库中,添加一个`theme/.gitignore`文件:

This will tell Git to not commit the Bootstrap source tree to the source repository. There's no need to commit third-party sources to your source tree since we have recorded in the package.json file the steps required to download the sources.

Now run a build with this command:


构建文件位于`theme/bootstrap-4.5.0/dist`目录中。该目录的内容将与 Bootstrap 的 npm 包的内容相匹配。

在继续之前,让我们看看 Bootstrap 源代码树。`scss`目录包含了将被编译成 Bootstrap CSS 文件的 SASS 源代码。要生成一个定制的 Bootstrap 构建,需要在该目录中进行一些修改。

`bootstrap-4.5.0/scss/bootstrap.scss`文件包含`@import`指令,以引入所有 Bootstrap 组件。文件`bootstrap-4.5.0/scss/_variables.scss`包含了在其余 Bootstrap SASS 源代码中使用的定义。编辑或覆盖这些值将改变使用生成的 Bootstrap 构建的网站的外观。

例如,这些定义确定了主要的颜色值:

These are similar to normal CSS statements. The !default attribute designates these values as the default. Any !default values can be overridden without editing _values.scss.

To create a custom theme we could change _variables.scss, then rerun the build. But what if Bootstrap makes a considerable change to _variables.scss that we miss? It's better to instead create a second file that overrides values in _variables.scss.

With that in mind, create a file, theme/_custom.scss, containing the following:


这会颠倒`_variables.scss`中`$body-bg`和`$body-color`设置的值。Notes 应用现在将使用黑色背景上的白色文本,而不是默认的白色背景和黑色文本。因为这些声明没有使用`!default`,它们将覆盖`_variables.scss`中的值。

然后,在`theme`目录中复制`scss/bootstrap.scss`并进行修改:

This adds an @import header for the _custom.scss file we just created. That way, Bootstrap will load our definitions during the build process.

Finally, add this line to the scripts section of theme/package.json:


使用这些脚本,在构建 Bootstrap 之前,这两个文件将被复制到指定位置,之后,构建后的文件将被复制到名为`dist`的目录中。`prebuild`步骤让我们可以将`_custom.scss`和`bootstrap.scss`的副本提交到我们的源代码库中,同时可以随时删除 Bootstrap 源。同样,`postbuild`步骤让我们可以将构建的自定义主题提交到源代码库中。

接下来,重新构建 Bootstrap:

While that's building, let's modify notes/app.mjs to mount the build directory:


我们所做的是从`node_modules`中的 Bootstrap 配置切换到我们刚在`theme`目录中构建的内容。

然后重新加载应用程序,您将看到颜色的变化。

要获得这个确切的演示,需要进行两个更改。我们之前使用的按钮元素具有`btn-outline-dark`类,这在浅色背景上效果很好。因为背景现在是黑色,这些按钮需要使用浅色着色。

要更改按钮,在`views/index.hbs`中进行以下更改:

Make a similar change in views/noteview.hbs:


很酷,我们现在可以按自己的意愿重新设计 Bootstrap 的颜色方案。不要向您的用户体验团队展示这一点,因为他们会大发雷霆。我们这样做是为了证明我们可以编辑`_custom.scss`并改变 Bootstrap 主题。

接下来要探索的是使用预先构建的第三方 Bootstrap 主题。

## 使用第三方自定义 Bootstrap 主题

如果所有这些对您来说太复杂了,一些网站提供了预先构建的 Bootstrap 主题,或者简化的工具来生成 Bootstrap 构建。让我们先尝试从 Bootswatch([`bootswatch.com/`](https://bootswatch.com/))下载一个主题。这既是一个免费开源主题的集合,也是一个用于生成自定义 Bootstrap 主题的构建系统([`github.com/thomaspark/bootswatch/`](https://github.com/thomaspark/bootswatch/))。

让我们使用 Bootswatch 的**Minty**主题来探索所需的更改。您可以从网站下载主题,或者将以下内容添加到`package.json`的`scripts`部分:

This will download the prebuilt CSS files for our chosen theme. In passing, notice that the Bootswatch website offers _variables.scss and _bootswatch.scss files, which should be usable with a workflow similar to what we implemented in the previous section. The GitHub repository matching the Bootswatch website has a complete build procedure for building custom themes.

Perform the download with the following command:


在`app.mjs`中,我们需要更改 Bootstrap 挂载点,分别挂载 JavaScript 和 CSS 文件。使用以下内容:

Instead of one mount for /vendor/bootstrap, we now have two mounts for each of the subdirectories. While the Bootswatch team provides bootstrap.css and bootstrap.min.css, they do not provide the JavaScript source. Therefore, we use the /vendor/bootstrap/css mount point to access the CSS files you downloaded from the theme provider, and the /vendor/bootstrap/js mount point to access the JavaScript files in the Bootstrap npm package.

Because Minty is a light-colored theme, the buttons now need to use the dark style. We had earlier changed the buttons to use a light style because of the dark background. We must now switch from btn-outline-light back to btn-outline-dark. In partials/header.hbs, the color scheme requires a change in the navbar content:


我们选择了`text-dark`和`btn-dark`类来提供一些与背景的对比。

重新运行应用程序,您将看到类似于这样的东西:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/bb10d8cb-6641-48f4-a1dd-961043e0e675.png)

有了这个,我们已经完成了对基于 Bootstrap 的应用程序外观和感觉的定制探索。我们现在可以结束本章了。

# 总结

使用 Bootstrap 的可能性是无穷的。虽然我们涵盖了很多内容,但我们只是触及了表面,我们可以在*Notes*应用程序中做更多的事情。但由于本书的重点不是 UI,而是后端 Node.js 代码,我们故意限制了自己,使应用程序在移动设备上能够正常工作。

通过使用 Twitter Bootstrap 框架来实现简单的响应式网站设计,您了解了 Bootstrap 框架的功能。即使我们所做的小改动也改善了*Notes*应用程序的外观和感觉。我们还创建了一个定制的 Bootstrap 主题,并使用了第三方主题,来探索如何轻松地使 Bootstrap 构建看起来独特。

现在,我们想要回到编写 Node.js 代码。我们在第五章中停下,*你的第一个 Express 应用程序*,遇到了持久性的问题,*Notes*应用程序可以在不丢失笔记的情况下停止和重新启动。在第七章中,*数据存储和检索*,我们将深入使用几种数据库引擎来存储我们的数据。


数据存储和检索

在前两章中,我们构建了一个小型且有些有用的存储笔记的应用程序,然后使其在移动设备上运行。虽然我们的应用程序运行得相当不错,但它并没有将这些笔记存储在长期基础上,这意味着当您停止服务器时,笔记会丢失,并且如果您运行多个`Notes`实例,每个实例都有自己的笔记集。我们的下一步是引入一个数据库层,将笔记持久化到长期存储中。

在本章中,我们将研究 Node.js 中的数据库支持,目标是获得对几种数据库的暴露。对于`Notes`应用程序,用户应该在访问任何`Notes`实例时看到相同的笔记集,并且用户应该能够随时可靠地访问笔记。

我们将从前一章中使用的`Notes`应用程序代码开始。我们从一个简单的内存数据模型开始,使用数组来存储笔记,然后使其适用于移动设备。在本章中,我们将涵盖以下主题:

+   数据库和异步代码之间的关系

+   配置操作和调试信息的记录

+   捕获重要的系统错误

+   使用`import()`来启用运行时选择要使用的数据库

+   使用多个数据库引擎为`Notes`对象实现数据持久化

+   设计简单的配置文件与 YAML

第一步是复制上一章的代码。例如,如果你在`chap06/notes`中工作,复制它并将其更改为`chap07/notes`。

让我们从回顾一下在 Node.js 中为什么数据库代码是异步的一些理论开始。

让我们开始吧!

# 第十章:记住数据存储需要异步代码

根据定义,外部数据存储系统需要异步编码技术,就像我们在前几章中讨论的那样。Node.js 架构的核心原则是,任何需要长时间执行的操作必须具有异步 API,以保持事件循环运行。从磁盘、另一个进程或数据库检索数据的访问时间总是需要足够的时间来要求延迟执行。

现有的`Notes`数据模型是一个内存数据存储。理论上,内存数据访问不需要异步代码,因此现有的模型模块可以使用常规函数,而不是`async`函数。

我们知道`Notes`应该使用数据库,并且需要一个异步 API 来访问`Notes`数据。因此,现有的`Notes`模型 API 使用`async`函数,所以在本章中,我们可以将 Notes 数据持久化到数据库中。

这是一个有用的复习。现在让我们谈谈生产应用程序所需的一个管理细节——使用日志系统来存储使用数据。

# 记录和捕获未捕获的错误

在我们进入数据库之前,我们必须解决高质量 Web 应用程序的一个属性——管理记录信息,包括正常系统活动、系统错误和调试信息。日志为开发人员提供了对系统行为的洞察。它们为开发人员回答以下问题:

+   应用程序的流量有多大?

+   如果是一个网站,人们最常访问哪些页面?

+   发生了多少错误,以及是什么类型的错误?是否发生了攻击?是否发送了格式不正确的请求?

日志管理也是一个问题。如果管理不当,日志文件很快就会填满磁盘空间。因此,在删除旧日志之前,处理旧日志变得非常重要,希望在删除旧日志之前提取有用的数据。通常,这包括**日志轮换**,即定期将现有日志文件移动到存档目录,然后开始一个新的日志文件。之后,可以进行处理以提取有用的数据,如错误或使用趋势。就像您的业务分析师每隔几周查看利润/损失报表一样,您的 DevOps 团队需要各种报告,以了解是否有足够的服务器来处理流量。此外,可以对日志文件进行筛查以查找安全漏洞。

当我们使用 Express 生成器最初创建`Notes`应用程序时,它使用以下代码配置了一个活动日志系统,使用了`morgan`:

This module is what prints messages about HTTP requests on the terminal window. We'll look at how to configure this in the next section.

Visit github.com/expressjs/morgan for more information about morgan.

Another useful type of logging is debugging messages about an application. Debugging traces should be silent in most cases; they should only print information when debugging is turned on, and the level of detail should be configurable.

The Express team uses the debug package for debugging logs. These are turned on using the DEBUG environment variable, which we've already seen in use. We will see how to configure this shortly and put it to use in the Notes application. For more information, refer to www.npmjs.com/package/debug.

Finally, the application might generate uncaught exceptions or unhandled Promises. The uncaughtException and unhandledRejection errors must be captured, logged, and dealt with appropriately. We do not use the word must lightly; these errors must be handled.

Let's get started.

Request logging with morgan

The morgan package generates log files from the HTTP traffic arriving on an Express application. It has two general areas for configuration:

  • Log format
  • Log location

As it stands, Notes uses the dev format, which is described as a concise status output for developers. This can be used to log web requests as a way to measure website activity and popularity. The Apache log format already has a large ecosystem of reporting tools and, sure enough, morgan can produce log files in this format.

To enable changing the logging format, simply change the following line in app.mjs:


这是我们在整本书中遵循的模式;即将默认值嵌入应用程序,并使用环境变量来覆盖默认值。如果我们没有通过环境变量提供配置值,程序将使用`dev`格式。接下来,我们需要运行`Notes`,如下所示:

To revert to the previous logging output, simply do not set this environment variable. If you've looked at Apache access logs, this logging format will look familiar. The ::1 notation at the beginning of the line is IPV6 notation for localhost, which you may be more familiar with as 127.0.0.1.

Looking at the documentation for morgan, we learn that it has several predefined logging formats available. We've seen two of them—the dev format is meant to provide developer-friendly information, while the common format is compatible with the Apache log format. In addition to these predefined formats, we can create a custom log format by using various tokens.

We could declare victory on request logging and move on to debugging messages. However, let's look at logging directly to a file. While it's possible to capture stdout through a separate process, morgan is already installed on Notes and it provides the capability to direct its output to a file.

The morgan documentation suggests the following:


然而,这存在一个问题;无法在不关闭和重新启动服务器的情况下执行日志轮换。术语“日志轮换”指的是 DevOps 实践,其中每个快照覆盖了几小时的活动。通常,应用服务器不会持续打开文件句柄到日志文件,DevOps 团队可以编写一个简单的脚本,每隔几个小时运行一次,并使用`mv`命令移动日志文件,使用`rm`命令删除旧文件。不幸的是,`morgan`在这里配置时,会持续打开文件句柄到日志文件。

相反,我们将使用`rotating-file-stream`包。这个包甚至自动化了日志轮换任务,这样 DevOps 团队就不必为此编写脚本。

有关此内容的文档,请参阅包页面[`www.npmjs.com/package/rotating-file-stream`](https://www.npmjs.com/package/rotating-file-stream)。

首先,安装包:

Then, add the following code to app.mjs:


在顶部的`import`部分,我们将`rotating-file-stream`加载为`rfs`。如果设置了`REQUEST_LOG_FILE`环境变量,我们将把它作为要记录的文件名。`morgan`的`stream`参数只需接受一个可写流。如果`REQUEST_LOG_FILE`没有设置,我们使用`?:`运算符将`process.stdout`的值作为可写流。如果设置了,我们使用`rfs.createStream`创建一个可写流,通过`rotating-file-stream`模块处理日志轮换。

在`rfs.createStream`中,第一个参数是日志文件的文件名,第二个是描述要使用的行为的`options`对象。这里提供了一套相当全面的选项。这里的配置在日志文件达到 10 兆字节大小(或 1 天后)时进行日志轮换,并使用`gzip`算法压缩旋转的日志文件。

可以设置多个日志。例如,如果我们想要将日志记录到控制台,除了记录到文件中,我们可以添加以下`logger`声明:

If the REQUEST_LOG_FILE variable is set, the other logger will direct logging to the file. Then, because the variable is set, this logger will be created and will direct logging to the console. Otherwise, if the variable is not set, the other logger will send logging to the console and this logger will not be created.

We use these variables as before, specifying them on the command line, as follows:


使用这个配置,将在`log.txt`中创建一个 Apache 格式的日志。在进行一些请求后,我们可以检查日志:

As expected, our log file has entries in Apache format. Feel free to add one or both of these environment variables to the script in package.json as well.

We've seen how to make a log of the HTTP requests and how to robustly record it in a file. Let's now discuss how to handle debugging messages.

Debugging messages

How many of us debug our programs by inserting console.log statements? Most of us do. Yes, we're supposed to use a debugger, and yes, it is a pain to manage the console.log statements and make sure they're all turned off before committing our changes. The debug package provides a better way to handle debug tracing, which is quite powerful.

For the documentation on the debug package, refer to www.npmjs.com/package/debug.

The Express team uses DEBUG internally, and we can generate quite a detailed trace of what Express does by running Notes this way:


如果要调试 Express,这非常有用。但是,我们也可以在我们自己的代码中使用这个。这类似于插入`console.log`语句,但无需记住注释掉调试代码。

要在我们的代码中使用这个,需要在任何想要调试输出的模块顶部添加以下声明:

This creates two functions—debug and dbgerror—which will generate debugging traces if enabled. The Debug package calls functions debuggers. The debugger named debug has a notes:debug specifier, while dbgerror has a notes:error specifier. We'll talk in more detail about specifiers shortly.

Using these functions is as simple as this:


当为当前模块启用调试时,这会导致消息被打印出来。如果当前模块未启用调试,则不会打印任何消息。再次强调,这类似于使用`console.log`,但您可以动态地打开和关闭它,而无需修改您的代码,只需适当设置`DEBUG`变量。

`DEBUG`环境变量包含描述哪些代码将启用调试的标识符。最简单的标识符是`*`,它是一个通配符,可以打开每个调试器。否则,调试标识符使用`identifer:identifier`格式。当我们说要使用`DEBUG=express:*`时,该标识符使用`express`作为第一个标识符,并使用`*`通配符作为第二个标识符。

按照惯例,第一个标识符应该是您的应用程序或库的名称。因此,我们之前使用`notes:debug`和`notes:error`作为标识符。但是,这只是一个惯例;您可以使用任何您喜欢的标识符格式。

要向`Notes`添加调试,让我们添加一些代码。将以下内容添加到`app.mjs`的底部:

This is adapted from the httpsniffer.mjs example from Chapter 4, HTTP Servers and Clients, and for every HTTP request, a little bit of information will be printed.

Then, in appsupport.mjs, let's make two changes. Add the following to the top of the onError function:


这将在 Express 捕获的任何错误上输出错误跟踪。

然后,将`onListening`更改为以下内容:

This changes the console.log call to a debug call so that a Listening on message is printed only if debugging is enabled.

If we run the application with the DEBUG variable set appropriately, we get the following output:


仔细看一下,你会发现输出既是来自`morgan`的日志输出,也是来自`debug`模块的调试输出。在这种情况下,调试输出以`notes:debug`开头。由于`REQUEST_LOG_FORMAT`变量,日志输出是以 Apache 格式的。

我们现在有一个准备好使用的调试跟踪系统。下一个任务是看看是否可能在文件中捕获这个或其他控制台输出。

## 捕获 stdout 和 stderr

重要消息可以打印到`process.stdout`或`process.stderr`,如果您不捕获输出,这些消息可能会丢失。最佳做法是捕获这些输出以供将来分析,因为其中可能包含有用的调试信息。更好的做法是使用系统设施来捕获这些输出流。

**系统设施**可以包括启动应用程序并将标准输出和标准错误流连接到文件的进程管理应用程序。

尽管它缺乏这种设施,但事实证明,在 Node.js 中运行的 JavaScript 代码可以拦截`process.stdout`和`process.stderr`流。在可用的包中,让我们看看`capture-console`。对于可写流,该包将调用您提供的回调函数来处理任何输出。

请参考`capture-console`包页面,了解相关文档:[`www.npmjs.com/package/capture-console`](https://www.npmjs.com/package/capture-console)。

最后一个行政事项是确保我们捕获其他未捕获的错误。

## 捕获未捕获的异常和未处理的拒绝的 Promises

未捕获的异常和未处理的拒绝的 Promises 是其他重要信息可能丢失的地方。由于我们的代码应该捕获所有错误,任何未捕获的错误都是我们的错误。如果我们不捕获这些错误,我们的失败分析可能会缺少重要信息。

Node.js 通过进程对象发送的事件指示这些条件,`uncaughtException`和`unhandledRejection`。在这些事件的文档中,Node.js 团队严厉地表示,在任何一种情况下,应用程序都处于未知状态,因为某些事情失败了,可能不安全继续运行应用程序。

要实现这些处理程序,请将以下内容添加到`appsupport.mjs`中:

Because these are events that are emitted from the process object, the way to handle them is to attach an event listener to these events. That's what we've done here.

The names of these events describe their meaning well. An uncaughtException event means an error was thrown but was not caught by a try/catch construct. Similarly, an unhandledRejection event means a Promise ended in a rejected state, but there was no .catch handler.

Our DevOps team will be happier now that we've handled these administrative chores. We've seen how to generate useful log files for HTTP requests, how to implement debug tracing, and even how to capture it to a file. We wrapped up this section by learning how to capture otherwise-uncaught errors.

We're now ready to move on to the real purpose of this chapter—storing notes in persistent storage, such as in a database. We'll implement support for several database systems, starting with a simple system using files on a disk.

Storing notes in a filesystem

Filesystems are an often-overlooked database engine. While filesystems don't have the sort of query features supported by database engines, they are still a reliable place to store files. The Notes schema is simple enough, so the filesystem can easily serve as its data storage layer.

Let's start by adding two functions to the Note class in models/Notes.mjs:


我们将使用这个将`Note`对象转换为 JSON 格式的文本,以及从 JSON 格式的文本转换为`Note`对象。

`JSON`方法是一个 getter,这意味着它检索对象的值。在这种情况下,`note.JSON`属性/getter(没有括号)将简单地给我们提供笔记的 JSON 表示。我们稍后将使用它来写入 JSON 文件。

`fromJSON` 是一个静态函数,或者工厂方法,用于帮助构造 `Note` 对象,如果我们有一个 JSON 字符串。由于我们可能会得到任何东西,我们需要仔细测试输入。首先,如果字符串不是 JSON 格式,`JSON.parse` 将失败并抛出异常。其次,我们有 TypeScript 社区所谓的**类型保护**,或者 `if` 语句,来测试对象是否符合 `Note` 对象所需的条件。这检查它是否是一个带有 `key`、`title` 和 `body` 字段的对象,这些字段都必须是字符串。如果对象通过了这些测试,我们使用数据来构造一个 `Note` 实例。

这两个函数可以如下使用:

This example code snippet produces a simple Note instance and then generates the JSON version of the note. Then, a new note is instantiated from that JSON string using from JSON().

Now, let's create a new module, models/notes-fs.mjs, to implement the filesystem datastore:


这导入了所需的模块;一个额外的添加是使用 `fs-extra` 模块。这个模块被用来实现与核心 `fs` 模块相同的 API,同时添加了一些有用的额外函数。在我们的情况下,我们对 `fs.ensureDir` 感兴趣,它验证指定的目录结构是否存在,如果不存在,则创建一个目录路径。如果我们不需要 `fs.ensureDir`,我们将简单地使用 `fs.promises`,因为它也提供了在 `async` 函数中有用的文件系统函数。

有关 `fs-extra` 的文档,请参考 [`www.npmjs.com/package/fs-extra`](https://www.npmjs.com/package/fs-extra)。

现在,将以下内容添加到 `models/notes-fs.mjs` 中:

The FSNotesStore class is an implementation of AbstractNotesStore, with a focus on storing the Note instances as JSON in a directory. These methods implement the API that we defined in Chapter 5, Your First Express Application. This implementation is incomplete since a couple of helper functions still need to be written, but you can see that it relies on files in the filesystem. For example, the destroy method simply uses fs.unlink to delete the note from the disk. In keylist, we use fs.readdir to read each Note object and construct an array of keys for the notes.

Let's add the helper functions:


`crupdate` 函数用于支持 `update` 和 `create` 方法。对于这个 `Notes` 存储,这两种方法都是相同的,它们将内容写入磁盘作为一个 JSON 文件。

代码中,笔记存储在由 `notesDir` 函数确定的目录中。这个目录可以在 `NOTES_FS_DIR` 环境变量中指定,也可以在 `Notes` 根目录中的 `notes-fs-data` 中指定(从 `approotdir` 变量中得知)。无论哪种方式,我们都使用 `fs.ensureDir` 来确保目录存在。

`Notes` 的路径名是由 `filePath` 函数计算的。

由于路径名是 `${notesDir}/${key}.json`,因此键不能使用文件名中不能使用的字符。因此,如果键包含 `/` 字符,`crupdate` 将抛出错误。

`readJSON` 函数的功能与其名称所示的一样——它从磁盘中读取一个 `Note` 对象作为 JSON 文件。

我们还添加了另一个依赖项:

We're now almost ready to run the Notes application, but there's an issue that first needs to be resolved with the import() function.

Dynamically importing ES6 modules

Before we start modifying the router functions, we have to consider how to account for multiple AbstractNotesStore implementations. By the end of this chapter, we will have several of them, and we want an easy way to configure Notes to use any of them. For example, an environment variable, NOTES_MODEL, could be used to specify the Notes data model to use, and the Notes application would dynamically load the correct module.

In Notes, we refer to the Notes datastore module from several places. To change from one datastore to another requires changing the source in each of these places. It would be better to locate that selection in one place, and further, to make it dynamically configurable at runtime.

There are several possible ways to do this. For example, in a CommonJS module, it's possible to compute the pathname to the module for a require statement. It would consult the environment variable, NOTES_MODEL, to calculate the pathname for the datastore module, as follows:


然而,我们的意图是使用 ES6 模块,因此让我们看看在这种情况下它是如何工作的。因为在常规的 `import` 语句中,模块名不能像这样是一个表达式,所以我们需要使用 `动态导入` 来加载模块。`动态导入` 功能——即 `import()` 函数——允许我们动态计算要加载的模块名。

为了实现这个想法,让我们创建一个新文件 `models/notes-store.mjs`,其中包含以下内容:

This is what we might call a factory function. It uses import() to load a module whose filename is calculated from the model parameter. We saw in notes-fs.mjs that the FSNotesStore class is the default export. Therefore, the NotesStoreClass variable gets that class, then we call the constructor to create an instance, and then we stash that instance in a global scope variable. That global scope variable is then exported as NotesStore.

We need to make one small change in models/notes-memory.mjs:


任何实现 `AbstractNotesStore` 的模块都将默认导出定义的类。

在 `app.mjs` 中,我们需要对调用这个 `useModel` 函数进行另一个更改。在第五章中,*你的第一个 Express 应用程序*,我们让 `app.mjs` 导入 `models/notes-memory.mjs`,然后设置 `NotesStore` 包含 `InMemoryNotesStore` 的一个实例。具体来说,我们有以下内容:

We need to remove these two lines of code from app.mjs and then add the following:


我们导入 `useModel`,将其重命名为 `useNotesModel`,然后通过传入 `NOTES_MODEL` 环境变量来调用它。如果 `NOTES_MODEL` 变量未设置,我们将默认使用“memory” `NotesStore`。由于 `useNotesModel` 是一个 `async` 函数,我们需要处理生成的 Promise。`.then` 处理成功的情况,但由于没有需要执行的操作,所以我们提供了一个空函数。重要的是任何错误都会关闭应用程序,因此我们添加了 `.catch`,它调用 `onError` 来处理错误。

为了支持这个错误指示器,我们需要在 `appsupport.mjs` 的 `onError` 函数中添加以下内容:

This added error handler will also cause the application to exit.

These changes also require us to make another change. The NotesStore variable is no longer in app.mjs, but is instead in models/notes-store.mjs. This means we need to go to routes/index.mjs and routes/notes.mjs, where we make the following change to the imports:


我们从`notes-store.mjs`中导入`NotesStore`导出,并将其重命名为`notes`。因此,在两个路由模块中,我们将进行诸如`notes.keylist()`的调用,以访问动态选择的`AbstractNotesStore`实例。

这种抽象层提供了期望的结果——设置一个环境变量,让我们在运行时决定使用哪个数据存储。

现在我们已经拥有了所有的部件,让我们运行`Notes`应用程序并看看它的行为。

## 使用文件系统存储运行 Notes 应用程序

在`package.json`中,将以下内容添加到`scripts`部分:

When you add these entries to package.json, make sure you use the correct JSON syntax. In particular, if you leave a comma at the end of the scripts section, it will fail to parse and npm will throw an error message.

With this code in place, we can now run the Notes application, as follows:


我们可以像以前一样在`http://localhost:3000`上使用应用程序。因为我们没有更改任何模板或 CSS 文件,所以应用程序看起来与您在第六章结束时一样。

因为`notes:*`的调试已打开,我们将看到`Notes`应用程序正在执行的任何操作的日志。通过简单地不设置`DEBUG`变量,可以轻松关闭此功能。

您现在可以关闭并重新启动`Notes`应用程序,并查看完全相同的注释。您还可以使用常规文本编辑器(如**vi**)在命令行中编辑注释。您现在可以在不同端口上启动多个服务器,使用`fs-server1`和`fs-server2`脚本,并查看完全相同的注释。

就像我们在第五章结束时所做的那样,*您的第一个 Express 应用程序*,我们可以在两个单独的命令窗口中启动两个服务器。这将在不同的端口上运行两个应用程序实例。然后,在不同的浏览器窗口中访问这两个服务器,您会发现两个浏览器窗口显示相同的注释。

另一个尝试的事情是指定`NOTES_FS_DIR`以定义一个不同的目录来存储注释。

最后的检查是创建一个带有`/`字符的键的注释。请记住,键用于生成我们存储注释的文件名,因此键不能包含`/`字符。在浏览器打开的情况下,单击“添加注释”,并输入一条注释,确保在“键”字段中使用`/`字符。单击提交按钮后,您将看到一个错误,指出这是不允许的。

我们现在已经演示了向`Notes`添加持久数据存储。但是,这种存储机制并不是最好的,还有其他几种数据库类型可以探索。我们列表中的下一个数据库服务是 LevelDB。

# 使用 LevelDB 数据存储存储注释

要开始使用实际数据库,让我们看一下一个极其轻量级、占用空间小的数据库引擎:`level`。这是一个 Node.js 友好的包装器,它包装了 LevelDB 引擎,并由 Google 开发。它通常用于 Web 浏览器进行本地数据持久化,并且是一个非索引的 NoSQL 数据存储,最初是为在浏览器中使用而设计的。Level Node.js 模块使用 LevelDB API,并支持多个后端,包括 leveldown,它将 C++ LevelDB 数据库集成到 Node.js 中。

访问[`www.npmjs.com/package/level`](https://www.npmjs.com/package/level)了解有关此模块的信息。

要安装数据库引擎,请运行以下命令:

This installs the version of level that the following code was written against.

Then, create the models/notes-level.mjs module, which will contain the AbstractNotesStore implementation:


我们从`import`语句和一些声明开始模块。`connectDB`函数用于连接数据库,`createIfMissing`选项也是如其名所示,如果不存在具有所使用名称的数据库,则创建一个数据库。从模块`level`导入的是一个构造函数,用于创建与第一个参数指定的数据库连接的`level`实例。这个第一个参数是文件系统中的位置,换句话说,是数据库将被存储的目录。

`level`构造函数通过返回一个`db`对象来与数据库进行交互。我们将`db`作为模块中的全局变量存储,以便于使用。在`connectDB`中,如果`db`对象已经设置,我们立即返回它;否则,我们使用构造函数打开数据库,就像刚才描述的那样。

数据库的位置默认为当前目录中的`notes.level`。`LEVELDB_LOCATION`环境变量可以设置,如其名称所示,以指定数据库位置。

现在,让我们添加这个模块的其余部分:

As expected, we're creating a LevelNotesStore class to hold the functions.

In this case, we have code in the close function that calls db.close to close down the connection. The level documentation suggests that it is important to close the connection, so we'll have to add something to app.mjs to ensure that the database closes when the server shuts down. The documentation also says that level does not support concurrent connections to the same database from multiple clients, meaning if we want multiple Notes instances to use the database, we should only have the connection open when necessary.

Once again, there is no difference between the create and update operations, and so we use a crupdate function again. Notice that the pattern in all the functions is to first call connectDB to get db, and then to call a function on the db object. In this case, we use db.put to store the Note object in the database.

In the read function, db.get is used to read the note. Since the Note data was stored as JSON, we use Note.fromJSON to decode and instantiate the Note instance.

The destroy function deletes a record from the database using the db.del function.

Both keylist and count use the createKeyStream function. This function uses an event-oriented interface to stream through every database entry, emitting events as it goes. A data event is emitted for each key in the database, while the end event is emitted at the end of the database, and the error event is emitted on errors. Since there is no simple way to present this as a simple async function, we have wrapped it with a Promise so that we can use await. We then invoke createKeyStream, letting it run its course and collect data as it goes. For keylist, in the data events, we add the data (in this case, the key to a database entry) to an array.

For count, we use a similar process, and in this case, we simply increment a counter. Since we have this wrapped in a Promise, in an error event, we call reject, and in an end event, we call resolve.

Then, we add the following to package.json in the scripts section:


最后,您可以运行`Notes`应用程序:

The printout in the console will be the same, and the application will also look the same. You can put it through its paces to check whether everything works correctly.

Since level does not support simultaneous access to a database from multiple instances, you won't be able to use the multiple Notes application scenario. You will, however, be able to stop and restart the application whenever you want to without losing any notes.

Before we move on to looking at the next database, let's deal with a issue mentioned earlier—closing the database connection when the process exits.

Closing database connections when closing the process

The level documentation says that we should close the database connection with db.close. Other database servers may well have the same requirement. Therefore, we should make sure we close the database connection before the process exits, and perhaps also on other conditions.

Node.js provides a mechanism to catch signals sent by the operating system. What we'll do is configure listeners for these events, then close NotesStore in response.

Add the following code to appsupport.mjs:


我们导入`NotesStore`以便可以调用其方法,`server`已经在其他地方导入。

前三个`process.on`调用监听操作系统信号。如果您熟悉 Unix 进程信号,这些术语会很熟悉。在每种情况下,事件调用`catchProcessDeath`函数,然后调用`NotesStore`和`server`上的`close`函数,以确保关闭。

然后,为了确认一些事情,我们附加了一个`exit`监听器,这样当进程退出时我们可以打印一条消息。Node.js 文档表示,`exit`监听器被禁止执行需要进一步事件处理的任何操作,因此我们不能在此处理程序中关闭数据库连接。

让我们试一下运行`Notes`应用程序,然后立即按下*Ctrl* + *C*:

Sure enough, upon pressing Ctrl + C, the exit and catchProcessDeath listeners are called.

That covers the level database, and we also have the beginning of a handler to gracefully shut down the application. The next database to cover is an embedded SQL database that requires no server processes.

Storing notes in SQL with SQLite3

To get started with more normal databases, let's see how we can use SQL from Node.js. First, we'll use SQLite3, which is a lightweight, simple-to-set-up database engine eminently suitable for many applications.

To learn more about this database engine, visit www.sqlite.org/.

To learn more about the Node.js module, visit github.com/mapbox/node-sqlite3/wiki/API or www.npmjs.com/package/sqlite3.

The primary advantage of SQLite3 is that it doesn't require a server; it is a self-contained, no-set-up-required SQL database. The SQLite3 team also claims that it is very fast and that large, high-throughput applications have been built with it. The downside to the SQLite3 package is that its API requires callbacks, so we'll have to use the Promise wrapper pattern.

The first step is to install the module:


当然,这会安装`sqlite3`包。

要管理 SQLite3 数据库,您还需要安装 SQLite3 命令行工具。该项目网站为大多数操作系统提供了预编译的二进制文件。您还会发现这些工具在大多数软件包管理系统中都是可用的。

我们可以使用的一个管理任务是设置数据库表,我们将在下一节中看到。

## SQLite3 数据库模式

接下来,我们需要确保我们的数据库配置了适合`Notes`应用程序的数据库表。这是上一节末尾提到的一个示例数据库管理员任务。为此,我们将使用`sqlite3`命令行工具。`sqlite3.org`网站有预编译的二进制文件,或者该工具可以通过您的操作系统的软件包管理系统安装——例如,您可以在 Ubuntu/Debian 上使用`apt-get`,在 macOS 上使用 MacPorts。

对于 Windows,请确保已经安装了 Chocolatey 软件包管理工具,然后以管理员权限启动 PowerShell,并运行"`choco install sqlite`"。这将安装 SQLite3 的 DLL 和其命令行工具,让您可以运行以下指令。

我们将使用以下的 SQL 表定义作为模式(将其保存为`models/schema-sqlite3.sql`):

To initialize the database table, we run the following command:


虽然我们可以这样做,但最佳实践是自动化所有管理过程。为此,我们应该编写一小段脚本来初始化数据库。

幸运的是,`sqlite3`命令为我们提供了一种方法来做到这一点。将以下内容添加到`package.json`的`scripts`部分:

Run the setup script:


这并不是完全自动化,因为我们必须在`sqlite`提示符下按*Ctrl* + *D*,但至少我们不必费心去记住如何做。我们本可以轻松地编写一个小的 Node.js 脚本来做到这一点;然而,通过使用软件包提供的工具,我们在自己的项目中需要维护的代码更少。

有了数据库表的设置,让我们继续编写与 SQLite3 交互的代码。

## SQLite3 模型代码

我们现在准备为 SQLite3 实现一个`AbstractNotesStore`实现。

创建`models/notes-sqlite3.mjs`文件:

This imports the required packages and makes the required declarations. The connectDB function has a similar purpose to the one in notes-level.mjs: to manage the database connection. If the database is not open, it'll go ahead and open it, and it will even make sure that the database file is created (if it doesn't exist). If the database is already open, it'll simply be returned.

Since the API used in the sqlite3 package requires callbacks, we will have to wrap every function call in a Promise wrapper, as shown here.

Now, add the following to models/notes-sqlite3.mjs:


由于有许多成员函数,让我们逐个讨论它们:

In close, the task is to close the database. There's a little dance done here to make sure the global db variable is unset while making sure we can close the database by saving db as _db. The sqlite3 package will report errors from db.close, so we're making sure we report any errors:


我们现在有理由定义`Notes`模型的`create`和`update`操作是分开的,因为每个函数的 SQL 语句是不同的。`create`函数当然需要一个`INSERT INTO`语句,而`update`函数当然需要一个`UPDATE`语句。

`db.run`函数在这里使用了多次,它执行一个 SQL 查询,同时给我们机会在查询字符串中插入参数。

这遵循了 SQL 编程接口中常见的参数替换范式。程序员将 SQL 查询放在一个字符串中,然后在查询字符串中的任何位置放置一个问号,以便在查询字符串中插入一个值。查询字符串中的每个问号都必须与程序员提供的数组中的一个值匹配。该模块负责正确编码这些值,以便查询字符串格式正确,同时防止 SQL 注入攻击。

`db.run`函数只是运行它所给出的 SQL 查询,并不检索任何数据。

To retrieve data using the sqlite3 module, you use the db.get, db.all, or db.each functions. Since our read method only returns one item, we use the db.get function to retrieve just the first row of the result set. By contrast, the db.all function returns all of the rows of the result set at once, and the db.each function retrieves one row at a time, while still allowing the entire result set to be processed.

By the way, this read function has a bug in it—see whether you can spot the error. We'll read more about this in Chapter 13, Unit Testing and Functional Testing, when our testing efforts uncover the bug:


在我们的`destroy`方法中,我们只需使用`db.run`执行`DELETE FROM`语句来删除相关笔记的数据库条目:

In keylist, the task is to collect the keys for all of the Note instances. As we said, db.get returns only the first entry of the result set, while the db.all function retrieves all the rows of the result set. Therefore, we use db.all, although db.each would have been a good alternative.

The contract for this function is to return an array of note keys. The rows object from db.all is an array of results from the database that contains the data we are to return, but we use the map function to convert the array into the format required by this function:


在`count`中,任务类似,但我们只需要表中行的计数。SQL 提供了一个`count()`函数来实现这个目的,我们已经使用了,然后因为这个结果只有一行,我们可以再次使用`db.get`。

这使我们能够使用`NOTES_MODEL`设置为`sqlite3`运行`Notes`。现在我们的代码已经设置好,我们可以继续使用这个数据库运行`Notes`。

## 使用 SQLite3 运行 Notes

我们现在准备使用 SQLite3 运行`Notes`应用程序。将以下代码添加到`package.json`的`scripts`部分:

This sets up the commands that we'll use to test Notes on SQLite3.

We can run the server as follows:


现在你可以在`http://localhost:3000`上浏览应用程序,并像以前一样运行它。

因为我们还没有对`View`模板或 CSS 文件进行任何更改,所以应用程序看起来和以前一样。

当然,你可以使用`sqlite`命令,或其他 SQLite3 客户端应用程序来检查数据库:

The advantage of installing the SQLite3 command-line tools is that we can perform any database administration tasks without having to write any code.

We have seen how to use SQLite3 with Node.js. It is a worthy database for many sorts of applications, plus it lets us use a SQL database without having to set up a server.

The next package that we will cover is an Object Relations Management (ORM) system that can run on top of several SQL databases.

Storing notes the ORM way with Sequelize

There are several popular SQL database engines, such as PostgreSQL, MySQL, and MariaDB. Corresponding to each are Node.js client modules that are similar in nature to the sqlite3 module that we just used. The programmer is close to SQL, which can be good in the same way that driving a stick shift car is fun. But what if we want a higher-level view of the database so that we can think in terms of objects, rather than rows of a database table? ORM systems provide a suitable higher-level interface, and even offer the ability to use the same data model with several databases. Just as driving an electric car provides lots of benefits at the expense of losing out on the fun of stick-shift driving, ORM produces lots of benefits, while also distancing ourselves from the SQL.

The Sequelize package (www.sequelizejs.com/) is Promise-based, offers strong, well-developed ORM features, and can connect to SQLite3, MySQL, PostgreSQL, MariaDB, and MSSQL databases. Because Sequelize is Promise-based, it will fit naturally with the Promise-based application code we're writing.

A prerequisite to most SQL database engines is having access to a database server. In the previous section, we skirted around this issue by using SQLite3, which requires no database server setup. While it's possible to install a database server on your laptop, right now, we want to avoid the complexity of doing so, and so we will use Sequelize to manage a SQLite3 database. We'll also see that it's simply a matter of using a configuration file to run the same Sequelize code against a hosted database such as MySQL. In Chapter 11, Deploying Node.js Microservices with Docker, we'll learn how to use Docker to easily set up a service, including database servers, on our laptop and deploy the exact same configuration to a live server. Most web-hosting providers offer MySQL or PostgreSQL as part of their service.

Before we start on the code, let's install two modules:


第一个安装了 Sequelize 包。第二个`js-yaml`是安装的,以便我们可以实现一个以 YAML 格式存储 Sequelize 连接配置的文件。YAML 是一种人类可读的**数据序列化语言**,这意味着它是一种易于使用的文本文件格式,用于描述数据对象。

也许最好了解 YAML 的地方是它的维基百科页面,可以在[`en.wikipedia.org/wiki/YAML`](https://en.wikipedia.org/wiki/YAML)找到。

让我们从学习如何配置 Sequelize 开始,然后我们将为 Sequelize 创建一个`AbstractNotesStore`实例,最后,我们将使用 Sequelize 测试`Notes`。

## 配置 Sequelize 并连接到数据库

我们将以与以前不同的方式组织 Sequelize 支持的代码。我们预见到`Notes`表不是`Notes`应用程序将使用的唯一数据模型。我们可以支持其他功能,比如上传笔记的图片或允许用户评论笔记。这意味着需要额外的数据库表,并建立数据库条目之间的关系。例如,我们可能会有一个名为`AbstractCommentStore`的类来存储评论,它将有自己的数据库表和自己的模块来管理评论数据。`Notes`和`Comments`存储区域都应该在同一个数据库中,因此它们应该共享一个数据库连接。

有了这个想法,让我们创建一个文件`models/sequlz.mjs`,来保存管理 Sequelize 连接的代码:

As with the SQLite3 module, the connectDB function manages the connection through Sequelize to a database server. Since the configuration of the Sequelize connection is fairly complex and flexible, we're not using environment variables for the whole configuration, but instead we use a YAML-formatted configuration file that will be specified in an environment variable. Sequelize uses four items of data—the database name, the username, the password, and a parameters object.

When we read in a YAML file, its structure directly corresponds to the object structure that's created. Therefore, with a YAML configuration file, we don't need to use up any brain cells developing a configuration file format. The YAML structure is dictated by the Sequelize params object, and our configuration file simply has to use the same structure.

We also allow overriding any of the fields in this file using environment variables. This will be useful when we deploy Notes using Docker so that we can configure database connections without having to rebuild the Docker container.

For a simple SQLite3-based database, we can use the following YAML file for configuration and name it models/sequelize-sqlite.yaml:


`params.dialect`的值决定了要使用的数据库类型;在这种情况下,我们使用的是 SQLite3。根据方言的不同,`params`对象可以采用不同的形式,比如连接到数据库的连接 URL。在这种情况下,我们只需要一个文件名,就像这样给出的。

`authenticate` 调用是为了测试数据库是否正确连接。

`close` 函数做你期望的事情——关闭数据库连接。

有了这个设计,我们可以很容易地通过添加一个运行时配置文件来更改数据库以使用其他数据库服务器。例如,很容易设置一个 MySQL 连接;我们只需创建一个新文件,比如 `models/sequelize-mysql.yaml`,其中包含类似以下代码的内容:

This is straightforward. The username and password fields must correspond to the database credentials, while host and port will specify where the database is hosted. Set the database's dialect parameter and other connection information and you're good to go.

To use MySQL, you will need to install the base MySQL driver so that Sequelize can use MySQL:


运行 Sequelize 对其支持的其他数据库,如 PostgreSQL,同样简单。只需创建一个配置文件,安装 Node.js 驱动程序,并安装/配置数据库引擎。

从 `connectDB` 返回的对象是一个数据库连接,正如我们将看到的,它被 Sequelize 使用。因此,让我们开始这一部分的真正目标——定义 `SequelizeNotesStore` 类。

## 为 Notes 应用程序创建一个 Sequelize 模型

与我们使用的其他数据存储引擎一样,我们需要为 Sequelize 创建一个 `AbstractNotesStore` 的子类。这个类将使用 Sequelize `Model` 类来管理一组注释。

让我们创建一个新文件,`models/notes-sequelize.mjs`:

The database connection is stored in the sequelize object, which is established by the connectDB function that we just looked at (which we renamed connectSequlz) to instantiate a Sequelize instance. We immediately return if the database is already connected.

In Sequelize, the Model class is where we define the data model for a given object. Each Model class corresponds to a database table. The Model class is a normal ES6 class, and we start by subclassing it to define the SQNote class. Why do we call it SQNote? That's because we already defined a Note class, so we had to use a different name in order to use both classes.

By calling SQNote.init, we initialize the SQNote model with the fields—that is, the schema—that we want it to store. The first argument to this function is the schema description and the second argument is the administrative data required by Sequelize.

As you would expect, the schema has three fields: notekey, title, and body. Sequelize supports a long list of data types, so consult the documentation for more on that. We are using STRING as the type for notekey and title since both handle a short text string up to 255 bytes long. The body field is defined as TEXT since it does not need a length limit. In the notekey field, you see it is an object with other parameters; in this case, it is described as the primary key and the notekey values must be unique.

Online documentation can be found at the following locations:
Sequelize class: docs.sequelizejs.com/en/latest/api/sequelize/ Defining models: docs.sequelizejs.com/en/latest/api/model/

That manages the database connection and sets up the schema. Now, let's add the SequelizeNotesStore class to models/notes-sequelize.mjs:


首先要注意的是,在每个函数中,我们调用在 `SQNote` 类中定义的静态方法来执行数据库操作。Sequelize 模型类就是这样工作的,它的文档中有一个全面的这些静态方法的列表。

在创建 Sequelize 模型类的新实例时——在本例中是 `SQNote`——有两种模式可供选择。一种是调用 `build` 方法,然后创建对象和 `save` 方法将其保存到数据库。或者,我们可以像这样使用 `create` 方法,它执行这两个步骤。此函数返回一个 `SQNote` 实例,在这里称为 `sqnote`,如果您查阅 Sequelize 文档,您将看到这些实例有一长串可用的方法。我们的 `create` 方法的约定是返回一个注释,因此我们构造一个 `Note` 对象来返回。

在这个和其他一些方法中,我们不想向调用者返回一个 Sequelize 对象。因此,我们构造了我们自己的 `Note` 类的实例,以返回一个干净的对象。

我们的 `update` 方法首先调用 `SQNote.findOne`。这是为了确保数据库中存在与我们给定的键对应的条目。此函数查找第一个数据库条目,其中 `notekey` 匹配提供的键。在快乐路径下,如果存在数据库条目,我们然后使用 `SQNote.update` 来更新 `title` 和 `body` 值,并通过使用相同的 `where` 子句,确保 `update` 操作针对相同的数据库条目。

Sequelize 的 `where` 子句提供了一个全面的匹配操作符列表。如果您仔细考虑这一点,很明显它大致对应于以下 SQL:

That's what Sequelize and other ORM libraries do—convert the high-level API into database operations such as SQL queries.

To read a note, we use the findOne operation again. There is the possibility of it returning an empty result, and so we have to throw an error to match. The contract for this function is to return a Note object, so we take the fields retrieved using Sequelize to create a clean Note instance.

To destroy a note, we use the destroy operation with the same where clause to specify which entry to delete. This means that, as in the equivalent SQL statement (DELETE FROM SQNotes WHERE notekey = ?), if there is no matching note, no error will be thrown.

Because the keylist function acts on all Note objects, we use the findAll operation. The difference between findOne and findAll is obvious from the names. While findOne returns the first matching database entry, findAll returns all of them. The attributes specifier limits the result set to include the named field—namely, the notekey field. This gives us an array of objects with a field named notekey. We then use a .map function to convert this into an array of note keys.

For the count function, we can just use the count() method to calculate the required result.

This allows us to use Sequelize by setting NOTES_MODEL to sequelize.

Having set up the functions to manage the database connection and defined the SequelizeNotesStore class, we're now ready to test the Notes application.

Running the Notes application with Sequelize

Now, we can get ready to run the Notes application using Sequelize. We can run it against any database server, but let's start with SQLite3. Add the following declarations to the scripts entry in package.json:


这设置了命令以运行单个服务器实例(或两个)。

然后,按以下方式运行它:

As before, the application looks exactly the same because we haven't changed the View templates or CSS files. Put it through its paces and everything should work.

You will be able to start two instances; use separate browser windows to visit both instances and see whether they show the same set of notes.

To reiterate, to use the Sequelize-based model on a given database server, do the following:

  1. Install and provision the database server instance; otherwise, get the connection parameters for an already-provisioned database server.
  2. Install the corresponding Node.js driver.
  3. Write a YAML configuration file corresponding to the connection parameters.
  4. Create new scripts entries in package.json to automate starting Notes against the database.

By using Sequelize, we have dipped our toes into a powerful library for managing data in a database. Sequelize is one of several ORM libraries available for Node.js. We've already used the word comprehensive several times in this section as it's definitely the best word to describe Sequelize.

An alternative that is worthy of exploration is not an ORM library but is what's called a query builder. knex supports several SQL databases, and its role is to simplify creating SQL queries by using a high-level API.

In the meantime, we have one last database to cover before wrapping up this chapter: MongoDB, the leading NoSQL database.

Storing notes in MongoDB

MongoDB is widely used with Node.js applications, a sign of which is the popular MEAN acronym: MongoDB (or MySQL), Express, Angular, and Node.js. MongoDB is one of the leading NoSQL databases, meaning it is a database engine that does not use SQL queries. It is described as a scalable, high-performance, open source, document-oriented database. It uses JSON-style documents with no predefined, rigid schema and a large number of advanced features. You can visit their website for more information and documentation at www.mongodb.org.

Documentation on the Node.js driver for MongoDB can be found at www.npmjs.com/package/mongodb and mongodb.github.io/node-mongodb-native/.

Mongoose is a popular ORM for MongoDB (mongoosejs.com/). In this section, we'll use the native MongoDB driver instead, but Mongoose is a worthy alternative.

First, you will need a running MongoDB instance. The Compose- (www.compose.io/) and ScaleGrid- (scalegrid.io/) hosted service providers offer hosted MongoDB services. Nowadays, it is straightforward to host MongoDB as a Docker container as part of a system built of other Docker containers. We'll do this in Chapter 13, Unit Testing and Functional Testing.

It's possible to set up a temporary MongoDB instance for testing on, say, your laptop. It is available in all the operating system package management systems, or you can download a compiled package from mongodb.com. The MongoDB website also has instructions (docs.mongodb.org/manual/installation/).

For Windows, it may be most expedient to use a cloud-hosted MongoDB instance.

Once installed, it's not necessary to set up MongoDB as a background service. Instead, you can run a couple of simple commands to get a MongoDB instance running in the foreground of a command window, which you can kill and restart any time you like.

In a command window, run the following:


这将创建一个数据目录,然后运行 MongoDB 守护程序来对该目录进行操作。

在另一个命令窗口中,您可以按以下方式进行测试:

This runs the Mongo client program with which you can run commands. The command language used here is JavaScript, which is comfortable for us.

This saves a document in the collection named foo. The second command finds all documents in foo, printing them out for you. There is only one document, the one we just inserted, so that's what gets printed. The _id field is added by MongoDB and serves as a document identifier.

This setup is useful for testing and debugging. For a real deployment, your MongoDB server must be properly installed on a server. See the MongoDB documentation for these instructions.

With a working MongoDB installation in our hands, let's get started with implementing the MongoNotesStore class.

A MongoDB model for the Notes application

The official Node.js MongoDB driver (www.npmjs.com/package/mongodb) is created by the MongoDB team. It is very easy to use, as we will see, and its installation is as simple as running the following command:


这为我们设置了驱动程序包,并将其添加到 `package.json`。

现在,创建一个新文件,`models/notes-mongodb.mjs`:

This sets up the required imports, as well as the functions to manage a connection with the MongoDB database.

The MongoClient class is used to connect with a MongoDB instance. The required URL, which will be specified through an environment variable, uses a straightforward format: mongodb://localhost/. The database name is specified via another environment variable.

The documentation for the MongoDB Node.js driver can be found at mongodb.github.io/node-mongodb-native/.

There are both reference and API documentation available. In the API section, the MongoClient and Db classes are the ones that most relate to the code we are writing (mongodb.github.io/node-mongodb-native/).

The connectDB function creates the database client object. This object is only created as needed. The connection URL is provided through the MONGO_URL environment variable.

The db function is a simple wrapper around the client object to access the database that is used for the Notes application, which we specify via the MONGO_DBNAME environment variable. Therefore, to access the database, the code will have to call db().mongoDbFunction().

Now, we can implement the MongoDBNotesStore class:


MongoDB 将所有文档存储在集合中。*集合* 是一组相关文档,类似于关系数据库中的表。这意味着创建一个新文档或更新现有文档始于将其构造为 JavaScript 对象,然后要求 MongoDB 将对象保存到数据库中。MongoDB 自动将对象编码为其内部表示形式。

`db().collection` 方法为我们提供了一个 `Collection` 对象,我们可以使用它来操作命名集合。在这种情况下,我们使用 `db().collection('notes')` 访问 `notes` 集合。

有关 `Collection` 类的文档,请参阅之前引用的 MongoDB Node.js 驱动程序文档。

在`create`方法中,我们使用`insertOne`;顾名思义,它将一个文档插入到集合中。这个文档用于`Note`类的字段。同样,在`update`方法中,`updateOne`方法首先找到一个文档(在这种情况下,通过查找具有匹配`notekey`字段的文档),然后根据指定的内容更改文档中的字段,然后将修改后的文档保存回数据库。

`read`方法使用`db().findOne`来搜索笔记。

`findOne`方法采用所谓的*查询选择器*。在这种情况下,我们要求与`notekey`字段匹配。MongoDB 支持一套全面的查询选择器操作符。

另一方面,`updateOne`方法采用所谓的*查询过滤器*。作为一个`update`操作,它在数据库中搜索与过滤器匹配的记录,根据更新描述符更新其字段,然后将其保存回数据库。

关于 MongoDB CRUD 操作的概述,包括插入文档、更新文档、查询文档和删除文档,请参阅[`docs.mongodb.com/manual/crud/`](https://docs.mongodb.com/manual/crud/)。

有关查询选择器的文档,请参阅[`docs.mongodb.com/manual/reference/operator/query/#query-selectors`](https://docs.mongodb.com/manual/reference/operator/query/#query-selectors)。

有关查询过滤器的文档,请参阅[`docs.mongodb.com/manual/core/document/#query-filter-documents`](https://docs.mongodb.com/manual/core/document/#query-filter-documents)。

有关更新描述符的文档,请参阅[`docs.mongodb.com/manual/reference/operator/update/`](https://docs.mongodb.com/manual/reference/operator/update/)。

MongoDB 有许多基本操作的变体。例如,`findOne`是基本`find`方法的一个变体。

在我们的`destroy`方法中,我们看到另一个`find`变体,`findOneAndDelete`。顾名思义,它查找与查询描述符匹配的文档,然后删除该文档。

在`keylist`方法中,我们需要处理集合中的每个文档,因此`find`查询选择器为空。`find`操作返回一个`Cursor`,这是一个用于导航查询结果的对象。`Cursor.forEach`方法采用两个回调函数,不是一个 Promise 友好的操作,因此我们必须使用一个 Promise 包装器。第一个回调函数对查询结果中的每个文档都会调用,而在这种情况下,我们只是将`notekey`字段推送到一个数组中。第二个回调函数在操作完成时调用,并且我们通知 Promise 它是成功还是失败。这给我们了我们的键数组,它返回给调用者。

有关`Cursor`类的文档,请参阅[`mongodb.github.io/node-mongodb-native/3.1/api/Cursor.html`](http://mongodb.github.io/node-mongodb-native/3.1/api/Cursor.html)。

在我们的`count`方法中,我们简单地调用 MongoDB 的`count`方法。`count`方法采用查询描述符,并且顾名思义,计算与查询匹配的文档数量。由于我们给出了一个空的查询选择器,它最终计算整个集合。

这使我们可以将`NOTES_MODEL`设置为`mongodb`来使用 MongoDB 数据库运行 Notes。

现在我们已经为 MongoDB 编写了所有的代码,我们可以继续测试`Notes`。

## 使用 MongoDB 运行 Notes 应用程序

我们准备使用 MongoDB 数据库测试`Notes`。到目前为止,你知道该怎么做;将以下内容添加到`package.json`的`scripts`部分:

The MONGO_URL environment variable is the URL to connect with your MongoDB database. This URL is the one that you need to use to run MongoDB on your laptop, as outlined at the top of this section. If you have a MongoDB server somewhere else, you'll be provided with the relevant URL to use.

You can start the Notes application as follows:


`MONGO_URL`环境变量应包含与您的 MongoDB 数据库连接的 URL。这里显示的 URL 对于在本地机器上启动 MongoDB 服务器是正确的,就像您在本节开始时在命令行上启动 MongoDB 一样。否则,如果您在其他地方提供了 MongoDB 服务器,您将被告知访问 URL 是什么,您的`MONGO_URL`变量应该有该 URL。

您可以启动两个`Notes`应用程序实例,并查看它们都共享相同的笔记集。

我们可以验证 MongoDB 数据库最终是否具有正确的值。首先,这样启动 MongoDB 客户端程序:

再次强调,这是基于迄今为止所呈现的 MongoDB 配置,如果您的配置不同,请在命令行上添加 URL。这将启动与 Notes 配置的数据库连接的交互式 MongoDB shell。要检查数据库的内容,只需输入命令:db.notes.find()。这将打印出每个数据库条目。

有了这一点,我们不仅完成了对Notes应用程序中 MongoDB 的支持,还支持了其他几种数据库,因此我们现在准备结束本章。

总结

在本章中,我们经历了不同的数据库技术的真正风暴。虽然我们一遍又一遍地看了同样的七个函数,但接触到各种数据存储模型和完成任务的方式是有用的。即便如此,在 Node.js 中访问数据库和数据存储引擎的选项只是触及了表面。

通过正确抽象模型实现,我们能够轻松地在不改变应用程序其余部分的情况下切换数据存储引擎。这种技术让我们探索了 JavaScript 中子类化的工作原理,以及创建相同 API 的不同实现的概念。此外,我们还对import()函数进行了实际介绍,并看到它可以用于动态选择要加载的模块。

在现实生活中的应用程序中,我们经常为类似的目的创建抽象。它们帮助我们隐藏细节或允许我们更改实现,同时使应用程序的其余部分与更改隔离。我们用于我们的应用程序的动态导入对于动态拼接应用程序非常有用;例如,加载给定目录中的每个模块。

我们避免了设置数据库服务器的复杂性。正如承诺的那样,当我们探索将 Node.js 应用程序部署到 Linux 服务器时,我们将在第十章中进行讨论,将 Node.js 应用程序部署到 Linux 服务器

通过将我们的模型代码专注于存储数据,模型和应用程序应该更容易测试。我们将在第十三章中更深入地研究这一点,单元测试和功能测试

在下一章中,我们将专注于支持多个用户,允许他们登录和退出,并使用 OAuth 2 对用户进行身份验证。

通过微服务对用户进行身份验证

现在我们的 Notes 应用程序可以将数据保存在数据库中,我们可以考虑下一步,即使这成为一个真正的应用程序的下一阶段,即对用户进行身份验证。

登录网站并使用其服务是非常自然的。我们每天都这样做,甚至信任银行和投资机构通过网站上的登录程序来保护我们的财务信息。超文本传输协议(HTTP)是一种无状态协议,网页应用程序无法通过 HTTP 请求比较多了解用户的信息。因为 HTTP 是无状态的,HTTP 请求本身并不知道用户的身份,也不知道驱动网络浏览器的用户是否已登录,甚至不知道 HTTP 请求是否由人发起。

用户身份验证的典型方法是向浏览器发送包含令牌的 cookie,以携带用户的身份,并指示该浏览器是否已登录。

使用 Express,最好的方法是使用express-session中间件,它可以处理带有 cookie 的会话管理。它易于配置,但不是用户身份验证的完整解决方案,因为它不处理用户登录/注销。

在用户身份验证方面,似乎领先的包是 Passport(passportjs.org/)。除了对本地用户信息进行身份验证外,它还支持对长列表的第三方服务进行身份验证。有了这个,可以开发一个网站,让用户使用来自另一个网站(例如 Twitter)的凭据进行注册。

我们将使用 Passport 来对用户进行身份验证,无论是存储在本地数据库中还是 Twitter 账户中。我们还将利用这个机会来探索基于 REST 的微服务,使用 Node.js。

原因是通过将用户信息存储在高度保护的飞地中,可以增加安全性的机会更大。许多应用团队将用户信息存储在一个受到严格控制的 API 和甚至物理访问用户信息数据库的严格控制区域中,尽可能多地实施技术屏障以防止未经批准的访问。我们不会走得那么远,但在本书结束时,用户信息服务将部署在自己的 Docker 容器中。

在本章中,我们将讨论以下三个方面:

  • 创建一个微服务来存储用户资料/身份验证数据。

  • 使用本地存储的密码对用户进行身份验证。

  • 使用 OAuth2 支持通过第三方服务进行身份验证。具体来说,我们将使用 Twitter 作为第三方身份验证服务。

让我们开始吧!

首先要做的是复制上一章节使用的代码。例如,如果你将该代码保存在chap07/notes目录中,那么创建一个新目录chap08/notes

第十一章:创建用户信息微服务

我们可以通过简单地向现有的Notes应用程序添加用户模型、一些路由和视图来实现用户身份验证和账户。虽然这很容易,但在真实的生产应用程序中是否会这样做呢?

考虑到用户身份信息的高价值和对强大可靠用户身份验证的极大需求。网站入侵经常发生,而似乎最经常被盗窃的是用户身份。因此,我们之前宣布了开发用户信息微服务的意图,但首先我们必须讨论这样做的技术原因。

当然,微服务并不是万能药,这意味着我们不应该试图将每个应用程序都强行塞进微服务的框架中。类比一下,微服务与 Unix 哲学中的小工具相契合,每个工具都做一件事情很好,然后我们将它们混合/匹配/组合成更大的工具。这个概念的另一个词是可组合性。虽然我们可以用这种哲学构建许多有用的软件工具,但它适用于诸如 Photoshop 或 LibreOffice 之类的应用程序吗?

这就是为什么微服务在应用团队中如此受欢迎的原因。如果使用得当,微服务架构更加灵活。正如我们之前提到的,我们的目标是实现高度安全的微服务部署。

决定已经做出,还有两个关于安全性影响的决定需要做。它们如下:

  • 我们要创建自己的 REST 应用程序框架吗?

  • 我们要创建自己的用户登录/身份验证框架吗?

在许多情况下,最好使用一个声誉良好的现有库,其中维护者已经解决了许多 bug,就像我们在上一章中使用 Sequelize ORM (Object-Relational Mapping)库一样,因为它很成熟。我们已经为 Notes 项目的这个阶段确定了两个库。

我们已经提到使用 Passport 来支持用户登录,以及对 Twitter 用户进行身份验证。

对于 REST 支持,我们本可以继续使用 Express,但我们将使用 Restify (restify.com/),这是一个流行的面向 REST 的应用程序框架。

为了测试服务,我们将编写一个命令行工具,用于管理数据库中的用户信息。我们不会在 Notes 应用程序中实现管理用户界面,而是依靠这个工具来管理用户。作为一个副作用,我们将拥有一个用于测试用户服务的工具。

一旦这项服务正常运行,我们将开始修改 Notes 应用程序,以从服务中访问用户信息,同时使用 Passport 来处理身份验证。

第一步是创建一个新目录来保存用户信息微服务。这应该是 Notes 应用程序的同级目录。如果您创建了一个名为chap08/notes的目录来保存 Notes 应用程序,那么请创建一个名为chap08/users的目录来保存微服务。

然后,在chap08/users目录中,运行以下命令:


This gets us ready to start coding. We'll use the `debug` module for logging messages, `js-yaml` to read the Sequelize configuration file, `restify` for its REST framework, and `sequelize/sqlite3` for database access.

In the sections to come, we will develop a database model to store user information, and then create a REST service to manage that data. To test the service, we'll create a command-line tool that uses the REST API.

## Developing the user information model

We'll be storing the user information using a Sequelize-based model in a SQL database. We went through that process in the previous chapter, but we'll do it a little differently this time. Rather than go for the ultimate flexibility of using any kind of database, we'll stick with Sequelize since the user information model is very simple and a SQL database is perfectly adequate.

The project will contain two modules. In this section, we'll create `users-sequelize.mjs`, which will define the SQUser schema and a couple of utility functions. In the next section, we'll start on `user-server.mjs`, which contains the REST server implementation. 

First, let's ponder an architectural preference. Just how much should we separate between the data model code interfacing with the database from the REST server code? In the previous chapter, we went for a clean abstraction with several implementations of the database storage layer. For a simple server such as this, the REST request handler functions could contain all database calls, with no abstraction layer. Which is the best approach? We don't have a hard rule to follow. For this server, we will have database code more tightly integrated to the router functions, with a few shared functions.

Create a new file named `users-sequelize.mjs` in `users` containing the following code:

与我们基于 Sequelize 的 Notes 模型一样,我们将使用YAML Ain't Markup Language (YAML)文件来存储连接配置。我们甚至使用相同的环境变量SEQUELIZE_CONNECT,以及相同的覆盖配置字段的方法。这种方法类似,通过connectDB函数设置连接并初始化 SQUsers 表。

通过这种方法,我们可以使用SEQUELIZE_CONNECT变量中的基本配置文件,然后使用其他环境变量来覆盖其字段。当我们开始部署 Docker 容器时,这将非常有用。

这里显示的用户配置文件模式是从 Passport 提供的规范化配置文件派生出来的,有关更多信息,请参阅www.passportjs.org/docs/profile

Passport 项目通过将多个第三方服务提供的用户信息协调为单个对象定义来开发了这个对象。为了简化我们的代码,我们只是使用了 Passport 定义的模式。

有几个函数需要创建,这些函数将成为管理用户数据的 API。让我们将它们添加到users-sequelize.mjs的底部,从以下代码开始:


In Restify, the route handler functions supply the same sort of `request` and `response` objects we've already seen. We'll go over the configuration of the REST server in the next section. Suffice to say that REST parameters arrive in the request handlers as the `req.params` object, as shown in the preceding code block. This function simplifies the gathering of those parameters into a simple object that happens to match the SQUser schema, as shown in the following code block:

当我们从数据库中获取 SQUser 对象时,Sequelize 显然会给我们一个具有许多额外字段和 Sequelize 使用的函数的 Sequelize 对象。我们不希望将这些数据发送给我们的调用者。此外,我们认为不提供密码数据超出此服务器的边界将增加安全性。这个函数从 SQUser 实例中产生一个简单的、经过消毒的匿名 JavaScript 对象。我们本可以定义一个完整的 JavaScript 类,但那有什么用呢?这个匿名的 JavaScript 类对于这个简单的服务器来说已经足够了,如下面的代码块所示:


The pair of functions shown in the preceding code block provides some database operations that are used several times in the `user-server.mjs` module. 

In `findOneUser`, we are looking up a single SQUser, and then returning a sanitized copy. In `createUser`, we gather the user parameters from the request object, create the SQUser object in the database, and then retrieve that newly created object to return it to the caller.

If you refer back to the `connectDB` function, there is a `SEQUELIZE_CONNECT` environment variable for the configuration file. Let's create one for SQLite3 that we can name `sequelize-sqlite.yaml`, as follows:

这就像我们在上一章中使用的配置文件一样。

这是我们在服务的数据库端所需要的。现在让我们继续创建 REST 服务。

为用户信息创建一个 REST 服务器

用户信息服务是一个用于处理用户信息数据和身份验证的 REST 服务器。我们的目标当然是将其与 Notes 应用程序集成,但在一个真实的项目中,这样的用户信息服务可以与多个 Web 应用程序集成。REST 服务将提供我们在开发 Notes 中用户登录/注销支持时发现有用的功能,我们稍后将在本章中展示。

package.json文件中,将main标签更改为以下代码行:


This declares that the module we're about to create, `user-server.mjs`, is the main package of this project.

Make sure the scripts section contains the following script:

显然,这是我们启动服务器的方式。它使用上一节的配置文件,并指定我们将在端口5858上监听。

然后,创建一个名为user-server.mjs的文件,其中包含以下代码:


We're using Restify, rather than Express, to develop this server. Obviously, the Restify API has similarities with Express, since both point to the Ruby framework Sinatra for inspiration. We'll see even more similarities when we talk about the route handler functions.

What we have here is the core setup of the REST server. We created the server object and added a few things that, in Express, were called *middleware*, but what Restify simply refers to as *handlers*. A Restify handler function serves the same purpose as an Express middleware function. Both frameworks let you define a function chain to implement the features of your service. One calls it a *middleware* function and the other calls it a *handler* function, but they're almost identical in form and function.

We also have a collection of listener functions that print a startup message and handle uncaught errors. You do remember that it's important to catch the uncaught errors?

An interesting thing is that, since REST services are often versioned, Restify has built-in support for handling version numbers. Restify supports **semantic versioning** (**SemVer**) version matching in the `Accept-Version` HTTP header. 

In the *handlers* that were installed, they obviously have to do with authorization and parsing parameters from the **Uniform Resource Locator** (**URL**) query string and from the HTTP body. The handlers with names starting with `restify.plugins` are maintained by the Restify team, and documented on their website.

That leaves the handler simply named *check*. This handler is in `user-server.mjs` and provides a simple mechanism of token-based authentication for REST clients.

Add the following code to the bottom of `user-server.mjs`:

这个处理程序对每个请求都执行,并紧随restify.plugins.authorizationParser。它查找授权数据,特别是 HTTP 基本授权,是否已在 HTTP 请求中提供。然后它循环遍历apiKeys数组中的键列表,如果基本授权参数匹配,则接受调用者。

这不应被视为最佳实践的示例,因为 HTTP 基本认证被广泛认为极不安全,还有其他问题。但它演示了基本概念,并且还表明通过类似的处理程序轻松实现基于令牌的授权。

这也向我们展示了 Restify 处理程序函数的函数签名,即与 Express 中间件使用的相同签名,requestresult对象以及next回调。

Restify 和 Express 在next回调的使用上有很大的区别。在 Express 中,记住中间件函数调用next,除非该中间件函数是处理链上的最后一个函数,例如,如果函数已经调用了res.send(或等效的)来向调用者发送响应。在 Restify 中,每个处理程序函数都调用next。如果处理程序函数知道它应该是处理程序链上的最后一个函数,那么它使用next(false);否则,它调用next()。如果处理程序函数需要指示错误,它调用next(err),其中err是一个对象,instanceof Errortrue

考虑以下假设的处理程序函数:


This shows the following three cases: 

1.  Errors are indicated with `next(new Error('Error description'))`.
2.  Completion is indicated with `next(false)`. 
3.  The continuation of processing is indicated with `next()`. 

We have created the starting point for a user information data model and the matching REST service. The next thing we need is a tool to test and administer the server.

What we want to do in the following sections is two things. First, we'll create the REST handler functions to implement the REST API. At the same time, we'll create a command-line tool that will use the REST API and let us both test the server and add or delete users.

### Creating a command-line tool to test and administer the user authentication server

To give ourselves assurance that the user authentication server works, let's write a tool with which to exercise the server that can also be used for administration. In a typical project, we'd create not only a customer-facing web user interface, but also an administrator-facing web application to administer the service. Instead of doing that here, we'll create a command-line tool.

The tool will be built with Commander, a popular framework for developing command-line tools in Node.js. With Commander, we can easily build a **command-line interface** (**CLI**) tool supporting the `program verb --option optionValue parameter` pattern.

For documentation on Commander, see [`www.npmjs.com/package/commander`](https://www.npmjs.com/package/commander).

Any command-line tool looks at the `process.argv` array to know what to do. This array contains strings parsed from what was given on the command line. The concept for all this goes way back to the earliest history of Unix and the C programming language. 

For documentation on the `process.argv` array, refer to [`nodejs.org/api/process.html#process_process_argv`](https://nodejs.org/api/process.html#process_process_argv).

By using Commander, we have a simpler path of dealing with the command line. It uses a declarative approach to handling command-line parameters. This means we use Commander functions to declare the options and sub-commands to be used by this program, and then we ask Commander to parse the command line the user supplies. Commander then calls the functions we declare based on the content of the command line.

Create a file named `cli.mjs` containing the following code:

这只是命令行工具的起点。对于大多数 REST 处理程序函数,我们还将在此工具中实现一个子命令。我们将在后续章节中处理该代码。现在,让我们专注于命令行工具的设置方式。

Commander 项目建议我们将默认导入命名为program,如前面的代码块所示。如前所述,我们通过在此对象上调用方法来声明命令行选项和子命令。

为了正确解析命令行,cli.mjs中的最后一行代码必须如下所示:


The `process.argv` variable is, of course, the command-line arguments split out into an array. Commander, then, is processing those arguments based on the options' declarations.

For the REST client, we use the `restify-clients` package. As the name implies, this is a companion package to Restify and is maintained by the Restify team.

At the top of this script, we declare a few variables to hold connection parameters. The goal is to create a connection URL to access the REST service. The `connect_url` variable is initialized with the default value, which is port `5858` on the localhost. 

The function named `client` looks at the information Commander parses from the command line, as well as a number of environment variables. From that data, it deduces any modification to the `connect_url` variable. The result is that we can connect to this service on any server from our laptop to a faraway cloud-hosted server.

We've also hardcoded the access token and the use of Basic Auth. Put on the backlog a high-priority task to change to a stricter form of authentication.

Where do the values of `program.port`, `program.host`, and `program.url` come from? We declared those variables—that's where they came from.

Consider the following line of code:

这声明了一个选项,要么是-p要么是--port,Commander 将从命令行中解析出来。请注意,我们所做的只是写一个文本字符串,从中 Commander 就知道它必须解析这些选项。这不是很容易吗?

当它看到这些选项之一时,<port>声明告诉 Commander 这个选项需要一个参数。它会从命令行中解析出该参数,然后将其分配给program.port

因此,program.portprogram.hostprogram.url都是以类似的方式声明的。当 Commander 看到这些选项时,它会创建相应的变量,然后我们的client函数将获取这些数据并适当地修改connect_url

这些声明的一个副作用是 Commander 可以自动生成帮助文本。我们将能够输入以下代码来实现结果:


The text comes directly from the descriptive text we put in the declarations. Likewise, each of the sub-commands also takes a `--help` option to print out corresponding help text.

With all that out of the way, let's start creating these commands and REST functions.

### Creating a user in the user information database

We have the starting point for the REST server, and the starting point for a command-line tool to administer the server. Let's start creating the functions—and, of course, the best place to start is to create an SQUser object.

In `user-server.mjs`, add the following route handler:

这个函数处理了/create-user URL 上的POST请求。这应该看起来非常类似于 Express 路由处理程序函数,除了使用next回调。回顾一下关于这一点的讨论。就像我们在 Notes 应用程序中所做的那样,我们将处理程序回调声明为异步函数,然后使用try/catch结构来捕获所有错误并将它们报告为错误。

处理程序以connectDB开始,以确保数据库设置正确。然后,如果你回顾createUser函数,你会看到它从请求参数中收集用户数据,然后使用SQUser.create在数据库中创建一个条目。我们将在这里收到经过处理的用户对象,并简单地将其返回给调用者。

让我们还向user-server.mjs中添加以下代码:


This is a variation on creating an SQUser. While implementing login support in the Notes application, there was a scenario in which we had an authenticated user that may or may not already have an SQUser object in the database. In this case, we look to see whether the user already exists and, if not, then we create that user.

Let's turn now to `cli.mjs` and implement the sub-commands to handle these two REST functions, as follows:

通过使用program.command,我们声明了一个子命令——在这种情况下是add<username>声明表示这个子命令需要一个参数。Commander 将会在action方法中传递username参数的值。

program.command声明的结构首先声明子命令的语法。description方法提供用户友好的文档。option方法调用是针对这个子命令的选项,而不是全局选项。最后,action方法是我们提供的回调函数,当 Commander 在命令行中看到这个子命令时将被调用。

program.command字符串中声明的任何参数最终都会成为回调函数的参数。

这个子命令的选项值都会落在cmdObj对象中。相比之下,全局选项的值会附加到program对象上。

有了这个理解,我们可以看到这个子命令从命令行收集信息,然后使用client函数连接到服务器。它调用/create-user URL,传递从命令行收集的数据。收到响应后,它将打印出错误或结果对象。

现在让我们添加对应于/find-or-create URL 的子命令,如下所示:


This is very similar, except for calling `/find-or-create`.

We have enough here to run the server and try the following two commands:

我们在一个命令窗口中运行这个命令来启动服务器。在另一个命令窗口中,我们可以运行以下命令:


Over in the server window, it will print a trace of the actions taken in response to this. But it's what we expect: the values we gave on the command line are in the database, as shown in the following code block:

同样,我们成功地使用了find-or-create命令。

这使我们能够创建 SQUser 对象。接下来,让我们看看如何从数据库中读取。

从用户信息服务中读取用户数据

我们想要支持的下一件事是在用户信息服务中查找用户。不是一个通用的搜索功能,而是需要为给定的用户名检索一个 SQUser 对象。我们已经有了这个目的的实用函数;现在只需要连接一个 REST 端点。

user-server.mjs中,添加以下函数:


And, as expected, that was easy enough. For the `/find` URL, we need to supply the username in the URL. The code simply looks up the SQUser object using the existing utility function.

A related function retrieves the SQUser objects for all users. Add the following code to `user-server.mjs`:

我们从上一章知道,findAll操作会检索所有匹配的对象,并且传递一个空的查询选择器,比如这样,会导致findAll匹配每个 SQUser 对象。因此,这执行了我们描述的任务,检索所有用户的信息。

然后,在cli.mjs中,我们添加以下子命令声明:


This is similarly easy. We pass the username provided on our command line in the `/find` URL and then print out the result. Likewise, for the `list-users` sub-command, we simply call `/list` on the server and print out the result.

After restarting the server, we can test the commands, as follows:

而且,结果正如我们所预期的那样。

我们需要的下一个操作是更新 SQUser 对象。

在用户信息服务中更新用户信息

要添加的下一个功能是更新用户信息。为此,我们可以使用 Sequelize 的update函数,并将其简单地公开为 REST 操作。

为此,在user-server.mjs中添加以下代码:


The caller is to provide the same set of user information parameters, which will be picked up by the `userParams` function. We then use the `update` function, as expected, and then retrieve the modified SQUser object, sanitize it, and send it as the result.

To match that function, add the following code to `cli.mjs`:

预期的是,这个子命令必须使用相同的用户信息参数集。然后,它将这些参数捆绑到一个对象中,将其发布到 REST 服务器上的/update-user端点。

然后,为了测试结果,我们运行以下命令:


And, indeed, we managed to change Snuffy's email address.

The next operation is to delete an SQUser object.

### Deleting a user record from the user information service

Our next operation will complete the **create, read, update, and delete** (**CRUD**) operations by letting us delete a user.

Add the following code to `user-server.mjs`:

这很简单。我们首先查找用户以确保它存在,然后在 SQUser 对象上调用destroy函数。不需要任何结果,所以我们发送一个空对象。

为了运行这个函数,将以下代码添加到cli.mjs中:


This is simply to send a `DELETE` request to the server on the `/destroy` URL. 

And then, to test it, run the following command:

首先,我们删除了 Snuffy 的用户记录,得到了一个预期的空响应。然后,我们尝试检索他的记录,预期地出现了错误。

虽然我们已经完成了 CRUD 操作,但还有最后一个任务要完成。

在用户信息服务中检查用户的密码

我们怎么能够有一个用户登录/注销服务而不能检查他们的密码呢?问题是:密码检查应该发生在哪里?似乎,不用深入研究,最好在用户信息服务内部执行此操作。我们之前描述过这个决定,可能更安全的做法是永远不要将用户密码暴露到用户信息服务之外。因此,密码检查应该发生在该服务中,以便密码不会流出服务范围。

让我们从user-server.mjs中的以下函数开始:


This lets us support the checking of user passwords. There are three conditions to check, as follows:

*   Whether there is no such user
*   Whether the passwords matched
*   Whether the passwords did not match

The code neatly determines all three conditions and returns an object indicating, via the `check` field, whether the user is authenticated. The caller is to send `username` and `password` parameters that will be checked.

To check it out, let's add the following code to `cli.mjs`:

并且,预期的是,调用此操作的代码很简单。我们从命令行获取usernamepassword参数,将它们发送到服务器,然后打印结果。

为了验证它是否有效,运行以下命令:


Indeed, the correct password gives us a `true` indicator, while the wrong password gives us `false`.

We've done a lot in this section by implementing a user information service. We successfully created a REST service while thinking about architectural choices around correctly handling sensitive user data. We were also able to verify that the REST service is functioning using an ad hoc testing tool. With this command-line tool, we can easily try any combination of parameters, and we can easily extend it if the need arises to add more REST operations.

Now, we need to start on the real goal of the chapter: changing the Notes user interface to support login/logout. We will see how to do this in the following sections.

# Providing login support for the Notes application

Now that we have proved that the user authentication service is working, we can set up the Notes application to support user logins. We'll be using Passport to support login/logout, and the authentication server to store the required data.

Among the available packages, Passport stands out for simplicity and flexibility. It integrates directly with the Express middleware chain, and the Passport community has developed hundreds of so-called strategy modules to handle authentication against a long list of third-party services.

Refer to [`www.passportjs.org/`](http://www.passportjs.org/) for information and documentation.

Let's start this by adding a module for accessing the user information REST server we just created.

## Accessing the user authentication REST API

The first step is to create a user data model for the Notes application. Rather than retrieving data from data files or a database, it will use REST to query the server we just created. Recall that we created this REST service in the theory of walling off the service since it contains sensitive user information.

Earlier, we suggested duplicating Chapter 7, *Data Storage and Retrieval*, code for Notes in the `chap08/notes` directory and creating the user information server as `chap08/users`.

Earlier in this chapter, we used the `restify-clients` module to access the REST service. That package is a companion to the Restify library; the `restify` package supports the server side of the REST protocol and `restify-clients` supports the client side. 

However nice the `restify-clients` library is, it doesn't support a Promise-oriented API, as is required to play well with `async` functions. Another library, SuperAgent, does support a Promise-oriented API and plays well in `async` functions, and there is a companion to that package, SuperTest, that's useful in unit testing. We'll use SuperTest in Chapter 13, *Unit Testing and Functional Testing* when we talk about unit testing.

For documentation, refer to [`www.npmjs.com/package/superagent`](https://www.npmjs.com/package/superagent) and [`visionmedia.github.io/superagent/`](http://visionmedia.github.io/superagent/).

To install the package (again, in the Notes application directory), run the following command:

然后,创建一个新文件models/users-superagent.mjs,其中包含以下代码:


The `reqURL` function is similar in purpose to the `connectDB` functions that we wrote in earlier modules. Remember that we used `connectDB` in earlier modules to open a database connection that will be kept open for a long time. With SuperAgent, we don't leave a connection open to the service. Instead, we open a new server connection on each request. For every request, we will formulate the request URL. The base URL, such as `http://localhost:3333/`, is to be provided in the `USER_SERVICE_URL` environment variable. The `reqURL` function modifies that URL, using the new **Web Hypertext Application Technology Working Group** (**WHATWG**) URL support in Node.js, to use a given URL path.

We also added the authentication ID and code required for the server. Obviously, when the backlog task comes up to use a better token authentication system, this will have to change.

To handle creating and updating user records, run the following code:

这些是我们的createupdate函数。在每种情况下,它们接受提供的数据,构造一个匿名对象,并将其POST到服务器。该函数应提供与 SQUser 模式对应的值。它将提供的数据捆绑在send方法中,设置各种参数,然后设置基本身份验证令牌。

SuperAgent 库使用一种称为方法链的 API 风格。编码者将方法调用链接在一起以构建请求。方法调用链可以以.then.end子句结束,其中任何一个都接受一个回调函数。但是如果两者都不加,它将返回一个 Promise,当然,Promise 让我们可以直接从异步函数中使用它。

每个函数末尾的res.body值包含了 REST 服务器返回的值。在整个库中,我们将使用.auth子句来设置所需的身份验证密钥。

这些匿名对象与普通对象有些不同。我们在这里使用了一个新的ECMAScript 2015 (ES-2015)特性,到目前为止我们还没有讨论过。与使用fieldName: fieldValue表示对象字段不同,ES-2015 给了我们一个选项,当用于fieldValue的变量名与所需的fieldName匹配时,可以缩短这个表示法。换句话说,我们只需列出变量名,字段名将自动匹配变量名。

在这种情况下,我们故意选择了参数的变量名,以匹配服务器使用的参数名称与对象字段名称。这样做,我们可以使用匿名对象的缩写表示法,通过始终使用一致的变量名,使我们的代码更清晰。

现在,添加以下函数以支持检索用户记录:


This is following the same pattern as before. The `set` methods are, of course, used for setting HTTP headers in the REST call. This means having at least a passing knowledge of the HTTP protocol.

The `Content-Type` header says the data sent to the server is in **JavaScript Object Notation** (**JSON**) format. The `Accept` header says that this REST client can handle JSON data. JSON is, of course, easiest for a JavaScript program—such as what we're writing—to utilize.

Let's now create the function for checking passwords, as follows:

这种方法值得注意的一点是,它可以在 URL 中获取参数,而不是在请求体中获取,就像这里所做的那样。但是,由于请求 URL 经常被记录到文件中,将用户名和密码参数放在 URL 中意味着用户身份信息将被记录到文件中并成为活动报告的一部分。这显然是一个非常糟糕的选择。将这些参数放在请求体中不仅避免了这种糟糕的结果,而且如果使用了与服务的 HTTPS 连接,交易将被加密。

然后,让我们创建我们的 find-or-create 函数,如下所示:


The `/find-or-create` function either discovers the user in the database or creates a new user. The `profile` object will come from Passport, but take careful note of what we do with `profile.id`. The Passport documentation says it will provide the username in the `profile.id` field, but we want to store it as `username` instead.

Let's now create a function to retrieve the list of users, as follows:

和以前一样,这非常简单。

有了这个模块,我们可以与用户信息服务进行接口,现在我们可以继续修改 Notes 用户界面。

在 Notes 应用程序中整合登录和注销路由函数

到目前为止,我们构建了一个用户数据模型,用一个 REST API 包装该模型来创建我们的身份验证信息服务。然后,在 Notes 应用程序中,我们有一个模块从这个服务器请求用户数据。到目前为止,Notes 应用程序中没有任何内容知道这个用户模型的存在。下一步是创建一个用于登录/注销 URL 的路由模块,并更改 Notes 的其余部分以使用用户数据。

路由模块是我们使用 passport 处理用户身份验证的地方。第一项任务是安装所需的模块,如下所示:


The `passport` module gives us the authentication algorithms. To support different authentication mechanisms, the passport authors have developed several *strategy* implementations—the authentication mechanisms, or strategies, corresponding to the various third-party services that support authentication, such as using OAuth to authenticate against services such as Facebook, Twitter, or GitHub.

Passport also requires that we install Express Session support. Use the following command to install the modules:

Express 会话支持,包括所有各种会话存储实现,都在其 GitHub 项目页面上有文档,网址为 github.com/expressjs/session

passport-local 包中实现的策略仅使用存储在应用程序本地的数据进行身份验证,例如我们的用户身份验证信息服务。稍后,我们将添加一个策略模块来验证使用 Twitter 的 OAuth。

让我们从创建路由模块 routes/users.mjs 开始,如下所示:


This brings in the modules we need for the `/users` router. This includes the two `passport` modules and the REST-based user authentication model. 

In `app.mjs`, we will be adding *session* support so our users can log in and log out. That relies on storing a cookie in the browser, and the cookie name is found in this variable exported from `app.mjs`. We'll be using that cookie in a moment.

Add the following functions to the end of `routes/users.mjs`:

initPassport 函数将从 app.mjs 被调用,并在 Express 配置中安装 Passport 中间件。我们将在后面讨论这个的影响,当我们到达 app.mjs 的变化时,但 Passport 使用会话来检测这个 HTTP 请求是否经过身份验证。它查看每个进入应用程序的请求,寻找关于这个浏览器是否已登录的线索,并将数据附加到请求对象作为 req.user

ensureAuthenticated 函数将被其他路由模块使用,并插入到任何需要经过身份验证的已登录用户的路由定义中。例如,编辑或删除笔记需要用户已登录,因此 routes/notes.mjs 中的相应路由必须使用 ensureAuthenticated。如果用户未登录,此函数将重定向他们到 /users/login,以便他们可以登录。

routes/users.mjs 中添加以下路由处理程序:


Because this router is mounted on `/users`, all these routes will have `/user` prepended. The `/users/login` route simply shows a form requesting a username and password. When this form is submitted, we land in the second route declaration, with a `POST` on `/users/login`. If `passport` deems this a successful login attempt using `LocalStrategy`, then the browser is redirected to the home page. Otherwise, it is redirected back to the `/users/login` page.

Add the following route for handling logout:

当用户请求注销 Notes 时,他们将被发送到 /users/logout。我们将在页眉模板中添加一个按钮来实现这个目的。req.logout 函数指示 Passport 擦除他们的登录凭据,然后将他们重定向到主页。

这个函数与 Passport 文档中的内容有所偏差。在那里,我们被告知只需调用 req.logout,但有时仅调用该函数会导致用户未注销。有必要销毁会话对象,并清除 cookie,以确保用户已注销。cookie 名称在 app.mjs 中定义,我们为这个函数导入了 sessionCookieName

添加 LocalStrategy 到 Passport,如下所示:


Here is where we define our implementation of `LocalStrategy`. In the callback function, we call `usersModel.userPasswordCheck`, which makes a REST call to the user authentication service. Remember that this performs the password check and then returns an object indicating whether the user is logged in.

A successful login is indicated when `check.check` is `true`. In this case, we tell Passport to use an object containing `username` in the session object. Otherwise, we have two ways to tell Passport that the login attempt was unsuccessful. In one case, we use `done(null, false)` to indicate an error logging in, and pass along the error message we were given. In the other case, we'll have captured an exception, and pass along that exception.

You'll notice that Passport uses a callback-style API. Passport provides a `done` function, and we are to call that function when we know what's what. While we use an `async` function to make a clean asynchronous call to the backend service, Passport doesn't know how to grok the Promise that would be returned. Therefore, we have to throw a `try/catch` around the function body to catch any thrown exception.

Add the following functions to manipulate data stored in the session cookie:

前面的函数负责对会话的身份验证数据进行编码和解码。我们只需要将username附加到会话中,就像我们在serializeUser中所做的那样。deserializeUser对象在处理传入的 HTTP 请求时被调用,这是我们查找用户配置文件数据的地方。Passport 会将其附加到请求对象上。

对 app.mjs 进行登录/注销更改

app.mjs中需要进行一些更改,其中一些我们已经提到过。我们已经将 Passport 模块的依赖项仔细隔离到routes/users.mjs中。app.mjs中需要的更改支持routes/users.mjs中的代码。

添加导入以从用户路由模块中引入函数,如下所示:


The User router supports the `/login` and `/logout` URLs, as well as using Passport for authentication. We need to call `initPassport` for a little bit of initialization.

And now, let's import modules for session handling, as follows:

因为 Passport 使用会话,我们需要在 Express 中启用会话支持,这些模块也这样做。session-file-store模块将我们的会话数据保存到磁盘上,这样我们可以在不丢失会话的情况下终止和重新启动应用程序。还可以使用适当的模块将会话保存到数据库中。文件系统会话存储仅在所有 Notes 实例运行在同一台服务器计算机上时才适用。对于分布式部署情况,您需要使用在整个网络服务上运行的会话存储,例如数据库。

我们在这里定义sessionCookieName,以便它可以在多个地方使用。默认情况下,express-session使用名为connect.sid的 cookie 来存储会话数据。作为一种小的安全措施,当有一个已发布的默认值时,使用不同的非默认值是有用的。每当我们使用默认值时,可能会有攻击者知道安全漏洞,这取决于该默认值。

将以下代码添加到app.mjs中:


Here, we initialize the session support. The field named `secret` is used to sign the session ID cookie. The session cookie is an encoded string that is encrypted in part using this secret. In the Express Session documentation, they suggest the `keyboard cat` string for the secret. But, in theory, what if Express has a vulnerability, such that knowing this secret can make it easier to break the session logic on your site? Hence, we chose a different string for the secret, just to be a little different and—perhaps—a little more secure.

Similarly, the default cookie name used by `express-session` is `connect.sid`. Here's where we change the cookie name to a non-default name.

`FileStore` will store its session data records in a directory named `sessions`. This directory will be auto-created as needed.

In case you see errors on Windows that are related to the files used by `session-file-store`, there are several alternate session store packages that can be used.  The attraction of the `session-file-store` is that it has no dependency on a service like a database server.  Two other session stores have a similar advantage, `LokiStore`, and `MemoryStore`. Both are configured similarly to the `session-file-store package`. For example, to use `MemoryStore`, first use npm to install the `memorystore` package, then use these  lines of code in `app.mjs`:

这是相同的初始化,但是使用MemoryStore而不是FileStore

要了解有关会话存储实现的更多信息,请参阅:expressjs.com/en/resources/middleware/session.html#compatible-session-stores

挂载用户路由,如下所示:


These are the three routers that are used in the Notes application. 

### Login/logout changes in routes/index.mjs

This router module handles the home page. It does not require the user to be logged in, but we want to change the display a little if they are logged in. To do so, run the following code:

记住,我们确保req.user拥有用户配置文件数据,这是在deserializeUser中完成的。我们只需检查这一点,并确保在渲染视图模板时添加该数据。

我们将对大多数其他路由定义进行类似的更改。之后,我们将讨论视图模板的更改,在其中我们使用req.user来在每个页面上显示正确的按钮。

routes/notes.mjs中需要进行登录/注销更改

这里需要的更改更为重要,但仍然很简单,如下面的代码片段所示:


We need to use the `ensureAuthenticated` function to protect certain routes from being used by users who are not logged in. Notice how ES6 modules let us import just the function(s) we require. Since that function is in the User router module, we need to import it from there.

Modify the `/add` route handler, as shown in the following code block:

我们将在整个模块中进行类似的更改,添加对ensureAuthenticated的调用,并使用req.user来检查用户是否已登录。目标是让几个路由确保路由仅对已登录用户可用,并且在这些路由和其他路由中将user对象传递给模板。

我们添加的第一件事是在路由定义中调用usersRouter.ensureAuthenticated。如果用户未登录,他们将由于该函数而被重定向到/users/login

因为我们已确保用户已经通过身份验证,所以我们知道req.user已经有了他们的配置文件信息。然后我们可以简单地将其传递给视图模板。

对于其他路由,我们需要进行类似的更改。

修改/save路由处理程序如下:


The `/save` route only requires this change to call `ensureAuthenticated` in order to ensure that the user is logged in.

Modify the `/view` route handler, as follows:

对于这个路由,我们不需要用户已登录。如果有的话,我们需要用户的配置文件信息发送到视图模板。

修改/edit/destroy路由处理程序如下:


Remember that throughout this module, we have made the following two changes to router functions:

1.  We protected some routes using `ensureAuthenticated` to ensure that the route is available only to logged-in users.
2.  We passed the `user` object to the template.

For the routes using `ensureAuthenticated`, it is guaranteed that `req.user` will contain the `user` object.  In other cases, such as with the `/view` router function, `req.user` may or may not have a value, and in case it does not, we make sure to pass `undefined`. In all such cases, the templates need to change in order to use the `user` object to detect whether the user is logged in, and whether to show HTML appropriate for a logged-in user.

### Viewing template changes supporting login/logout

So far, we've created a backend user authentication service, a REST module to access that service, a router module to handle routes related to logging in and out of the website, and changes in `app.mjs` to use those modules. We're almost ready, but we've got a number of changes left that need to be made to the templates. We're passing the `req.user` object to every template because each one must be changed to accommodate whether the user is logged in. 

This means that we can test whether the user is logged in simply by testing for the presence of a `user` variable.

In `partials/header.hbs`, make the following additions:

我们在这里做的是控制屏幕顶部显示哪些按钮,这取决于用户是否已登录。较早的更改确保了如果用户已注销,则user变量将为undefined;否则,它将具有用户配置文件对象。因此,只需检查user变量即可,如前面的代码块所示,以渲染不同的用户界面元素。

未登录的用户不会看到“添加笔记”按钮,并会看到一个登录按钮。否则,用户会看到“添加笔记”按钮和一个注销按钮。登录按钮将用户带到/users/login,而注销按钮将他们带到/users/logout。这两个按钮都在routes/users.js中处理,并执行预期的功能。

注销按钮具有 Bootstrap 徽章组件显示用户名。这为已登录的用户名提供了一个小的视觉标识。稍后我们将看到,它将作为用户身份的视觉提示。

因为nav现在支持登录/注销按钮,我们已经更改了navbar-toggler按钮,以便它控制具有id="navbarLogIn"<div>

我们需要创建views/login.hbs,如下所示:


This is a simple form decorated with Bootstrap goodness to ask for the username and password. When submitted, it creates a `POST` request to `/users/login`, which invokes the desired handler to verify the login request. The handler for that URL will start the Passport process to decide whether the user is authenticated.

In `views/notedestroy.hbs`, we want to display a message if the user is not logged in. Normally, the form to cause the note to be deleted is displayed, but if the user is not logged in, we want to explain the situation, as illustrated in the following code block:

这很简单 - 如果用户已登录,则显示表单;否则,在partials/not-logged-in.hbs中显示消息。我们根据user变量确定要显示其中哪一个。

我们可以在partials/not-logged-in.hbs中插入以下代码块中显示的代码:


As the text says, this will probably never be shown to users. However, it is useful to put something such as this in place since it may show up during development, depending on the bugs you create.

In `views/noteedit.hbs`, we require a similar change, as follows:

也就是说,在底部我们添加了一个段落,对于未登录的用户,引入了not-logged-in部分。

Bootstrap jumbotron组件可以创建一个漂亮而大的文本显示,非常引人注目。然而,用户不应该看到这一点,因为这些模板只在我们预先验证用户已登录时使用。

这样的消息对于检查代码中的错误非常有用。假设我们疏忽了,并且未能确保这些表单仅显示给已登录用户。假设我们有其他错误,未检查表单提交以确保它仅由已登录用户请求。以这种方式修复模板是另一层防止向未被允许使用该功能的用户显示表单的预防措施。

我们现在已经对用户界面进行了所有更改,并准备测试登录/注销功能。

使用用户身份验证运行 Notes 应用程序

我们已经创建了用户信息 REST 服务,创建了一个模块来从 Notes 访问该服务,修改了路由模块以正确访问用户信息服务,并更改了其他支持登录/注销所需的内容。

必要的最后一个任务是修改package.json的脚本部分,如下所示:


In the previous chapters, we built up quite a few combinations of models and databases for running the Notes application. Since we don't need those, we can strip most of them out from `package.json`. This leaves us with one, configured to use the Sequelize model for Notes, using the SQLite3 database, and to use the new user authentication service that we wrote earlier. All the other Notes data models are still available, just by setting the environment variables appropriately.

`USER_SERVICE_URL` needs to match the port number that we designated for that service.

In one window, start the user authentication service, as follows:

然后,在另一个窗口中,按照以下方式启动 Notes 应用程序:


You'll be greeted with the following message:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/ceb549b2-cf18-4dd8-9830-b8ef51822b0e.png)

Notice the new button, Log in, and the lack of an ADD Note button. We're not logged in, and so `partials/header.hbs` is rigged to show only the Log in button.

Click on the Log in button, and you will see the login screen, as shown in the following screenshot:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/15c349d7-2754-4ef5-8749-7842178385cc.png)

This is our login form from `views/login.hbs`. You can now log in, create a note or three, and you might end up with the following messages on the home page:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/8dd14aa5-9173-4621-a7fa-9aa53572f658.png)

You now have both Log Out and ADD Note buttons. You'll notice that the Log Out button has the username (me) shown. After some thought and consideration, this seemed the most compact way to show whether the user is logged in, and which user is logged in. This might drive the user experience team nuts, and you won't know whether this user interface design works until it's tested with users, but it's good enough for our purpose at the moment.

In this section, we've learned how to set up a basic login/logout functionality using locally stored user information. This is fairly good, but many web applications find it useful to allow folks to log in using their Twitter or other social media accounts for authentication. In the next section, we'll learn about that by setting up Twitter authentication.

# Providing Twitter login support for the Notes application

If you want your application to hit the big time, it's a great idea to ease the registration process by using third-party authentication. Websites all over the internet allow you to log in using accounts from other services such as Facebook or Twitter. Doing so removes hurdles to prospective users signing up for your service. Passport makes it extremely easy to do this.

Authenticating users with Twitter requires installation of `TwitterStrategy` from the `passport-twitter` package, registering a new application with Twitter, adding a couple of routes to `routes/user.mjs`, and making a small change in `partials/header.hbs`. Integrating other third-party services requires similar steps.

## Registering an application with Twitter

Twitter, as with every other third-party service, uses OAuth to handle authentication. OAuth is a standard protocol through which an application or a person can authenticate with one website by using credentials they have on another website. We use this all the time on the internet. For example, we might use an online graphics application such as [draw.io](http://draw.io) or Canva by logging in with a Google account, and then the service can save files to our Google Drive. 

Any application author must register with any sites you seek to use for authentication. Since we wish to allow Twitter users to log in to Notes using Twitter credentials, we have to register our Notes application with Twitter. Twitter then gives us a pair of authentication keys that will validate the Notes application with Twitter. Any application, whether it is a popular site such as Canva, or a new site such as Joe's Ascendant Horoscopes, must be registered with any desired OAuth authentication providers. The application author must then be diligent about keeping the registration active and properly storing the authentication keys.

The authentication keys are like a username/password pair. Anyone who gets a hold of those keys could use the service as if they were you, and potentially wreak havoc on your reputation or business.

Our task in this section is to register a new application with Twitter, fulfilling whatever requirements Twitter has.

To register a new application with Twitter, go to [`developer.twitter.com/en/apps`](https://developer.twitter.com/en/apps). 

As you go through this process, you may be shown the following message:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/a2a51eab-1678-4207-b7fb-847230ed20fe.png)

Recall that in recent years, concerns began to arise regarding the misuse of third-party authentication, the potential to steal user information, and the negative results that have occurred thanks to user data being stolen from social networks. As a result, social networks have increased scrutiny over developers using their APIs. It is necessary to sign up for a Twitter developer account, which is an easy process that does not cost anything.

As we go through this, realize that the Notes application needs a minimal amount of data. The ethical approach to this is to request only the level of access required for your application, and nothing more.

Once you're registered, you can log in to `developer.twitter.com/apps` and see a dashboard listing the active applications you've registered. At this point, you probably do not have any registered applications. At the top is a button marked *Create an App*. Click on that button to start the process of submitting a request to register a new application.

Every service offering OAuth authentication has an administrative backend similar to `developer.twitter.com/apps`. The purpose is so that certified application developers can administer the registered applications and authorization tokens. Each such service has its own policies for validating that those requesting authorization tokens have a legitimate purpose and will not abuse the service. The authorization token is one of the mechanisms to verify that API requests come from approved applications. Another mechanism is the URL from which API requests are made. 

In the normal case, an application will be deployed to a regular server, and is accessed through a domain name such as `MyNotes.xyz`. In our case, we are developing a test application on our laptop, and do not have a public IP address, nor is there a domain name associated with our laptop. Not all social networks allow interactions from an application on an untrusted computer—such as a developer's laptop—to make API requests; however, Twitter does.

At the time of writing, there are several pieces of information requested by the Twitter sign-up process, listed as follows:

*   **Name**: This is the application name, and it can be anything you like. It would be a good form to use "`Test`" in the name, in case Twitter's staff decide to do some checking.
*   **Description**: Descriptive phrase—and again, it can be anything you like. The description is shown to users during the login process. It's good form to describe this as a test application.

*   **Website**: This would be your desired domain name. Here, the help text helpfully suggests *If you don't have a URL yet, just put a placeholder here but remember to change it later*.
*   **Allow this application to be used to sign in with Twitter**: Check this, as it is what we want.
*   **Callback URL**: This is the URL to return to following successful authentication. Since we don't have a public URL to supply, this is where we specify a value referring to your laptop. It's been found that `http://localhost:3000` works just fine. macOS users have another option because of the `.local` domain name that is automatically assigned to their laptop. 
*   **Tell us how this app will be used**: This statement will be used by Twitter to evaluate your request. For the purpose of this project, explain that it is a sample app from a book. It is best to be clear and honest about your intention.

The sign-up process is painless. However, at several points, Twitter reiterated the sensitivity of the information provided through the Twitter API. The last step before granting approval warned that Twitter prohibits the use of its API for various unethical purposes.

The last thing to notice is the extremely sensitive nature of the authentication keys. It's bad form to check these into a source code repository or otherwise put them in a place where anybody can access the key. We'll tackle this issue in Chapter 14, *Security in Node.js Applications*.

The Twitter developers' site has documentation describing best practices for storing authentication tokens. Visit [`developer.twitter.com/en/docs/basics/authentication/guides/authentication-best-practices`](https://developer.twitter.com/en/docs/basics/authentication/guides/authentication-best-practices).

### Storing authentication tokens

The Twitter recommendation is to store configuration values in a `.env` file. The contents of this file are to somehow become environment variables, which we can then access using `process.env`, as we've done before. Fortunately, there is a third-party Node.js package to do just this, called `dotenv`.

Learn about the `dotenv` package at [`www.npmjs.com/package/dotenv.`](https://www.npmjs.com/package/dotenv)

First, install the package, as follows:

文档表示我们应该加载dotenv包,然后在应用程序启动阶段非常早的时候调用dotenv.config(),并且我们必须在访问任何环境变量之前这样做。然而,仔细阅读文档后,似乎最好将以下代码添加到app.mjs中:


With this approach, we do not have to explicitly call the `dotenv.config` function. The primary advantage is avoiding issues with referencing environment variables from multiple modules.

The next step is to create a file, `.env`, in the `notes` directory. The syntax of this file is very simple, as shown in the following code block:

这正是我们期望的语法,因为它与 shell 脚本的语法相同。在这个文件中,我们需要定义两个变量,TWITTER_CONSUMER_KEYTWITTER_CONSUMER_SECRET。我们将在下一节中编写的代码中使用这些变量。由于我们正在将配置值放在package.jsonscripts部分中,因此可以将这些环境变量添加到.env中。

下一步是避免将此文件提交到 Git 等源代码控制系统中。为了确保这不会发生,您应该已经在notes目录中有一个.gitignore文件,并确保其内容类似于以下内容:


These values mostly refer to database files we generated in the previous chapter. In the end, we've added the `.env` file, and because of this, Git will not commit this file to the repository.

This means that when deploying the application to a server, you'll have to arrange to add this file to the deployment without it being committed to a source repository. 

With an approved Twitter application, and with our authentication tokens recorded in a configuration file, we can move on to adding the required code to Notes.

## Implementing TwitterStrategy

As with many web applications, we have decided to allow our users to log in using Twitter credentials. The OAuth protocol is widely used for this purpose and is the basis for authentication on one website using credentials maintained by another website.

The application registration process you just followed at `developer.twitter.com` generated for you a pair of API keys: a consumer key, and a consumer secret. These keys are part of the OAuth protocol and will be supplied by any OAuth service you register with, and the keys should be treated with the utmost care. Think of them as the username and password your service uses to access the OAuth-based service (Twitter et al.). The more people who can see these keys, the more likely it becomes that a miscreant can see them and then cause trouble. Anybody with those secrets can access the service API as if they are you.

Let's install the package required to use `TwitterStrategy`, as follows:

routes/users.mjs中,让我们开始做一些更改,如下所示:


This imports the package, and then makes its `Strategy` variable available as `TwitterStrategy`.

Let's now install the `TwitterStrategy`, as follows:

这注册了一个TwitterStrategy实例到passport,安排在用户注册 Notes 应用程序时调用用户认证服务。当用户成功使用 Twitter 进行身份验证时,将调用此callback函数。

如果包含 Twitter 令牌的环境变量没有设置,那么这段代码就不会执行。显然,没有设置 Twitter 认证的密钥是错误的,所以我们通过不执行代码来避免错误。

为了帮助其他代码知道 Twitter 支持是否已启用,我们导出了一个标志变量-twitterLogin

我们专门定义了usersModel.findOrCreate函数来处理来自 Twitter 等第三方服务的用户注册。它的任务是查找在配置文件对象中描述的用户,并且如果该用户不存在,则在 Notes 中创建该用户帐户。

consumerKeyconsumerSecret的值是在注册应用程序后由 Twitter 提供的。这些密钥在 OAuth 协议中用作向 Twitter 证明身份的凭证。

TwitterStrategy配置中的callbackURL设置是 Twitter 的基于 OAuth1 的 API 实现的遗留物。在 OAuth1 中,回调 URL 是作为 OAuth 请求的一部分传递的。由于TwitterStrategy使用了 Twitter 的 OAuth1 服务,我们必须在这里提供 URL。我们马上会看到这个 URL 在 Notes 中是如何实现的。

callbackURLconsumerKeyconsumerSecret设置都是使用环境变量注入的。之前,我们讨论了不将consumerKeyconsumerSecret的值提交到源代码库是最佳实践,因此我们设置了dotenv包和一个.env文件来保存这些配置值。在第十章,将 Node.js 应用程序部署到 Linux 服务器中,我们将看到这些密钥可以在 Dockerfile 中声明为环境变量。

添加以下路由声明:


To start the user logging in with Twitter, we'll send them to this URL. Remember that this URL is really `/users/auth/twitter` and, in the templates, we'll have to use that URL. When this is called, the passport middleware starts the user authentication and registration process using `TwitterStrategy`.

Once the user's browser visits this URL, the OAuth dance begins. It's called a dance because the OAuth protocol involves carefully designed redirects between several websites. Passport sends the browser over to the correct URL at Twitter, where Twitter asks the user whether they agree to authenticate using Twitter, and then Twitter redirects the user back to your callback URL. Along the way, specific tokens are passed back and forth in a very carefully designed dance between websites.

Once the OAuth dance concludes, the browser lands at the URL designated in the following router declaration:

这个路由处理回调 URL,并且它对应于之前配置的callbackURL设置。根据它是否指示成功注册,Passport 将重定向浏览器到首页或者回到/users/login页面。

因为router被挂载在/user上,所以这个 URL 实际上是/user/auth/twitter/callback。因此,在配置TwitterStrategy时要使用完整的 URL,并提供给 Twitter 的是http://localhost:3000/user/auth/twitter/callback

在处理回调 URL 的过程中,Passport 将调用之前显示的回调函数。因为我们的回调使用了usersModel.findOrCreate函数,如果需要,用户将自动注册。

我们几乎准备好了,但是我们需要在 Notes 的其他地方做一些小的更改。

partials/header.hbs中,对代码进行以下更改:


This adds a new button that, when clicked, takes the user to `/users/auth/twitter`, which—of course—kicks off the Twitter authentication process. The button is enabled only if Twitter support is enabled, as determined by the `twitterLogin` variable. This means that the router functions must be modified to pass in this variable.

This button includes a little image we downloaded from the official Twitter brand assets page at [`about.twitter.com/company/brand-assets`](https://about.twitter.com/company/brand-assets). Twitter recommends using these branding assets for a consistent look across all services using Twitter. Download the whole set, and then pick the one you like.

For the URL shown here, the corresponding project directory is named `public/assets/vendor/twitter`. Notice that we force the size to be small enough for the navigation bar.

In `routes/index.mjs`, make the following change:

这导入了变量,然后在传递给res.render的数据中,我们添加了这个变量。这将确保该值传递到partials/header.hbs

routes/notes.mjs中,我们需要在几个路由函数中进行类似的更改:


This is the same change, importing the variable and passing it to `res.render`.

With these changes, we're ready to try logging in with Twitter.

Start the user information server as shown previously, and then start the Notes application server, as shown in the following code block:

然后,使用浏览器访问http://localhost:3000,如下所示:

注意新按钮。它看起来差不多,多亏了使用了官方的 Twitter 品牌形象。按钮有点大,所以也许你想咨询一位设计师。显然,如果你要支持数十种认证服务,就需要不同的设计。

在略去 Twitter 令牌环境变量的情况下运行它,Twitter 登录按钮不应该出现。

单击此按钮将浏览器带到/users/auth/twitter,这意味着启动 Passport 运行 OAuth 协议交易以进行身份验证。但是,您可能会收到一个错误消息,指出回调 URL 未经此客户端应用程序批准。批准的回调 URL 可以在您的应用程序设置中进行调整。如果是这种情况,就需要在developer.twitter.com上调整应用程序配置。错误消息明确表示 Twitter 看到了一个未经批准的 URL。

在应用程序页面上,在 App Details 选项卡上,点击编辑按钮。然后,向下滚动到 Callback URLs 部分,并添加以下条目:

正如它所解释的,此框列出了允许用于 Twitter OAuth 身份验证的 URL。目前,我们正在使用端口3000在笔记本电脑上托管应用程序。如果您从其他基本 URL 访问它,例如http://MacBook-Pro-4.local,那么除了该基本 URL 外还应该使用它。

一旦正确配置了回调 URL,单击“使用 Twitter 登录”按钮将带您到正常的 Twitter OAuth 身份验证页面。只需点击批准,您将被重定向回 Notes 应用程序。

然后,一旦您使用 Twitter 登录,您将看到类似以下截图:

我们现在已经登录,并且会注意到我们的 Notes 用户名与我们的 Twitter 用户名相同。您可以浏览应用程序并创建、编辑或删除笔记。实际上,您可以对任何您喜欢的笔记进行操作,甚至是其他人创建的笔记。这是因为我们没有创建任何访问控制或权限系统,因此每个用户都可以完全访问每个笔记。这是一个需要放入待办事项的功能。

通过使用多个浏览器或计算机,您可以同时以不同用户身份登录,每个浏览器一个用户。

您可以通过执行我们之前所做的操作来运行 Notes 应用程序的多个实例,如下所示:


Then, in one command window, run the following command:

在另一个命令窗口中,运行以下命令:


As previously, this starts two instances of the Notes server, each with a different value in the `PORT` environment variable. In this case, each instance will use the same user authentication service. As shown here, you'll be able to visit the two instances at `http://localhost:3000` and `http://localhost:3002`. As before, you'll be able to start and stop the servers as you wish, see the same notes in each, and see that the notes are retained after restarting the server.

Another thing to try is to fiddle with the **session store**. Our session data is being stored in the `sessions` directory. These are just files in the filesystem, and we can take a look with normal tools such as `ls`, as shown in the following code block:

这是使用 Twitter 账户登录后。您可以看到 Twitter 账户名称存储在会话数据中。

如果您想要清除会话怎么办?这只是文件系统中的一个文件。删除会话文件会擦除会话,用户的浏览器将被强制注销。

如果用户长时间不活动,会话将超时。session-file-store选项之一,ttl,控制超时时间,默认为 3,600 秒(一小时)。会话超时后,应用程序将恢复到已注销状态。

在这一部分,我们已经完成了设置支持使用 Twitter 身份验证服务进行登录的完整流程。我们创建了 Twitter 开发者账户,并在 Twitter 的后端创建了一个应用程序。然后,我们实现了与 Twitter 的 OAuth 支持集成所需的工作流程。为了支持这一点,我们集成了存储用户授权信息的服务。

我们的下一个任务非常重要:保持用户密码加密。

保持秘密和密码安全

我们已经多次警告过安全处理用户识别信息的重要性。安全处理这些数据的意图是一回事,但实际上这样做非常重要。尽管我们迄今为止使用了一些良好的做法,但就目前而言,Notes 应用程序无法经受任何安全审计,原因如下:

  • 用户密码以明文形式保存在数据库中。

  • Twitter 等的身份验证令牌以明文形式保存。

  • 身份验证服务 API 密钥不是加密安全的任何东西;它只是一个明文的通用唯一标识符UUID)。

如果您不认识短语明文,它只是表示未加密。任何人都可以阅读用户密码或身份验证令牌的文本。最好将两者都加密以避免信息泄漏。

请记住这个问题,因为我们将在第十四章中重新讨论这些以及其他安全问题。

在我们离开这一章之前,让我们解决其中的第一个问题:以明文形式存储密码。我们之前已经提到用户信息安全非常重要。因此,我们应该从一开始就注意到这一点。

bcrypt Node.js 包使得安全存储密码变得容易。有了它,我们可以立即加密密码,永远不会存储未加密的密码。

有关bcrypt文档,请参阅www.npmjs.com/package/bcrypt

notesusers目录中安装bcrypt,执行以下命令:


The `bcrypt` documentation says that the correct version of this package must be used precisely for the Node.js version in use. Therefore, you should adjust the version number appropriately to the Node.js version you are using.

The strategy of storing an encrypted password dates back to the earliest days of Unix. The creators of the Unix operating system devised a means for storing an encrypted value in `/etc/passwd`, which was thought sufficiently safe that the password file could be left readable to the entire world.

Let's start with the user information service.

## Adding password encryption to the user information service

Because of our command-line tool, we can easily test end-to-end password encryption. After verifying that it works, we can implement encryption in the Notes application.

In `cli.mjs`, add the following code near the top:

这引入了bcrypt包,然后我们配置了一个常量,该常量控制解密密码所需的 CPU 时间。bcrypt文档指向了一篇讨论为什么bcrypt算法非常适合存储加密密码的博客文章。论点归结为解密所需的 CPU 时间。针对密码数据库的暴力攻击更加困难,因此如果使用强加密加密密码,测试所有密码组合所需的 CPU 时间更长,因此成功的可能性更小。

我们分配给saltRounds的值决定了 CPU 时间的要求。文档进一步解释了这一点。

接下来,添加以下函数:


This takes a plain text password and runs it through the encryption algorithm. What's returned is the hash for the password.

Next, in the commands for `add`, `find-or-create`*,* and `update`, we make this same change, as follows:

也就是说,在每个地方,我们将回调函数设置为异步函数,以便我们可以使用await。然后,我们调用hashpass函数来加密密码。

这样,我们立即加密密码,并且用户信息服务器将存储加密密码。

因此,在user-server.mjs中,password-check处理程序必须重写以适应检查加密密码。

user-server.mjs的顶部,添加以下导入:


Of course, we need to bring in the module here to use its decryption function. This module will no longer store a plain text password, but instead, it will now store encrypted passwords. Therefore, it does not need to generate encrypted passwords, but the `bcrypt` package also has a function to compare a plain text password against the encrypted one in the database, which we will use.

Next, scroll down to the `password-check` handler and modify it, like so:

bcrypt.compare函数比较明文密码,这些密码将作为req.params.password到达,与我们存储的加密密码进行比较。为了处理加密,我们需要重构检查,但我们正在测试相同的三个条件。更重要的是,对于这些条件返回相同的对象。

要测试它,像之前一样启动用户信息服务器,如下所示:


In another window, we can create a new user, as follows:

我们之前已经完成了这两个步骤。不同之处在于我们接下来要做什么。

让我们检查数据库,看看存储了什么,如下所示:


Indeed, the password field no longer has a plain text password, but what is—surely—encrypted text.

Next, we should check that the `password-check` command behaves as expected: 

我们之前进行了相同的测试,但这次是针对加密密码。

我们已经验证了对密码进行 REST 调用将起作用。我们的下一步是在 Notes 应用程序中实现相同的更改。

在 Notes 应用程序中实现加密密码支持

由于我们已经证明了如何实现加密密码检查,我们所需要做的就是在 Notes 服务器中复制一些代码。

users-superagent.mjs中,添加以下代码到顶部:


As before, this imports the `bcrypt` package and configures the complexity that will be used, and we have the same encryption function because we will use it from multiple places.

Next, we must change the functions that interface with the backend server, as follows:

在适当的地方,我们必须加密密码。不需要其他更改。

因为password-check后端执行相同的检查,返回相同的对象,所以前端代码不需要更改。

为了测试,启动用户信息服务器和 Notes 服务器。然后,使用应用程序检查使用基于 Twitter 的用户和本地用户的登录和退出。

我们已经学会了如何使用加密来安全存储用户密码。如果有人窃取了我们的用户数据库,由于这里的选择,破解密码将需要更长的时间。

我们几乎完成了本章。剩下的任务只是简单地回顾我们创建的应用程序架构。

运行 Notes 应用程序堆栈

您是否注意到之前我们说要运行 Notes 应用程序堆栈?现在是时候向营销团队解释这个短语的含义了。他们可能希望在营销宣传册或网站上放置架构图。对于像我们这样的开发人员来说,退一步并绘制我们已经创建或计划创建的图片也是有用的。

以下是工程师可能绘制的图表,以展示给营销团队系统设计(当然,营销团队将聘请图形艺术家对其进行整理):

在上图中标有“Notes 应用程序”的方框是由模板和路由器模块实现的面向公众的代码。按照当前配置,它可以在我们的笔记本电脑上的端口3000上可见。它可以使用多个数据存储服务之一。它通过端口5858(或前图所示的端口3333)与“用户身份验证服务”后端通信。

在第十章中,将 Node.js 应用程序部署到 Linux 服务器,当我们学习如何在真实服务器上部署时,我们将扩展这个图片。

总结

在本章中,您涵盖了很多内容,不仅涉及 Express 应用程序中的用户身份验证,还涉及微服务开发。

具体来说,您涵盖了 Express 中的会话管理,使用 Passport 进行用户身份验证(包括 Twitter/OAuth),使用路由器中间件限制访问,使用 Restify 创建 REST 服务,以及何时创建微服务。我们甚至使用了加密算法来确保我们只存储加密密码。

了解如何处理登录/注销,特别是来自第三方服务的 OAuth 登录,对于 Web 应用程序开发人员来说是一项必不可少的技能。现在您已经学会了这一点,您将能够为自己的应用程序做同样的事情。

在下一章中,我们将通过半实时通信将 Notes 应用程序提升到一个新水平。为此,我们将编写一些浏览器端 JavaScript,并探索 Socket.io 包如何让我们在用户之间发送消息。

使用 Socket.IO 进行动态客户端/服务器交互

Web 的原始设计模型类似于 20 世纪 70 年代主机的工作方式。旧式的哑终端,如 IBM 3270,和 Web 浏览器都遵循请求-响应范式。用户发送请求,远程计算机发送响应。这种请求-响应范式在 Node.js HTTP 服务器 API 中是明显的,如下面的代码所示:


The paradigm couldn't be more explicit than this. The `request` and the `response` are right there.

It wasn't until JavaScript improved that we had a quite different paradigm. The new paradigm is interactive communication driven by browser-side JavaScript. This change in the web application model is called, by some, the real-time web. In some cases, websites keep an open connection to the web browser, send notifications, or update the page as it changes.

For some deep background on this, read about the Comet application architecture introduced by Alex Russell in his blog in 2006 ([`infrequently.org/2006/03/comet-low-latency-data-for-the-browser/`](http://infrequently.org/2006/03/comet-low-latency-data-for-the-browser/)). That blog post called for a platform very similar to Node.js, years before Node.js existed.

In this chapter, we'll explore interactive dynamically updated content, as well as inter-user messaging, in the Notes application. To do this, we'll lean on the Socket.IO library ([`socket.io/`](http://socket.io/)). This library simplifies two-way communication between the browser and server and can support a variety of protocols with fallback to old-school web browsers. It keeps a connection open continuously between browser and server, and it follows the `EventEmitter` model, allowing us to send events back and forth.

We'll be covering the following topics:

*   An introduction to the Socket.IO library
*   Integrating Socket.IO with an Express application, and with Passport
*   Real-time communications in modern web browsers
*   Using Socket.IO events:
    *   To update application content as it changes
    *   To send messages between users
*   User experience for real-time communication
*   Using Modal windows to support a user experience that eliminates page reloads

These sorts of techniques are widely used in many kinds of websites. This includes online chat with support personnel, dynamically updated pricing on auction sites, and dynamically updated social network sites.

To get started, let's talk about what Socket.IO is and what it does.

# 第十二章:Introducing Socket.IO

The aim of Socket.IO is to make real-time apps possible in every browser and mobile device*. *It supports several transport protocols, choosing the best one for the specific browser.

Look up the technical definition for the phrase *real-time* and you'll see the real-time web is not truly real-time. The actual meaning of *real-time* involves software with strict time boundaries that must respond to events within a specified time constraint. It is typically used in embedded systems to respond to button presses, for applications as diverse as junk food dispensers and medical devices in intensive care units. Eat too much junk food and you could end up in intensive care, and you'll be served by real-time software in both cases. Try and remember the distinction between different meanings for this phrase.

The proponents of the so-called real-time web should be calling it the pseudo-real-time-web, but that's not as catchy a phrase.

What does it mean that Socket.IO uses the best protocol for the specific browser? If you were to implement your application with WebSockets, it would be limited to the modern browsers supporting that protocol. Because Socket.IO falls back on so many alternative protocols (WebSockets, Flash, XHR, and JSONP), it supports a wider range of web browsers.

As the application author, you don't have to worry about the specific protocol Socket.IO uses with a given browser. Instead, you can implement the business logic and the library takes care of the details for you.

The Socket.IO package includes both a server-side package and a client library. After an easy configuration, the two will communicate back and forth over a socket. The API between the server side and client side is very similar. Because a Socket.IO application runs code in both browser and server, in this chapter we will be writing code for both.

The model that Socket.IO provides is similar to the `EventEmitter` object. The programmer uses the `.on` method to listen for events and the `.emit` method to send them. But with Socket.IO, an event is sent not just using its event name, but is targeted to a combination of two spaces maintained by Socket.IO – the *namespace* and the *room*. Further, the events are sent between the browser and the server rather than being limited to the Node.js process.

Information about Socket.IO is available at [`socket.io/`](https://socket.io/).

On the server side, we wrap the HTTP Server object using the Socket.IO library, giving us the Socket.IO Server object. The Server object lets us create two kinds of communication spaces, *namespaces,* and *rooms*. With it we can send messages, using the `emit` method, either globally or into one of those spaces. We can also listen for messages, using the `on` method, either globally or from a namespace or room.

On the client side, we load the library from the Socket.IO server. Then, client code running in the browser opens one or more communication channels to the server, and the client can connect to namespaces or rooms.

This high-level overview should help to understand the following work. Our next step is to integrate Socket.IO into the initialization of the Notes application.

# Initializing Socket.IO with Express

Socket.IO works by wrapping itself around an HTTP Server object. Think back to Chapter 4, *HTTP Servers and Clients*, where we wrote a module that hooked into HTTP Server methods so that we could spy on HTTP transactions. The HTTP Sniffer attaches a listener to every HTTP event to print out the events. But what if you used that idea to do real work? Socket.IO uses a similar concept, listening to HTTP requests and responding to specific ones by using the Socket.IO protocol to communicate with client code in the browser.

To get started, let's first make a duplicate of the code from the previous chapter. If you created a directory named `chap08` for that code, create a new directory named `chap09` and copy the source tree there.

We won't make changes to the user authentication microservice, but we will use it for user authentication, of course.

In the Notes source directory, install these new modules:

我们将在一些实时交互中结合使用passport模块进行用户身份验证,该模块在第八章 使用微服务对用户进行身份验证中使用。

app.mjs的开头,将此添加到import语句中:


This code brings in the required modules. The `socket.io` package supplies the core event-passing service. The `passport.socketio` module integrates Socket.IO with PassportJS-based user authentication. We will be reorganizing `app.mjs` so that session management will be shared between Socket.IO, Express, and Passport. 

The first change is to move the declaration of some session-related values to the top of the module, as we've done here:

这样做的是创建一对全局范围的变量来保存与会话配置相关的对象。在设置 Express 会话支持时,我们一直在使用这些值作为常量。现在我们需要将这些值与 Socket.IO 和 Express 会话管理器共享。当我们初始化 Express 和 Socket.IO 会话处理程序时,有一个初始化对象接受初始化参数。在每个对象中,我们将传入相同的值作为secretsessionStore字段,以确保它们保持一致。

下一个更改是将与设置服务器对象相关的一些代码从app.mjs的底部移到靠近顶部,如下所示:


In addition to moving some code from the bottom of `app.mjs`, we've added the initialization for Socket.IO. This is where the Socket.IO library wraps itself around the HTTP server object. Additionally, we're integrating it with the Passport library so that Socket.IO knows which sessions are authenticated.

The creation of the `app` and `server` objects is the same as before. All that's changed is the location in `app.mjs` where that occurred. What's new is the `io` object, which is our entry point into the Socket.IO API, and it is used for all Socket.IO operations. This precise object must be made available to other modules wishing to use Socket.IO operations since this object was created by wrapping the HTTP server object. Hence, the `io` object is exported so that other modules can import it.

By invoking `socketio(server)`, we have given Socket.IO access to the HTTP server. It listens for incoming requests on the URLs through which Socket.IO does its work. That's invisible to us, and we don't have to think about what's happening under the covers.

According to the Socket.IO internals, it looks like Socket.IO uses the `/socket.io` URL. That means our applications must avoid using this URL. See [`socket.io/docs/internals/`](https://socket.io/docs/internals/).

The `io.use` function installs functions in Socket.IO that are similar to Express middleware, which the Socket.IO documentation even calls middleware. In this case, the middleware function is returned by calling `passportSocketIO.authorize`, and is how we integrate Passport authentication into Socket.IO.

Because we are sharing session management between Express and Socket.IO, we must make the following change:

这与我们在第八章 使用微服务对用户进行身份验证中添加的 Express 会话支持的配置相同,但修改为使用我们之前设置的配置变量。这样做,Express 和 Socket.IO 会话处理都是从相同的信息集中管理的。

我们已经完成了在 Express 应用程序中设置 Socket.IO 的基本设置。首先,我们将 Socket.IO 库连接到 HTTP 服务器,以便它可以处理 Socket.IO 服务的请求。然后我们将其与 Passport 会话管理集成。

现在让我们学习如何使用 Socket.IO 在 Notes 中添加实时更新。

Notes 主页的实时更新

我们正在努力实现的目标是,当笔记被编辑、删除或创建时,Notes 主页会自动更新笔记列表。到目前为止,我们已经重构了应用程序启动,以便在 Notes 应用程序中初始化 Socket.IO。但是行为还没有改变。

我们将在创建、更新或删除笔记时发送事件。Notes 应用程序的任何感兴趣的部分都可以监听这些事件并做出适当的反应。例如,Notes 主页路由模块可以监听事件,然后向浏览器发送更新。Web 浏览器中的代码将监听来自服务器的事件,并在响应时重新编写主页。同样,当笔记被修改时,监听器可以向 Web 浏览器发送包含新笔记内容的消息,或者如果笔记被删除,监听器可以发送消息,以便 Web 浏览器重定向到主页。

这些更改是必需的:

  • 重构 Notes Store 实现以发送创建、更新和删除事件

  • 重构模板以支持每个页面上的 Bootstrap 和自定义 Socket.IO 客户端

  • 重构主页和笔记查看路由模块,以侦听 Socket.IO 事件并向浏览器发送更新

我们将在接下来的几节中处理这个问题,所以让我们开始吧。

重构 NotesStore 类以发出事件

为了在笔记更改、删除或创建时自动更新用户界面,NotesStore必须发送事件以通知感兴趣的各方这些更改。我们将使用我们的老朋友EventEmitter类来管理必须发送的事件的监听器。

请记住,我们创建了一个名为AbstractNotesStore的类,每个存储模块都包含AbstractNotesStore的子类。因此,我们可以在AbstractNotesStore中添加监听器支持,使其自动可用于实现。

models/Notes.mjs中,进行以下更改:


We imported the `EventEmitter` class, made `AbstractNotesStore` a subclass of `EventEmitter`, and then added some methods to emit events. As a result, every `NotesStore` implementation now has an `on` and `emit` method, plus these three helper methods.

This is only the first step since nothing is emitting any events. We have to rewrite the create, update, and destroy methods in `NotesStore` implementations to call these methods so the events are emitted. 

In the interest of space, we'll show the modifications to one of the `NotesStore` implementations, and leave the rest as an exercise for you.

Modify these functions in `models/notes-sequelize.mjs` as shown in the following code:

这些更改并未改变这些方法的原始合同,因为它们仍然创建、更新和销毁笔记。其他NotesStore实现需要类似的更改。新的是现在这些方法会为可能感兴趣的任何代码发出适当的事件。

还有一个需要处理的任务是初始化,这必须发生在NotesStore初始化之后。请记住,设置NotesStore是异步的。因此,在NotesStore初始化之后调用.on函数注册事件监听器必须发生在NotesStore初始化之后。

routes/index.mjsroutes/notes.mjs中,添加以下函数:


This function is meant to be in place of such initialization.

Then, in `app.mjs`, make this change:

这导入了两个init函数,为它们提供了唯一的名称,然后在NotesStore设置完成后调用它们。目前,这两个函数什么也不做,但很快会改变。重要的是这两个init函数将在NotesStore完全初始化后被调用。

我们的NotesStore在创建、更新或销毁笔记时发送事件。现在让我们使用这些事件适当地更新用户界面。

Notes 主页的实时更改

Notes 模型现在在创建、更新或销毁笔记时发送事件。为了让这些事件有用,它们必须显示给我们的用户。使事件对我们的用户可见意味着应用程序的控制器和视图部分必须消耗这些事件。

routes/index.mjs的顶部,将其添加到导入列表中:


Remember that this is the initialized Socket.IO object we use to send messages to and from connected browsers. We will use it to send messages to the Notes home page.

Then refactor the `router` function:

这将原本是router函数主体的内容提取到一个单独的函数中。我们不仅需要在主页的router函数中使用这个函数,还需要在为主页发出 Socket.IO 消息时使用它。

我们确实改变了返回值。最初,它包含一个 Note 对象数组,现在它包含一个包含keytitle数据的匿名对象数组。我们之所以这样做,是因为将 Note 对象数组提供给 Socket.IO 会导致发送到浏览器的是一组空对象,而发送匿名对象则可以正常工作。

然后,在底部添加这个:


The primary purpose of this section is to listen to the create/update/destroy events, so we can update the browser. For each, the current list of Notes is gathered, then sent to the browser.

As we said, the Socket.IO package uses a model similar to the `EventEmitter` class. The `emit` method sends an event, and the policy of event names and event data is the same as with `EventEmitter`.

Calling `io.of('/namespace')` creates a `Namespace` object for the named namespace. Namespaces are named in a pattern that looks like a pathname in Unix-like filesystems.

Calling `io.of('/namespace').on('connect'...)` has the effect of letting server-side code know when a browser connects to the named namespace. In this case, we are using the `/home` namespace for the Notes home page. This has the side-effect of keeping the namespace active after it is created. Remember that `init` is called during the initialization of the server. Therefore, we will have created the `/home` namespace long before any web browser tries to access that namespace by visiting the Notes application home page.

Calling `io.emit(...)` sends a broadcast message. Broadcast messages are sent to every browser connected to the application server. That can be useful in some situations, but in most situations, we want to avoid sending too many messages. To limit network data consumption, it's best to target each event to the browsers that need the event.

Calling `io.of('/namespace').emit(...)` targets the event to browsers connected to the named namespace. When the client-side code connects to the server, it connects with one or more namespaces. Hence, in this case, we target the `notetitles` event to browsers attached to the `/home` namespace, which we'll see later is the Notes home page.

Calling `io.of('/namespace').to('room')` accesses what Socket.IO calls a `room`. Before a browser receives events in a room, it must *join* the room. Rooms and namespaces are similar, but different, things. We'll use rooms later.

The next task accomplished in the `init` function is to create the event listeners for the `notecreated`, `noteupdate`, and `notedestroy` events. The handler function for each emits a Socket.IO event, `notetitles`, containing the list of note keys and titles.

As Notes are created, updated, and destroyed, we are now sending an event to the home page that is intended to refresh the page to match the change. The home page template, `views/index.hbs`, must be refactored to receive that event and rewrite the page to match.

### Changing the home page and layout templates

Socket.IO runs on both the client and the server, with the two communicating back and forth over the HTTP connection. So far, we've seen the server side of using Socket.IO to send events. The next step is to install a Socket.IO client on the Notes home page.

Generally speaking, every application page is likely to need a different Socket.IO client, since each page has different requirements. This means we must change how JavaScript code is loaded in Notes pages. 

Initially, we simply put JavaScript code required by Bootstrap and FeatherJS at the bottom of `layout.hbs`. That worked because every page required the same set of JavaScript modules, but now we've identified the need for different JavaScript code on each page. Because the custom Socket.IO clients for each page use jQuery for DOM manipulation, they must be loaded after jQuery is loaded. Therefore, we need to change `layout.hbs` to not load the JavaScript. Instead, every template will now be required to load the JavaScript code it needs. We'll supply a shared code snippet for loading the Bootstrap, Popper, jQuery, and FeatherJS libraries but beyond that, each template is responsible for loading any additional required JavaScript.

Create a file, `partials/footerjs.hbs`, containing the following code:

这段代码原本位于views/layout.hbs的底部,这是我们刚提到的共享代码片段。这意味着它将用于每个页面模板,并在自定义 JavaScript 之后使用。

现在我们需要修改views/layout.hbs如下:


That is, we'll leave `layout.hbs` pretty much as it was, except for removing the JavaScript tags from the bottom. Those tags are now in `footerjs.hbs`. 

We'll now need to modify every template (`error.hbs`, `index.hbs`, `login.hbs`, `notedestroy.hbs`, `noteedit.hbs`, and `noteview.hbs`) to, at the minimum, load the `footerjs` partial.

有了这个,每个模板都明确地在页面底部加载了 Bootstrap 和 FeatherJS 的 JavaScript 代码。它们以前是在layout.hbs的页面底部加载的。这给我们带来的好处是可以在加载 Bootstrap 和 jQuery 之后加载 Socket.IO 客户端代码。

我们已经更改了每个模板以使用新的加载 JavaScript 的策略。现在让我们来处理主页上的 Socket.IO 客户端。

向 Notes 主页添加 Socket.IO 客户端

请记住我们的任务是在主页添加一个 Socket.IO 客户端,以便主页接收有关创建、更新或删除笔记的通知。

views/index.hbs中,在footerjs部分之后添加以下内容:


This is what we meant when we said that each page will have its own Socket.IO client implementation. This is the client for the home page, but the client for the Notes view page will be different. This Socket.IO client connects to the `/home` namespace, then for `notetitles` events, it redraws the list of Notes on the home page.

The first `<script>` tag is where we load the Socket.IO client library, from `/socket.io/socket.io.js`. You'll notice that we never set up any Express route to handle the `/socket.io` URL. Instead, the Socket.IO library did that for us. Remember that the Socket.IO library handles every request starting with `/socket.io`, and this is one of such request it handles. The second `<script>` tag is where the page-specific client code lives.

Having client code within a `$(document).ready(function() { .. })` block is typical when using jQuery. This, as the code implies, waits until the web page is fully loaded, and then calls the supplied function. That way, our client code is not only held within a private namespace; it executes only when the page is fully set up.

On the client side, calling `io()` or `io('/namespace')` creates a `socket` object. This object is what's used to send messages to the server or to receive messages from the server.

In this case, the client connects a `socket` object to the `/home` namespace, which is the only namespace defined so far. We then listen for the `notetitles` events, which is what's being sent from the server. Upon receiving that event, some jQuery DOM manipulation erases the current list of Notes and renders a new list on the screen. The same markup is used in both places.

Additionally, for this script to function, this change is required elsewhere in the template:

您会注意到脚本中引用了$("#notetitles")来清除现有的笔记标题列表,然后添加一个新列表。显然,这需要在这个<div>上有一个id="notetitles"属性。

我们在routes/index.mjs中的代码监听了来自 Notes 模型的各种事件,并相应地向浏览器发送了一个notetitles事件。浏览器代码获取笔记信息列表并重新绘制屏幕。

您可能会注意到我们的浏览器端 JavaScript 没有使用 ES-2015/2016/2017 功能。当然,如果我们这样做,代码会更清晰。我们如何知道我们的访问者是否使用足够现代的浏览器来支持这些语言特性呢?我们可以使用 Babel 将 ES-2015/2016/2017 代码转译为能够在任何浏览器上运行的 ES5 代码。然而,在浏览器中仍然编写 ES5 代码是一种务实的折衷。

使用实时主页更新运行 Notes

我们现在已经实现了足够的功能来运行应用程序并看到一些实时操作。

像之前一样,在一个窗口中启动用户信息微服务:


Then, in another window, start the Notes application:

然后,在浏览器窗口中,转到http://localhost:3000并登录 Notes 应用程序。要查看实时效果,请打开多个浏览器窗口。如果您可以从多台计算机上使用 Notes,则也可以这样做。

在一个浏览器窗口中,创建和删除便签,同时保留其他浏览器窗口查看主页。创建一个便签,它应该立即显示在其他浏览器窗口的主页上。删除一个便签,它也应该立即消失。

您可能要尝试的一个场景需要三个浏览器窗口。在一个窗口中,创建一个新的便签,然后保留显示新创建的便签的浏览器窗口。在另一个窗口中,显示 Notes 主页。在第三个窗口中,显示新创建的便签。现在,删除这个新创建的便签。其中两个窗口被正确更新,现在显示主页。第三个窗口,我们只是在查看便签,仍然显示该便签,即使它已经不存在。

我们很快就会解决这个问题,但首先,我们需要讨论如何调试您的 Socket.IO 客户端代码。

关于在 Socket.IO 代码中启用调试跟踪的说明

检查 Socket.IO 正在做什么是有用的,如果您遇到问题。幸运的是,Socket.IO 包使用与 Express 相同的 Debug 包,我们可以通过设置DEBUG环境变量来打开调试跟踪。它甚至在客户端使用相同的语法localStorage.debug变量,我们也可以在浏览器中启用调试跟踪。

在服务器端,这是一个有用的DEBUG环境变量设置:


This enables debug tracing for the Notes application and the Socket.IO package.

Enabling this in a browser is a little different since there are no environment variables. Simply open up the JavaScript console in your browser and enter this command:

立即,您将开始看到来自 Socket.IO 的不断交谈的消息。您将了解到的一件事是,即使应用程序处于空闲状态,Socket.IO 也在来回通信。

还有其他几个要使用的DEBUG字符串。例如,Socket.IO 依赖于 Engine.IO 包来进行传输。如果您想要对该包进行调试跟踪,将engine*添加到DEBUG字符串中。在测试本章节时,所示的字符串最有帮助。

现在我们已经了解了调试跟踪,我们可以处理将/notes/view页面更改为对正在查看的便签做出反应的问题。

查看便签时的实时操作

现在我们可以看到 Notes 应用程序的一部分实时更改,这很酷。让我们转到/notes/view页面看看我们能做些什么。我想到的是这个功能:

  • 如果其他人编辑便签,则更新便签。

  • 如果其他人删除了便签,将查看者重定向到主页。

  • 允许用户在便签上留下评论。

对于前两个功能,我们可以依赖于来自 Notes 模型的现有事件。因此,我们可以在本节中实现这两个功能。第三个功能将需要一个消息传递子系统,因此我们将在本章的后面进行讨论。

为了实现这一点,我们可以为每个便签创建一个 Socket.IO 命名空间,例如/notes/${notekey}。然后,当浏览器查看便签时,添加到noteview.hbs模板的客户端代码将连接到该命名空间。然而,这引发了如何创建这些命名空间的问题。相反,所选的实现是有一个命名空间/notes,并为每个便签创建一个房间。

routes/notes.mjs中,确保像这样导入io对象:


This, of course, makes the `io` object available to code in this module. We're also importing a function from `index.mjs` that is not currently exported. We will need to cause the home page to be updated, and therefore in `index.mjs`, make this change:

这只是添加了export关键字,以便我们可以从其他地方访问该函数。

然后,将init函数更改为以下内容:


First, we handle `connect` events on the `/notes` namespace. In the handler, we're looking for a `query` object containing the `key` for a Note. Therefore, in the client code, when calling `io('/notes')` to connect with the server, we'll have to arrange to send that `key` value. It's easy to do, and we'll learn how in a little while.

Calling `socket.join(roomName)` does what is suggested—it causes this connection to join the named room. Therefore, this connection will be addressed as being in the `/notes` namespace, and in a room whose name is the `key` for a given Note.

The next thing is to add listeners for the `noteupdated` and `notedestroyed` messages. In both, we are using this pattern:

这就是我们如何使用 Socket.IO 向连接到给定命名空间和房间的任何浏览器发送消息。

对于noteupdated,我们只需发送新的笔记数据。我们再次不得不将笔记对象转换为匿名 JavaScript 对象,因为否则浏览器中会收到一个空对象。客户端代码将不得不使用 jQuery 操作来更新页面,我们很快就会看到。

对于notedestroyed,我们只需发送key。由于客户端代码将通过将浏览器重定向到主页来做出响应,我们根本不需要发送任何内容。

在这两者中,我们还调用emitNoteTitles来确保主页在被查看时得到更新。

为实时操作更改笔记视图模板

就像我们在主页模板中所做的那样,这些事件中包含的数据必须对用户可见。我们不仅需要向模板views/noteview.hbs中添加客户端代码;我们还需要对模板进行一些小的更改:


In this section of the template, we add a pair of IDs to two elements. This enables the JavaScript code to target the correct elements.

Add this client code to `noteview.hbs`:

在此脚本中,我们首先连接到/notes命名空间,然后为noteupdatednotedestroyed事件创建监听器。

连接到/notes命名空间时,我们传递了一个额外的参数。这个函数的可选第二个参数是一个选项对象,在这种情况下,我们传递了query选项。query对象在形式上与URL类的query对象相同。这意味着命名空间就像是一个 URL,比如/notes?key=${notekey}。根据 Socket.IO 文档,我们可以传递一个完整的 URL,如果连接是这样创建的,它也可以工作:


While we could set up the URL query string this way, it's cleaner to do it the other way.

We need to call out a technique being used. These code snippets are written in a Handlebars template, and therefore the syntax `{{ expression }}` is executed on the server, with the result of that expression to be substituted into the template. Therefore, the `{{ expression }}` construct accesses server-side data. Specifically, `query: { key: '{{ notekey }}' }` is a data structure on the client side, but the `{{ notekey }}` portion is evaluated on the server. The client side does not see `{{ notekey }}`, it sees the value `notekey` had on the server.

For the `noteupdated` event, we take the new note content and display it on the screen. For this to work, we had to add `id=` attributes to certain HTML elements so we could use jQuery selectors to manipulate the correct elements.

Additionally in `partials/header.hbs`, we needed to make this change as well:

我们还需要在页面顶部更新标题,这个id属性有助于定位正确的元素。

对于notedestroyed事件,我们只需将浏览器窗口重定向回主页。正在查看的笔记已被删除,用户继续查看不再存在的笔记是没有意义的。

在查看笔记时运行带有伪实时更新的笔记

此时,您现在可以重新运行笔记应用程序并尝试新的实时更新功能。

到目前为止,您已经多次测试了笔记,并知道该怎么做。首先启动用户认证服务器和笔记应用程序。确保数据库中至少有一条笔记;如果需要,添加一条。然后,打开多个浏览器窗口,一个查看主页,两个查看同一条笔记。在查看笔记的窗口中,编辑笔记进行更改,确保更改标题。文本更改应该在主页和查看笔记的页面上都有变化。

然后删除笔记并观察它从主页消失,而且查看笔记的浏览器窗口现在位于主页上。

在本节中,我们处理了很多事情,现在笔记应用程序具有动态更新功能。为此,我们创建了一个基于事件的通知系统,然后在浏览器和服务器中使用 Socket.IO 来往返通信数据。

我们已经实现了我们设定的大部分目标。通过重构笔记存储实现以发送事件,我们能够向浏览器中的 Socket.IO 客户端发送事件。这反过来又用于自动更新笔记主页和/notes/view页面。

剩下的功能是让用户能够在笔记上写评论。在下一节中,我们将通过添加一个全新的数据库表来处理消息。

笔记的用户间聊天和评论

这很酷!现在我们在编辑、删除或创建笔记时可以实时更新笔记。现在让我们把它提升到下一个级别,并实现类似于用户之间聊天的功能。

早些时候,我们列举了在/notes/view页面上可以使用 Socket.IO 做的三件事。我们已经实现了当笔记更改时的实时更新和当笔记被删除时重定向到主页;剩下的任务是允许用户对笔记进行评论。

我们可以将我们的笔记应用程序概念转变,并将其发展成一个社交网络。在大多数这样的网络中,用户发布东西(笔记、图片、视频等),其他用户对这些东西进行评论。如果做得好,这些基本元素可以发展成一个庞大的人群共享笔记的社区。虽然笔记应用程序有点像一个玩具,但它离一个基本的社交网络并不太远。我们现在将要做的评论是朝着这个方向迈出的一小步。

在每个笔记页面上,我们将有一个区域来显示来自笔记用户的消息。每条消息将显示用户名、时间戳和他们的消息。我们还需要一种方法让用户发布消息,并允许用户删除消息。

所有这些操作都将在不刷新屏幕的情况下执行。相反,网页内运行的代码将发送命令到/从服务器,并动态采取行动。通过这样做,我们将学习关于 Bootstrap 模态对话框,以及更多关于发送和接收 Socket.IO 消息的知识。让我们开始吧。

存储消息的数据模型

我们需要首先实现一个用于存储消息的数据模型。所需的基本字段是唯一 ID、发送消息的人的用户名、与消息相关的命名空间和房间、消息,最后是消息发送的时间戳。当接收或删除消息时,必须从数据模型中发出事件,以便我们可以在网页上做正确的事情。我们将消息与房间和命名空间组合关联起来,因为在 Socket.IO 中,该组合已被证明是一种很好的方式来定位笔记应用程序中的特定页面。

这个数据模型实现将被写入 Sequelize。如果您喜欢其他存储解决方案,您可以尽管在其他数据存储系统上重新实现相同的 API。

创建一个新文件models/messages-sequelize.mjs,其中包含以下内容:


This sets up the modules being used and also initializes the `EventEmitter` interface. We're also exporting the `EventEmitter` as `emitter` so other modules can be notified about messages as they're created or deleted.

Now add this code for handling the database connection:

connectDB的结构与我们在notes-sequelize.mjs中所做的类似。我们使用相同的connectSequlz函数与相同的数据库连接,并且如果数据库已经连接,我们会立即返回。

通过SQMessage.init,我们在数据库中定义了我们的消息模式。我们有一个相当简单且相当自解释的数据库模式。为了发出关于消息的事件,我们使用了Sequelize的一个特性,在特定时间调用。

id字段不会由调用者提供;相反,它将自动生成。因为它是一个autoIncrement字段,每添加一条消息,数据库将为其分配一个新的id编号。在 MySQL 中的等效操作是在列定义上的AUTO_INCREMENT属性。

namespaceroom字段一起定义了每条消息属于笔记中的哪个页面。请记住,在使用 Socket.IO 发出事件时,我们可以将事件定位到这两个空间中的一个或两个,因此我们将使用这些值将每条消息定位到特定页面。

到目前为止,我们为笔记主页定义了一个命名空间/home,为查看单个笔记定义了另一个命名空间/notes。理论上,笔记应用程序可以扩展到在其他区域显示消息。例如,/private-message命名空间可以用于私人消息。因此,模式被定义为具有namespaceroom字段,以便在将来的笔记应用程序的任何部分中使用消息。

对于我们当前的目的,消息将被存储在namespace等于/homeroom等于给定笔记的key的情况下。

我们将使用timestamp按发送顺序呈现消息。from字段是发送者的用户名。

为了发送有关已创建和已销毁消息的通知,让我们尝试一些不同的方法。如果我们遵循之前使用的模式,我们即将创建的函数将具有带有相应消息的emitter.emit调用。但 Sequelize 提供了一种不同的方法。

使用Sequelize,我们可以创建所谓的钩子方法。钩子也可以被称为生命周期事件,它们是我们可以声明的一系列函数。当 Sequelize 管理的对象存在某些触发状态时,将调用钩子方法。在这种情况下,我们的代码需要知道何时创建消息,以及何时删除消息。

钩子声明如选项对象所示。schema选项对象中的名为hooks的字段定义了钩子函数。对于我们想要使用的每个钩子,添加一个包含钩子函数的适当命名字段。对于我们的需求,我们需要声明hooks.afterCreatehooks.afterDestroy。对于每个钩子,我们声明一个函数,该函数接受刚刚创建或销毁的SQMessage对象的实例。然后,使用该对象,我们调用emitter.emit,使用newmessagedestroymessage事件名称。

继续添加这个函数:


The `sanitizedMessage` function performs the same function as `sanitizedUser`. In both cases, we are receiving a Sequelize object from the database, and we want to return a simple object to the caller. These functions produce that simplified object.

Next, we have several functions to store new messages, retrieve messages, and delete messages. 

The first is this function:

当用户发布新评论/消息时将调用此函数。我们将其存储在数据库中,并且钩子发出一个事件,表示消息已创建。

请记住,id字段是在存储新消息时自动创建的。因此,在调用SQMessage.create时不提供它。

这个函数和下一个函数本来可以包含emitter.emit调用来发送newmessagedestroymessage事件。相反,这些事件是在我们之前创建的钩子函数中发送的。问题是是否将emitter.emit放在钩子函数中,还是放在这里。

这里使用的原理是,通过使用钩子,我们可以确保始终发出消息。

然后,添加这个函数:


This is to be called when a user requests that a message should be deleted. With Sequelize, we must first find the message and then delete it by calling its `destroy` method.

Add this function:

这个函数检索最近的消息,立即使用情况是在渲染/notes/view页面时使用。

虽然我们当前的实现是用于查看笔记,但它是通用的,适用于任何 Socket.IO 命名空间和房间。这是为了可能的未来扩展,正如我们之前解释的那样。它找到与给定命名空间和房间组合关联的最近的 20 条消息,然后将一个经过清理的列表返回给调用者。

findAll中,我们指定一个order属性。这类似于 SQL 中的ORDER BY短语。order属性接受一个或多个描述符的数组,声明 Sequelize 应该如何对结果进行排序。在这种情况下,有一个描述符,表示按照时间戳字段降序排序。这将导致最近的消息首先显示。

我们创建了一个简单的模块来存储消息。我们没有实现完整的创建、读取、更新和删除CRUD)操作,因为对于这个任务并不需要。我们即将创建的用户界面只允许用户添加新消息、删除现有消息和查看当前消息。

让我们继续创建用户界面。

为 Notes 路由器添加消息支持

现在我们可以将消息存储到数据库中,让我们将其集成到 Notes 路由器模块中。

将消息集成到/notes/view页面将需要在notesview.hbs模板中添加一些新的 HTML 和 JavaScript,并在routes/notes.mjs中的init函数中添加一些新的 Socket.IO 通信端点。在本节中,让我们处理这些通信端点,然后在下一节中让我们讨论如何在用户界面中设置它。

routes/notes.mjs中,将这个添加到import语句中:


This imports the functions we just created so we can use them. And we also set up `debug` and `error` functions for tracing.

Add these event handlers to the `init` function in `routes/notes.mjs`:

这些接收来自models/messages-sequelize.mjs的新消息或已销毁消息的通知,然后将通知转发到浏览器。请记住,消息对象包含命名空间和房间,因此这让我们能够将此通知发送到任何 Socket.IO 通信通道。

为什么我们不直接在models/messages-sequelize.mjs中进行 Socket.IO 调用呢?显然,将 Socket.IO 调用放在messages-sequelize.mjs中会更有效率,需要更少的代码行,因此减少了错误的机会。但是我们正在保持模型、视图和控制器之间的分离,这是我们在第五章中讨论过的。此外,我们能够自信地预测将来不会有其他用途的消息吗?这种架构允许我们将多个监听器方法连接到这些消息事件,以实现多种目的。

在用户界面中,我们将不得不实现相应的监听器来接收这些消息,然后采取适当的用户界面操作。

init函数中的connect监听器中,添加这两个新的事件监听器:


This is the existing function to listen for connections from `/notes/view` pages, but with two new Socket.IO event handler functions. Remember that in the existing client code in `notesview.hbs`, it connects to the `/notes` namespace and supplies the note `key` as the room to join. In this section, we build on that by also setting up listeners for `create-message` and `delete-message` events when a note `key` has been supplied.

As the event names imply, the `create-message` event is sent by the client side when there is a new message, and the `delete-message` event is sent to delete a given message. The corresponding data model functions are called to perform those functions.

For the `create-message` event, there is an additional feature being used. This uses what Socket.IO calls an acknowledgment function.

So far, we've used the Socket.IO `emit` method with an event name and a data object. We can also include a `callback` function as an optional third parameter. The receiver of the message will receive the function and can call the function, and any data passed to the function is sent to the `callback` function. The interesting thing is this works across the browser-server boundary.

This means our client code will do this:

第三个参数中的函数成为create-message事件处理程序函数中的fn参数。然后,提供给fn调用的任何内容都将作为result参数传递到此函数中。不管是浏览器通过连接到服务器提供该函数,还是在服务器上调用该函数,Socket.IO 都会负责将响应数据传输回浏览器代码并在那里调用确认函数。最后要注意的是,我们在错误报告方面有些懒惰。因此,将一个任务放在待办事项中,以改进向用户报告错误。

下一个任务是在浏览器中实现代码,使所有这些对用户可见。

更改消息的注释视图模板

我们需要再次深入views/noteview.hbs进行更多的更改,以便我们可以查看、创建和删除消息。这一次,我们将添加大量的代码,包括使用 Bootstrap 模态弹出窗口来获取消息,我们刚刚讨论的 Socket.IO 消息,以及 jQuery 操作,使所有内容显示在屏幕上。

我们希望/notes/view页面不会导致不必要的页面重新加载。相反,我们希望用户通过弹出窗口收集消息文本来添加评论,然后新消息将被添加到页面上,而不会导致页面重新加载。同样,如果另一个用户向 Note 添加消息,我们希望消息能够在不重新加载页面的情况下显示出来。同样,我们希望删除消息而不会导致页面重新加载,并且希望消息被删除后,其他查看 Note 的用户也不会导致页面重新加载。

当然,这将涉及浏览器和服务器之间来回传递多个 Socket.IO 消息,以及一些 jQuery DOM 操作。我们可以在不重新加载页面的情况下完成这两个操作,这通常会提高用户体验。

让我们首先实现用户界面来创建新消息。

在 Note 视图页面上撰写消息

/notes/view页面的下一个任务是让用户添加消息。他们将点击一个按钮,弹出窗口让他们输入文本,然后他们将在弹出窗口中点击一个按钮,弹出窗口将被关闭,消息将显示出来。此外,消息将显示给 Note 的其他查看者。

Bootstrap 框架包括对模态窗口的支持。它们与桌面应用程序中的模态对话框具有类似的作用。模态窗口出现在应用程序现有窗口的上方,同时阻止与网页或应用程序其他部分的交互。它们用于向用户提问等目的。典型的交互是点击按钮,然后应用程序弹出一个包含一些 UI 元素的模态窗口,用户与模态交互,然后关闭它。在使用计算机时,您肯定已经与成千上万个模态窗口进行了交互。

让我们首先添加一个按钮,用户将请求添加评论。在当前设计中,笔记文本下方有一排两个按钮。在views/noteview.hbs中,让我们添加第三个按钮:


This is directly out of the documentation for the Bootstrap Modal component. The `btn-outline-dark` style matches the other buttons in this row, and between the `data-toggle` and the `data-target` attributes, Bootstrap knows which Modal window to pop up.

Let's insert the definition for the matching Modal window in `views/noteview.hbs`:

这是直接来自 Bootstrap 模态组件的文档,以及一个简单的表单来收集消息。

请注意,这里有<div class="modal-dialog">,在其中有<div class="model-content">。这两者一起形成了对话框窗口内显示的内容。内容分为<div class="modal-header">用于对话框的顶部行,以及<div class="modal-body">用于主要内容。

最外层元素的id值,id="notes-comment-modal",与按钮中声明的目标匹配,data-target="#notes-comment-modal"。另一个连接是aria-labelledby,它与<h5 class="modal-title">元素的id匹配。

<form id="submit-comment">很简单,因为我们不会使用它通过 HTTP 连接提交任何内容到常规 URL。因此,它没有actionmethod属性。否则,这是一个正常的日常 Bootstrapform,带有fieldset和各种表单元素。

下一步是添加客户端 JavaScript 代码使其功能正常。单击按钮时,我们希望运行一些客户端代码,该代码将发送与我们添加到routes/notes.mjs匹配的create-message事件。

views/noteview.hbs中,我们有一个包含客户端代码的$(document).ready部分。在该函数中,添加一个仅在user对象存在时存在的部分,如下所示:


That is, we want a section of jQuery code that's active only when there is a `user` object, meaning that this Note is being shown to a logged-in user.

Within that section, add this event handler:

这与我们刚刚创建的表单中的按钮相匹配。通常在type="submit"按钮的事件处理程序中,我们会使用event.preventDefault来防止正常结果,即重新加载页面。但在这种情况下不需要。

该函数从表单元素中收集各种值,并发送create-message事件。如果我们回顾服务器端代码,create-message调用postMessage,将消息保存到数据库,然后发送newmessage事件,该事件传递到浏览器。

因此,我们将需要一个newmessage事件处理程序,我们将在下一节中介绍。与此同时,您应该能够运行 Notes 应用程序,添加一些消息,并查看它们是否已添加到数据库中。

请注意,这有一个第三个参数,一个函数,当调用时会导致模态被关闭,并清除输入的任何消息。这是我们之前提到的确认函数,在服务器上调用,并且 Socket.IO 安排在客户端调用它。

在 Note 视图页面上显示任何现有消息

现在我们可以添加消息了,让我们学习如何显示消息。请记住,我们已经定义了 SQMessage 模式,并且我们已经定义了一个函数recentMessages来检索最近的消息。

在呈现 Note 页面时,我们有两种可能的方法来显示现有消息。一种选择是当页面最初显示时,发送一个事件请求最近的消息,并在接收到消息后在客户端呈现这些消息。另一种选择是在服务器上呈现消息。我们选择了第二种选择,即服务器端呈现。

routes/notes.mjs中,修改/view路由器函数如下:


That's simple enough: we retrieve the recent messages, then supply them to the `noteview.hbs` template. When we retrieve the messages, we supply the `/notes` namespace and a room name of the note `key`. It is now up to the template to render the messages.

In the `noteview.hbs` template, just below the *delete*, edit, and *comment* buttons, add this code:

如果有一个messages对象,这些步骤会遍历数组,并为每个条目设置一个 Bootstrap card组件来显示消息。消息显示在<div id="noteMessages">中,我们稍后会在 DOM 操作中进行定位。每条消息的标记直接来自 Bootstrap 文档,稍作修改。

在每种情况下,card组件都有一个id属性,我们可以用它来与数据库中的特定消息关联。button组件将用于删除消息,并携带数据属性来标识将要删除的消息。

通过这样,我们可以查看一个笔记,并查看已附加的任何消息。我们没有选择消息的排序,但请记住,在models/messages-sequelize.mjs中,数据库查询按照时间顺序相反的顺序排列消息。

无论如何,我们的目标是使消息能够自动添加,而无需重新加载页面。为此,我们需要一个newmessage事件的处理程序,这是上一节遗留下来的任务。

submitNewComment按钮的处理程序下面,添加以下内容:


This is a handler for the Socket.IO `newmessage` event. What we have done is taken the same markup as is in the template, substituted values into it, and used jQuery to prepend the text to the top of the `noteMessages` area.

Remember that we decided against using any ES6 goodness because a template string would sure be handy in this case. Therefore, we have fallen back on an older technique, the JavaScript `String.replace` method.

There is a common question, how do we replace multiple occurrences of a target string in JavaScript? You'll notice that the target `%id%` appears twice. The best answer is to use `replace(/pattern/g, newText)`; in other words, you pass a regular expression and specify the `g` modifier to make it a global action. To those of us who grew up using `/bin/ed` and for whom `/usr/bin/vi` was a major advance, we're nodding in recognition that this is the JavaScript equivalent to `s/pattern/newText/g`.

With this event handler, the message will now appear automatically when it is added by the user. Further, for another window simply viewing the Note the new message will appear automatically.

Because we use the jQuery `prepend` method, the message appears at the top. If you want it to appear at the bottom, then use `append`. And in `models/messages-sequelize.mjs`, you can remove the `DESC` attribute in `recentMessages` to change the ordering.

The last thing to notice is the markup includes a button with the `id="message-del-button"`. This button is meant to be used to delete a message, and in the next section, we'll implement that feature.

### Deleting messages on the Notes view page

To make the `message-del-button` button active, we need to listen to click events on the button. 

Below the `newmessage` event handler, add this button click handler:

socket对象已经存在,并且是与此笔记的 Socket.IO 连接。我们向房间发送一个delete-message事件,其中包含按钮上存储的数据属性的值。

正如我们已经看到的,在服务器上,delete-message事件调用destroyMessage函数。该函数从数据库中删除消息,并发出一个destroymessage事件。routes/notes.mjs中接收到该事件,并将消息转发到浏览器。因此,我们需要在浏览器中添加一个事件监听器来接收destroymessage事件:


回头看看,每条消息显示`card`都有一个符合这里显示模式的`id`参数。因此,jQuery 的`remove`函数负责从显示中删除消息。

### 运行笔记并传递消息

这是很多代码,但现在我们有能力撰写消息,在屏幕上显示它们,并删除它们,而无需重新加载页面。

您可以像我们之前那样运行应用程序,首先在一个命令行窗口中启动用户认证服务器,然后在另一个命令行窗口中启动笔记应用程序:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/8600d6c4-884c-4758-bf91-e57db6f92371.png)

它显示了笔记上的任何现有消息。

输入消息时,模态框看起来像这样:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/332a268c-23b4-440f-aa4c-b3084db30df7.png)

尝试在多个浏览器窗口中查看相同的笔记或不同的笔记。这样,您可以验证笔记只显示在相应的笔记窗口上。

# 总结

在本章中,我们走了很长的路,但也许 Facebook 不必担心我们将笔记应用程序转换为社交网络的初步尝试。尽管如此,我们为应用程序添加了一些有趣的新功能,这使我们有机会探索一些真正酷的伪实时通信技术,用于浏览器会话之间的交流。

我们了解了如何使用 Socket.IO 进行伪实时的网络体验。正如我们所学到的,它是一个用于服务器端代码和在浏览器中运行的客户端代码之间动态交互的框架。它遵循一个事件驱动模型,用于在两者之间发送事件。我们的代码使用这个框架,既用于向浏览器发送服务器上发生的事件的通知,也用于希望编写评论的用户。

我们了解了从服务器端代码的一个部分发送到另一个部分的事件的价值。这使我们能够根据服务器上发生的更改进行客户端更新。这使用了`EventEmitter`类和监听器方法,将事件和数据传递到浏览器。

在浏览器中,我们使用 jQuery DOM 操作来响应这些动态发送的消息来改变用户界面。通过使用 Socket.IO 和正常的 DOM 操作,我们能够刷新页面内容,同时避免重新加载页面。

我们还学习了关于模态窗口,利用这种技术来创建评论。当然,还有很多其他事情可以做,比如不同的体验来创建、删除或编辑笔记。

为了支持所有这些,我们添加了另一种数据,*消息*,以及一个由新的 Sequelize 模式管理的相应数据库表。它用于表示我们的用户可以在笔记上发表的评论,但也足够通用,可以用于其他用途。

正如我们所看到的,Socket.IO 为我们提供了丰富的事件基础,可以在服务器和客户端之间传递事件,为用户构建多用户、多通道的通信体验。

在下一章中,我们将探讨 Node.js 应用程序在真实服务器上的部署。在我们的笔记本上运行代码很酷,但要取得成功,应用程序需要得到适当的部署。


# 第十三章

第三部分:部署

除了使用 systemd 传统部署 Node.js 应用程序的方法外,新的最佳实践是使用 Kubernetes 或类似的系统。

本节包括以下章节:

+   第十章,将 Node.js 应用程序部署到 Linux 服务器

+   第十一章,使用 Docker 部署 Node.js 微服务

+   第十二章,使用 Terraform 在 AWS EC2 上部署 Docker Swarm

+   第十三章,单元测试和功能测试

+   第十四章,Node.js 应用程序中的安全性


将 Node.js 应用程序部署到 Linux 服务器

现在 Notes 应用程序已经相当完整,是时候考虑如何将其部署到真实服务器上了。我们已经创建了一个合作笔记概念的最小实现,效果相当不错。为了发展,Notes 必须离开我们的笔记本电脑,生活在一个真正的服务器上。

要实现的用户故事是访问托管应用程序,即使您的笔记本电脑关闭,也可以进行评估。开发者的故事是识别几种部署解决方案之一,确保系统在崩溃时具有足够的可靠性,以及用户可以在不占用开发者太多时间的情况下访问应用程序。

在本章中,我们将涵盖以下主题:

+   应用程序架构的讨论,以及如何实施部署的想法

+   在 Linux 服务器上进行传统的 LSB 兼容的 Node.js 部署

+   配置 Ubuntu 以管理后台任务

+   调整 Twitter 应用程序认证的设置

+   使用 PM2 可靠地管理后台任务

+   部署到虚拟 Ubuntu 实例,可以是我们笔记本电脑上的虚拟机(VM)或虚拟专用服务器(VPS)提供商

Notes 应用程序由两个服务组成:Notes 本身和用户认证服务,以及相应的数据库实例。为了可靠地向用户提供这些服务,这些服务必须部署在公共互联网上可见的服务器上,并配备系统管理工具,以保持服务运行,处理服务故障,并扩展服务以处理大量流量。一个常见的方法是依赖于在服务器启动期间执行脚本来启动所需的后台进程。

即使我们的最终目标是在具有自动扩展和所有流行词的基于云的平台上部署,您仍必须从如何在类 Unix 系统上后台运行应用程序的基础知识开始。

让我们通过再次审查架构并思考如何在服务器上最佳部署来开始本章。

# 第十四章:注意应用程序架构和部署考虑事项

在我们开始部署 Notes 应用程序之前,我们需要审查其架构并了解我们计划做什么。我们已将服务分成两组,如下图所示:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/27cc847a-c683-4bd3-9897-0a95dc242e1e.png)

用户界面部分是 Notes 服务及其数据库。后端,用户认证服务及其数据库需要更多的安全性。在我们的笔记本电脑上,我们无法为该服务创建设想中的保护墙,但我们即将实施一种形式的保护。

增强安全性的一种策略是尽可能少地暴露端口。这减少了所谓的攻击面,简化了我们在加固应用程序防止安全漏洞方面的工作。对于 Notes 应用程序,我们只需要暴露一个端口:用户访问应用程序的 HTTP 服务。其他端口——两个用于 MySQL 服务器,一个用于用户认证服务端口——不应该对公共互联网可见,因为它们仅供内部使用。因此,在最终系统中,我们应该安排暴露一个 HTTP 端口,并将其他所有内容与公共互联网隔离开来。

在内部,Notes 应用程序需要访问 Notes 数据库和用户认证服务。反过来,该服务需要访问用户认证数据库。Notes 服务不需要访问用户认证数据库,用户认证服务也不需要访问 Notes 数据库。按照目前的设想,不需要外部访问任何数据库或认证服务。

这给了我们一个将要实施的感觉。要开始,让我们学习在 Linux 上部署应用程序的传统方式。

# Node.js 服务的传统 Linux 部署

在本节中,我们将探讨传统的 Linux/Unix 服务部署。我们将在笔记本电脑上运行一个虚拟的 Ubuntu 实例来完成这个目标。目标是创建后台进程,这些进程在启动时自动启动,如果进程崩溃,则重新启动,并允许我们监视日志文件和系统状态。

传统的 Linux/Unix 服务器应用部署使用 init 脚本来管理后台进程。它们在系统启动时启动,并在系统停止时干净地关闭。名称“init 脚本”来自系统中启动的第一个进程的名称,其传统名称为`/etc/init`。init 脚本通常存储在`/etc/init.d`中,并且通常是简单的 shell 脚本。一些操作系统使用其他进程管理器,例如`upstart`、`systemd`或`launchd`,但遵循相同的模型。虽然这是一个简单的模型,但具体情况在一个操作系统(OS)到另一个操作系统(OS)之间差异很大。

Node.js 项目本身不包括任何脚本来管理任何操作系统上的服务器进程。基于 Node.js 实现完整的 Web 服务意味着我们必须创建脚本来与您的操作系统上的进程管理集成。

在互联网上拥有 Web 服务需要在服务器上运行后台进程,并且这些进程必须是以下内容:

+   **可靠性**:例如,当服务器进程崩溃时,它们应该能够自动重新启动。

+   **可管理性**:它们应该与系统管理实践很好地集成。

+   **可观察性**:管理员必须能够从服务中获取状态和活动信息。

为了演示涉及的内容,我们将使用 PM2 来实现*Notes*的后台服务器进程管理。PM2 将自己标榜为*进程管理器*,意味着它跟踪它正在管理的进程的状态,并确保这些进程可靠地执行并且可观察。PM2 会检测系统类型,并可以自动集成到本机进程管理系统中。它将创建一个 LSB 风格的 init 脚本([`wiki.debian.org/LSBInitScripts`](http://wiki.debian.org/LSBInitScripts)),或者根据您的服务器需要创建其他脚本。

本章的目标是探讨如何做到这一点,有几种实现这一目标的途径:

+   传统的虚拟机管理应用程序,包括 VirtualBox、Parallels 和 VMware,让我们在虚拟环境中安装 Ubuntu 或任何其他操作系统。在 Windows 上,Hyper-V 随 Windows 10 Pro 一起提供类似的功能。在这些情况下,您下载引导 CD-ROM 的 ISO 镜像,从该 ISO 镜像引导虚拟机,并运行完整的操作系统安装,就像它是一台普通的计算机一样。

+   您可以从全球数百家网络托管提供商中租用廉价的 VPS。通常选择受限于 Ubuntu 服务器。在这些情况下,您将获得一个预先准备好的服务器系统,可用于安装运行网站的服务器软件。

+   一种新产品 Multipass 是一种基于轻量级虚拟化技术的轻量级虚拟机管理工具,适用于每台台式计算机操作系统。它为您提供了与从托管提供商租用 VPS 或使用 VirtualBox 等 VM 软件获得的完全相同的起点,但对系统的影响要比 VirtualBox 等传统 VM 应用程序低得多。就像在笔记本电脑上获得 VPS 一样。

从启动后台进程的工具和命令的角度来看,这些选择之间没有实际区别。在 VirtualBox 中安装的 Ubuntu 实例与从 Web 托管提供商那里租用的 VPS 上的 Ubuntu 相同,与在 Multipass 实例中启动的 Ubuntu 相同。它是相同的操作系统,相同的命令行工具和相同的系统管理实践。不同之处在于对笔记本电脑性能的影响。使用 Multipass,我们可以在几秒钟内设置一个虚拟的 Ubuntu 实例,并且很容易在笔记本电脑上运行多个实例而几乎不会影响性能。使用 VirtualBox、Hyper-V 或其他 VM 解决方案的体验是,使用笔记本电脑会很快感觉像在糖浆中行走,特别是在同时运行多个 VM 时。

因此,在本章中,我们将在 Multipass 上运行此练习。本章中显示的所有内容都可以轻松转移到 VirtualBox/VMware/等上的 Ubuntu 或从 Web 托管提供商那里租用的 VPS 上。

对于此部署,我们将使用 Multipass 创建两个 Ubuntu 实例:一个用于 Notes 服务,另一个用于用户服务。在每个实例中,都将有一个对应数据库的 MySQL 实例。然后我们将使用 PM2 配置这些系统,在启动时在后台启动我们的服务。

由于 Multipass 和 WSL2 之间存在明显的不兼容性,因此在 Windows 上使用 Multipass 可能会遇到困难。如果遇到问题,我们有一节描述应该怎么做。

第一项任务是复制上一章的源代码。建议您创建一个新目录`chap10`,作为`chap09`目录的同级目录,并将`chap09`中的所有内容复制到`chap10`中。

首先,让我们安装 Multipass,然后我们将开始部署和测试用户认证服务,然后部署和测试 Notes。我们还将涵盖 Windows 上的设置问题。

## 安装 Multipass

Multipass 是由 Canonical 开发的开源工具。它是一个非常轻量级的用于管理 VM 的工具,特别是基于 Ubuntu 的 VM。它足够轻便,可以在笔记本电脑上运行迷你云主机系统。

要安装 Multipass,请从[`multipass.run/`](https://multipass.run/)获取安装程序。它也可能通过软件包管理系统可用。

安装了 Multipass 后,您可以运行以下命令中的一些来尝试它:

Because we did not supply a name for the machine, Multipass created a random name. It isn't shown in the preceding snippet, but the first command included the download and setup of a VM image. The shell command starts a login shell inside the newly created VM, where you can use tools like ps or htop to see that there is indeed a full complement of processes running already.

Since one of the first things you do with a new Ubuntu install is to update the system, let's do so the Multipass way:


这按预期工作,您会看到`apt-get`首先更新其可用软件包的列表,然后要求您批准下载和安装软件包以进行更新,之后它会这样做。熟悉 Ubuntu 的人会觉得这很正常。不同之处在于从主机计算机的命令行环境中执行此操作。

这很有趣,但我们有一些工作要做,我们对 Multipass 基于野马的机器名称不满意。让我们学习如何删除 Multipass 实例:

We can easily delete a VM image with the delete command; it is then marked as Deleted. To truly remove the VM, we must use the purge command.

We've learned how to create, manage, and delete VMs using Multipass. This was a lot faster than some of the alternative technologies. With VirtualBox, for example, we would have had to find and download an ISO, then boot a VirtualBox VM instance and run the Ubuntu installer, taking a lot more time.

There might be difficulties using Multipass on Windows, so let's talk about that and how to rectify it.

Handling a failure to launch Multipass instances on Windows

The Multipass team makes their application available to on run Windows systems, but issues like the following can crop up:


它通过设置实例的所有步骤,但在最后一步,我们收到了这条消息,而不是成功。运行`multipass list`可能会显示实例处于`Running`状态,但没有分配 IP 地址,运行`multipass shell`也会导致超时。

如果在计算机上安装了 WSL2 和 Multipass,则会观察到此超时。WSL2 是 Windows 的轻量级 Linux 子系统,被称为在 Windows 上运行 Linux 命令的极佳环境。同时运行 WSL2 和 Multipass 可能会导致不希望的行为。

在本章中,WSL2 没有用。这是因为 WSL2 目前不支持安装在重启后持续存在的后台服务,因为它不支持`systemd`。请记住,我们的目标是学习设置持久的后台服务。

可能需要禁用 WSL2。要这样做,请使用 Windows 任务栏中的搜索框查找“打开或关闭 Windows 功能”控制面板。因为 WSL2 是一个功能而不是一个安装或卸载的应用程序,所以可以使用此控制面板来启用或禁用它。只需向下滚动以找到该功能,取消选中复选框,然后重新启动计算机。

Multipass 在线文档中有一个用于 Windows 的故障排除页面,其中包含一些有用的提示,网址为[`multipass.run/docs/troubleshooting-networking-on-windows`](https://multipass.run/docs/troubleshooting-networking-on-windows)。

WSL2 和 Multipass 都使用 Hyper-V。这是 Windows 的虚拟化引擎,它还支持以类似于 VirtualBox 或 VMware 的模式安装 VM。可以轻松下载 Ubuntu 或任何其他操作系统的 ISO 并在 Hyper-V 上安装它。这将导致完整的操作系统,可以在其中进行后台进程部署的实验。您可能更喜欢在 Hyper-V 内部运行这些示例。

安装了虚拟机后,本章其余大部分说明都将适用。具体来说,`install-packages.sh`脚本可用于安装完成说明所需的 Ubuntu 软件包,`configure-svc`脚本可用于将服务“部署”到`/opt/notes`和`/opt/userauth`。建议在虚拟机内部使用 Git 克隆与本书相关的存储库。最后,pm2-single 目录中的脚本可用于在 PM2 下运行 Notes 和 Users 服务。

我们的目的是学习如何在 Linux 系统上部署 Node.js 服务,而无需离开我们的笔记本电脑。为此,我们熟悉了 Multipass,因为它是管理 Ubuntu 实例的绝佳工具。我们还了解了诸如 Hyper-V 或 VirtualBox 之类的替代方案,这些替代方案也可以用于管理 Linux 实例。

让我们开始探索使用用户认证服务进行部署。

## 为用户认证服务配置服务器

由于我们希望拥有分段基础架构,并将用户认证服务放在一个隔离区域中,让我们首先尝试构建该架构。使用 Multipass,我们将创建两个服务器实例`svc-userauth`和`svc-notes`。每个实例将包含自己的 MySQL 实例和相应的基于 Node.js 的服务。在本节中,我们将设置`svc-userauth`,然后在另一节中,我们将复制该过程以设置`svc-notes`。

对于我们的 DevOps 团队,他们要求对所有管理任务进行自动化,我们将创建一些 shell 脚本来管理服务器的设置和配置。

这里显示的脚本处理了部署到两个服务器的情况,其中一个服务器保存认证服务,另一个保存*Notes*应用程序。在本书的 GitHub 存储库中,您将找到其他脚本,用于部署到单个服务器。如果您使用的是 VirtualBox 而不是 Multipass 等较重的虚拟化工具,则可能需要单个服务器方案。

在本节中,我们将创建用户认证后端服务器`svc-userauth`,在后面的部分中,我们将创建*Notes*前端的服务器`svc-notes`。由于这两个服务器实例将设置类似,我们可能会质疑为什么要设置两个服务器。这是因为我们决定的安全模型。

涉及几个步骤,包括一些用于自动化 Multipass 操作的脚本,如下所示:

1.  创建一个名为`chap10/multipass`的目录,用于管理 Multipass 实例的脚本。

1.  然后,在该目录中创建一个名为`create-svc-userauth.sh`的文件,其中包含以下内容:

On Windows, instead create a file named create-svc-userauth.ps1 containing the following:


这两者几乎相同,只是计算当前目录的方法不同。

Multipass 中的`mount`命令将主机目录附加到给定位置的实例中。因此,我们将`multipass`目录附加为`/build`,将`users`附加为`/build-users`。

``pwd``符号是 Unix/Linux shell 环境的一个特性。它意味着运行`pwd`进程并捕获其输出,将其作为命令行参数提供给`multipass`命令。对于 Windows,我们在 PowerShell 中使用`(get-location)`来达到同样的目的。

1.  通过运行脚本创建实例:

Or, on Windows, run this:


运行脚本中的命令,将启动实例并从主机文件系统挂载目录。

1.  创建一个名为`install-packages.sh`的文件,其中包含以下内容:

This installs Node.js 14.x and sets up other packages required to run the authentication service. This includes a MySQL server instance and the MySQL client.

The Node.js documentation (nodejs.org/en/download/package-manager/) has documentation on installing Node.js from package managers for several OSes. This script uses the recommended installation for Debian and Ubuntu systems because that's the OS used in the Multipass instance.

A side effect of installing the mysql-server package is that it launches a running MySQL service with a default configuration. Customizing that configuration is up to you, but for our purposes here and now, the default configuration will work.

  1. Execute this script inside the instance like so:

正如我们之前讨论的,`exec`命令会导致在主机系统上运行此命令,从而在容器内部执行命令。

1.  在`users`目录中,编辑`user-server.mjs`并更改以下内容:

Previously, we had specified a hardcoded 'localhost' here. The effect of this was that the user authentication service only accepted connections from the same computer. To implement our vision of Notes and the user authentication services running on different computers, this service must support connections from elsewhere.

This change introduces a new environment variable, REST_LISTEN, where we will declare where the server should listen for connections.

As you edit the source files, notice that the changes are immediately reflected inside the Multipass machine in the /build-users directory.

  1. Create a file called users/sequelize-mysql.yaml containing the following:

这是允许用户服务与本地 MySQL 实例连接的配置。`dbname`、`username`和`password`参数必须与之前显示的配置脚本中的值匹配。

1.  然后,在`users/package.json`文件中,将这些条目添加到`scripts`部分:

The on-server script contains the runtime configuration we'll use on the server.

  1. Next, in the users directory, run this command:

由于我们现在正在使用 MySQL,我们必须安装驱动程序包。

1.  现在创建一个名为`configure-svc-userauth.sh`的文件,其中包含以下内容:

This script is meant to execute inside the Ubuntu system managed by Multipass. The first section sets a user identity in the database. The second section copies the user authentication service code, from /build-users to /userauth, into the instance, followed by installing the required packages.

Since the MySQL server is already running, the mysql command will access the running server to create the database, and create the userauth user. We will use this user ID to connect with the database from the user authentication service.

But, why are some files removed before copying them into the instance? The primary goal is to delete the node_modules directory; the other files are simply unneeded. The node_modules directory contains modules that were installed on your laptop, and surely your laptop has a different OS than the Ubuntu instance running on the server? Therefore, rerunning npm install on the Ubuntu server ensures the packages are installed correctly.

  1. Run the configure-svc-userauth script like so:

请记住源代码中的`multipass`目录被挂载到实例内部作为`/build`。一旦我们创建了这个文件,它就会出现在`/build`目录中,我们可以在实例内部执行它。

在本书中,我们已经多次谈到了明确声明所有依赖关系和自动化一切的价值。这证明了这个价值,因为现在,我们只需运行几个 shell 脚本,服务器就配置好了。而且我们不必记住如何启动服务器,因为`package.json`中的`scripts`部分。

1.  现在我们可以启动用户认证服务器,就像这样:

Notice that our notation is to use $ to represent a command typed on the host computer, and ubuntu@svc-userauth:~$ to represent a command typed inside the instance. This is meant to help you understand where the commands are to be executed.

In this case, we've logged into the instance, changed directory to /opt/userauth, and started the server using the corresponding npm script.

Testing the deployed user authentication service

Our next step at this point is to test the service. We created a script, cli.mjs, for that purpose. In the past, we ran this script on the same computer where the authentication service was running. But this time, we want to ensure the ability to access the service remotely.

Notice that the URL printed is http://[::]:5858. This is shorthand for listening to connections from any IP address.

On our laptop, we can see the following:


Multipass 为实例分配了一个 IP 地址。您的 IP 地址可能会有所不同。

在我们的笔记本电脑上有源代码的副本,包括`cli.mjs`的副本。这意味着我们可以在笔记本电脑上运行`cli.mjs`,告诉它访问`svc-userauth`上的服务。这是因为我们提前考虑并添加了`--host`和`--port`选项到`cli.mjs`。理论上,使用这些选项,我们可以在互联网上的任何地方访问这个服务器。目前,我们只需要在笔记本电脑的虚拟环境中进行访问。

在您的笔记本电脑上,而不是在 Multipass 内部的常规命令环境中,运行这些命令:

Make sure to specify the correct host IP address and port number.

If you remember, the script retrieves the newly created user entry and prints it out. But we need to verify this and can do so using the list-users command. But let's do something a little different, and learn how to access the database server.

In another command window on your laptop, type these commands:


这显示了我们创建的用户的数据库条目。请注意,当登录到 Multipass 实例时,我们可以使用任何 Ubuntu 命令,因为我们面前有完整的操作系统。

我们不仅在 Ubuntu 服务器上启动了用户认证服务,而且还验证了我们可以从服务器外部访问该服务。

在本节中,我们设置了我们想要运行的两个服务器中的第一个。我们仍然需要创建`svc-notes`服务器。

但在此之前,我们首先需要讨论在 Windows 上运行脚本。

## 在 Windows 上使用 PowerShell 执行脚本

在本章中,我们将编写几个 shell 脚本。其中一些脚本需要在您的笔记本电脑上运行,而不是在 Ubuntu 托管的服务器上运行。一些开发人员使用 Windows,因此我们需要讨论在 PowerShell 上运行脚本。

在 Windows 上执行脚本是不同的,因为它使用 PowerShell 而不是 Bash,还有许多其他考虑因素。对于这个和接下来的脚本,做出以下更改。

PowerShell 脚本文件名必须以`.ps1`扩展名结尾。对于大多数这些脚本,所需的只是将`.sh`脚本复制为`.ps1`文件,因为脚本非常简单。要执行脚本,只需在 PowerShell 窗口中键入`.\scriptname.ps1`。换句话说,在 Windows 上,刚才显示的脚本必须命名为`configure-svc-userauth.ps1`,并且以`.\configure-svc-userauth.ps1`执行。

要执行这些脚本,您可能需要更改 PowerShell 执行策略:

Obviously, there are security considerations with this change, so change the execution policy back when you're done.

A simpler method on Windows is to simply paste these commands into a PowerShell window.

It was useful to discuss script execution on PowerShell. Let's return to the task at hand, which is provisioning the Notes stack on Ubuntu. Since we have a functioning user authentication service, the remaining task is the Notes service.

Provisioning a server for the Notes service

So far, we have set up the user authentication service on Multipass. Of course, to have the full Notes application stack running, the Notes service must also be running. So let's take care of that now.

The first server, svc-userauth, is running the user authentication service. Of course, the second server will be called svc-notes, and will run the Notes service. What we'll do is very similar to how we set up svc-userauth.

There are several tasks in the multipass directory to prepare this second server. As we did with the svc-userauth server, here, we set up the svc-notes server by installing and configuring required Ubuntu packages, then set up the Notes application:

  1. Create a script named multipass/create-svc-notes.sh containing the following:

这个任务是启动 Multipass 实例,并且与`create-svc-userauth`非常相似,但是更改为使用单词`notes`。

对于 Windows,创建一个名为`multipass/create-svc-notes.ps1`的文件,其中包含以下内容:

This is the same as before, but using (get-location) this time.

  1. Create the instance by running the script as follows:

或者,在 Windows 上,运行以下命令:

Either one runs the commands in the scripts that will launch the instance and mount directories from the host filesystem.

  1. Install the required packages like so:

此脚本安装了 Node.js、MySQL 服务器和其他一些必需的软件包。

1.  现在创建一个文件,`notes/models/sequelize-mysql.yaml`,其中包含以下内容:

This is the database name, username, and password credentials for the database configured previously.

  1. Because we are now using MySQL, run this command:

我们需要 MySQL 驱动程序包来使用 MySQL。

1.  然后,在`notes/package.json`文件中,将此条目添加到`scripts`部分:

This uses the new database configuration for the MySQL server and the IP address for the user authentication service. Make sure that the IP address matches what Multipass assigned to svc-userauth.

You'll, of course, get the IP address in the following way:


`on-server`脚本将需要相应地更新。

1.  复制`multipass/configure-svc-userauth.sh`以创建一个名为`multipass/configure-svc-notes.sh`的脚本,并将最后两个部分更改为以下内容:

This is also similar to what we did for svc-userauth. This also changes things to use the word notes where we used userauth before.

Something not explicitly covered here is ensuring the .env file you created to hold Twitter secrets is deployed to this server. We suggested ensuring this file is not committed to a source repository. That means you'll be handling it semi-manually perhaps, or you'll have to use some developer ingenuity to create a process for managing this file securely.

  1. Run the configure-svc-notes script like so:

请记住,源树中的`multipass`目录被挂载到实例内部作为`/build`。一旦我们创建了这个文件,它就会出现在`/build`目录中,并且我们可以在实例内部执行它。

1.  现在可以使用以下命令运行 Notes 服务:

As with svc-userauth, we shell into the server, change the directory to /opt/notes, and run the on-server script. If you want Notes to be visible on port 80, simply change the PORT environment variable. After that, the URL in the TWITTER_CALLBACK_HOST variable must contain the port number on which Notes is listening. For that to work, the on-server script needs to run as root, so therefore we will run the following:


更改是使用`sudo`以`root`身份执行命令。

为了测试这一点,我们当然需要使用浏览器连接到 Notes 服务。为此,我们需要使用`svc-notes`的 IP 地址,这是我们之前从 Multipass 学到的。使用这个例子,URL 是`http://172.23.89.142:3000`。

您会发现,由于我们在外观和感觉类别中没有改变任何内容,我们的*Notes*应用程序看起来一直都是这样。从功能上讲,您将无法使用 Twitter 凭据登录,但可以使用我们在测试期间创建的本地帐户之一登录。

一旦两个服务都在运行,您可以使用浏览器与*Notes*应用程序进行交互,并通过其功能运行它。

我们已经构建了两个服务器,`svc-userauth`和`svc-notes`,在这两个服务器上运行 Notes 应用程序堆栈。这给了我们两个 Ubuntu 实例,每个实例都配置了数据库和 Node.js 服务。我们能够手动运行身份验证和 Notes 服务,并从一个 Ubuntu 实例连接到另一个 Ubuntu 实例,每个实例都与其相应的数据库一起工作。要将其作为完全部署的服务器,我们将在后面的部分中使用 PM2。

我们已经学到了一些关于配置 Ubuntu 服务器的知识,尽管运行服务作为后台进程仍然存在问题。在解决这个问题之前,让我们纠正一下 Twitter 登录功能的情况。Twitter 登录的问题在于应用现在位于不同的 IP 地址,因此为了解决这个问题,我们现在必须在 Twitter 的管理后端中添加该 IP 地址。

# 调整 Twitter 身份验证以在服务器上工作

正如我们刚才指出的,当前部署的*Notes*应用程序不支持基于 Twitter 的登录。任何尝试都会导致错误。显然,我们不能这样部署它。

我们之前为*Notes*设置的 Twitter 应用程序将无法工作,因为引用我们笔记本电脑的身份验证 URL 对于服务器来说是不正确的。要使 OAuth 在这个新服务器上与 Twitter 一起工作,请转到`developer.twitter.com/en/apps`并重新配置应用程序以使用服务器的 IP 地址。

该页面是您已在 Twitter 注册的应用程序的仪表板。单击`Details`按钮,您将看到配置的详细信息。单击`Edit`按钮,编辑回调 URL 的列表如下:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/100f13e0-7055-4914-b2b6-19352f3bc230.png)

当然,您必须替换服务器的 IP 地址。如果您的 Multipass 实例被分配了 IP 地址`192.168.64.9`,则此处显示的 URL 是正确的。这将通知 Twitter 使用一个新的正确的回调 URL。同样,如果您已经配置*Notes*监听端口`80`,那么您指向 Twitter 的 URL 也必须使用端口`80`。您必须为将来使用的任何回调 URL 更新此列表。

接下来要做的是更改*Notes*应用程序,以便在`svc-notes`服务器上使用这个新的回调 URL。在`routes/users.mjs`中,默认值是`http://localhost:3000`,用于我们的笔记本电脑。但是现在我们需要使用服务器的 IP 地址。幸运的是,我们事先考虑到了这一点,软件有一个环境变量来实现这个目的。在`notes/package.json`中,将以下环境变量添加到`on-server`脚本中:

Use the actual IP address or domain name assigned to the server being used. In a real deployment, we'll have a domain name to use here.

Additionally, to enable Twitter login support, it is required to supply Twitter authentication tokens in the environment variables:


这不应该添加在`package.json`中,而应通过其他方式提供。我们还没有找到合适的方法,但我们确实发现将这些变量添加到`package.json`中意味着将它们提交到源代码存储库,这可能会导致这些值泄漏给公众。

目前,服务器可以这样启动:

This is still a semi-manual process of starting the server and specifying the Twitter keys, but you'll be able to log in using Twitter credentials. Keep in mind that we still need a solution for this that avoids committing these keys to a source repository.

The last thing for us to take care of is ensuring the two service processes restart when the respective servers restart. Right now, the services are running at the command line. If we ran multipass restart, the service instances will reboot and the service processes won't be running.

In the next section, we'll learn one way to configure a background process that reliably starts when a computer is booted.

Setting up PM2 to manage Node.js processes

We have two servers, svc-notes and svc-userauth, configured so we can run the two services making up the Notes application stack. A big task remaining is to ensure the Node.js processes are properly installed as background processes.

To see the problem, start another command window and run these commands:


服务器实例正在 Multipass 下运行,`restart`命令导致命名实例`stop`,然后`start`。这模拟了服务器的重启。由于两者都在前台运行,您将看到每个命令窗口退出到主机命令 shell,并且再次运行`multipass list`将显示两个实例处于`Running`状态。最重要的是,两个服务都不再运行。

有许多方法可以管理服务器进程,以确保在进程崩溃时重新启动等。我们将使用**PM2**([`pm2.keymetrics.io/`](http://pm2.keymetrics.io/)),因为它针对 Node.js 进程进行了优化。它将进程管理和监控捆绑到一个应用程序中。

现在让我们看看如何使用 PM2 来正确地管理 Notes 和用户身份验证服务作为后台进程。我们将首先熟悉 PM2,然后创建脚本来使用 PM2 来管理服务,最后,我们将看到如何将其与操作系统集成,以便正确地将服务作为后台进程进行管理。

## 熟悉 PM2

为了熟悉 PM2,让我们使用`svc-userauth`服务器设置一个测试。我们将创建一个目录来保存`pm2-userauth`项目,在该目录中安装 PM2,然后使用它来启动用户身份验证服务。在此过程中,我们将学习如何使用 PM2。

首先在`svc-userauth`服务器上运行以下命令:

The result of these commands is an npm project directory containing the PM2 program and a package.json file that we can potentially use to record some scripts.

Now let's start the user authentication server using PM2:


这归结为运行`pm2 start ./user-server.mjs`,只是我们添加了包含配置值的环境变量,并且指定了 PM2 的完整路径。这样可以在后台运行我们的用户服务器。

我们可以重复使用`cli.mjs`来列出已知的身份验证服务器用户的测试:

Since we had previously launched this service and tested it, there should be user IDs already in the authentication server database. The server is running, but because it's not in the foreground, we cannot see the output. Try this command:


因为 PM2 捕获了服务器进程的标准输出,任何输出都被保存起来。`logs`命令让我们查看那些输出。

其他一些有用的命令如下:

+   `pm2 status`:列出 PM2 当前正在管理的所有命令及其状态

+   `pm2 stop SERVICE`:停止命名服务

+   `pm2 start SERVICE`或`pm2 restart SERVICE`:启动命名服务

+   `pm2 delete SERVICE`:使 PM2 忘记命名服务

还有其他几个命令,PM2 网站包含了完整的文档。[`pm2.keymetrics.io/docs/usage/pm2-doc-single-page/`](https://pm2.keymetrics.io/docs/usage/pm2-doc-single-page/)

暂时,让我们关闭它并删除受管进程:

We have familiarized ourselves with PM2, but this setup is not quite suitable for any kind of deployment. Let's instead set up scripts that will manage the Notes services under PM2 more cleanly.

Scripting the PM2 setup on Multipass

We have two Ubuntu systems onto which we've copied the Notes and user authentication services, and also configured a MySQL server for each machine. On these systems, we've manually run the services and know that they work, and now it's time to use PM2 to manage these services as persistent background processes.

With PM2 we can create a file, ecosystem.json, to describe precisely how to launch the processes. Then, with a pair of PM2 commands, we can integrate the process setup so it automatically starts as a background process.

Let's start by creating two directories, multipass/pm2-notes and multipass/pm2-userauth. These will hold the scripts for the corresponding servers.

In pm2-notes, create a file, package.json, containing the following:


这为我们记录了对 PM2 的依赖,因此可以轻松安装它,以及一些有用的脚本可以在 PM2 上运行。

然后在同一目录中,创建一个包含以下内容的`ecosystem.json`文件:

The ecosystem.json file is how we describe a process to be monitored to PM2.

In this case, we've described a single process, called Notes. The cwd value declares where the code for this process lives, and the script value describes which script to run to launch the service. The env value is a list of environment variables to set.

This is where we would specify the Twitter authentication tokens. But since this file is likely to be committed to a source repository, we shouldn't do so. Instead, we'll forego Twitter login functionality for the time being.

The USER_SERVICE_URL and TWITTER_CALLBACK_HOST variables are set according to the multipass list output we showed earlier. These values will, of course, vary based on what was selected by your host system.

These environment variables are the same as we set in notes/package.json – except, notice that we've set PORT to 80 so that it runs on the normal HTTP port. To successfully specify port 80, PM2 must execute as root.

In pm2-userauth, create a file named package.json containing the folllowing:


这与`pm2-notes`相同,只是名称不同。

然后,在`pm2-userauth`中,创建一个名为`ecosystem.json`的文件,其中包含以下内容:

This describes the user authentication service. On the server, it is stored in the /userauth directory and is launched using the user-server.mjs script, with that set of environment variables.

Next, on both servers create a directory called /opt/pm2. Copy the files in pm2-notes to the /opt/pm2 directory on svc-notes, and copy the files in pm2-userauth to the /opt/pm2 directory on svc-userauth.

On both svc-notes and svc-userauth, you can run these commands:


这样做会启动两个服务器实例上的服务。 `npm run logs` 命令让我们可以实时查看日志输出。我们已经在更符合 DevOps 的日志配置中配置了两个服务,没有启用 DEBUG 日志,并且使用了*common*日志格式。

对于测试,我们访问与之前相同的 URL,但是端口改为`80`而不是`3000`。

因为`svc-notes`上的 Notes 服务现在在端口`80`上运行,我们需要再次更新 Twitter 应用程序的配置,如下所示:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/86efbfc1-e3a4-402c-8cce-22a36e1d88da.png)

这将从服务器的 URL 中删除端口`3000`。应用程序不再在端口`3000`上运行,而是在端口`80`上运行,我们需要告诉 Twitter 这个变化。

## 将 PM2 设置集成为持久后台进程

*Notes*应用程序应该完全正常运行。还有一个小任务要完成,那就是将其与操作系统集成。

在类 Unix 系统上的传统方法是在`/etc`目录中的一个目录中添加一个 shell 脚本。Linux 社区为此目的定义了 LSB Init Script 格式,但由于每个操作系统对于管理后台进程的脚本有不同的标准,PM2 有一个命令可以为每个操作系统生成正确的脚本。

让我们从`svc-userauth`开始,运行这些命令:

With npm run save, we run the pm2 save command. This command saves the current configuration into a file in your home directory.

With npm run startup, we run the pm2 startup command. This converts the saved current configuration into a script for the current OS that will manage the PM2 system. PM2, in turn, manages the set of processes you've configured with PM2.

In this case, it identified the presence of the systemd init system, which is the standard for Ubuntu. It generated a file, /etc/systemd/system/pm2-root.service, that tells Ubuntu about PM2. In amongst the output, it tells us how to use systemctl to start and stop the PM2 service.

Do the same on svc-notes to implement the background service there as well.

And now we can test restarting the two servers with the following commands:


机器应该能够正确重启,并且在我们不进行干预的情况下,服务将会运行。您应该能够对*Notes*应用程序进行测试,并查看它是否正常工作。此时 Twitter 登录功能将无法使用,因为我们没有提供 Twitter 令牌。

在每台服务器上运行这个命令尤其有益:

The monit command starts a monitoring console showing some statistics including CPU and memory use, as well as logging output.

When done, run the following command:


当然,这将关闭服务实例。由于我们所做的工作,您随时可以重新启动它们。

在这一部分,我们学到了很多关于将*Notes*应用程序配置为受管后台进程的知识。通过一系列 shell 脚本和配置文件,我们组建了一个系统,使用 PM2 来管理这些服务作为后台进程。通过编写我们自己的脚本,我们更清楚地了解了底层的工作原理。

有了这些,我们就可以结束本章了。

# 总结

在本章中,我们开始了解将 Node.js 服务部署到生产服务器的过程。目标是学习部署到云托管,但为了达到这个目标,我们学习了在 Linux 系统上获得可靠后台进程的基础知识。

我们首先回顾了 Notes 应用程序的架构,并看到这将如何影响部署。这使我们能够了解服务器部署的要求。

然后我们学习了在 Linux 上使用 init 脚本部署服务的传统方法。为此,我们学习了如何使用 PM2 来管理进程,并将其集成为持久后台进程。PM2 是 Unix/Linux 系统上管理后台进程的有用工具。部署和管理持久性是任何开发 Web 应用程序的关键技能。

虽然这是在您的笔记本电脑上执行的,但完全相同的步骤可以在公共服务器上执行,比如从 Web 托管公司租用的 VPS。通过一点工作,我们可以使用这些脚本在公共 VPS 上设置一个测试服务器。我们需要更好的自动化工作,因为 DevOps 团队需要完全自动化的部署。

即使在云托管平台的时代,许多组织仍然使用我们在本章讨论的相同技术部署服务。他们不使用基于云的部署,而是租用一个或几个 VPS。但即使在使用 Docker、Kubernetes 等云基部署时,开发人员也必须知道如何在类 Unix 系统上实现持久服务。Docker 容器通常是 Linux 环境,必须包含可靠的持久后台任务,这些任务是可观察和可维护的。

在下一章中,我们将转向不同的部署技术:Docker。Docker 是一种流行的系统,用于将应用程序代码打包在一个*容器*中,在我们的笔记本电脑上执行,或者在云托管平台上按比例执行而不改变。


使用 Docker 部署 Node.js 微服务

现在我们已经体验了传统的 Linux 部署应用程序的方式,让我们转向 Docker,这是一种流行的新的应用程序部署方式。

Docker(http://docker.com)是软件行业中一个很酷的新工具。它被描述为*面向开发人员和系统管理员的分布式应用程序的开放平台*。它是围绕 Linux 容器化技术设计的,并专注于描述在任何 Linux 变体上的软件配置。

Docker 容器是 Docker 镜像的运行实例。Docker 镜像是一个包含特定 Linux 操作系统、系统配置和应用程序配置的捆绑包。Docker 镜像使用 Dockerfile 来描述,这是一个相当简单的编写脚本,描述如何构建 Docker 镜像。Dockerfile 首先通过指定一个基础镜像来开始构建,这意味着我们从其他镜像派生 Docker 镜像。Dockerfile 的其余部分描述了要添加到镜像中的文件,要运行的命令以构建或配置镜像,要公开的网络端口,要在镜像中挂载的目录等等。

Docker 镜像存储在 Docker 注册服务器上,每个镜像存储在自己的存储库中。最大的注册表是 Docker Hub,但也有第三方注册表可用,包括您可以安装在自己硬件上的注册服务器。Docker 镜像可以上传到存储库,并且可以从存储库部署到任何 Docker 服务器。

我们实例化一个 Docker 镜像来启动一个 Docker 容器。通常,启动容器非常快速,而且通常情况下,容器会在短时间内实例化,然后在不再需要时被丢弃。

运行的容器感觉像是在虚拟机上运行的虚拟服务器。然而,Docker 容器化与诸如 VirtualBox 或 Multipass 之类的虚拟机系统非常不同。容器不是完整计算机的虚拟化。相反,它是一个极其轻量级的外壳,创建了已安装操作系统的外观。例如,容器内运行的进程实际上是在主机操作系统上运行的,使用某些 Linux 技术(cgroups、内核命名空间等)创建了运行特定 Linux 变体的幻觉。您的主机操作系统可以是 Ubuntu,容器操作系统可以是 Fedora 或 OpenSUSE,甚至是 Windows;Docker 使所有这些都能运行。

虽然 Docker 主要针对 x86 版本的 Linux,但它也适用于几种基于 ARM 的操作系统,以及其他处理器。甚至可以在单板计算机上运行 Docker,比如树莓派,用于面向硬件的物联网(IoT)项目。

Docker 生态系统包含许多工具,它们的数量正在迅速增加。对于我们的目的,我们将专注于以下两个工具:

+   **Docker 引擎**:这是协调一切的核心执行系统。它在 Linux 主机系统上运行,公开一个基于网络的 API,客户端应用程序使用它来进行 Docker 请求,比如构建、部署和运行容器。

+   **Docker Compose**:这有助于您在一个文件中定义一个多容器应用程序及其所有定义的依赖关系。

还有其他与 Docker 密切相关的工具,比如 Kubernetes,但一切都始于构建一个容器来容纳您的应用程序。通过学习 Docker,我们学会了如何将应用程序容器化,这是我们可以在 Docker 和 Kubernetes 中使用的技能。

学习如何使用 Docker 是学习其他流行系统的入门,比如 Kubernetes 或 AWS ECS。这两个是用于在云托管基础设施上大规模管理容器部署的流行编排系统。通常,容器是 Docker 容器,但它们是由其他系统部署和管理的,无论是 Kubernetes、ECS 还是 Mesos。这使得学习如何使用 Docker 成为学习这些其他系统的绝佳起点。

在本章中,我们将涵盖以下主题:

+   在我们的笔记本电脑上安装 Docker

+   开发我们自己的 Docker 容器并使用第三方容器

+   在 Docker 中设置用户认证服务及其数据库

+   在 Docker 中设置 Notes 服务及其数据库

+   在 Docker 中部署 MySQL 实例,并为 Docker 中的应用程序提供数据持久性,例如数据库

+   使用 Docker Compose 描述完整应用程序的 Docker 部署

+   在 Docker 基础设施中扩展容器实例并使用 Redis 来缓解扩展问题

第一项任务是复制上一章的源代码。建议您创建一个新目录`chap11`,作为`chap10`目录的兄弟目录,并将`chap10`中的所有内容复制到`chap11`中。

在本章结束时,您将对使用 Docker、创建 Docker 容器以及使用 Docker Compose 管理 Notes 应用程序所需的服务有扎实的基础。

借助 Docker,我们将在笔记本电脑上设计第十章中显示的系统,*将 Node.js 应用程序部署到 Linux 服务器*。这一章,以及第十二章,*使用 Terraform 在 AWS EC2 上部署 Docker Swarm*,形成了一个覆盖 Node.js 三种部署风格的弧线。

# 第十五章:在您的笔记本电脑或计算机上设置 Docker

学习如何在笔记本电脑上安装 Docker 的最佳地方是 Docker 文档。我们要找的是 Docker **Community Edition**(CE),这就是我们所需要的:

+   macOS 安装:[`docs.docker.com/docker-for-mac/install/`](https://docs.docker.com/docker-for-mac/install/)

+   Windows 安装:[`docs.docker.com/docker-for-windows/install/`](https://docs.docker.com/docker-for-windows/install/)

+   Ubuntu 安装:[`docs.docker.com/install/linux/docker-ce/ubuntu/`](https://docs.docker.com/install/linux/docker-ce/ubuntu/)

还有其他几种发行版的安装说明。一些有用的 Linux 后安装说明可在[`docs.docker.com/install/linux/linux-postinstall/`](https://docs.docker.com/install/linux/linux-postinstall/)找到。

Docker 在 Linux 上本地运行,安装只是 Docker 守护程序和命令行工具。要在 macOS 或 Windows 上运行 Docker,您需要安装 Docker for Windows 或 Docker for Mac 应用程序。这些应用程序在轻量级虚拟机中管理一个虚拟 Linux 环境,在其中运行着一个在 Linux 上运行的 Docker Engine 实例。在过去(几年前),我们不得不手工设置这个环境。必须感谢 Docker 团队,他们使得这一切像安装应用程序一样简单,所有复杂性都被隐藏起来。结果非常轻量级,Docker 容器可以在后台运行而几乎不会产生影响。

现在让我们学习如何在 Windows 或 macOS 机器上安装 Docker。

## 使用 Docker for Windows 或 macOS 安装和启动 Docker

Docker 团队使得在 Windows 或 macOS 上安装 Docker 变得非常简单。您只需下载安装程序,并像大多数其他应用程序一样运行安装程序。它会负责安装并为您提供一个应用程序图标,用于启动 Docker。在 Linux 上,安装稍微复杂一些,因此最好阅读并遵循官方说明。

在 Windows 或 macOS 上启动 Docker 非常简单,一旦您遵循了安装说明。您只需找到并双击应用程序图标。有可用的设置,使得 Docker 在每次启动笔记本电脑时自动启动。

在 Docker for Windows 和 Docker for Mac 上,CPU 必须支持**虚拟化**。Docker for Windows 和 Docker for Mac 中内置了一个超轻量级的 hypervisor,而这又需要 CPU 的虚拟化支持。

对于 Windows,这可能需要 BIOS 配置。有关更多信息,请参阅[`docs.docker.com/docker-for-windows/troubleshoot/#virtualization-must-be-enabled`](https://docs.docker.com/docker-for-windows/troubleshoot/#virtualization-must-be-enabled)。

对于 macOS,这需要 2010 年或之后的硬件,具有英特尔对**内存管理单元**(**MMU**)虚拟化的硬件支持,包括**扩展页表**(**EPTs**)和无限制模式。您可以通过运行`sysctl kern.hv_support`来检查此支持。还需要 macOS 10.11 或更高版本。

安装完软件后,让我们尝试并熟悉 Docker。

## 熟悉 Docker

完成设置后,我们可以使用本地 Docker 实例创建 Docker 容器,运行一些命令,并且通常学习如何使用它。

就像许多软件之旅一样,这一切都始于“Hello World”:

The docker run command downloads a Docker image, named on the command line, initializes a Docker container from that image, and then runs that container. In this case, the image, named hello-world, was not present on the local computer and had to be downloaded and initialized. Once that was done, the hello-world container was executed and it printed out these instructions.

The docker run hello-world command is a quick way to verify that Docker is installed correctly.

Let's follow the suggestion and start an Ubuntu container:


“无法找到镜像”这个短语意味着 Docker 尚未下载命名的镜像。因此,它不仅下载了 Ubuntu 镜像,还下载了它所依赖的镜像。任何 Docker 镜像都可以分层构建,这意味着我们总是根据基础镜像定义镜像。在这种情况下,我们看到 Ubuntu 镜像总共需要四层。

镜像由 SHA-256 哈希标识,并且有长格式标识符和短格式标识符。我们可以在此输出中看到长标识符和短标识符。

`docker run`命令下载图像,配置其执行,并执行图像。`-it`标志表示在终端中交互式运行图像。

在`docker run`命令行中,图像名称后面要执行的部分作为命令选项传递到容器中以执行。在这种情况下,命令选项表示要运行`bash`,这是默认的命令 shell。事实上,我们得到了一个命令提示符,可以运行 Linux 命令。

您可以查询您的计算机,看到`hello-world`容器已经执行并完成,但它仍然存在:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/fc4a87f1-2ca6-44b6-aa62-a8b206cf9a98.png)

`docker ps`命令列出正在运行的 Docker 容器。正如我们在这里看到的,`hello-world`容器不再运行,但 Ubuntu 容器在运行。使用`-a`开关,`docker ps`还会显示那些存在但当前未运行的容器。

最后一列是容器名称。由于在启动容器时我们没有指定容器名称,Docker 为我们创建了一个半随机的名称。

使用容器后,您可以使用以下命令进行清理:

The clever_napier name is the container name automatically generated by Docker. While the image name was hello-world, that was not the container name. Docker generated the container name so that you have a more user-friendly identifier for the containers than the hex ID shown in the CONTAINER ID column:


也可以指定十六进制 ID。但是,相对于十六进制 ID,为容器指定一个名称当然更加用户友好。在创建容器时,可以轻松地指定任何您喜欢的容器名称。

我们已经在笔记本电脑或计算机上安装了 Docker,并尝试了一些简单的命令来熟悉 Docker。现在让我们开始一些工作。我们将首先在 Docker 容器中设置用户认证服务。

# 在 Docker 中设置用户认证服务

在我们的脑海中有这么多理论,现在是时候做一些实际的事情了。让我们首先设置用户认证服务。我们将称之为 AuthNet,并且它包括一个用于存储用户数据库的 MySQL 实例,认证服务器和一个私有子网来连接它们。

最好让每个容器专注于提供一个服务。每个容器提供一个服务是一个有用的架构决策,因为我们可以专注于为特定目的优化每个容器。另一个理由与扩展有关,因为每个服务有不同的要求来满足其提供的流量。在我们的情况下,根据流量负载,我们可能需要一个单独的 MySQL 实例和 10 个用户认证实例。

Docker Hub([`hub.docker.com`](https://hub.docker.com))上有大量预定义的 Docker 镜像库。最好重用其中一个镜像作为构建我们所需服务的起点。

Docker 环境不仅让我们定义和实例化 Docker 容器,还可以定义容器之间的网络连接。这就是我们之前所说的*私有子网*。通过 Docker,我们不仅可以管理容器,还可以配置子网、数据存储服务等等。

在接下来的几节中,我们将仔细地将用户认证服务基础架构 docker 化。我们将学习如何为 Docker 设置一个 MySQL 容器,并在 Docker 中启动一个 Node.js 服务。

让我们首先学习如何在 Docker 中启动一个 MySQL 容器。

## 在 Docker 中启动一个 MySQL 容器

在公开可用的 Docker 镜像中,有超过 11,000 个适用于 MySQL 的镜像。幸运的是,MySQL 团队提供的`mysql/mysql-server`镜像易于使用和配置,所以让我们使用它。

可以指定 Docker 镜像名称,以及通常是软件版本号的*标签*。在这种情况下,我们将使用`mysql/mysql-server:8.0`,其中`mysql/mysql-server`是镜像存储库 URL,`mysql-server`是镜像名称,`8.0`是标签。截至撰写本文时,MySQL 8.x 版本是当前版本。与许多项目一样,MySQL 项目使用版本号标记 Docker 镜像。

按照以下方式下载镜像:

The docker pull command retrieves an image from a Docker repository and is conceptually similar to the git pull command, which retrieves changes from a git repository.

This downloaded four image layers in total because this image is built on top of three other images. We'll see later how that works when we learn how to build a Dockerfile.

We can query which images are stored on our laptop with the following command:


目前有两个可用的镜像——我们刚刚下载的`mysql-server`镜像和之前运行的`hello-world`镜像。

我们可以使用以下命令删除不需要的镜像:

Notice that the actual delete operation works with the SHA256 image identifier.

A container can be launched with the image, as follows:


`docker run`命令接受一个镜像名称,以及各种参数,并将其作为运行中的容器启动。

我们在前台启动了这项服务,当 MySQL 初始化其容器时,会有大量的输出。由于`--name`选项,容器的名称是`mysql`。通过环境变量,我们告诉容器初始化`root`密码。

既然我们有一个运行中的服务器,让我们使用 MySQL CLI 来确保它实际上正在运行。在另一个窗口中,我们可以在容器内运行 MySQL 客户端,如下所示:

The docker exec command lets you run programs inside the container. The -it option says the command is run interactively on an assigned terminal. In this case, we used the mysql command to run the MySQL client so that we could interact with the database. Substitute bash for mysql, and you will land in an interactive bash command shell.

This mysql command instance is running inside the container. The container is configured by default to not expose any external ports, and it has a default my.cnf file.

Docker containers are meant to be ephemeral, created and destroyed as needed, while databases are meant to be permanent, with lifetimes sometimes measured in decades. A very important discussion on this point and how it applies to database containers is presented in the next section.

It is cool that we can easily install and launch a MySQL instance. However, there are several considerations to be made:

  • Access to the database from other software, specifically from another container
  • Storing the database files outside the container for a longer lifespan
  • Custom configuration, because database admins love to tweak the settings
  • We need a path to connect the MySQL container to the AuthNet network that we'll be creating

Before proceeding, let's clean up. In a terminal window, type the following:


这关闭并清理了我们创建的容器。重申之前提到的观点,容器中的数据库已经消失了。如果那个数据库包含重要信息,你刚刚丢失了它,没有机会恢复数据。

在继续之前,让我们讨论一下这对我们服务设计的影响。

## Docker 容器的短暂性

Docker 容器被设计为易于创建和销毁。在试验过程中,我们已经创建并销毁了三个容器。

在过去(几年前),设置数据库需要提供特别配置的硬件,雇佣具有特殊技能的数据库管理员,并仔细地为预期的工作负载进行优化。在短短几段文字中,我们已经实例化和销毁了三个数据库实例。这是多么崭新的世界啊!

在数据库和 Docker 容器方面,数据库相对是永恒的,而 Docker 容器是短暂的。数据库预计会持续数年,甚至数十年。在计算机年代,那几乎是不朽的。相比之下,一个被使用后立即丢弃的 Docker 容器只是与数据库预期寿命相比的短暂时间。

这些容器可以快速创建和销毁,这给了我们很大的灵活性。例如,编排系统,如 Kubernetes 或 AWS ECS,可以自动增加或减少容器的数量以匹配流量,重新启动崩溃的容器等等。

但是数据库容器中的数据存放在哪里?在前一节中运行的命令中,数据库数据目录位于容器内部。当容器被销毁时,数据目录也被销毁,我们数据库中的任何数据都被永久删除。显然,这与我们在数据库中存储的数据的生命周期要求不兼容。

幸运的是,Docker 允许我们将各种大容量存储服务附加到 Docker 容器。容器本身可能是短暂的,但我们可以将永久数据附加到短暂的容器。只需配置数据库容器,使数据目录位于正确的存储系统上。

足够的理论,现在让我们做点什么。具体来说,让我们为身份验证服务创建基础架构。

## 定义身份验证服务的 Docker 架构

Docker 支持在容器之间创建虚拟桥接网络。请记住,Docker 容器具有已安装的 Linux 操作系统的许多功能。每个容器都可以有自己的 IP 地址和公开的端口。Docker 支持创建类似虚拟以太网段的东西,称为**桥接网络**。这些网络仅存在于主机计算机中,并且默认情况下,外部计算机无法访问它们。

因此,Docker 桥接网络的访问受到严格限制。连接到桥接网络的任何 Docker 容器都可以与连接到该网络的其他容器进行通信,并且默认情况下,该网络不允许外部流量。容器通过主机名找到彼此,并且 Docker 包含一个嵌入式 DNS 服务器来设置所需的主机名。该 DNS 服务器配置为不需要域名中的点,这意味着每个容器的 DNS/主机名只是容器名称。我们将在后面发现,容器的主机名实际上是`container-name.network-name`,并且 DNS 配置允许您跳过使用`network-name`部分的主机名。使用主机名来标识容器的策略是 Docker 对服务发现的实现。

在`users`和`notes`目录的同级目录中创建名为`authnet`的目录。我们将在该目录中处理`authnet`。

在该目录中创建一个名为`package.json`的文件,我们将仅使用它来记录管理 AuthNet 的命令:

We'll be adding more scripts to this file. The build-authnet command builds a virtual network using the bridge driver, as we just discussed. The name for this network is authnet.

Having created authnet, we can attach containers to it so that the containers can communicate with one another.

Our goal for the Notes application stack is to use private networking between containers to implement a security firewall around the containers. The containers will be able to communicate with one another, but the private network is not reachable by any other software and is, therefore, more or less safe from intrusion.

Type the following command:


这将创建一个 Docker 桥接网络。长编码字符串是此网络的标识符。`docker network ls`命令列出当前 Docker 系统中的现有网络。除了短十六进制 ID 外,网络还具有我们指定的名称。

使用以下命令查看有关网络的详细信息:

At the moment, this won't show any containers attached to authnet. The output shows the network name, the IP range of this network, the default gateway, and other useful network configuration information. Since nothing is connected to the network, let's get started with building the required containers:


此命令允许我们从 Docker 系统中删除网络。但是,由于我们需要此网络,重新运行命令以重新创建它。

我们已经探讨了设置桥接网络,因此我们的下一步是用数据库服务器填充它。

## 为身份验证服务创建 MySQL 容器

现在我们有了一个网络,我们可以开始将容器连接到该网络。除了将 MySQL 容器连接到私有网络外,我们还将能够控制与数据库一起使用的用户名和密码,并且还将为其提供外部存储。这将纠正我们之前提到的问题。

要创建容器,可以运行以下命令:

This does several useful things all at once. It initializes an empty database configured with the named users and passwords, it mounts a host directory as the MySQL data directory, it attaches the new container to authnet, and it exposes the MySQL port to connections from outside the container.

The docker run command is only run the first time the container is started. It combines building the container by running it for the first time. With the MySQL container, its first run is when the database is initialized. The options that are passed to this docker run command are meant to tailor the database initialization.

The --env option sets environment variables inside the container. The scripts driving the MySQL container look to these environment variables to determine the user IDs, passwords, and database to create.

In this case, we configured a password for the root user, and we configured a second user—userauth—with a matching password and database name.

There are many more environment variables available.

The official MySQL Docker documentation provides more information on configuring a MySQL Docker container (dev.mysql.com/doc/refman/8.0/en/docker-mysql-more-topics.html).

The MySQL server recognizes an additional set of environment variables (dev.mysql.com/doc/refman/8.0/en/environment-variables.html).

The MySQL server recognizes a long list of configuration options that can be set on the command line or in the MySQL configuration file (dev.mysql.com/doc/refman/8.0/en/server-option-variable-reference.html).

The --network option attaches the container to the authnet network.

The -p option exposes a TCP port from inside the container so that it is visible outside the container. By default, containers do not expose any TCP ports. This means we can be very selective about what to expose, limiting the attack surface for any miscreants seeking to gain illicit access to the container.

The --mount option is meant to replace the older --volume option. It is a powerful tool for attaching external data storage to a container. In this case, we are attaching a host directory, userauth-data, to the /var/lib/mysql directory inside the container. This ensures that the database is not inside the container, and that it will last beyond the lifetime of the container. For example, while creating this example, we deleted this container several times to fine-tune the command line, and it kept using the same data directory.

We should also mention that the --mount option requires the src= option be a full pathname to the file or directory that is mounted. We are using pwd to determine the full path to the file. However, this is, of course, specific to Unix-like OSes. If you are on Windows, the command should be run in PowerShell and you can use the $PSScriptRoot variable. Alternatively, you can hardcode an absolute pathname.

It is possible to inject a custom my.cnf file into the container by adding this option to the docker run command:


换句话说,Docker 不仅允许您挂载目录,还允许您挂载单个文件。

命令行遵循以下模式:

So far, we have talked about the options for the docker run command. Those options configure the characteristics of the container. Next on the command line is the image name—in this case, mysql/mysql-server:8.0. Any command-line tokens appearing after the image name are passed into the container. In this case, they are interpreted as arguments to the MySQL server, meaning we can configure this server using any of the extensive sets of command-line options it supports. While we can mount a my.cnf file in the container, it is possible to achieve most configuration settings this way.

The first of these options, --bind_address, tells the server to listen for connections from any IP address.

The second, --socket=/tmp/mysql.sock, serves two purposes. One is security, to ensure that the MySQL Unix domain socket is accessible only from inside the container. By default, the scripts inside the MySQL container put this socket in the /var/lib/mysql directory, and when we attach the data directory, the socket is suddenly visible from outside the container.

On Windows, if this socket is in /var/lib/mysql, when we attach a data directory to the container, that would put the socket in a Windows directory. Since Windows does not support Unix domain sockets, the MySQL container will mysteriously fail to start and give a misleadingly obtuse error message. The --socket option ensures that the socket is instead on a filesystem that supports Unix domain sockets, avoiding the possibility of this failure.

When experimenting with different options, it is important to delete the mounted data directory each time you recreate the container to try a new setting. If the MySQL container sees a populated data directory, it skips over most of the container initialization scripts and will not run. A common mistake when trying different container MySQL configuration options is to rerun docker run without deleting the data directory. Since the MySQL initialization doesn't run, nothing will have changed and it won't be clear why the behavior isn't changing.

Therefore, to try a different set of MySQL options, execute the following command:


这将确保您每次都从新数据库开始,并确保容器初始化运行。

这也暗示了一个行政模式要遵循。每当您希望更新到较新的 MySQL 版本时,只需停止容器,保留数据目录。然后,删除容器,并使用新的`mysql/mysql-server`标签重新执行`docker run`命令。这将导致 Docker 使用不同的镜像重新创建容器,但使用相同的数据目录。使用这种技术,您可以通过拉取更新的镜像来更新 MySQL 版本。

一旦 MySQL 容器运行,输入以下命令:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/f8b581a3-5b69-4a4c-9725-32697e2a774b.png)

这将显示当前容器状态。如果我们使用`docker ps -a`,我们会看到`PORTS`列显示`0.0.0.0:3306->3306/tcp, 33060/tcp`。这表示容器正在监听从任何地方(`0.0.0.0`)到端口`3306`的访问,这个流量将连接到容器内部的端口`3306`。此外,还有一个端口`33060`可用,但它没有暴露到容器外部。

尽管它配置为监听整个世界,但容器附加到`authnet`,限制了连接的来源。限制可以连接到数据库的进程的范围是一件好事。但是,由于我们使用了`-p`选项,数据库端口暴露给了主机,这并不像我们想要的那样安全。我们稍后会修复这个问题。

### 数据库容器中的安全性

一个要问的问题是是否像这样设置`root`密码是一个好主意。`root`用户对整个 MySQL 服务器有广泛的访问权限,而其他用户,如`userauth`,对给定数据库的访问权限有限。由于我们的目标之一是安全性,我们必须考虑这是否创建了一个安全或不安全的数据库容器。

我们可以使用以下命令以`root`用户身份登录:

This executes the MySQL CLI client inside the newly created container. There are a few commands we can run to check the status of the root and userauth user IDs. These include the following:


连接到 MySQL 服务器包括用户 ID、密码和连接的来源。这个连接可能来自同一台计算机内部,也可能来自另一台计算机的 TCP/IP 套接字。为了批准连接,服务器会在`mysql.user`表中查找与`user`、`host`(连接来源)和`password`字段匹配的行。用户名和密码是作为简单的字符串比较进行匹配的,但主机值是一个更复杂的比较。与 MySQL 服务器的本地连接将与主机值为`localhost`的行匹配。

对于远程连接,MySQL 会将连接的 IP 地址和域名与`host`列中的条目进行比较。`host`列可以包含 IP 地址、主机名或通配符模式。SQL 的通配符字符是`%`。单个`%`字符匹配任何连接源,而`172.%`的模式匹配第一个 IPv4 八位是`172`的任何 IP 地址,或者`172.20.%.%`匹配`172.20.x.x`范围内的任何 IP 地址。

因此,由于`userauth`的唯一行指定了`%`的主机值,我们可以从任何地方使用`userauth`。相比之下,`root`用户只能在`localhost`连接中使用。

下一个任务是检查`userauth`和`root`用户 ID 的访问权限:

This says that the userauth user has full access to the userauth database. The root user, on the other hand, has full access to every database and has so many permissions that the output of that does not fit here. Fortunately, the root user is only allowed to connect from localhost.

To verify this, try connecting from different locations using these commands:


我们展示了访问数据库的四种模式,表明`userauth` ID 确实可以从同一容器或远程容器访问,而`root` ID 只能从本地容器使用。

使用`docker run --it --rm ... container-name ..`启动一个容器,运行与容器相关的命令,然后在完成后退出容器并自动删除它。

因此,通过这两个命令,我们创建了一个单独的`mysql/mysql-server:8.0`容器,连接到`authnet`,以运行`mysql`CLI 程序。`mysql`参数是使用给定的用户名(`root`或`userauth`)连接到名为`db-userauth`的主机上的 MySQL 服务器。这演示了从一个独立的连接器连接到数据库,并显示我们可以使用`userauth`用户远程连接,但不能使用`root`用户。

然后,最终的访问实验涉及省略`--network`选项:

This demonstrates that if the container is not attached to authnet, it cannot access the MySQL server because the db-userauth hostname is not even known.

Where did the db-userauth hostname come from? We can find out by inspecting a few things:


换句话说,`authnet`网络具有`172.20.0.0/16`网络号,而`db-userauth`容器被分配了`172.20.0.2`IP 地址。这种细节很少重要,但在第一次仔细检查设置时是有用的,这样我们就能理解我们正在处理的内容。

存在一个严重的安全问题,违反了我们的设计。即,数据库端口对主机是可见的,因此,任何可以访问主机的人都可以访问数据库。这是因为我们在错误的认为下使用了`-p 3306:3306`,以为这是必需的,这样`svc-userauth`才能在下一节中访问数据库。我们将通过删除该选项来解决这个问题。

现在我们已经为认证服务设置了数据库实例,让我们看看如何将其 Docker 化。

## Docker 化认证服务

*Dockerize*一词意味着为软件创建一个 Docker 镜像。然后可以与他人共享 Docker 镜像,或部署到服务器上。在我们的情况下,目标是为用户认证服务创建一个 Docker 镜像。它必须连接到`authnet`,以便可以访问我们刚刚在`db-userauth`容器中配置的数据库服务器。

我们将命名这个新容器为`svc-userauth`,以表示这是用户认证 REST 服务,而`db-userauth`容器是数据库。

Docker 镜像是使用 Dockerfile 定义的,Dockerfile 是描述在服务器上安装应用程序的文件。它们记录了 Linux 操作系统的设置,安装的软件以及 Docker 镜像中所需的配置。这实际上是一个名为`Dockerfile`的文件,其中包含 Dockerfile 命令。Dockerfile 命令用于描述镜像的构建方式。

请参考[`docs.docker.com/engine/reference/builder/`](https://docs.docker.com/engine/reference/builder/)获取文档。

### 创建认证服务 Dockerfile

在`users`目录中,创建一个名为`Dockerfile`的文件,其中包含以下内容:

The FROM command specifies a pre-existing image, called the base image, from which to derive a given image. Frequently, you define a Docker image by starting from an existing image. In this case, we're using the official Node.js Docker image (hub.docker.com/_/node/), which, in turn, is derived from debian.

Because the base image, node, is derived from the debian image, the commands available are what are provided on a Debian OS. Therefore, we use apt-get to install more packages.

The RUN commands are where we run the shell commands required to build the container. The first one installs required Debian packages, such as the build-essential package, which brings in compilers required to install native-code Node.js packages.

It's recommended that you always combine apt-get updateapt-get upgrade, and apt-get install in the same command line like this because of the Docker build cache. Docker saves each step of the build to avoid rerunning steps unnecessarily. When rebuilding an image, Docker starts with the first changed step. Therefore, in the set of Debian packages to install changes, we want all three of those commands to run.

Combining them into a single command ensures that this will occur. For a complete discussion, refer to the documentation at docs.docker.com/develop/develop-images/dockerfile_best-practices/.

The ENV commands define environment variables. In this case, we're using the same environment variables that were defined in the package.json script for launching the user authentication service.

Next, we have a sequence of lines to create the /userauth directory and to populate it with the source code of the user authentication service. The first line creates the /userauth directory. The COPY command, as its name implies, copies the files for the authentication service into that directory. The WORKDIR command changes the working directory to /userauth. This means that the last RUN command, npm install, is executed in /userauth, and therefore, it installs the packages described in /userauth/package.json in /userauth/node_modules.

There is a new SEQUELIZE_CONNECT configuration file mentioned: sequelize-docker-mysql.yaml. This will describe the Sequelize configuration required to connect to the database in the db-userauth container.

Create a new file named users/sequelize-docker-mysql.yaml containing the following:


不同之处在于,我们使用`db-userauth`而不是`localhost`作为数据库主机。之前,我们探索了`db-userauth`容器,并确定这是容器的主机名。通过在这个文件中使用`db-userauth`,认证服务将使用容器中的数据库。

`EXPOSE`命令通知 Docker 容器监听指定的 TCP 端口。这不会将端口暴露到容器之外。`-p`标志是将给定端口暴露到容器之外的方式。

最后,`CMD`命令记录了在执行容器时启动的过程。`RUN`命令在构建容器时执行,而`CMD`表示容器启动时执行的内容。

我们本可以在容器中安装`PM2`,然后使用`PM2`命令来启动服务。然而,Docker 能够实现相同的功能,因为它自动支持在服务进程死掉时重新启动容器。

### 构建和运行认证服务 Docker 容器

现在我们已经在 Dockerfile 中定义了镜像,让我们来构建它。

在`users/package.json`中,将以下行添加到`scripts`部分:

As has been our habit, this is an administrative task that we can record in package.json, making it easier to automate this task.

We can build the authentication service as follows:


`docker build`命令从 Dockerfile 构建一个镜像。请注意,构建一步一步进行,每个步骤都与 Dockerfile 中的命令完全对应。

每个步骤都存储在缓存中,因此不必重新运行。在后续构建中,执行的唯一步骤是更改的步骤和所有后续步骤。

在`authnet/package.json`中,我们需要相当多的脚本来管理用户认证服务:

This is the set of commands that were found to be useful to manage building the images, starting the containers, and stopping the containers.

Look carefully and you will see that we've added --detach to the docker run commands. So far, we've used docker run without that option, and the container remained in the foreground. While this was useful to see the logging output, it's not so useful for deployment. With the --detach option, the container becomes a background task.

On Windows, for the --mount option, we need to change the src= parameter (as discussed earlier) to use a Windows-style hard-coded path. That means it should read:


此选项需要绝对路径名,并且以这种方式指定路径在 Windows 上有效。

另一个需要注意的是`-p 3306:3306`选项的缺失。有两个原因确定这是不必要的。首先,该选项将数据库暴露给主机,`db-userauth`的安全模型要求不这样,因此删除该选项可以获得所需的安全性。其次,`svc-userauth`在删除此选项后仍然能够访问`db-userauth`数据库。

有了这些命令,我们现在可以输入以下内容来构建,然后运行容器:

These commands build the pieces required for the user authentication service. As a side effect, the containers are automatically executed and will launch as background tasks.

Once it is running, you can test it using the cli.mjs script as before. You can shell into the svc-userauth container and run cli.mjs there; or, since the port is visible to the host computer, you can run it from outside the container.

Afterward, we can manage the whole service as follows:


这将停止并启动构成用户认证服务的两个容器。

我们已经创建了托管用户认证服务的基础设施,以及一系列脚本来管理该服务。我们的下一步是探索我们创建的内容,并了解 Docker 为我们创建的基础设施的一些情况。

## 探索 AuthNet

请记住,AuthNet 是认证服务的连接介质。为了了解这个网络是否提供了我们正在寻找的安全性增益,让我们探索一下我们刚刚创建的内容:

This prints out a large JSON object describing the network, along with its attached containers, which we've looked at before. If everything went well, we will see that there are now two containers attached to authnet where there'd previously have just been one.

Let's go into the svc-userauth container and poke around:


`/userauth`目录位于容器内,包含使用`COPY`命令放置在容器中的文件,以及`node_modules`中安装的文件:

We can run the cli.mjs script to test and administer the service. To get these database entries set up, use the add command with the appropriate options:


进程列表是值得研究的。进程`PID 1`是 Dockerfile 中的`node ./user-server.mjs`命令。我们在`CMD`行中使用的格式确保`node`进程最终成为进程 1。这很重要,以便正确处理进程信号,从而允许 Docker 正确管理服务进程。以下博客文章的末尾有关于这个问题的很好讨论:

[`www.docker.com/blog/keep-nodejs-rockin-in-docker/`](https://www.docker.com/blog/keep-nodejs-rockin-in-docker/)

`ping`命令证明两个容器作为与容器名称匹配的主机名可用:

From outside the containers, on the host system, we cannot ping the containers. That's because they are attached to authnet and are not reachable.

We have successfully Dockerized the user authentication service in two containers—db-userauth and svc-userauth. We've poked around the insides of a running container and found some interesting things. However, our users need the fantastic Notes application to be running, and we can't afford to rest on our laurels.

Since this was our first time setting up a Docker service, we went through a lot of details. We started by launching a MySQL database container, and what is required to ensure that the data directory is persistent. We then set up a Dockerfile for the authentication service and learned how to connect containers to a common Docker network and how containers can communicate with each other over the network. We also studied the security benefits of this network infrastructure, since we can easily wall off the service and its database from intrusion.

Let's now move on and Dockerize the Notes application, making sure that it is connected to the authentication server.

Creating FrontNet for the Notes application

We have the back half of our system set up in Docker containers, as well as the private bridge network to connect the backend containers. It's now time to do the same for the front half of the system: the Notes application (svc-notes) and its associated database (db-notes). Fortunately, the tasks required to build FrontNet are more or less the same as what we did for AuthNet.

The first task is to set up another private bridge network, frontnet. Like authnet, this will be the infrastructure for the front half of the Notes application stack.

Create a directory, frontnet, and in that directory, create a package.json file that will contain the scripts to manage frontnet:


与`authnet`一样,这只是起点,因为我们还有几个脚本要添加。

让我们继续创建`frontnet`桥接网络:

We have two virtual bridge networks. Over the next few sections, we'll set up the database and Notes application containers, connect them to frontnet, and then see how to manage everything.

MySQL container for the Notes application

As with authnet, the task is to construct a MySQL server container using the mysql/mysql-server image. We must configure the server to be compatible with the SEQUELIZE_CONNECT file that we'll use in the svc-notes container. For that purpose, we'll use a database named notes and a notes user ID.

For that purpose, add the following to the scripts section of the package.json file:


这与`db-userauth`几乎相同,只是将`notes`替换为`userauth`。请记住,在 Windows 上,`-mount`选项需要 Windows 风格的绝对路径名。

现在让我们运行脚本: 

This database will be available in the db-notes domain name on frontnet. Because it's attached to frontnet, it won't be reachable by containers connected to authnet. To verify this, run the following command:


由于`db-notes`位于不同的网络段,我们已经实现了隔离。但我们可以注意到一些有趣的事情。`ping`命令告诉我们,`db-userauth`的完整域名是`db-userauth.authnet`。因此,可以推断`db-notes`也被称为`db-notes.frontnet`。但无论如何,我们无法从`authnet`上的容器访问`frontnet`上的容器,因此我们已经实现了所需的隔离。

我们能够更快地移动以构建 FrontNet,因为它非常类似于 AuthNet。我们只需要做以前做过的事情,并微调名称。

在本节中,我们创建了一个数据库容器。在下一节中,我们将为 Notes 应用程序创建 Dockerfile。

## Docker 化 Notes 应用程序

我们的下一步当然是将 Notes 应用程序 Docker 化。这始于创建一个 Dockerfile,然后添加另一个 Sequelize 配置文件,最后通过向`frontnet/package.json`文件添加更多脚本来完成。

在`notes`目录中,创建一个名为`Dockerfile`的文件,其中包含以下内容:

This is similar to the Dockerfile we used for the authentication service. We're using the environment variables from notes/package.json, plus a new one: NOTES_SESSION_DIR.

The most obvious change is the number of COPY commands. The Notes application is a lot more involved, given the number of sub-directories full of files that must be installed. We start by creating the top-level directories of the Notes application deployment tree. Then, one by one, we copy each sub-directory into its corresponding sub-directory in the container filesystem.

In a COPY command, the trailing slash on the destination directory is important. Why? Because the Docker documentation says that the trailing slash is important, that's why.

The big question is why use multiple COPY commands like this? This would have been incredibly simple:


然而,多个`COPY`命令让我们可以精确控制复制的内容。避免复制`node_modules`目录是最重要的。不仅是主机上的`node_modules`文件很大,如果复制到容器中会使容器膨胀,而且它是为主机操作系统而不是容器操作系统设置的。`node_modules`目录必须在容器内部构建,安装过程发生在容器的操作系统上。这个约束导致选择明确地将特定文件复制到目标位置。

我们还有一个新的`SEQUELIZE_CONNECT`文件。创建`models/sequelize-docker-mysql.yaml`,其中包含以下内容:

This will access a database server on the db-notes domain name using the named database, username, and password.

Notice that the USER_SERVICE_URL variable no longer accesses the authentication service at localhost, but at svc-userauth. The svc-userauth domain name is currently only advertised by the DNS server on AuthNet, but the Notes service is on FrontNet. Therefore, this will cause a failure for us when we get to running the Notes application, and we'll have to make some connections so that the svc-userauth container can be accessed from svc-notes.

In Chapter 8, Authenticating Users with a Microservice, we discussed the need to protect the API keys supplied by Twitter. We could copy the .env file to the Dockerfile, but this may not be the best choice, and so we've left it out of the Dockerfile.

Unfortunately, this does not protect the Twitter credentials to the level required. The .env file is available as plaintext inside the container. Docker has a feature, Docker Secrets, that can be used to securely store data of this sort. Unfortunately, it is only available when using Swarm mode, which we are not doing at this time; but we will use this feature in Chapter 12, Deploying a Docker Swarm to AWS EC2 Using Terraform.

The value of TWITTER_CALLBACK_HOST needs to reflect where Notes is deployed. Right now, it is still on your laptop, but if it is deployed to a server, this variable will require the IP address or domain name of the server.

In notes/package.json, add the following scripts entry:


与身份验证服务器一样,这使我们能够为 Notes 应用程序服务构建容器镜像。

然后,在`frontnet/package.json`中添加这些脚本:

Now, we can build the container image:


这将创建容器镜像,然后启动容器。

注意,暴露的端口`3000`与`-p 80:3000`映射到正常的 HTTP 端口。由于我们准备在真实服务上部署,我们可以停止使用端口`3000`。

此时,我们可以将浏览器连接到`http://localhost`并开始使用 Notes 应用程序。但是,我们很快就会遇到一个问题:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/11e5e001-7757-4392-9277-62ce82f78a64.png)

用户体验团队将对这个丑陋的错误消息大声疾呼,所以把它放在您的待办事项中,生成一个更漂亮的错误屏幕。例如,一群鸟将鲸鱼从海洋中拉出是很受欢迎的。

这个错误意味着 Notes 无法访问名为`svc-userauth`的主机上的任何内容。该主机确实存在,因为容器正在运行,但它不在`frontnet`上,并且无法从`notes`容器中访问。相反,它在`authnet`上,目前无法被`svc-notes`访问:

We can reach db-notes from svc-notes but not svc-userauth. This is as expected since we have attached these containers to different networks.

If you inspect FrontNet and AuthNet, you'll see that the containers attached to each do not overlap:


在第十章中呈现的架构图中,*将 Node.js 应用程序部署到 Linux 服务器*,我们展示了`svc-notes`和`svc-userauth`容器之间的连接。这种连接是必需的,以便 Notes 可以对其用户进行身份验证。但是这种连接尚不存在。

Docker 要求您采取第二步将容器连接到第二个网络:

With no other change, the Notes application will now allow you to log in and start adding and editing notes. Furthermore, start a shell in svc-notes and you'll be able to ping both svc-userauth and db-userauth.

There is a glaring architecture question staring at us. Do we connect the svc-userauth service to frontnet, or do we connect the svc-notes service to authnet? We just connected svc-notes to authnet, but maybe that's not the best choice. To verify which network setup solves the problem, run the following commands:


第一次,我们将`svc-notes`连接到`authnet`,然后将其从`authnet`断开连接,然后将`svc-userauth`连接到`frontnet`。这意味着我们尝试了两种组合,并且如预期的那样,在这两种情况下,`svc-notes`和`svc-userauth`都能够通信。

这是一个安全专家的问题,因为考虑到任何入侵者可用的攻击向量。假设 Notes 存在安全漏洞,允许入侵者访问。我们如何限制通过该漏洞可达到的内容?

主要观察是通过将`svc-notes`连接到`authnet`,`svc-notes`不仅可以访问`svc-userauth`,还可以访问`db-userauth`。要查看这一点,请运行以下命令:

This sequence reconnects svc-notes to authnet and demonstrates the ability to access both the svc-userauth and db-userauth containers. Therefore, a successful invader could access the db-userauth database, a result we wanted to prevent. Our diagram in Chapter 10, Deploying Node.js Applications to Linux Servers, showed no such connection between svc-notes and db-userauth.

Given that our goal for using Docker was to limit the attack vectors, we have a clear distinction between the two container/network connection setups. Attaching svc-userauth to frontnet limits the number of containers that can access db-userauth. For an intruder to access the user information database, they must first break into svc-notes, and then break into svc-userauth; unless, that is, our amateur attempt at a security audit is flawed.

For this and a number of other reasons, we arrive at this final set of scripts for frontnet/package.json:


主要是添加一个命令`connect-userauth`,将`svc-userauth`连接到`frontnet`。这有助于我们记住如何加入容器的决定。我们还借此机会进行了一些重新组织。

在本节中,我们学到了很多关于 Docker 的知识——使用 Docker 镜像,从镜像创建 Docker 容器,并在考虑一些安全约束的情况下配置一组 Docker 容器。我们在本节中实现了我们最初的架构想法。我们有两个私有网络,容器连接到它们适当的网络。唯一暴露的 TCP 端口是 Notes 应用程序,可在端口`80`上看到。其他容器使用不可从容器外部访问的 TCP/IP 连接相互连接。

在继续下一部分之前,您可能希望关闭我们启动的服务。只需执行以下命令:

Because we've automated many things, it is this simple to administer the system. However, it is not as automated as we want it to be. To address that, let's learn how to make the Notes stack more easily deployable by using Docker Compose to describe the infrastructure.

Managing multiple containers with Docker Compose

It is cool that we can create encapsulated instantiations of the software services that we've created. In theory, we can publish these images to Docker repositories, and then launch the containers on any server we want. For example, our task in Chapter 10, Deploying Node.js Applications to Linux Servers, would be greatly simplified with Docker. We could simply install Docker Engine on the Linux host and then deploy our containers on that server, and not have to deal with all those scripts and the PM2 application.

But we haven't properly automated the process. The promise was to use the Dockerized application for deployment on cloud services. In other words, we need to take all this learning and apply it to the task of simplifying deployment.

We've demonstrated that, with Docker, Notes can be built using four containers that have a high degree of isolation from each other and from the outside world.

There is a glaring problem: our process in the previous section was partly manual, partly automated. We created scripts to launch each portion of the system, which is good practice. However, we did not automate the entire process to bring up Notes and the authentication services, nor is this solution scalable beyond one machine.

Let's start with the last issue first—scalability. Within the Docker ecosystem, several Docker orchestrator services are available. An orchestrator automatically deploys and manages Docker containers over a group of machines. Some examples of Docker orchestrators are Docker Swarm, Kubernetes, CoreOS Fleet, and Apache Mesos. These are powerful systems that can automatically increase/decrease resources as needed to move containers from one host to another, and more. We mention these systems for you to further study as your needs grow. In Chapter 12, Deploying a Docker Swarm to AWS EC2 with Terraform, we will build on the work we're about to do in order to deploy Notes in a Docker Swarm cluster that we'll build on AWS EC2 infrastructure.

Docker Compose (docs.docker.com/compose/overview/) will solve the other problems we've identified. It lets us easily define and run several Docker containers together as a complete application. It uses a YAML file, docker-compose.yml, to describe the containers, their dependencies, the virtual networks, and the volumes. While we'll be using it to describe deployment on a single host machine, Docker Compose can be used for multi-machine deployments. Namely, Docker Swarm directly uses compose files to describe the services you launch in a swarm. In any case, learning about Docker Compose will give you a headstart on understanding the other systems.

Before proceeding, ensure that Docker Compose is installed. If you've installed Docker for Windows or Docker for Mac, everything that is required is installed. On Linux, you must install it separately by following the instructions in the links provided earlier.

Docker Compose file for the Notes stack

We just talked about Docker orchestration services, but Docker Compose is not itself such a service. Instead, Docker Compose uses a specific YAML file structure to describe how to deploy Docker containers. With a Docker Compose file, we can describe one or more containers, networks, and volumes involved in launching a Docker-based service.

Let's start by creating a directory, compose-local, as a sibling to the users and notes directories. In that directory, create a file named docker-compose.yml:


这是整个 Notes 部署的描述。它在相当高的抽象级别上,大致相当于我们迄今为止使用的命令行工具中的选项。它相当简洁和自解释,正如我们将看到的,`docker-compose`命令使这些文件成为管理 Docker 服务的便利方式。

`version`行表示这是一个版本 3 的 Compose 文件。版本号由`docker-compose`命令检查,以便它可以正确解释其内容。完整的文档值得阅读,网址是[`docs.docker.com/compose/compose-file/`](https://docs.docker.com/compose/compose-file/)。

这里使用了三个主要部分:`services`、`volumes`和`networks`。`services`部分描述了正在使用的容器,`networks`部分描述了网络,`volumes`部分描述了卷。每个部分的内容都与我们之前创建的容器相匹配。我们已经处理过的配置都在这里,只是重新排列了一下。

有两个数据库容器——`db-userauth`和`db-notes`——以及两个服务容器——`svc-userauth`和`svc-notes`。服务容器是从`build`属性中指定的目录中的 Dockerfile 构建的。数据库容器是从 Docker Hub 下载的镜像实例化的。两者都直接对应于我们之前所做的,使用`docker run`命令创建数据库容器,并使用`docker build`生成服务的镜像。

`container_name`属性等同于`--name`属性,并为容器指定了一个用户友好的名称。我们必须指定容器名称,以便指定容器主机名以实现 Docker 风格的服务发现。

`networks`属性列出了此容器必须连接的网络,与`--net`参数完全相同。即使`docker`命令不支持多个`--net`选项,我们可以在 Compose 文件中列出多个网络。在这种情况下,网络是桥接网络。与之前一样,网络本身必须单独创建,在 Compose 文件中,这是在`networks`部分完成的。

`ports`属性声明要发布的端口及其与容器端口的映射。在`ports`声明中,有两个端口号,第一个是要发布的端口号,第二个是容器内部的端口号。这与之前使用的`-p`选项完全相同。

`depends_on`属性允许我们控制启动顺序。依赖于另一个容器的容器将等待直到被依赖的容器正在运行。

`volumes`属性描述了容器目录到`host`目录的映射。在这种情况下,我们定义了两个卷名称——`db-userauth-data`和`db-notes-data`——然后将它们用于卷映射。但是,当我们部署到 AWS EC2 上的 Docker Swarm 时,我们需要改变这个实现方式。

请注意,我们没有为卷定义主机目录。Docker 会为我们分配一个目录,我们可以使用`docker volume inspect`命令了解这个目录。

`restart`属性控制容器死亡时或者何时发生的情况。当容器启动时,它运行`CMD`指令中指定的程序,当该程序退出时,容器也退出。但是,如果该程序是要永远运行的,Docker 不应该知道它应该重新启动该进程吗?我们可以使用后台进程监视器,如 Supervisord 或 PM2。但是,Docker 的`restart`选项会处理这个问题。

`restart`属性可以取以下四个值之一:

+   `no`: 不重新启动。

+   `on-failure:count`: 最多重新启动*N*次。

+   `always`: 总是重新启动。

+   `unless-stopped`: 除非明确停止,否则启动容器。

在本节中,我们学习了如何通过创建描述 Notes 应用程序堆栈的文件来构建 Docker Compose 文件。有了这个,让我们看看如何使用这个工具来启动容器。

## 使用 Docker Compose 构建和运行 Notes 应用程序

使用 Docker Compose CLI 工具,我们可以管理任何可以在`docker-compose.yml`文件中描述的 Docker 容器集。我们可以构建容器,启动和关闭它们,查看日志等。在 Windows 上,我们可以无需更改地运行本节中的命令。

我们的第一个任务是通过运行以下命令来创建一个干净的状态:

We first needed to stop and delete any existing containers left over from our previous work. We can also use the scripts in the frontnet and authnet directories to do this. docker-compose.yml used the same container names, so we need the ability to launch new containers with those names.

To get started, use this command:


这将构建`docker-compose.yml`中列出的镜像。请注意,我们最终得到的镜像名称都以`compose-local`开头,这是包含该文件的目录的名称。因为这相当于在每个目录中运行`docker build`,它只构建镜像。

构建了容器之后,我们可以使用`docker-compose up`或`docker-compose start`一次性启动它们所有:

We can use docker-compose stop to shut down the containers. With docker-compose start, the containers run in the background.

We can also run docker-compose up to get a different experience:


如果需要,`docker-compose up`将首先构建容器。此外,它将保持所有容器在前台运行,以便我们可以查看日志。它将所有容器的日志输出合并在一起,每行开头显示容器名称。对于像 Notes 这样的多容器系统,这非常有帮助。

我们可以使用此命令检查状态:

This is related to running docker ps, but the presentation is a little different and more compact.

In docker-compose.yml, we insert the following declaration for svc-userauth:


这意味着`svc-userauth`的 REST 服务端口已经发布。确实,在状态输出中,我们看到端口已经发布。这违反了我们的安全设计,但它确实让我们可以从笔记本电脑上使用`users/cli.mjs`运行测试。也就是说,我们可以像以前那样向数据库添加用户。

只要它保持在我们的笔记本电脑上,这种安全违规是可以接受的。`compose-local`目录的命名是专门用于在我们的笔记本电脑上与 Docker Compose 一起使用的。

或者,我们可以像以前一样在`svc-userauth`容器内运行命令:

We started the Docker containers using docker-compose, and we can use the docker-compose command to interact with the containers. In this case, we demonstrated using both the docker-compose and docker commands to execute a command inside one of the containers. While there are slight differences in the command syntax, it's the same interaction with the same results.

Another test is to go into the containers and explore:


从那里,我们可以尝试 ping 每个容器,以查看哪些容器可以被访问。这将作为一个简单的安全审计,以确保我们创建的内容符合我们期望的安全模型。

在执行此操作时,我们发现`svc-userauth`可以 ping 通每个容器,包括`db-notes`。这违反了安全计划,必须更改。

幸运的是,这很容易解决。只需通过更改配置,我们可以在`docker-compose.yml`中添加一个名为`svcnet`的新网络:

svc-userauth is no longer connected to frontnet, which is how we could ping db-notes from svc-userauth. Instead, svc-userauth and svc-notes are both connected to a new network, svcnet, which is meant to connect the service containers. Therefore, both service containers have exactly the required access to match the goals outlined at the beginning.

That's an advantage of Docker Compose. We can quickly reconfigure the system without rewriting anything other than the docker-compose.yml configuration file. Furthermore, the new configuration is instantly reflected in a file that can be committed to our source repository.

When you're done testing the system, simply type CTRL +* C* in the terminal:


如图所示,这将停止整组容器。偶尔,它会退出用户到 shell,并且容器仍然在运行。在这种情况下,用户将不得不使用其他方法来关闭容器:

The docker-compose commands—startstop, and restart—all serve as ways to manage the containers as background tasks. The default mode for the docker-compose up command is, as we've seen, to start the containers in the foreground. However, we can also run docker-compose up with the -d option, which says to detach the containers from the terminal to run in the background.

We're getting closer to our end goal. In this section, we learned how to take the Docker containers we've designed and create a system that can be easily brought up and down as a unit by running the docker-compose command.

While preparing to deploy this to Docker Swarm on AWS EC2, a horizontal scaling issue was found, which we can fix on our laptop. It is fairly easy with Docker Compose files to test multiple svc-notes instances to see whether we can scale Notes for higher traffic loads. Let's take a look at that before deploying to the swarm.

Using Redis for scaling the Notes application stack

In the previous section, we learned how to use Docker Compose to manage the Notes application stack. Looking ahead, we can see the potential need to use multiple instances of the Notes container when we deploy to Docker Swarm on AWS EC2. In this section, we will make a small modification to the Docker Compose file for an ad hoc test with multiple Notes containers. This test will show us a couple of problems. Among the available solutions are two packages that fix both problems by installing a Redis instance.

A common tactic for handling high traffic loads is to deploy multiple service instances as needed. This is called horizontal scaling, where we deploy multiple instances of a service to multiple servers. What we'll do in this section is learn a little about horizontal scaling in Docker by starting two Notes instances to see how it behaves.

As it currently exists, Notes stores some data—the session data—on the local disk space. As orchestrators such as Docker Swarm, ECS, and Kubernetes scale containers up and down, containers are constantly created and destroyed or moved from one host to another. This is done in the name of handling the traffic while optimizing the load on the available servers. In this case, whatever active data we're storing on a local disk will be lost. Losing the session data means users will be randomly logged out. The users will be rightfully upset and will then send us support requests asking what's wrong and whether we have even tested this thing!

In this section, we will learn that Notes does not behave well when we have multiple instances of svc-notes. To address this problem, we will add a Redis container to the Docker Compose setup and configure Notes to use Redis to solve the two problems that we have discovered. This will ensure that the session data is shared between multiple Notes instances via a Redis server.

Let's get started by performing a little ad hoc testing to better understand the problem.

Testing session management with multiple Notes service instances

We can easily verify whether Notes properly handles session data if there are multiple svc-notes instances. With a small modification to compose-local/docker-compose.yml, we can start two svc-notes instances, or more. They'll be on separate TCP ports, but it will let us see how Notes behaves with multiple instances of the Notes service.

Create a new service, svc-notes-2, by duplicating the svc-notes declaration. The only thing to change is the container name, which should be svc-notes-2, and the published port, which should be port 3020.

For example, add the following to compose-local/docker-compose.yml:


这是我们刚刚描述的`svc-notes-2`容器的服务定义。因为我们设置了`PORT`变量,所以容器将在端口`3020`上监听,这也是在`ports`属性中宣传的端口。

与以前一样,当我们快速重新配置网络配置时,注意到只需对 Docker Compose 文件进行简单编辑就足以改变事物。

然后,按照以下步骤重新启动 Notes 堆栈:

In this case, there was no source code change, only a configuration change. Therefore, the containers do not need to be rebuilt, and we can simply relaunch with the new configuration.

That will give us two Notes containers on different ports. Each is configured as normal; for example, they connect to the same user authentication service. Using two browser windows, visit both at their respective port numbers. You'll be able to log in with one browser window, but you'll encounter the following situation:

The browser window on port 3020 is logged out, while the window open to port 3000 is logged in. Remember that port 3020 is svc-notes-2, while port 3000 is svc-notes. However, as you use the two windows, you'll observe some flaky behavior with regard to staying logged in.

The issue is that the session data is not shared between svc-notes and svc-notes-2. Instead, the session data is in files stored within each container.

We've identified a problem whereby keeping the session data inside the container makes it impossible to share session data across all instances of the Notes service. To fix this, we need a session store that shares the session data across processes.

Storing Express/Passport session data in a Redis server

Looking back, we saw that we might have multiple instances of svc-notes deployed on Docker Swarm. To test this, we created a second instance, svc-notes-2, and found that user sessions were not maintained between the two Notes instances. This told us that we must store session data in a shared data storage system.

There are several choices when it comes to storing sessions. While it is tempting to use the express-session-sequelize package, because we're already using Sequelize to manage a database, we have another issue to solve that requires the use of Redis. We'll discuss this other issue later.

For a list of Express session stores, go to expressjs.com/en/resources/middleware/session.html#compatible-session-stores.

Redis is a widely used key-value data store that is known for being very fast. It is also very easy to install and use. We won't have to learn anything about Redis, either.

Several steps are required in order to set up Redis:

  1. In compose-local/docker-compose.yml, add the following definition to the services section:

这在一个名为`redis`的容器中设置了一个 Redis 服务器。这意味着想要使用 Redis 的其他服务将在名为`redis`的主机上访问它。

对于您定义的任何`svc-notes`服务(`svc-notes`和`svc-notes-2`),我们现在必须告诉 Notes 应用程序在哪里找到 Redis 服务器。我们可以通过使用环境变量来实现这一点。

1.  在`compose-local/docker-compose.yml`中,向任何此类服务添加以下环境变量声明:

Add this to both the svc-notes and svc-notes-2 service declarations. This passes the Redis hostname to the Notes service.

  1. Next, install the package:

这将安装所需的软件包。`redis`软件包是用于从 Node.js 使用 Redis 的客户端,而`connect-redis`软件包是 Redis 的 Express 会话存储。

1.  我们需要更改`app.mjs`中的初始化,以使用`connect-redis`包来存储会话数据:

This brings in the Redis-based session store provided by connect-redis.

The configuration for these packages is taken directly from the relevant documentation.

For connect-redis, refer to www.npmjs.com/package/connect-redis. For redis, refer to github.com/NodeRedis/node-redis.

This imports the two packages and then configures the connect-redis package to use the redis package. We consulted the REDIS_ENDPOINT environment variable to configure the redis client object. The result landed in the same sessionStore variable we used previously. Therefore, no other change is required in app.mjs.

If no Redis endpoint is specified, we instead revert to the file-based session store. We might not always deploy Notes in a context where we can run Redis; for example, while developing on our laptop. Therefore, we require the option of not using Redis, and, at the moment, the choice looks to be between using Redis or the filesystem to store session data.

With these changes, we can relaunch the Notes application stack. It might help to relaunch the stack using the following command:


由于源文件发生了更改,需要重新构建容器。这些选项确保了这一点。

现在我们将能够连接到`http://localhost:3000`(`svc-notes`)上的 Notes 服务和`http://localhost:3020`(`svc-notes-2`)上的服务,并且它将处理两个服务上的登录会话。

然而,还应该注意另一个问题,即实时通知在两个服务器之间没有发送。要看到这一点,设置四个浏览器窗口,两个用于每个服务器。将它们全部导航到相同的笔记。然后,添加和删除一些评论。只有连接到相同服务器的浏览器窗口才会动态显示评论的更改。连接到另一个服务器的浏览器窗口不会。

这是第二个水平扩展问题。幸运的是,它的解决方案也涉及使用 Redis。

## 使用 Redis 分发 Socket.IO 消息

在测试多个`svc-notes`容器时,我们发现登录/注销不可靠。我们通过安装基于 Redis 的会话存储来解决了这个问题,以便将会话数据存储在可以被多个容器访问的地方。但我们也注意到另一个问题:基于 Socket.IO 的消息传递并不能可靠地在所有浏览器窗口中引发更新。

请记住,我们希望在浏览器中发生的更新是由对`SQNotes`或`SQMessages`表的更新触发的。更新任一表时由服务器进行更新时发出的事件。发生在一个服务容器中的更新(比如`svc-notes-2`)将从该容器发出一个事件,但不会从另一个容器(比如`svc-notes`)发出。没有机制让其他容器知道它们应该发出这样的事件。

Socket.IO 文档谈到了这种情况:

[`socket.io/docs/using-multiple-nodes/`](https://socket.io/docs/using-multiple-nodes/)

Socket.IO 团队提供了`socket.io-redis`包作为解决这个问题的方案。它确保通过 Socket.IO 由任何服务器发出的事件将传递到其他服务器,以便它们也可以发出这些事件。

由于我们已经安装了 Redis 服务器,我们只需要按照说明安装包并进行配置。再次强调,我们不需要学习有关 Redis 的任何内容:

This installs the socket.io-redis package.

Then, we configure it in app.mjs, as follows:


唯一的变化是添加粗体字中的行。`socket.io-redis`包是 Socket.IO 团队称之为适配器的东西。通过使用`io.adapter`调用,可以将适配器添加到 Socket.IO 中。

只有在指定了 Redis 端点时,我们才连接这个适配器。与以前一样,这是为了需要时可以在没有 Redis 的情况下运行 Notes。

不需要其他任何东西。如果重新启动 Notes 应用程序堆栈,现在将在连接到 Notes 服务的每个实例的每个浏览器窗口中接收更新。

在这一部分,我们提前考虑了部署到云托管服务的情况。知道我们可能想要实现多个 Notes 容器,我们在笔记本上测试了这种情况,并发现了一些问题。通过安装 Redis 服务器并添加一些包,这些问题很容易解决。

我们准备完成本章,但在此之前有一项任务要处理。`svc-notes-2`容器对于临时测试很有用,但不是部署多个 Notes 实例的正确方式。因此,在`compose-local/docker-compose.yml`中,注释掉`svc-notes-2`的定义。

这让我们对一个广泛使用的新工具——Redis 有了宝贵的了解。我们的应用现在似乎也已经准备好部署。我们将在下一章处理这个问题。

# 总结

在本章中,我们迈出了一个巨大的步伐,朝着在云托管平台上部署 Notes 的愿景迈进。Docker 容器在云托管系统上被广泛用于应用程序部署。即使我们最终不使用 Docker Compose 文件,我们仍然可以进行部署,并且我们已经解决了如何将 Notes 堆栈的每个方面都 Docker 化。

在本章中,我们不仅学习了如何为 Node.js 应用程序创建 Docker 镜像,还学习了如何启动包括 Web 应用程序在内的一整套服务系统。我们了解到,Web 应用程序不仅涉及应用程序代码,还涉及数据库、我们使用的框架,甚至其他服务,比如 Redis。

为此,我们学习了如何创建自己的 Docker 容器以及如何使用第三方容器。我们学习了如何使用`docker run`和 Docker Compose 启动容器。我们学习了如何使用 Dockerfile 构建自定义 Docker 容器,以及如何自定义第三方容器。

为了连接容器,我们学习了关于 Docker 桥接网络。这在单主机 Docker 安装中非常有用,它是一个私有通信通道,容器可以在其中找到彼此。作为一个私有通道,桥接网络相对安全,可以让我们安全地将服务绑定在一起。我们有机会尝试 Docker 内部的不同网络架构,并探索每种架构的安全影响。我们了解到 Docker 提供了一个在主机系统上安全部署持久服务的绝佳方式。

展望将 Notes 部署到云托管服务的任务,我们对 Notes 服务的多个实例进行了一些临时测试。这凸显了多个实例可能出现的一些问题,我们通过将 Redis 添加到应用程序堆栈中来解决了这些问题。

这使我们全面了解了如何准备 Node.js 服务以在云托管提供商上部署。请记住,我们的目标是将 Notes 应用程序作为 Docker 容器部署到 AWS EC2 上,作为云部署的一个示例。在本章中,我们探讨了 Docker 化 Node.js 应用程序堆栈的不同方面,为我们提供了在 Docker 上部署服务的坚实基础。我们现在已经准备好将这个应用程序部署到公共互联网上的服务器上。

在下一章中,我们将学习两种非常重要的技术。第一种是**Docker Swarm**,它是一个与 Docker 捆绑在一起的 Docker 编排器。我们将学习如何在 AWS EC2 基础设施上构建的 Swarm 中将我们的 Docker 堆栈部署为服务。我们将学习的第二种技术是 Terraform,它是一种用于描述云托管系统上服务配置的开源工具。我们将使用它来描述 Notes 应用程序堆栈的 AWS EC2 配置。


使用 Terraform 将 Docker Swarm 部署到 AWS EC2

到目前为止,在本书中,我们已经创建了一个基于 Node.js 的应用程序堆栈,包括两个 Node.js 微服务、一对 MySQL 数据库和一个 Redis 实例。在上一章中,我们学习了如何使用 Docker 轻松启动这些服务,打算在云托管平台上这样做。Docker 被广泛用于部署我们这样的服务,对于在公共互联网上部署 Docker,我们有很多可用的选项。

由于 Amazon Web Services(AWS)是一个成熟且功能丰富的云托管平台,我们选择在那里部署。在 AWS 上有许多可用于托管 Notes 的选项。我们在第十一章《使用 Docker 部署 Node.js 微服务》中的工作中,最直接的路径是在 AWS 上创建一个 Docker Swarm 集群。这使我们能够直接重用我们创建的 Docker compose 文件。

Docker Swarm 是可用的 Docker 编排系统之一。这些系统管理一个或多个 Docker 主机系统上的一组 Docker 容器。换句话说,构建一个 Swarm 需要为一个或多个服务器系统进行配置,安装 Docker Engine,并启用 Swarm 模式。Docker Swarm 内置于 Docker Engine 中,只需几个命令即可将这些服务器加入到 Swarm 中。然后,我们可以将基于 Docker 的服务部署到 Swarm 中,Swarm 会在服务器系统之间分发容器,监视每个容器,重新启动任何崩溃的容器等。

Docker Swarm 可以在具有多个 Docker 主机系统的任何情况下使用。它不受 AWS 的限制,因为我们可以从世界各地的数百家 Web 托管提供商那里租用合适的服务器。它足够轻量级,以至于您甚至可以在笔记本电脑上使用虚拟机实例(Multipass、VirtualBox 等)来尝试 Docker Swarm。

在本章中,我们将使用一组 AWS Elastic Compute Cloud(EC2)实例。EC2 是 AWS 的虚拟专用服务器(VPS)的等价物,我们可以从 Web 托管提供商那里租用。EC2 实例将部署在 AWS 虚拟私有云(VPC)中,以及我们将在其上实施之前概述的部署架构的网络基础设施。

让我们谈谈成本,因为 AWS 可能成本高昂。AWS 提供了所谓的免费层,对于某些服务,只要保持在一定阈值以下,成本就为零。在本章中,我们将努力保持在免费层内,除了我们将有三个 EC2 实例部署一段时间,这超出了 EC2 使用的免费层。如果您对成本敏感,可以通过在不需要时销毁 EC2 实例来将其最小化。我们将在稍后讨论如何做到这一点。

本章将涵盖以下主题:

+   注册 AWS 并配置 AWS 命令行界面(CLI)

+   要部署的 AWS 基础设施概述

+   使用 Terraform 创建 AWS 基础设施

+   在 AWS EC2 上设置 Docker Swarm 集群

+   为 Notes Docker 镜像设置 Elastic Container Registry(ECR)存储库

+   为部署到 Docker Swarm 创建 Docker 堆栈文件

+   为完整的 Docker Swarm 配置 EC2 实例

+   将 Notes 堆栈文件部署到 Swarm

在本章中,您将学到很多东西,从如何开始使用 AWS 管理控制台,设置 AWS 上的身份和访问管理(IAM)用户,到如何设置 AWS 命令行工具。由于 AWS 平台如此庞大,重要的是要对其内容和我们在本章中将使用的功能有一个概述。然后,我们将学习 Terraform,这是一种在各种云平台上配置服务的主要工具。我们将学习如何使用它来配置 AWS 资源,如 VPC、相关的网络基础设施,以及如何配置 EC2 实例。接下来,我们将学习 Docker Swarm,这是内置在 Docker 中的编排系统,以及如何设置一个 Swarm,以及如何在 Swarm 中部署应用程序。

为此,我们将学习 Docker 镜像注册表、AWS 弹性容器注册表(ECR)、如何将镜像推送到 Docker 注册表,以及如何在 Docker 应用程序堆栈中使用来自私有注册表的镜像。最后,我们将学习创建 Docker 堆栈文件,该文件允许您描述要在群集中部署的 Docker 服务。

让我们开始吧。

# 第十六章:注册 AWS 并配置 AWS CLI

要使用 AWS 服务,当然必须拥有 AWS 账户。AWS 账户是我们向 AWS 进行身份验证的方式,也是 AWS 向我们收费的方式。

首先,访问[`aws.amazon.com`](https://aws.amazon.com)并注册一个账户。

Amazon 免费套餐是一种零成本体验 AWS 服务的方式:[`aws.amazon.com/free/`](https://aws.amazon.com/free/)。文档可在[`docs.aws.amazon.com`](https://docs.aws.amazon.com)找到。

AWS 有两种我们可以使用的账户,如下:

+   **根账户**是我们注册 AWS 账户时创建的账户。根账户对 AWS 服务拥有完全访问权限。

+   IAM 用户账户是您可以在根账户中创建的权限较低的账户。根账户的所有者创建 IAM 账户,并为每个 IAM 账户分配权限范围。

直接使用根账户是不好的行为,因为根账户对 AWS 资源拥有完全访问权限。如果根账户的凭据泄露给公众,可能会对您的业务造成重大损害。如果 IAM 用户账户的凭据泄露,损害仅限于该用户账户控制的资源以及该账户被分配的权限。此外,IAM 用户凭据可以随时被撤销,然后生成新的凭据,防止持有泄霩凭据的任何人进一步造成损害。另一个安全措施是为所有账户启用多因素身份验证(MFA)。

如果您还没有这样做,请前往上述链接之一的 AWS 网站并注册一个账户。请记住,以这种方式创建的账户是您的 AWS 根账户。

我们的第一步是熟悉 AWS 管理控制台。

## 找到 AWS 账户的方法

由于 AWS 平台上有如此多的服务,看起来就像是一个迷宫。但是,稍微了解一下,我们就能找到自己的路。

首先,看一下窗口顶部的导航栏。右侧有三个下拉菜单。第一个是您的账户名称,并有与账户相关的选项。第二个可以让您选择 AWS 区域的默认设置。AWS 将其基础设施划分为*区域*,基本上意味着 AWS 数据中心所在的世界地区。第三个可以让您联系 AWS 支持。

左侧是一个标有“服务”的下拉菜单。这会显示所有 AWS 服务的列表。由于服务列表很长,AWS 为您提供了一个搜索框。只需输入服务的名称,它就会显示出来。AWS 管理控制台首页也有这个搜索框。

在我们找到自己的路的同时,让我们记录根帐户的帐户号。我们以后会需要这些信息。在帐户下拉菜单中,选择“我的帐户”。帐户 ID 在那里,以及您的帐户名称。

建议在 AWS 根帐户上设置 MFA。MFA 简单地意味着以多种方式对人进行身份验证。例如,服务可能使用通过短信发送的代码号作为第二种身份验证方法,同时要求输入密码。理论上,如果服务验证了我们输入了正确的密码并且我们携带了其他日子携带的同一部手机,那么服务对我们的身份更加确定。

要在根帐户上设置 MFA,请转到“我的安全凭据”仪表板。在 AWS 管理控制台菜单栏中可以找到指向该仪表板的链接。这将带您到一个页面,控制与 AWS 的所有形式的身份验证。从那里,您可以按照 AWS 网站上的说明进行操作。有几种可能的工具可用于实施 MFA。最简单的工具是在智能手机上使用 Google Authenticator 应用程序。设置 MFA 后,每次登录到根帐户都需要从验证器应用程序输入代码。

到目前为止,我们已经处理了在线 AWS 管理控制台。我们真正的目标是使用命令行工具,为此,我们需要在笔记本电脑上安装和配置 AWS CLI。让我们接下来处理这个问题。

## 使用 AWS 身份验证凭据设置 AWS CLI

AWS CLI 工具是通过 AWS 网站提供的下载。在幕后,它使用 AWS 应用程序编程接口(API),并且还要求我们下载和安装身份验证令牌。

一旦您有了帐户,我们就可以准备 AWS CLI 工具。

AWS CLI 使您能够从笔记本电脑的命令行与 AWS 服务进行交互。它具有与每个 AWS 服务相关的广泛的子命令集。

安装 AWS CLI 的说明可以在此处找到:[`docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html`](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html)。

配置 AWS CLI 的说明可以在此处找到:[`docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html`](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html)。

一旦在笔记本电脑上安装了 AWS CLI 工具,我们必须配置所谓的*配置文件*。

AWS 提供了支持广泛的工具来操作 AWS 基础架构的 AWS API。AWS CLI 工具使用该 API,第三方工具如 Terraform 也使用该 API。使用 API 需要访问令牌,因此 AWS CLI 和 Terraform 都需要相同的令牌。

要获取 AWS API 访问令牌,请转到“我的安全凭据”仪表板,然后单击“访问密钥”选项卡。

单击此按钮,将显示两个安全令牌,即访问密钥 ID 和秘密访问密钥。您将有机会下载包含这些密钥的逗号分隔值(CSV)文件。CSV 文件如下所示:

You will receive a file that looks like this. These are the security tokens that identify your account. Don't worry, as no secrets are being leaked in this case. Those particular credentials have been revoked. The good news is that you can revoke these credentials at any time and download new credentials.

Now that we have the credentials file, we can configure an AWS CLI profile.

The aws configure command, as the name implies, takes care of configuring your AWS CLI environment. This asks a series of questions, the first two of which are those keys. The interaction looks like this:


对于前两个提示,粘贴您下载的密钥。区域名称提示选择您的服务将在其中提供服务的默认 Amazon AWS 数据中心。AWS 在世界各地都有设施,每个地点都有一个代码名称,例如`us-west-2`(位于俄勒冈州)。最后一个提示询问您希望 AWS CLI 如何向您呈现信息。

对于区域代码,在 AWS 控制台中,查看区域下拉菜单。这会显示可用的区域,描述区域和每个区域的区域代码。对于这个项目,最好使用靠近您的 AWS 区域。对于生产部署,最好使用最接近您的受众的区域。可以配置跨多个区域工作的部署,以便您可以为多个地区的客户提供服务,但这种实现远远超出了我们在本书中涵盖的范围。

通过使用`--profile`选项,我们确保创建了一个命名的配置文件。如果我们省略该选项,我们将创建一个名为`default`的配置文件。对于任何`aws`命令,`--profile`选项选择要使用的配置文件。顾名思义,默认配置文件是如果我们省略`--profile`选项时使用的配置文件。

在使用 AWS 身份时,最好始终明确。一些指南建议根本不创建默认的 AWS 配置文件,而是始终使用`--profile`选项以确保始终使用正确的 AWS 配置文件。

验证 AWS 配置的一种简单方法是运行以下命令:

The AWS Simple Storage Service (S3) is a cloud file-storage system, and we are running these commands solely to verify the correct installation of the credentials.  The ls command lists any files you have stored in S3. We don't care about the files that may or may not be in an S3 bucket, but whether this executes without error.

The first command shows us that execution with no --profile option, and no default profile, produces an error. If there were a default AWS profile, that would have been used. However, we did not create a default profile, so therefore no profile was available and we got an error. The second shows the same command with an explicitly named profile. The third shows the AWS_PROFILE environment variable being used to name the profile to be deployed.

Using the environment variables supported by the AWS CLI tool, such as AWS_PROFILE, lets us skip using command-line options such as --profile while still being explicit about which profile to use.

As we said earlier, it is important that we interact with AWS via an IAM user, and therefore we must learn how to create an IAM user account. Let's do that next.

Creating an IAM user account, groups, and roles

We could do everything in this chapter using our root account but, as we said, that's bad form. Instead, it is recommended to create a second user—an IAM user—and give it only the permissions required by that user.

To get to the IAM dashboard, click on Services in the navigation bar, and enter IAM. IAM stands for Identity and Access Management. Also, the My Security Credentials dashboard is part of the IAM service, so we are probably already in the IAM area.

The first task is to create a role. In AWS, roles are used to associate privileges with a user account. You can create roles with extremely limited privileges or an extremely broad range of privileges.

In the IAM dashboard, you'll find a navigation menu on the left. It has sections for users, groups, roles, and other identity management topics. Click on the Roles choice. Then, in the Roles area, click on Create Role. Perform the following steps:

  1. Under Type of trusted identity, select Another AWS account. Enter the account ID, which you will have recorded earlier while familiarizing yourself with the AWS account. Then, click on Next.
  2. On the next page, we select the permissions for this role. For our purpose, select AdministratorAccess, a privilege that grants full access to the AWS account. Then, click on Next.
  3. On the next page, you can add tags to the role. We don't need to do this, so click Next.
  4. On the last page, we give a name to the role. Enter admin because this role has administrator permissions. Click on Create Role.

You'll see that the role, admin, is now listed in the Role dashboard. Click on admin and you will be taken to a page where you can customize the role further. On this page, notice the characteristic named Role ARN. Record this Amazon Resource Name (ARN) for future reference.

ARNs are identifiers used within AWS. You can reliably use this ARN in any area of AWS where we can specify a role. ARNs are used with almost every AWS resource.

Next, we have to create an administrator group. In IAM, users are assigned to groups as a way of passing roles and other attributes to a group of IAM user accounts. To do this, perform the following steps:

  1. In the left-hand navigation menu, click on Group, and then, in the group dashboard, click on Create Group.
  2. For the group name, enter Administrators.
  3. Skip the Attach Policy page, click Next Step, and then, on the Review page, simply click Create Group.
  4. This creates a group with no permissions and directs you back to the group dashboard.
  5. Click on the Administrators group, and you'll be taken to the overview page. Record the ARN for the group.
  6. Click on Permissions to open that tab, and then click on the Inline policies section header. We will be creating an inline policy, so click on the Click here link.
  7. Click on Custom Policy, and you'll be taken to the policy editor.
  8. For the policy name, enter AssumeAdminRole. Below that is an area where we enter a block of JavaScript Object Notation (JSON) code describing the policy. Once that's done, click the Apply Policy button.

The policy document to use is as follows:


这描述了为管理员组创建的策略。它为该组提供了我们之前在管理员角色中指定的权限。资源标签是我们输入之前创建的管理员组的 ARN 的地方。确保将整个 ARN 放入此字段。

导航回到组区域,然后再次点击创建组。我们将创建一个名为`NotesDeveloper`的组,供分配给 Notes 项目的开发人员使用。它将为这些用户帐户提供一些额外的特权。执行以下步骤:

1.  输入`NotesDeveloper`作为组名。然后,点击下一步。

1.  对于“附加策略”页面,有一个要考虑的策略长列表;例如,`AmazonRDSFullAccess`,`AmazonEC2FullAccess`,`IAMFullAccess`,`AmazonEC2ContainerRegistryFullAccess`,`AmazonS3FullAccess`,`AdministratorAccess`和`AmazonElasticFileSystemFullAccess`。

1.  然后,点击下一步,如果在审阅页面上一切看起来都正确,请点击**创建组**。

这些策略涵盖了完成本章所需的服务。AWS 错误消息指出用户没有足够的特权访问该功能时,很好地告诉您所需的特权。如果这是用户需要的特权,那么回到这个组并添加特权。

在左侧导航中,点击用户,然后点击创建用户。这开始了创建 IAM 用户所涉及的步骤,如下所述:

1.  对于用户名,输入`notes-app`,因为此用户将管理与 Notes 应用程序相关的所有资源。对于访问类型,点击程序访问和 AWS 管理控制台访问,因为我们将同时使用两者。第一个授予使用 AWS CLI 工具的能力,而第二个涵盖了 AWS 控制台。然后,点击下一步。

1.  对于权限,选择将用户添加到组,并选择管理员和 NotesDeveloper 两个组。这将用户添加到您选择的组。然后,点击下一步。

1.  没有其他事情要做,所以继续点击下一步,直到您到达审阅页面。如果您满意,请点击创建用户。

您将被带到一个宣布成功的页面。在这个页面上,AWS 提供了可以与此帐户一起使用的访问令牌(也称为安全凭证)。在您做任何其他操作之前,请下载这些凭证。您随时可以撤销这些凭证并生成新的访问令牌。

您新创建的用户现在列在用户部分。点击该条目,因为我们有一些数据项要记录。第一个显然是用户帐户的 ARN。第二个是一个**统一资源定位符**(**URL**),您可以使用它以此用户身份登录到 AWS。对于该 URL,请点击安全凭证选项卡,登录链接将在那里。

建议还为 IAM 帐户设置 MFA。AWS 任务栏中的“My Security Credentials”选项可让您进入包含设置 MFA 按钮的屏幕。请参阅前几页关于为根帐户设置 MFA 的讨论。

要测试新用户帐户,请注销,然后转到登录网址。输入帐户的用户名和密码,然后登录。

在完成本节之前,返回命令行并运行以下命令:

This will create another AWS CLI profile, this time for the notes-app IAM user.

Using the AWS CLI, we can list the users in our account, as follows:


这是验证 AWS CLI 是否正确安装的另一种方法。此命令从 AWS 查询用户信息,如果执行无误,则已正确配置 CLI。

AWS CLI 命令遵循类似的结构,其中有一系列子命令,后面跟着选项。在这种情况下,子命令是`aws`,`iam`和`list-users`。AWS 网站为 AWS CLI 工具提供了广泛的在线文档。

### 创建 EC2 密钥对

由于我们将在此练习中使用 EC2 实例,我们需要一个 EC2 密钥对。这是一个加密证书,其作用与我们用于无密码登录到服务器的普通**安全外壳**(**SSH**)密钥相同。实际上,密钥对文件具有相同的作用,允许使用 SSH 无密码登录到 EC2 实例。执行以下步骤:

1.  登录到 AWS 管理控制台,然后选择您正在使用的区域。

1.  接下来,导航到 EC2 仪表板,例如,通过在搜索框中输入`EC2`。

1.  在导航侧边栏中,有一个名为“网络和安全”的部分,其中包含一个名为“密钥对”的链接。

1.  单击该链接。右上角有一个标有“创建密钥对”的按钮。单击此按钮,您将进入以下屏幕:

![](https://gitee.com/OpenDocCN/freelearn-node-zh/raw/master/docs/node-webdev-5e/img/dfe865a3-6172-4b2b-ad97-760498cd6af6.png)

1.  输入密钥对的所需名称。根据您使用的 SSH 客户端,使用`.pem`(用于`ssh`命令)或`.ppk`(用于 PuTTY)格式的密钥对文件。

1.  单击“创建密钥对”,您将返回到仪表板,并且密钥对文件将在浏览器中下载。

1.  下载密钥对文件后,需要将其设置为只读,可以使用以下命令:

Substitute here the pathname where your browser downloaded the file.

For now, just make sure this file is correctly stored somewhere. When we deploy EC2 instances, we'll talk more about how to use it.

We have familiarized ourselves with the AWS Management Console, and created for ourselves an IAM user account. We have proved that we can log in to the console using the sign-in URL. While doing that, we copied down the AWS access credentials for the account.

We have completed the setup of the AWS command-line tools and user accounts. The next step is to set up Terraform.

An overview of the AWS infrastructure to be deployed

AWS is a complex platform with dozens of services available to us. This project will touch on only the part required to deploy Notes as a Docker swarm on EC2 instances. In this section, let's talk about the infrastructure and AWS services we'll put to use.

An AWS VPC is what it sounds like—namely, a service within AWS where you build your own private cloud service infrastructure. The AWS team designed the VPC service to look like something that you would construct in your own data center, but implemented on the AWS infrastructure. This means that the VPC is a container to which everything else we'll discuss is attached.

The AWS infrastructure is spread across the globe into what AWS calls regions. For example, us-west-1 refers to Northern California, us-west-2 refers to Oregon, and eu-central-1 refers to Frankfurt. For production deployment, it is recommended to use a region nearer your customers, but for experimentation, it is good to use the region closest to you. Within each region, AWS further subdivides its infrastructure into availability zones (a.k.a. AZs). An AZ might correspond to a specific building at an AWS data center site, but AWS often recommends that we deploy infrastructure to multiple AZs for reliability. In case one AZ goes down, the service can continue in the AZs that are running.

When we allocate a VPC, we specify an address range for resources deployed within the VPC. The address range is specified with a Classless Inter-Domain Routing (CIDR) specifier. These are written as 10.3.0.0/16 or 10.3.20.0/24, which means any Internet Protocol version 4 (IPv4) address starting with 10.3 and 10.3.20, respectively.

Every device we attach to a VPC will be attached to a subnet, a virtual object similar to an Ethernet segment. Each subnet will be assigned a CIDR from the main range. A VPC assigned the 10.3.0.0/16 CIDR might have a subnet with a CIDR of 10.3.20.0/24. Devices attached to the subnet will have an IP address assigned within the range indicated by the CIDR for the subnet.

EC2 is AWS's answer to a VPS that you might rent from any web hosting provider. An EC2 instance is a virtual computer in the same sense that Multipass or VirtualBox lets you create a virtual computer on your laptop. Each EC2 instance is assigned a central processing unit (CPU), memory, disk capacity, and at least one network interface. Hence, an EC2 instance is attached to a subnet and is assigned an IP address from the subnet's assigned range.

By default, a device attached to a subnet has no internet access. The internet gateway and network address translation (NAT) gateway resources on AWS play a critical role in connecting resources attached to a VPC via the internet. Both are what is known as an internet router, meaning that both handle the routing of internet traffic from one network to another. Because a VPC contains a VPN, these gateways handle traffic between that network and the public internet, as follows:

  • Internet gateway: This handles two-way routing, allowing a resource allocated in a VPC to be reachable from the public internet. An internet gateway allows external traffic to enter the VPC, and it also allows resources in the VPC to access resources on the public internet.

  • NAT gateway: This handles one-way routing, meaning that resources on the VPC will be able to access resources on the public internet, but does not allow external traffic to enter the VPC. To understand the NAT gateway, think about a common home Wi-Fi router because they also contain a NAT gateway. Such a gateway will manage a local IP address range such as 192.168.0.0/16, while the internet service provider (ISP) might assign a public IP address such as 107.123.42.231 to the connection. Local IP addresses, such as 192.168.1.45, will be assigned to devices connecting to the NAT gateway. Those local IP addresses do not appear in packets sent to the public internet. Instead, the NAT gateway translates the IP addresses to the public IP address of the gateway, and then when reply packets arrive, it translates the IP address to that of the local device. NAT translates IP addresses from the local network to the IP address of the NAT gateway.

In practical terms, this determines the difference between a private subnet and a public subnet. A public subnet has a routing table that sends traffic for the public internet to an internet gateway, whereas a private subnet sends its public internet traffic to a NAT gateway.

Routing tables describe how to route internet traffic. Inside any internet router, such as an internet gateway or a NAT gateway, is a function that determines how to handle internet packets destined for a location other than the local subnet. The routing function matches the destination address against routing table entries, and each routing table entry says where to forward matching packets.

Attached to each device deployed in a VPC is a security group. A security group is a firewall controlling what kind of internet traffic can enter or leave that device. For example, an EC2 instance might have a web server supporting HTTP (port 80) and HTTPS (port 443) traffic, and the administrator might also require SSH access (port 22) to the instance. The security group would be configured to allow traffic from any IP address on ports 80 and 443 and to allow traffic on port 22 from IP address ranges used by the administrator.

A network access control list (ACL) is another kind of firewall that's attached to subnets. It, too, describes which traffic is allowed to enter or leave the subnet. The security groups and network ACLs are part of the security protections provided by AWS.

If a device connected to a VPC does not seem to work correctly, there might be an error in the configuration of these parts. It's necessary to check the security group attached to the device, and to the NAT gateway or internet gateway, and that the device is connected to the expected subnet, the routing table for the subnet, and any network ACLs.

Using Terraform to create an AWS infrastructure

Terraform is an open source tool for configuring a cloud hosting infrastructure. It uses a declarative language to describe the configuration of cloud services. Through a long list of plugins, called providers, it has support for a variety of cloud services. In this chapter, we'll use Terraform to describe AWS infrastructure deployments.

To install Terraform, download an installer from www.terraform.io/downloads.html.

Alternatively, you will find the Terraform CLI available in many package management systems.

Once installed, you can view the Terraform help with the following command:


Terraform 文件具有`.tf`扩展名,并使用相当简单、易于理解的声明性语法。Terraform 不关心您使用的文件名或创建文件的顺序。它只是读取所有具有`.tf`扩展名的文件,并寻找要部署的资源。这些文件不包含可执行代码,而是声明。Terraform 读取这些文件,构建依赖关系图,并确定如何在使用的云基础设施上实现这些声明。

一个示例声明如下:

The first word, resource or variable, is the block type, and in this case, we are declaring a resource and a variable. Within the curly braces are the arguments to the block, and it is helpful to think of these as attributes.

Blocks have labels—in this case, the labels are aws_vpc and main. We can refer to this specific resource elsewhere by joining the labels together as aws_vpc.main. The name, aws_vpc, comes from the AWS provider and refers to VPC elements. In many cases, a block—be it a resource or another kind—will support attributes that can be accessed. For example, the CIDR for this VPC can be accessed as aws_vpc.main.cidr_block.

The general structure is as follows:


区块类型包括资源(resource),声明与云基础设施相关的内容,变量(variable),声明命名值,输出(output),声明模块的结果,以及其他一些类型。

区块标签的结构因区块类型而异。对于资源区块,第一个区块标签指的是资源的类型,而第二个是该资源的特定实例的名称。

参数的类型也因区块类型而异。Terraform 文档对每个变体都有广泛的参考。

Terraform 模块是包含 Terraform 脚本的目录。当在目录中运行`terraform`命令时,它会读取该目录中的每个脚本以构建对象树。

在模块内,我们处理各种值。我们已经讨论了资源、变量和输出。资源本质上是与云托管平台上的某些东西相关的对象值。变量可以被视为模块的输入,因为有多种方法可以为变量提供值。输出值如其名称所示,是模块的输出。当执行模块时,输出可以打印在控制台上,或保存到文件中,然后被其他模块使用。与此相关的代码可以在以下片段中看到:

This is what the variable and output declarations look like. Every value has a data type. For variables, we can attach a description to aid in their documentation. The declaration uses the word default rather than value because there are multiple ways (such as Terraform command-line arguments) to specify a value for a variable. Terraform users can override the default value in several ways, such as the --var or --var-file command-line options.

Another type of value is local. Locals exist only within a module because they are neither input values (variables) nor output values, as illustrated in the following code snippet:


在这种情况下,我们定义了与要在 VPC 中创建的子网的 CIDR 相关的几个本地变量。`cidrsubnet`函数用于计算子网掩码,例如`10.1.1.0/24`。

Terraform 的另一个重要特性是提供者插件。Terraform 支持的每个云系统都需要一个定义如何使用 Terraform 与该平台的具体细节的插件模块。

提供者插件的一个效果是 Terraform 不会尝试成为平台无关的。相反,给定平台的所有可声明资源都是唯一的。您不能直接在另一个系统(如 Azure)上重用 AWS 的 Terraform 脚本,因为资源对象都是不同的。您可以重用的是 Terraform 如何处理云资源声明的知识。 

另一个任务是在你的编程编辑器中寻找一个 Terraform 扩展。其中一些支持 Terraform,包括语法着色、检查简单错误,甚至代码补全。

尽管如此,这已经足够的理论了。要真正学会这个,我们需要开始使用 Terraform。在下一节中,我们将从实现 VPC 结构开始,然后在其中部署 Notes 应用程序堆栈。

## 使用 Terraform 配置 AWS VPC

AWS VPC 就像它的名字一样,是 AWS 内的一个服务,用来容纳您定义的云服务。AWS 团队设计了 VPC 服务,看起来有点像您在自己的数据中心构建的东西,但是在 AWS 基础设施上实现。

在本节中,我们将构建一个包含公共子网和私有子网、互联网网关和安全组定义的 VPC。

在项目工作区中,创建一个名为`terraform-swarm`的目录,它是`notes`和`users`目录的同级目录。

在该目录中,创建一个名为`main.tf`的文件,其中包含以下内容:

This says to use the AWS provider plugin. It also configures this script to execute using the named AWS profile. Clearly, the AWS provider plugin requires AWS credential tokens in order to use the AWS API. It knows how to access the credentials file set up by aws configure.

To learn more about configuring the AWS provider plugin, refer to www.terraform.io/docs/providers/aws/index.html.

As shown here, the AWS plugin will look for the AWS credentials file in its default location, and use the notes-app profile name.

In addition, we have specified which AWS region to use. The reference, var.aws_region, is a Terraform variable. We use variables for any value that can legitimately vary. Variables can be easily customized to any value in several ways.

To support the variables, we create a file named variables.tf, starting with this:


`default`属性为变量设置了默认值。正如我们之前看到的,声明也可以指定变量的数据类型和描述。

有了这个,我们现在可以运行我们的第一个 Terraform 命令,如下所示:

This initializes the current directory as a Terraform workspace. You'll see that it creates a directory, .terraform, and a file named terraform.tfstate containing data collected by Terraform. The .tfstate files are what is known as state files. These are in JSON format and store the data Terraform collects from the platform (in this case, AWS) regarding what has been deployed. State files must not be committed to source code repositories because it is possible for sensitive data to end up in those files. Therefore, a .gitignore file listing the state files is recommended.

The instructions say we should run terraform plan, but before we do that, let's declare a few more things.

To declare the VPC and its related infrastructure, let's create a file named vpc.tf. Start with the following command:


这声明了 VPC。这将是我们正在创建的基础设施的容器。

`cidr_block`属性确定将用于此 VPC 的 IPv4 地址空间。CIDR 表示法是一个互联网标准,例如`10.0.0.0/16`。该 CIDR 将覆盖以`10.0`开头的任何 IP 地址。

`enable_dns_support`和`enable_dns_hostnames`属性确定是否为连接到 VPC 的某些资源生成**域名系统**(**DNS**)名称。DNS 名称可以帮助一个资源在运行时找到其他资源。

`tags`属性用于将名称/值对附加到资源上。名称标签被 AWS 用来为资源设置显示名称。每个 AWS 资源都有一个计算生成的、用户不友好的名称,带有一个长编码的字符串,当然,我们人类需要友好的名称。名称标签在这方面很有用,AWS 管理控制台将通过在仪表板中使用这个名称来做出响应。

在`variables.tf`中,添加以下内容以支持这些资源声明:

These values will be used throughout the project. For example, var.project_name will be widely used as the basis for creating name tags for deployed resources.

Add the following to vpc.tf:


`resource`块声明了托管平台上的某些内容(在本例中是 AWS),`data`块从托管平台检索数据。在这种情况下,我们正在检索当前选择区域的 AZ 列表。以后在声明某些资源时会用到这个数据。

### 配置 AWS 网关和子网资源

请记住,公共子网与互联网网关相关联,私有子网与 NAT 网关相关联。这种区别决定了附加到每个子网的互联网访问设备的类型。

创建一个名为`gw.tf`的文件,其中包含以下内容:

This declares the internet gateway and the NAT gateway. Remember that internet gateways are used with public subnets, and NAT gateways are used with private subnets.

An Elastic IP (EIP) resource is how a public internet IP address is assigned. Any device that is to be visible to the public must be on a public subnet and have an EIP. Because the NAT gateway faces the public internet, it must have an assigned public IP address and an EIP.

For the subnets, create a file named subnets.tf containing the following:


这声明了公共和私有子网。请注意,这些子网分配给了特定的 AZ。通过添加名为`public2`、`public3`、`private2`、`private3`等子网,很容易扩展以支持更多子网。如果这样做,最好将这些子网分布在不同的 AZ 中。建议在多个 AZ 中部署,这样如果一个 AZ 崩溃,应用程序仍在仍在运行的 AZ 中运行。

带有`[0]`的这种表示是什么样子的——一个数组。值`data.aws_availability_zones.available.names`是一个数组,添加`[0]`确实访问了该数组的第一个元素,就像你期望的那样。数组只是 Terraform 提供的数据结构之一。

每个子网都有自己的 CIDR(IP 地址范围),为了支持这一点,我们需要在`variables.tf`中列出这些 CIDR 分配,如下所示:

These are the CIDRs corresponding to the resources declared earlier.

For these pieces to work together, we need appropriate routing tables to be configured. Create a file named routing.tf containing the following:


要为公共子网配置路由表,我们修改连接到 VPC 的主路由表的路由表。我们在这里做的是向该表添加一条规则,指定公共互联网流量要发送到互联网网关。我们还有一个路由表关联声明,公共子网使用这个路由表。

对于`aws_route_table.private`,私有子网的路由表,声明指定将公共互联网流量发送到 NAT 网关。在路由表关联中,此表用于私有子网。

之前,我们说公共子网和私有子网的区别在于公共互联网流量是发送到互联网网关还是 NAT 网关。这些声明就是实现这一点的方式。

在这一部分中,我们声明了 VPC、子网、网关和路由表,换句话说,我们将部署 Docker Swarm 的基础架构。

在连接容纳 Swarm 的 EC2 实例之前,让我们将其部署到 AWS 并探索设置的内容。

## 使用 Terraform 将基础架构部署到 AWS

我们现在已经声明了我们需要的 AWS 基础架构的基本结构。这是 VPC、子网和路由表。让我们将其部署到 AWS,并使用 AWS 控制台来探索创建了什么。

之前,我们运行了`terraform init`来初始化我们的工作目录中的 Terraform。这样做时,它建议我们运行以下命令:

This command scans the Terraform files in the current directory and first determines that everything has the correct syntax, that all the values are known, and so forth. If any problems are encountered, it stops right away with error messages such as the following:


Terraform 的错误消息通常是不言自明的。在这种情况下,原因是决定只使用一个公共子网和一个私有子网。这段代码是从两个子网的情况遗留下来的。因此,这个错误指的是容易删除的陈旧代码。

`terraform plan`的另一个作用是构建所有声明的图表并打印出一个列表。这让你了解 Terraform 打算部署到所选云平台上的内容。因此,这是你检查预期基础架构并确保它是你想要使用的机会。

一旦您满意了,请运行以下命令:

With terraform apply, the report shows the difference between the actual deployed state and the desired state as reflected by the Terraform files. In this case, there is no deployed state, so therefore everything that is in the files will be deployed. In other cases, you might have deployed a system and have made a change, in which case Terraform will work out which changes have to be deployed based on the changes you've made. Once it calculates that, Terraform asks for permission to proceed. Finally, if we have said yes, it will proceed and launch the desired infrastructure.

Once finished, it tells you what happened. One result is the values of the output commands in the scripts. These are both printed on the console and are saved in the backend state file.

To see what was created, let's head to the AWS console and navigate to the VPC area, as follows:

Compare the VPC ID in the screenshot with the one shown in the Terraform output, and you'll see that they match. What's shown here is the main routing table, and the CIDR, and other settings we made in our scripts. Every AWS account has a default VPC that's presumably meant for experiments. It is a better form to create a VPC for each project so that resources for each project are separate from other projects.

The sidebar contains links for further dashboards for subnets, route tables, and other things, and an example dashboard can be seen in the following screenshot:

For example, this is the NAT gateway dashboard showing the one created for this project.

Another way to explore is with the AWS CLI tool. Just because we have Terraform doesn't mean we are prevented from using the CLI. Have a look at the following code block:


这列出了创建的 VPC 的参数。

记得要么配置`AWS_PROFILE`环境变量,要么在命令行上使用`--profile`。

要列出子网上的数据,请运行以下命令:

To focus on the subnets for a given VPC, we use the --filters option, passing in the filter named vpc-id and the VPC ID for which to filter.

Documentation for the AWS CLI can be found at docs.aws.amazon.com/cli/latest/reference/index.html. For documentation relating to the EC2 sub-commands, refer to docs.aws.amazon.com/cli/latest/reference/ec2/index.html.

The AWS CLI tool has an extensive list of sub-commands and options. These are enough to almost guarantee getting lost, so read carefully.

In this section, we learned how to use Terraform to set up the VPC and related infrastructure resources, and we also learned how to navigate both the AWS console and the AWS CLI to explore what had been created.

Our next step is to set up an initial Docker Swarm cluster by deploying an EC2 instance to AWS.

Setting up a Docker Swarm cluster on AWS EC2

What we have set up is essentially a blank slate. AWS has a long list of offerings that could be deployed to the VPC that we've created. What we're looking to do in this section is to set up a single EC2 instance to install Docker, and set up a single-node Docker Swarm cluster. We'll use this to familiarize ourselves with Docker Swarm. In the remainder of the chapter, we'll build more servers to create a larger swarm cluster for full deployment of Notes.

A Docker Swarm cluster is simply a group of servers running Docker that have been joined together into a common pool. The code for the Docker Swarm orchestrator is bundled with the Docker Engine server but it is disabled by default. To create a swarm, we simply enable swarm mode by running docker swarm init and then run a docker swarm join command on each system we want to be part of the cluster. From there, the Docker Swarm code automatically takes care of a long list of tasks. The features for Docker Swarm include the following:

  • Horizontal scaling: When deploying a Docker service to a swarm, you tell it the desired number of instances as well as the memory and CPU requirements. The swarm takes that and computes the best distribution of tasks to nodes in the swarm.
  • Maintaining the desired state: From the services deployed to a swarm, the swarm calculates the desired state of the system and tracks its current actual state. Suppose one of the nodes crashes—the swarm will then readjust the running tasks to replace the ones that vaporized because of the crashed server.
  • Multi-host networking: The overlay network driver automatically distributes network connections across the network of machines in the swarm.
  • Secure by default: Swarm mode uses strong Transport Layer Security (TLS) encryption for all communication between nodes.
  • Rolling updates: You can deploy an update to a service in such a manner where the swarm intelligently brings down existing service containers, replacing them with updated newer containers.

For an overview of Docker Swarm, refer to docs.docker.com/engine/swarm/.

We will use this section to not only learn how to set up a Docker Swarm but to also learn something about how Docker orchestration works.

To get started, we'll set up a single-node swarm on a single EC2 instance in order to learn some basics, before we move on to deploying a multi-node swarm and deploying the full Notes stack.

Deploying a single-node Docker Swarm on a single EC2 instance

For a quick introduction to Docker Swarm, let's start by installing Docker on a single EC2 node. We can kick the tires by trying a few commands and exploring the resulting system.

This will involve deploying Ubuntu 20.04 on an EC2 instance, configuring it to have the latest Docker Engine, and initializing swarm mode.

Adding an EC2 instance and configuring Docker

To launch an EC2 instance, we must first select which operating system to install. There are thousands of operating system configurations available. Each of these configurations is identified by an AMI code, where AMI stands for Amazon Machine Image.

To find your desired AMI, navigate to the EC2 dashboard on the AWS console. Then, click on the Launch Instance button, which starts a wizard-like interface to launch an instance. You can, if you like, go through the whole wizard since that is one way to learn about EC2 instances. We can search the AMIs via the first page of that wizard, where there is a search box.

For this exercise, we will use Ubuntu 20.04, so enter Ubuntu and then scroll down to find the correct version, as illustrated in the following screenshot:

This is what the desired entry looks like. The AMI code starts with ami- and we see one version for x86 CPUs, and another for ARM (previously Advanced RISC Machine). ARM processors, by the way, are not just for your cell phone but are also used in servers. There is no need to launch an EC2 instance from here since we will instead do so with Terraform.

Another attribute to select is the instance size. AWS supports a long list of sizes that relate to the amount of memory, CPU cores, and disk space. For a chart of the available instance types, click on the Select button to proceed to the second page of the wizard, which shows a table of instance types and their attributes. For this exercise, we will use the t2.micro instance type because it is eligible for the free tier.

Create a file named ec2-public.tf containing the following:


在 Terraform AWS 提供程序中,EC2 实例的资源名称是`aws_instance`。由于此实例附加到我们的公共子网,我们将其称为`aws_instance.public`。因为它是一个公共的 EC2 实例,`associate_public_ip_address`属性设置为`true`。

属性包括 AMI ID、实例类型、子网 ID 等。`key_name`属性是指我们将用于登录 EC2 实例的 SSH 密钥的名称。我们稍后会讨论这些密钥对。`vpc_security_group_ids`属性是指我们将应用于 EC2 实例的安全组。`depends_on`属性导致 Terraform 等待数组中命名的资源的创建。`user_data`属性是一个 shell 脚本,一旦创建实例就在实例内执行。

对于 AMI、实例类型和密钥对数据,请将这些条目添加到`variables.tf`,如下所示:

The AMI ID shown here is specifically for Ubuntu 20.04 in us-west-2. There will be other AMI IDs in other regions. The key_pair name shown here should be the key-pair name you selected when creating your key pair earlier.

It is not necessary to add the key-pair file to this directory, nor to reference the file you downloaded in these scripts. Instead, you simply give the name of the key pair. In our example, we named it notes-app-key-pair, and downloaded notes-app-key-pair.pem.

The user_data feature is very useful since it lets us customize an instance after creation. We're using this to automate the Docker setup on the instances. This field is to receive a string containing a shell script that will execute once the instance is launched. Rather than insert that script inline with the Terraform code, we have created a set of files that are shell script snippets. The Terraform file function reads the named file, returning it as a string. The Terraform join function takes an array of strings, concatenating them together with the delimiter character in between. Between the two we construct a shell script. The shell script first installs Docker Engine, then initializes Docker Swarm mode, and finally changes the hostname to help us remember that this is the public EC2 instance.

Create a directory named sh in which we'll create shell scripts, and in that directory create a file named docker_install.sh. To this file, add the following:


此脚本源自 Ubuntu 上安装 Docker Engine **Community Edition** (**CE**)的官方说明。第一部分是支持`apt-get`从 HTTPS 存储库下载软件包。然后将 Docker 软件包存储库配置到 Ubuntu 中,之后安装 Docker 和相关工具。最后,确保`docker`组已创建并确保`ubuntu`用户 ID 是该组的成员。Ubuntu AMI 默认使用此用户 ID `ubuntu` 作为 EC2 管理员使用的用户 ID。

对于此 EC2 实例,我们还运行`docker swarm init`来初始化 Docker Swarm。对于其他 EC2 实例,我们不运行此命令。用于初始化`user_data`属性的方法让我们可以轻松地为每个 EC2 实例设置自定义配置脚本。对于其他实例,我们只运行`docker_install.sh`,而对于此实例,我们还将初始化 swarm。

回到`ec2-public.tf`,我们还有两件事要做,然后我们可以启动 EC2 实例。看一下以下代码块:

This is the security group declaration for the public EC2 instance. Remember that a security group describes the rules of a firewall that is attached to many kinds of AWS objects. This security group was already referenced in declaring aws_instance.public.

The main feature of security groups is the ingress and egress rules. As the words imply, ingress rules describe the network traffic allowed to enter the resource, and egress rules describe what's allowed to be sent by the resource. If you have to look up those words in a dictionary, you're not alone.

We have two ingress rules, and the first allows traffic on port 22, which covers SSH traffic. The second allows traffic on port 80, covering HTTP. We'll add more Docker rules later when they're needed.

The egress rule allows the EC2 instance to send any traffic to any machine on the internet.

These ingress rules are obviously very strict and limit the attack surface any miscreants can exploit.

The final task is to add these output declarations to ec2-public.tf, as follows:


这将让我们知道公共 IP 地址和公共 DNS 名称。如果我们感兴趣,输出还会告诉我们私有 IP 地址和 DNS 名称。

### 在 AWS 上启动 EC2 实例

我们已经添加了用于创建 EC2 实例的 Terraform 声明。

现在我们已经准备好将其部署到 AWS 并查看我们可以做些什么。我们已经知道该怎么做了,所以让我们运行以下命令:

If the VPC infrastructure were already running, you would get output similar to this. The addition is two new objects, aws_instance.public and aws_security_group.ec2-public-sg. This looks good, so we proceed to deployment, as follows:


这构建了我们的 EC2 实例,我们有了 IP 地址和域名。因为初始化脚本需要几分钟才能运行,所以最好等待一段时间再进行系统测试。

`ec2-public-ip`值是 EC2 实例的公共 IP 地址。在以下示例中,我们将放置文本`PUBLIC-IP-ADDRESS`,当然您必须替换为您的 EC2 实例分配的 IP 地址。

我们可以这样登录到 EC2 实例:

On a Linux or macOS system where we're using SSH, the command is as shown here. The -i option lets us specify the Privacy Enhanced Mail (PEM) file that was provided by AWS for the key pair. If on Windows using PuTTY, you'd instead tell it which PuTTY Private Key (PPK) file to use, and the connection parameters will otherwise be similar to this.

This lands us at the command-line prompt of the EC2 instance. We see that it is Ubuntu 20.04, and the hostname is set to notes-public, as reflected in Command Prompt and the output of the hostname command. This means that our initialization script ran because the hostname was the last configuration task it performed.

Handling the AWS EC2 key-pair file

Earlier, we said to safely store the key-pair file somewhere on your computer.  In the previous section, we showed how to use the PEM file with SSH to log in to the EC2 instance. Namely, we use the PEM file like so:


每次使用 SSH 时记住添加`-i`标志可能会不方便。为了避免使用此选项,运行此命令:

As the command name implies, this adds the authentication file to SSH. This has to be rerun on every reboot of the computer, but it conveniently lets us access EC2 instances without remembering to specify this option.

Testing the initial Docker Swarm

We have an EC2 instance and it should already be configured with Docker, and we can easily verify that this is the case as follows:


设置脚本也应该已经将此 EC2 实例初始化为 Docker Swarm 节点,以下命令验证了是否发生了这种情况:

The docker info command, as the name implies, prints out a lot of information about the current Docker instance. In this case, the output includes verification that it is in Docker Swarm mode and that this is a Docker Swarm manager instance.

Let's try a couple of swarm commands, as follows:


`docker node`命令用于管理集群中的节点。在这种情况下,只有一个节点 - 这个节点,并且它被显示为不仅是一个管理者,而且是集群的领导者。当你是集群中唯一的节点时,成为领导者似乎很容易。

`docker service`命令用于管理集群中部署的服务。在这种情况下,服务大致相当于 Docker compose 文件中`services`部分的条目。换句话说,服务不是正在运行的容器,而是描述启动给定容器一个或多个实例的配置的对象。 

要了解这意味着什么,让我们启动一个`nginx`服务,如下所示:

We started one service using the nginx image. We said to deploy one replica and to expose port 80. We chose the nginx image because it has a simple default HTML file that we can easily view, as illustrated in the following screenshot:

Simply paste the IP address of the EC2 instance into the browser location bar, and we're greeted with that default HTML.

We also see by using docker node ls and docker service ps that there is one instance of the service. Since this is a swarm, let's increase the number of nginx instances, as follows:


一旦服务部署完成,我们可以使用`docker service update`命令来修改部署。在这种情况下,我们告诉它使用`--replicas`选项增加实例的数量,现在`notes-public`节点上运行了三个`nginx`容器的实例。

我们还可以运行正常的`docker ps`命令来查看实际的容器,如下面的代码块所示:

This verifies that the nginx service with three replicas is actually three nginx containers.

In this section, we were able to launch an EC2 instance and set up a single-node Docker swarm in which we launched a service, which gave us the opportunity to familiarize ourselves with what this can do.

While we're here, there is another thing to learn—namely, how to set up the remote control of Docker hosts.

Setting up remote control access to a Docker Swarm hosted on EC2

A feature that's not well documented in Docker is the ability to control Docker nodes remotely. This will let us, from our laptop, run Docker commands on a server. By extension, this means that we will be able to manage the Docker Swarm from our laptop.

One method for remotely controlling a Docker instance is to expose the Docker Transmission Control Protocol (TCP) port. Be aware that miscreants are known to scan an internet infrastructure for Docker ports to hijack. The following technique does not expose the Docker port but instead uses SSH.

The following setup is for Linux and macOS, relying on features of SSH. To do this on Windows would rely on installing OpenSSH. From October 2018, OpenSSH became available for Windows, and the following commands may work in PowerShell (failing that, you can run these commands from a Multipass or Windows Subsystem for Linux (WSL) 2 instance on Windows):


退出 EC2 实例上的 shell,这样你就可以在笔记本电脑的命令行上了。

运行以下命令:

We discussed this command earlier, noting that it lets us log in to EC2 instances without having to use the -i option to specify the PEM file.  This is more than a simple convenience when it comes to remotely accessing Docker hosts. The following steps are dependent on having added the PEM file to SSH, as shown here.

To verify you've done this correctly, use this command:


通常在 EC2 实例上,我们会使用`-i`选项,就像之前展示的那样。但是在运行`ssh-add`之后,就不再需要`-i`选项了。

这使我们能够创建以下环境变量:

The DOCKER_HOST environment variable enables the remote control of Docker hosts. It relies on a passwordless SSH login to the remote host. Once you have that, it's simply a matter of setting the environment variable and you've got remote control of the Docker host, and in this case, because the host is a swarm manager, a remote swarm.

But this gets even better by using the Docker context feature. A context is a configuration required to access a remote node or swarm. Have a look at the following code snippet:


我们首先删除环境变量,因为我们将用更好的东西来替代它,如下所示:

We create a context using docker context create, specifying the same SSH URL we used in the DOCKER_HOST variable. We can then use it either with the --context option or by using docker context use to switch between contexts.

With this feature, we can easily maintain configurations for multiple remote servers and switch between them with a simple command.

For example, the Docker instance on our laptop is the default context. Therefore, we might find ourselves doing this:


有时候我们必须意识到当前的 Docker 上下文是什么,以及何时使用哪个上下文。在下一节中,当我们学习如何将镜像推送到 AWS ECR 时,这将是有用的。

我们在本节中学到了很多知识,所以在进行下一个任务之前,让我们清理一下我们的 AWS 基础设施。没有必要保持这个 EC2 实例运行,因为我们只是用它进行了一个快速的熟悉之旅。我们可以轻松地删除这个实例,同时保留其余的基础设施配置。最有效的方法是将`ec2-public.tf`重命名为`ec2-public.tf-disable`,然后重新运行`terraform apply`,如下面的代码块所示:

The effect of changing the name of one of the Terraform files is that Terraform will not scan those files for objects to deploy. Therefore, when Terraform maps out the state we want Terraform to deploy, it will notice that the deployed EC2 instance and security group are not listed in the local files, and it will, therefore, destroy those objects. In other words, this lets us undeploy some infrastructure with very little fuss.

This tactic can be useful for minimizing costs by turning off unneeded facilities. You can easily redeploy the EC2 instances by renaming the file back to ec2-public.tf and rerunning terraform apply.

In this section, we familiarized ourselves with Docker Swarm by deploying a single-node swarm on an EC2 instance on AWS. We first added suitable declarations to our Terraform files. We then deployed the EC2 instance on AWS. Following deployment, we set about verifying that, indeed, Docker Swarm was already installed and initialized on the server and that we could easily deploy Docker services on the swarm. We then learned how to set up remote control of the swarm from our laptop.

Taken together, this proved that we can easily deploy Docker-based services to EC2 instances on AWS. In the next section, let's continue preparing for a production-ready deployment by setting up a build process to push Docker images to image repositories.

Setting up ECR repositories for Notes Docker images

We have created Docker images to encapsulate the services making up the Notes application. So far, we've used those images to instantiate Docker containers on our laptop. To deploy containers on the AWS infrastructure will require the images to be hosted in a Docker image repository.

This requires a build procedure by which the svc-notes and svc-userauth images are correctly pushed to the container repository on the AWS infrastructure. We will go over the commands required and create a few shell scripts to record those commands.

A site such as Docker Hub is what's known as a Docker Registry. Registries are web services that store Docker images by hosting Docker image repositories. When we used the redis or mysql/mysql-server images earlier, we were using Docker image repositories located on the Docker Hub Registry.

The AWS team offers a Docker image registry, ECR. An ECR instance is available for each account in each AWS region. All we have to do is log in to the registry, create repositories, and push images to the repositories.

It is extremely important to run commands in this section in the default Docker context on your laptop. The reason is that Docker builds must not happen on the Swarm host but on some other host, such as your laptop.

Because it is important to not run Docker build commands on the Swarm infrastructure, execute this command:


这个命令将 Docker 上下文切换到本地系统。

为了保存与管理 AWS ECR 存储库相关的脚本和其他文件,创建一个名为`ecr`的目录,作为`notes`,`users`和`terraform-swarm`的同级目录。

构建过程需要几个命令来创建 Docker 镜像,对其进行标记,并将其推送到远程存储库。为了简化操作,让我们创建一些 shell 脚本以及 PowerShell 脚本来记录这些命令。

第一个任务是连接到 AWS ECR 服务。为此,创建一个名为`login.sh`的文件,其中包含以下内容:

This command, and others, are available in the ECR dashboard. If you navigate to that dashboard and then create a repository there, a button labeled View Push Command is available. This and other useful commands are listed there, but we have substituted a few variable names to make this configurable.

If you are instead using Windows PowerShell, AWS recommends the following:


这依赖于 PowerShell 的 AWS 工具包(参见[`aws.amazon.com/powershell/`](https://aws.amazon.com/powershell/)),它似乎提供了一些有用于 AWS 服务的强大工具。然而,在测试中,这个命令并没有表现得很好。

相反,发现以下命令效果更好,你可以将其放在一个名为`login.ps1`的文件中:

This is the same command as is used for Unix-like systems, but with Windows-style references to environment variables.

You may wish to explore the cross-var package, since it can convert Unix-style environment variable references to Windows. For the documentation, refer to www.npmjs.com/package/cross-var.

Several environment variables are being used, but just what are those variables being used and how do we set them?

Using environment variables for AWS CLI commands

Look carefully and you will see that some environment variables are being used. The AWS CLI commands know about those environment variables and will use them instead of command-line options. The environment variables we're using are the following:

  • AWS_PROFILE: The AWS profile to use with this project.
  • AWS_REGION: The AWS region to deploy the project to.
  • AWS_USER: The numeric user ID for the account being used. This ID is available on the IAM dashboard page for the account.

The AWS CLI recognizes some of these environment variables, and others. For further details, refer to docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html.

The AWS command-line tools will use those environment variables in place of the command-line options. Earlier, we discussed using the AWS_PROFILE variable instead of the --profile option. The same holds true for other command-line options.

This means that we need an easy way to set those variables. These Bash commands can be recorded in a shell script like this, which you could store as env-us-west-2:


当然,这个脚本遵循 Bash shell 的语法。对于其他命令环境,你必须适当地进行转换。要在 Bash shell 中设置这些变量,请运行以下命令:

For other command environments, again transliterate appropriately. For example, in Windows and in PowerShell, the variables can be set with these commands:


这些值应该是相同的,只是在 Windows 中被识别的语法。

我们已经定义了正在使用的环境变量。现在让我们回到定义构建 Docker 镜像并将其推送到 ECR 的过程。

## 定义一个构建 Docker 镜像并将其推送到 AWS ECR 的过程

我们正在探索一个将 Docker 容器推送到 ECR 存储库的构建过程,直到我们开始谈论环境变量。让我们回到手头的任务,那就是轻松地构建 Docker 镜像,创建 ECR 存储库,并将镜像推送到 ECR。

正如本节开头提到的,确保切换到*default* Docker 上下文。我们必须这样做,因为 Docker Swarm 的政策是不使用集群主机来构建 Docker 镜像。

要构建镜像,让我们添加一个名为`build.sh`的文件,其中包含以下内容:

This handles running docker build commands for both the Notes and user authentication services. It is expected to be executed in the ecr directory and takes care of executing commands in both the notes and users directories.

Let's now create and delete a pair of registries to hold our images. We have two images to upload to the ECR, and therefore we create two registries.

Create a file named create.sh containing the following:


还要创建一个名为`delete.sh`的伴随文件,其中包含以下内容:

Between these scripts, we can create and delete the ECR repositories for our Docker images. These scripts are directly usable on Windows; simply change the filenames to create.ps1 and delete.ps1.

In aws ecr delete-repository, the --force option means to delete the repositories even if they contain images.

With the scripts we've written so far, they are executed in the following order:


`aws ecr create-repository`命令会输出这些镜像存储库的描述符。需要注意的重要数据是`repositoryUri`值。这将在稍后的 Docker 堆栈文件中用于命名要检索的镜像。

`create.sh` 脚本只需要执行一次。

除了创建仓库,工作流程如下:

+   构建图像,我们已经创建了一个名为 `build.sh` 的脚本。

+   使用 ECR 仓库的**统一资源标识符**(**URI**)标记图像。

+   将图像推送到 ECR 仓库。

对于后两个步骤,我们仍然有一些脚本要创建。

创建一个名为 `tag.sh` 的文件,其中包含以下内容:

The docker tag command we have here takes svc-notes:latest, or svc-userauth:latest, and adds what's called a target image to the local image storage area. The target image name we've used is the same as what will be stored in the ECR repository.

For Windows, you should create a file named tag.ps1 using the same commands, but with Windows-style environment variable references.

Then, create a file named push.sh containing the following:


`docker push` 命令会将目标图像发送到 ECR 仓库。同样,对于 Windows,创建一个名为 `push.ps1` 的文件,其中包含相同的命令,但使用 Windows 风格的环境变量引用。

在 `tag` 和 `push` 脚本中,我们使用了仓库 URI 值,但插入了两个环境变量。这将使其在我们将 Notes 部署到另一个 AWS 区域时变得通用。

我们已经将工作流程实现为脚本,现在让我们看看如何运行它,如下:

This builds the Docker images. When we run docker build, it stores the built image in an area on our laptop where Docker maintains images. We can inspect that area using the docker images command, like this:


如果我们没有指定标签,`docker build` 命令会自动添加标签 `latest`。

然后,要将图像推送到 ECR 仓库,我们执行以下命令:

Since the images are rather large, it will take a long time to upload them to the AWS ECR. We should add a task to the backlog to explore ways to trim Docker image sizes. In any case, expect this to take a while.

After a period of time, the images will be uploaded to the ECR repositories, and you can inspect the results on the ECR dashboard.

Once the Docker images are pushed to the AWS ECR repository, we no longer need to stay with the default Docker context. You will be free to run the following command at any time:


请记住,不要使用 swarm 主机构建 Docker 图像。在本节开始时,我们切换到默认上下文,以便构建发生在我们的笔记本电脑上。

在本节中,我们学习了如何设置构建过程,将我们的 Docker 图像推送到 AWS ECR 服务的仓库。这包括使用一些有趣的工具,简化了在 `package.json` 脚本中构建复杂构建过程。

我们的下一步是学习如何使用 Docker compose 文件来描述在 Docker Swarm 上的部署。

# 为部署到 Docker Swarm 创建 Docker stack 文件

在之前的章节中,我们学习了如何使用 Terraform 设置 AWS 基础架构。我们设计了一个将容纳 Notes 应用程序堆栈的 VPC,我们在单个 EC2 实例上构建了单节点 Docker Swarm 集群,并设置了一个将 Docker 图像推送到 ECR 的过程。

我们的下一个任务是为部署到 swarm 准备一个 Docker stack 文件。一个 stack 文件几乎与我们在第十一章中使用的 Docker compose 文件相同,*使用 Docker 部署 Node.js 微服务*。Compose 文件用于普通的 Docker 主机,而 stack 文件用于 swarm。为了使其成为一个 stack 文件,我们添加了一些新的标签并更改了一些内容,包括网络实现。

之前,我们使用 `docker service create` 命令测试了 Docker Swarm,以在 swarm 上启动一个服务。虽然这很容易,但它并不构成可以提交到源代码库的代码,也不是一个自动化的过程。

在 swarm 模式下,服务是在 swarm 节点上执行任务的定义。每个服务由若干任务组成,这个数量取决于副本设置。每个任务是部署到 swarm 中的节点上的容器。当然,还有其他配置参数,如网络端口、卷连接和环境变量。

Docker 平台允许使用 compose 文件将服务部署到 swarm。这种情况下,compose 文件被称为 stack 文件。有一组用于处理 stack 文件的 `docker stack` 命令,如下:

+   在普通的 Docker 主机上,`docker-compose.yml` 文件称为 compose 文件。我们在 compose 文件上使用 `docker-compose` 命令。

+   在 Docker swarm 上,`docker-compose.yml` 文件称为 stack 文件。我们在 stack 文件上使用 `docker stack` 命令。

请记住,Compose 文件有一个`services`标签,该标签中的每个条目都是要部署的容器配置。当用作堆栈文件时,`services`标签中的每个条目当然是刚才描述的服务。这意味着就像`docker run`命令和 Compose 文件中的容器定义之间有很多相似之处一样,`docker service create`命令和堆栈文件中的服务条目之间也有一定程度的相似性。

一个重要的考虑是构建不应该发生在 Swarm 主机上的策略。相反,这些机器必须仅用于部署和执行容器。这意味着堆栈文件中列出的服务中的任何`build`标签都会被忽略。相反,有一个`deploy`标签,用于在 Swarm 中部署的参数,当文件与 Compose 一起使用时,`deploy`标签会被忽略。更简单地说,我们可以将同一个文件同时用作 Compose 文件(使用`docker compose`命令)和堆栈文件(使用`docker stack`命令),具有以下条件:

+   当用作 Compose 文件时,使用`build`标签,忽略`deploy`标签。

+   当用作堆栈文件时,忽略`build`标签,使用`deploy`标签。

这一政策的另一个后果是根据需要切换 Docker 上下文的必要性。我们已经讨论过这个问题——我们在笔记本上使用*默认*的 Docker 上下文构建镜像,当与 AWS EC2 实例上的 Swarm 进行交互时,我们使用 EC2 上下文。

要开始,创建一个名为`compose-stack`的目录,它是`compose-local`,`notes`,`terraform-swarm`和其他目录的同级目录。然后,将`compose-local/docker-compose.yml`复制到`compose-stack`中。这样,我们可以从我们知道工作良好的东西开始。

这意味着我们将从我们的 Compose 文件创建一个 Docker 堆栈文件。这涉及到几个步骤,我们将在接下来的几个部分中进行介绍。这包括添加部署标签,为 Swarm 配置网络,控制 Swarm 中服务的位置,将秘密存储在 Swarm 中,以及其他任务。

## 从 Notes Docker compose 文件创建 Docker 堆栈文件

有了这个理论基础,现在让我们来看看现有的 Docker compose 文件,并了解如何使其对部署到 Swarm 有用。

由于我们将需要一些高级的`docker-compose.yml`功能,将版本号更新为以下内容:

For the Compose file we started with, version '3' was adequate, but to accomplish the tasks in this chapter the higher version number is required, to enable newer features.

Fortunately, most of this is straightforward and will require very little code.

Deployment parameters: These are expressed in the deploy tag, which covers things such as the number of replicas, and memory or CPU requirements. For documentation, refer to docs.docker.com/compose/compose-file/#deploy.

For the deployment parameters, simply add a deploy tag to each service. Most of the options for this tag have perfectly reasonable defaults. To start with, let's add this to every service, as follows:


这告诉 Docker 我们想要每个服务的一个实例。稍后,我们将尝试添加更多的服务实例。我们稍后会添加其他参数,比如放置约束。稍后,我们将尝试为`svc-notes`和`svc-userauth`添加多个副本。将服务的 CPU 和内存限制放在服务上是很诱人的,但这并不是必要的。

很高兴得知,使用 Swarm 模式,我们可以简单地更改`replicas`设置来更改实例的数量。

接下来要注意的是镜像名称。虽然存在`build`标签,但要记住它会被忽略。对于 Redis 和数据库容器,我们已经在 Docker Hub 中使用镜像,但对于`svc-notes`和`svc-userauth`,我们正在构建自己的容器。这就是为什么在本章的前面,我们设置了将镜像推送到 ECR 存储库的程序。现在我们可以从堆栈文件中引用这些镜像。这意味着我们必须进行以下更改:

If we use this with docker-compose, it will perform the build in the named directories, and then tag the resulting image with the tag in the image field. In this case, the deploy tag will be ignored as well. However, if we use this with docker stack deploy, the build tag will be ignored, and the images will be downloaded from the repositories listed in the image tag. In this case, the deploy tag will be used.

For documentation on the build tag, refer to docs.docker.com/compose/compose-file/#build. For documentation on the image tag, refer to docs.docker.com/compose/compose-file/#image.

When running the compose file on our laptop, we used bridge networking. This works fine for a single host, but with swarm mode, we need another network mode that handles multi-host deployments. The Docker documentation clearly says to use the overlay driver in swarm mode, and the bridge driver for a single-host deployment.

Virtual networking for containers: Since bridge networking is designed for a single-host deployment, we must use overlay networking in swarm mode. For documentation, refer to docs.docker.com/compose/compose-file/#network-configuration-reference.

To use overlay networking, change the networks tag to the following:


为了支持在 Swarm 中使用这个文件,或者用于单主机部署,我们可以保留`bridge`网络设置,但将其注释掉。然后,根据上下文的不同,我们可以更改哪个被注释掉,从而改变`overlay`或`bridge`网络的活动状态。

`overlay`网络驱动程序在 Swarm 节点之间设置了一个虚拟网络。这个网络支持容器之间的通信,也方便访问外部发布的端口。

`overlay`网络配置了集群中的容器自动分配与服务名称匹配的域名。与之前使用的`bridge`网络一样,容器通过域名相互找到。对于部署多个实例的服务,`overlay`网络确保可以将请求路由到该容器的任何实例。如果连接到一个容器,但在同一主机上没有该容器的实例,`overlay`网络将请求路由到另一台主机上的实例。这是一种简单的服务发现方法,通过使用域名,但在集群中跨多个主机进行扩展。

这解决了将 compose 文件转换为堆栈文件的简单任务。然而,还有一些其他任务需要更多的注意。

### 在集群中放置容器

我们还没有这样做,但我们将向集群添加多个 EC2 实例。默认情况下,集群模式会在集群节点上均匀分配任务(容器)。然而,我们有两个考虑因素,应该强制一些容器部署在特定的 Docker 主机上,即以下:

1.  我们有两个数据库容器,需要为数据文件安排持久存储。这意味着数据库必须每次部署到相同的实例,以便它可以使用相同的数据目录。

1.  名为`notes-public`的公共 EC2 实例将成为集群的一部分。为了维护安全模型,大多数服务不应该部署在这个实例上,而是应该部署在将附加到私有子网的实例上。因此,我们应该严格控制哪些容器部署到`notes-public`。

Swarm 模式允许我们声明任何服务的放置要求。有几种实现方式,例如匹配主机名或分配给每个节点的标签。

有关堆栈文件`placement`标签的文档,请参阅[`docs.docker.com/compose/compose-file/#placement`](https://docs.docker.com/compose/compose-file/#placement)。[](https://docs.docker.com/compose/compose-file/#placement) `docker stack create`命令的文档包括对部署参数的进一步解释:[ ](https://docs.docker.com/compose/compose-file/#placement)[`docs.docker.com/engine/reference/commandline/service_create`](https://docs.docker.com/engine/reference/commandline/service_create)。

将`deploy`标签添加到`db-userauth`服务声明中:

The placement tag governs where the containers are deployed. Rather than Docker evenly distributing the containers, we can influence the placement with the fields in this tag. In this case, we have two examples, such as deploying a container to a specific node based on the hostname or selecting a node based on the labels attached to the node.

To set a label on a Docker swarm node, we run the following command:


此命令将标签命名为`type`,值为`public`,附加到名为`notes-public`的节点上。我们使用这个来设置标签,正如你所看到的,标签可以有任何名称和任何值。然后可以使用标签和其他属性来影响容器在集群节点上的放置。

对于堆栈文件的其余部分,添加以下放置约束:

This gives us three labels to assign to our EC2 instances: db, svc, and public. These constraints will cause the databases to be placed on nodes where the type label is db, the user authentication service is on the node of type svc, the Notes service is on the public node, and the Redis service is on any node that is not the public node.

The reasoning stems from the security model we designed. The containers deployed on the private network should be more secure behind more layers of protection. This placement leaves the Notes container as the only one on the public EC2 instance. The other containers are split between the db and svc nodes. We'll see later how these labels will be assigned to the EC2 instances we'll create.

Configuring secrets in Docker Swarm

With Notes, as is true for many kinds of applications, there are some secrets we must protect. Primarily, this is the Twitter authentication tokens, and we've claimed it could be a company-ending event if those tokens were to leak to the public. Maybe that's overstating the danger, but leaked credentials could be bad. Therefore, we must take measures to ensure that those secrets do not get committed to a source repository as part of any source code, nor should they be recorded in any other file.

For example, the Terraform state file records all information about the infrastructure, and the Terraform team makes no effort to detect any secrets and suppress recording them. It's up to us to make sure the Terraform state file does not get committed to source code control as a result.

Docker Swarm supports a very interesting method for securely storing secrets and for making them available in a secure manner in containers.

The process starts with the following command:


这就是我们在 Docker 集群中存储秘密的方法。`docker secret create`命令首先需要秘密的名称,然后是包含秘密文本的文件的说明符。这意味着我们可以将秘密的数据存储在文件中,或者—就像在这种情况下一样—我们使用`-`来指定数据来自标准输入。在这种情况下,我们使用`printf`命令,它适用于 macOS 和 Linux,将值发送到标准输入。

Docker Swarm 安全地记录加密数据作为秘密。一旦您将秘密交给 Docker,您就无法检查该秘密的价值。

在`compose-stack/docker-compose.yml`中,在最后添加此声明:

This lets Docker know that this stack requires the value of those two secrets.

The declaration for svc-notes also needs the following command:


这通知了集群 Notes 服务需要这两个秘密。作为回应,集群将使秘密的数据在容器的文件系统中可用,如`/var/run/secrets/TWITTER_CONSUMER_KEY`和`/var/run/secrets/TWITTER_CONSUMER_SECRET`。它们被存储为内存文件,相对安全。

总结一下,所需的步骤如下:

+   使用`docker secret create`在 Swarm 中注册秘密数据。

+   在堆栈文件中,在顶级秘密标签中声明`secrets`。

+   在需要秘密的服务中,声明一个`secrets`标签,列出此服务所需的秘密。

+   在服务的环境标签中,创建一个指向`secrets`文件的环境变量。

Docker 团队对环境变量配置有一个建议的约定。您可以直接在环境变量中提供配置设置,例如`TWITTER_CONSUMER_KEY`。但是,如果配置设置在文件中,则文件名应该在不同的环境变量中给出,其名称附加了`_FILE`。例如,我们将使用`TWITTER_CONSUMER_KEY`或`TWITTER_CONSUMER_KEY_FILE`,具体取决于值是直接提供还是在文件中。

这意味着我们必须重写 Notes 以支持从文件中读取这些值,除了现有的环境变量。

为了支持从文件中读取,将此导入添加到`notes/routes/users.mjs`的顶部:

Then, we'll find the code corresponding to these environment variables further down the file. We should rewrite that section as follows:


这与我们已经使用过的代码类似,但组织方式有点不同。它首先尝试从环境中读取 Twitter 令牌。如果失败,它会尝试从命名文件中读取。因为这段代码是在全局上下文中执行的,所以我们必须使用`readFileSync`来读取文件。

如果令牌可以从任一来源获取,则设置`twitterLogin`变量,然后我们启用对`TwitterStrategy`的支持。否则,Twitter 支持将被禁用。我们已经组织了视图模板,以便如果`twitterLogin`为`false`,则 Twitter 登录按钮不会出现。

这就是我们在第八章中所做的,*使用微服务对用户进行身份验证*,但增加了从文件中读取令牌。

### 在 Docker Swarm 中持久化数据

我们在第十一章中使用的数据持久化策略,*使用 Docker 部署 Node.js 微服务*,需要将数据库文件存储在卷中。卷的目录位于容器之外,并且在我们销毁和重新创建容器时仍然存在。

该策略依赖于有一个单一的 Docker 主机来运行容器。卷数据存储在主机文件系统的目录中。但在 Swarm 模式下,卷不以兼容的方式工作。

使用 Docker Swarm,除非我们使用放置标准,否则容器可以部署到任何 Swarm 节点。 Docker 中命名卷的默认行为是数据存储在当前 Docker 主机上。如果容器被重新部署,那么卷将在一个主机上被销毁,并在新主机上创建一个新的卷。显然,这意味着该卷中的数据不是持久的。

有关在 Docker Swarm 中使用卷的文档,请参阅[`docs.docker.com/compose/compose-file/#volumes-for-services-swarms-and-stack-files`](https://docs.docker.com/compose/compose-file/#volumes-for-services-swarms-and-stack-files)。

文档中建议的做法是使用放置标准来强制这些容器部署到特定的主机。例如,我们之前讨论的标准将数据库部署到具有`type`标签等于`db`的节点。

在下一节中,我们将确保 Swarm 中恰好有一个这样的节点。为了确保数据库数据目录位于已知位置,让我们更改`db-userauth`和`db-notes`容器的声明,如下所示:

In docker-local/docker-compose.yml, we used the named volumes, db-userauth-data and db-notes-data. The top-level volumes tag is required when doing this. In docker-swarm/docker-compose.yml, we've commented all of that out. Instead, we are using a bind mount, to mount specific host directories in the /var/lib/mysql directory of each database.

Therefore, the database data directories will be in /data/users and /data/notes, respectively.

This result is fairly good, in that we can destroy and recreate the database containers at will and the data directories will persist. However, this is only as persistent as the EC2 instance this is deployed to. The data directories will vaporize as soon as we execute terraform destroy.

That's obviously not good enough for a production deployment, but it is good enough for a test deployment such as this.

It is preferable to use a volume instead of the bind mount we just implemented. Docker volumes have a number of advantages, but to make good use of a volume requires finding the right volume driver for your needs. Two examples are as follows:

  1. In the Docker documentation, at docs.docker.com/storage/volumes/, there is an example of mounting a Network File System (NFS) volume in a Docker container. AWS offers an NFS service—the Elastic Filesystem (EFS) service—that could be used, but this may not be the best choice for a database container.
  2. The REX-Ray project (github.com/rexray/rexray) aims to advance the state of the art for persistent data storage in various containerization systems, including Docker.

Another option is to completely skip running our own database containers and instead use the Relational Database Service (RDS). RDS is an AWS service offering several Structured Query Language (SQL) database solutions, including MySQL. It offers a lot of flexibility and scalability, at a price. To use this, you would eliminate the db-notes and db-userauth containers, provision RDS instances, and then update the SEQUELIZE_CONNECT configuration in svc-notes and svc-userauth to use the database host, username, and password you configured in the RDS instances.

For our current requirements, this setup, with a bind mount to a directory on the EC2 host, will suffice. These other options are here for your further exploration.

In this section, we converted our Docker compose file to be useful as a stack file. While doing this, we discussed the need to influence which swarm host has which containers. The most critical thing is ensuring that the database containers are deployed to a host where we can easily persist the data—for example, by running a database backup every so often to external storage. We also discussed storing secrets in a secure manner so that they may be used safely by the containers.

At this point, we cannot test the stack file that we've created because we do not have a suitable swarm to deploy to. Our next step is writing the Terraform configuration to provision the EC2 instances. That will give us the Docker swarm that lets us test the stack file.

Provisioning EC2 instances for a full Docker swarm

So far in this chapter, we have used Terraform to create the required infrastructure on AWS, and then we set up a single-node Docker swarm on an EC2 instance to learn about Docker Swarm. After that, we pushed the Docker images to ECR, and we have set up a Docker stack file for deployment to a swarm. We are ready to set up the EC2 instances required for deploying a full swarm.

Docker Swarm is able to handle Docker deployments to large numbers of host systems. Of course, the Notes application only has delusions of grandeur and doesn't need that many hosts. We'll be able to do everything with three or four EC2 instances. We have declared one so far, and will declare two more that will live on the private subnet. But from this humble beginning, it would be easy to expand to more hosts.

Our goal in this section is to create an infrastructure for deploying Notes on EC2 using Docker Swarm. This will include the following:

  • Configuring additional EC2 instances on the private subnet, installing Docker on those instances, and joining them together in a multi-host Docker Swarm
  • Creating semi-automated scripting, thereby making it easy to deploy and configure the EC2 instances for the swarm
  • Using an nginx container on the public EC2 instance as a proxy in front of the Notes container

That's quite a lot of things to take care of, so let's get started.

Configuring EC2 instances and connecting to the swarm

We have one EC2 instance declared for the public subnet, and it is necessary to add two more for the private subnet. The security model we discussed earlier focused on keeping as much as possible in a private secure network infrastructure. On AWS, that means putting as much as possible on the private subnet.

Earlier, you may have renamed ec2-public.tf to ec2-public.tf-disable. If so, you should now change back the filename to ec2-public.tf. Remember that this tactic is useful for minimizing AWS resource usage when it is not needed.

Create a new file in the terraform-swarm directory named ec2-private.tf, as follows:


这声明了两个附加到私有子网的 EC2 实例。除了名称之外,这些实例之间没有区别。因为它们位于私有子网上,所以它们没有分配公共 IP 地址。

因为我们将`private-db1`实例用于数据库,所以我们为根设备分配了 50GB 的空间。`root_block_device`块用于自定义 EC2 实例的根磁盘。在可用的设置中,`volume_size`设置其大小,以 GB 为单位。

`private-db1`中的另一个区别是`instance_type`,我们已经将其硬编码为`t2.medium`。问题在于将两个数据库容器部署到此服务器。`t2.micro`实例有 1GB 内存,而观察到两个数据库会压倒这台服务器。如果您想要调试这种情况,将此值更改为`var.instance_type`,默认为`t2.micro`,然后阅读本章末尾关于调试发生的情况的部分。

请注意,对于`user_data`脚本,我们只发送安装 Docker 支持的脚本,而不是初始化 swarm 的脚本。swarm 是在公共 EC2 实例中初始化的。其他实例必须使用`docker swarm join`命令加入 swarm。稍后,我们将介绍如何初始化 swarm,并看看如何完成这个过程。对于`public-db1`实例,我们还创建了`/data/notes`和`/data/users`目录,用于保存数据库数据目录。

将以下代码添加到`ec2-private.tf`中:

This is the security group for these EC2 instances. It allows any traffic from inside the VPC to enter the EC2 instances. This is the sort of security group we'd create when in a hurry and should tighten up the ingress rules, since this is very lax.

Likewise, the ec2-public-sg security group needs to be equally lax. We'll find that there is a long list of IP ports used by Docker Swarm and that the swarm will fail to operate unless those ports can communicate. For our immediate purposes, the easiest option is to allow any traffic, and we'll leave a note in the backlog to address this issue in Chapter 14, Security in Node.js Applications.

In ec2-public.tf, edit the ec2-public-sg security group to be the following:


这实际上不是最佳实践,因为它允许来自任何 IP 地址的任何网络流量到达公共 EC2 实例。但是,这确实给了我们在此时开发代码而不担心协议的自由。我们稍后会解决这个问题并实施最佳安全实践。看一下以下代码片段:

This outputs the useful attributes of the EC2 instances.

In this section, we declared EC2 instances for deployment on the private subnet. Each will have Docker initialized. However, we still need to do what we can to automate the setup of the swarm.

Implementing semi-automatic initialization of the Docker Swarm

Ideally, when we run terraform apply, the infrastructure is automatically set up and ready to go. Automated setup reduces the overhead of running and maintaining the AWS infrastructure. We'll get as close to that goal as possible.

For this purpose, let's revisit the declaration of aws_instance.public in ec2-public.tf. Let's rewrite it as follows:


这基本上与以前一样,但有两个更改。第一个是向`depends_on`属性添加对私有 EC2 实例的引用。这将延迟公共 EC2 实例的构建,直到其他两个实例正在运行。

另一个更改是扩展附加到`user_data`属性的 shell 脚本。该脚本的第一个添加是在`notes-public`节点上设置`type`标签。该标签与服务放置一起使用。

最后的更改是一个脚本,我们将用它来设置 swarm。我们将生成一个脚本来创建 swarm,而不是直接在`user_data`脚本中设置 swarm。在`sh`目录中,创建一个名为`swarm-setup.sh`的文件,其中包含以下内容:

This generates a shell script that will be used to initialize the swarm. Because the setup relies on executing commands on the other EC2 instances, the PEM file for the AWS key pair must be present on the notes-public instance. However, it is not possible to send the key-pair file to the notes-public instance when running terraform apply. Therefore, we use the pattern of generating a shell script, which will be run later.

The pattern being followed is shown in the following code snippet:


`<<EOF`和`EOF`之间的部分作为`cat`命令的标准输入提供。因此,`/home/ubuntu/swarm-setup.sh`最终会以这些标记之间的文本结束。另一个细节是一些变量引用被转义,如`PEM=\$1`。这是必要的,以便在设置此脚本时不评估这些变量,但在生成的脚本中存在。

此脚本使用`templatefile`函数进行处理,以便我们可以使用模板命令。主要是使用`%{for .. }`循环生成配置每个 EC2 实例的命令。您会注意到每个实例都有一个数据数组,通过`templatefile`调用传递。

因此,`swarm-setup.sh`脚本将包含每个 EC2 实例的以下一对命令的副本:

The first line uses SSH to execute the swarm join command on the EC2 instance. For this to work, we need to supply the AWS key pair, which must be specified on the command file so that it becomes the PEM variable. The second line adds the type label with the named value to the named swarm node.

What is the $join variable? It has the output of running docker swarm join-token, so let's take a look at what it is.

Docker uses a swarm join token to facilitate connecting Docker hosts as a node in a swarm. The token contains cryptographically signed information that authenticates the attempt to join the swarm. We get the token by running the following command:


这里的`manager`一词意味着我们正在请求一个作为管理节点加入的令牌。要将节点连接为工作节点,只需将`manager`替换为`worker`。

一旦 EC2 实例部署完成,我们可以登录到`notes-public`,然后运行此命令获取加入令牌,并在每个 EC2 实例上运行该命令。然而,`swarm-setup.sh`脚本会为我们处理这一切。一旦 EC2 主机部署完成,我们所要做的就是登录到`notes-public`并运行此脚本。

它运行`docker swarm join-token manager`命令,通过一些`sed`命令将用户友好的文本提取出来。这样就留下了`join`变量,其中包含`docker swarm join`命令的文本,然后使用 SSH 在每个实例上执行该命令。

在本节中,我们研究了如何尽可能自动化 Docker swarm 的设置。

现在让我们来做吧。

## 在部署 Notes 堆栈之前准备 Docker Swarm

当您制作煎蛋卷时,最好在加热平底锅之前切好所有的蔬菜和香肠,准备好黄油,将牛奶和鸡蛋搅拌成混合物。换句话说,我们在进行关键操作之前准备好了所有的配料。到目前为止,我们已经准备好了成功将 Notes 堆栈部署到 AWS 上使用 Docker Swarm 的所有要素。现在是时候打开平底锅,看看它的效果如何了。

我们在 Terraform 文件中声明了所有内容,可以使用以下命令部署我们的完整系统:

This deploys the EC2 instances on AWS. Make sure to record all the output parameters. We're especially interested in the domain names and IP addresses for the three EC2 instances.

As before, the notes-public instance should have a Docker swarm initialized. We have added two more instances, notes-private-db1 and notes-private-svc1. Both will have Docker installed, but they are not joined to the swarm. Instead, we need to run the generated shell script for them to become nodes in the swarm, as follows:


我们已经在我们的笔记本电脑上运行了`ssh-add`,因此 SSH 和**安全复制**(**SCP**)命令可以在不明确引用 PEM 文件的情况下运行。然而,`notes-public` EC2 实例上的 SSH 没有 PEM 文件。因此,为了访问其他 EC2 实例,我们需要 PEM 文件可用。因此,我们使用了`scp`将其复制到`notes-public`实例上。

如果您想验证实例正在运行并且 Docker 处于活动状态,请输入以下命令:

In this case, we are testing the private EC2 instances from a shell running on the public EC2 instance. That means we must use the private IP addresses printed when we ran Terraform. This command verifies SSH connectivity to an EC2 instance and verifies its ability to download and execute a Docker image.

Next, we can run swarm-setup.sh. On the command line, we must give the filename for the PEM file as the first argument, as follows:


我们可以使用 SSH 在每个 EC2 实例上执行`docker swarm join`命令来看到这一点,从而使这两个系统加入到 swarm 中,并在实例上设置标签,如下面的代码片段所示:

Indeed, these systems are now part of the cluster.

The swarm is ready to go, and we no longer need to be logged in to notes-public. Exiting back to our laptop, we can create the Docker context to control the swarm remotely, as follows:


我们已经看到了这是如何工作的,这样做之后,我们将能够在我们的笔记本电脑上运行 Docker 命令;例如,看一下下面的代码片段:

From our laptop, we can query the state of the remote swarm that's hosted on AWS. Of course, this isn't limited to querying the state; we can run any other Docker command.

We also need to run the following commands, now that the swarm is set up:


请记住,新创建的 swarm 没有任何秘密。要安装秘密,需要重新运行这些命令。

如果您希望创建一个 shell 脚本来自动化这个过程,请考虑以下内容:

This script executes the same commands we just went over to prepare the swarm on the EC2 hosts. It requires the environment variables to be set, as follows:

  • AWS_KEY_PAIR: The filename for the PEM file
  • NOTES_PUBLIC_IP: The IP address of the notes-public EC2 instance
  • TWITTER_CONSUMER_KEY, TWITTER_CONSUMER_SECRET: The access tokens for Twitter authentication

In this section, we have deployed more EC2 instances and set up the Docker swarm. While the process was not completely automated, it's very close. All that's required, after using Terraform to deploy the infrastructure, is to execute a couple of commands to get logged in to notes-public where we run a script, and then go back to our laptop to set up remote access.

We have set up the EC2 instances and verified we have a working swarm. We still have the outstanding issue of verifying the Docker stack file created in the previous section. To do so, our next step is to deploy the Notes app on the swarm.

Deploying the Notes stack file to the swarm

We have prepared all the elements required to set up a Docker Swarm on the AWS EC2 infrastructure, we have run the scripts required to set up that infrastructure, and we have created the stack file required to deploy Notes to the swarm.

What's required next is to run docker stack deploy from our laptop, to deploy Notes on the swarm. This will give us the chance to test the stack file created earlier. You should still have the Docker context configured for the remote server, making it possible to remotely deploy the stack. However, there are four things to handle first, as follows:

  1. Install the secrets in the newly deployed swarm.
  2. Update the svc-notes environment configuration for the IP address of notes-public.
  3. Update the Twitter application for the IP address of notes-public.
  4. Log in to the ECR instance.

Let's take care of those things and then deploy the Notes stack.

Preparing to deploy the Notes stack to the swarm

We are ready to deploy the Notes stack to the swarm that we've launched. However, we have realized that we have a couple of tasks to take care of.

The environment variables for svc-notes configuration require a little adjustment. Have a look at the following code block:


我们的主要要求是调整`TWITTER_CALLBACK_HOST`变量。`notes-public`实例的域名在每次部署 AWS 基础设施时都会更改。因此,`TWITTER_CALLBACK_HOST`必须更新以匹配。

同样,我们必须转到 Twitter 开发者仪表板并更新应用程序设置中的 URL。正如我们已经知道的那样,每当我们在不同的 IP 地址或域名上托管 Notes 时,都需要这样做。要使用 Twitter 登录,我们必须更改 Twitter 识别的 URL 列表。

更新`TWITTER_CALLBACK_HOST`和 Twitter 应用程序设置将让我们使用 Twitter 帐户登录到 Notes。

在这里,我们应该审查其他变量,并确保它们也是正确的。

最后的准备步骤是登录到 ECR 存储库。要做到这一点,只需执行以下命令:

This has to be rerun every so often since the tokens that are downloaded time out after a few hours.

We only need to run login.sh, and none of the other scripts in the ecr directory.

In this section, we prepared to run the deployment. We should now be ready to deploy Notes to the swarm, so let's do it.

Deploying the Notes stack to the swarm

We just did the final preparation for deploying the Notes stack to the swarm. Take a deep breath, yell out Smoke Test, and type the following command:


这部署了服务,swarm 通过尝试启动每个服务来做出响应。`--with-registry-auth`选项将 Docker Registry 身份验证发送到 swarm,以便它可以从 ECR 存储库下载容器映像。这就是为什么我们必须先登录到 ECR。

### 验证 Notes 应用程序堆栈的正确启动

使用以下命令来监视启动过程将会很有用:

The service ls command lists the services, with a high-level overview. Remember that the service is not the running container and, instead, the services are declared by entries in the services tag in the stack file. In our case, we declared one replica for each service, but we could have given a different amount. If so, the swarm will attempt to distribute that number of containers across the nodes in the swarm.

Notice that the pattern for service names is the name of the stack that was given in the docker stack deploy command, followed by the service name listed in the stack file. When running that command, we named the stack notes; so, the services are notes_db-notes, notes_svc-userauth, notes_redis, and so on.

The service ps command lists information about the tasks deployed for the service. Remember that a task is essentially the same as a running container. We see here that one instance of the svc-notes container has been deployed, as expected, on the notes-public host.

Sometimes, the notes_svc-notes service doesn't launch, and instead, we'll see the following message:


错误`no suitable node`意味着 swarm 无法找到符合放置条件的节点。在这种情况下,`type=public`标签可能没有正确设置。

以下命令很有帮助:

Notice that the Labels entry is empty. In such a case, you can add the label by running this command:


一旦运行了这个命令,swarm 将在`notes-public`节点上放置`svc-notes`服务。

如果发生这种情况,将以下命令添加到`aws_instance.public`的`user_data`脚本中可能会有用(在`ec2-public.tf`中),就在设置`type=public`标签之前:

It would appear that this provides a small window of opportunity to allow the swarm to establish itself.

Diagnosing a failure to launch the database services

Another possible deployment problem is that the database services might fail to launch, and the notes-public-db1 node might become Unavailable. Refer back to the docker node ls output and you will see a column marked Status. Normally, this column says Reachable, meaning that the swarm can reach and communicate with the swarm agent on that node. But with the deployment as it stands, this node might instead show an Unavailable status, and in the docker service ls output, the database services might never show as having deployed.

With remote access from our laptop, we can run the following command:


输出将告诉您当前的状态,例如部署服务时的任何错误。但是,要调查与 EC2 实例的连接,我们必须登录到`notes-public`实例,如下所示:

That gets us access to the public EC2 instance. From there, we can try to ping the notes-private-db1 instance, as follows:


这应该可以工作,但是`docker node ls`的输出可能会显示节点为`Unreachable`。问问自己:如果一台计算机内存不足会发生什么?然后,认识到我们已经将两个数据库实例部署到只有 1GB 内存的 EC2 实例上——这是写作时` t2.micro` EC2 实例的内存容量。问问自己,您是否可能已经部署到给定服务器的服务已经超负荷了该服务器。

要测试这个理论,在`ec2-private.tf`中进行以下更改:

This changes the instance type from t2.micro to t2.medium, or even t2.large, thereby giving the server more memory.

To implement this change, run terraform apply to update the configuration. If the swarm does not automatically correct itself, then you may need to run terraform destroy and then run through the setup again, starting with terraform apply.

Once the notes-private-db1 instance has sufficient memory, the databases should successfully deploy.

In this section, we deployed the Notes application stack to the swarm cluster on AWS. We also talked a little about how to verify the fact that the stack deployed correctly, and how to handle some common problems.

Next, we have to test the deployed Notes stack to verify that it works on AWS.

Testing the deployed Notes application

Having set up everything required to deploy Notes to AWS using Docker Swarm, we have done so. That means our next step is to put Notes through its paces. We've done enough ad hoc testing on our laptop to have confidence it works, but the Docker swarm deployment might show up some issues.

In fact, the deployment we just made very likely has one or two problems. We can learn a lot about AWS and Docker Swarm by diagnosing those problems together.

The first test is obviously to open the Notes application in the browser. In the outputs from running terraform apply was a value labeled ec2-public-dns. This is the domain name for the notes-public EC2 instance. If we simply paste that domain name into our browser, the Notes application should appear.

However, we cannot do anything because there are no user IDs available to log in with.

Logging in with a regular account on Notes

Obviously, in order to test Notes, we must log in and add some notes, make some comments, and so forth. It will be instructive to log in to the user authentication service and use cli.mjs to add a user ID.

The user authentication service is on one of the private EC2 instances, and its port is purposely not exposed to the internet. We could change the configuration to expose its port and then run cli.mjs from our laptop, but that would be a security problem and we need to learn how to access the running containers anyway.

We can find out which node the service is deployed on by using the following command:


`notes_svc-userauth`任务已部署到`notes-private-svc1`,正如预期的那样。

要运行`cli.mjs`,我们必须在容器内部获得 shell 访问权限。由于它部署在私有实例上,这意味着我们必须首先 SSH 到`notes-public`实例;然后从那里 SSH 到`notes-private-svc1`实例;然后在那里运行`docker exec`命令,在运行的容器中启动一个 shell,如下面的代码块所示:

We SSHd to the notes-public server and, from there, SSHd to the notes-private-svc1 server. On that server, we ran docker ps to find out the name of the running container. Notice that Docker generated a container name that includes a coded string, called a nonce, that guarantees the container name is unique. With that container name, we ran docker exec -it ... bash to get a root shell inside the container.

Once there, we can run the following command:


这验证了用户认证服务器的工作,并且它可以与数据库通信。为了进一步验证这一点,我们可以访问数据库实例,如下所示:

From there, we can explore the database and see that, indeed, Ashildr's user ID exists.

With this user ID set up, we can now use our browser to visit the Notes application and log in with that user ID.

Diagnosing an inability to log in with Twitter credentials

The next step will be to test logging in with Twitter credentials. Remember that earlier, we said to ensure that the TWITTER_CALLBACK_HOST variable has the domain name of the EC2 instance, and likewise that the Twitter application configuration does as well.

Even with those settings in place, we might run into a problem. Instead of logging in, we might get an error page with a stack trace, starting with the message: Failed to obtain request token.

There are a number of possible issues that can cause this error. For example, the error can occur if the Twitter authentication tokens are not deployed. However, if you followed the directions correctly, they will be deployed correctly.

In notes/appsupport.mjs, there is a function, basicErrorHandler, which will be invoked by this error. In that function, add this line of code:


这将打印完整的错误,包括导致失败的原始错误。您可能会看到打印的以下消息:`getaddrinfo EAI_AGAIN api.twitter.com`。这可能令人困惑,因为该域名肯定是可用的。但是,由于 DNS 配置的原因,它可能在`svc-notes`容器内部不可用。

从`notes-public`实例,我们将能够 ping 该域名,如下所示:

However, if we attempt this inside the svc-notes container, this might fail, as illustrated in the following code snippet:


理想情况下,这也将在容器内部起作用。如果在容器内部失败,这意味着 Notes 服务无法访问 Twitter 以处理使用 Twitter 凭据登录所需的 OAuth 过程。

问题在于,在这种情况下,Docker 设置了不正确的 DNS 配置,容器无法为许多域名进行 DNS 查询。在 Docker Compose 文档中,建议在服务定义中使用以下代码:

These two DNS servers are operated by Google, and indeed this solves the problem. Once this change has been made, you should be able to log in to Notes using Twitter credentials.

In this section, we tested the Notes application and discussed how to diagnose and remedy a couple of common problems. While doing so, we learned how to navigate our way around the EC2 instances and the Docker Swarm.

Let's now see what happens if we change the number of instances for our services.

Scaling the Notes instances

By now, we have deployed the Notes stack to the cluster on our EC2 instances. We have tested everything and know that we have a correctly functioning system deployed on AWS. Our next task is to increase the number of instances and see what happens.

To increase the instances for svc-notes, edit compose-swarm/docker-compose.yml as follows:


这会增加副本的数量。由于现有的放置约束,两个实例都将部署到具有`type`标签`public`的节点上。要更新服务,只需要重新运行以下命令:

Earlier, this command described its actions with the word Creating, and this time it used the word Updating. This means that the services are being updated with whatever new settings are in the stack file.

After a few minutes, you may see this:


确实,它显示了`svc-notes`服务的两个实例。`2/2`表示两个实例当前正在运行,而请求的实例数为两个。

要查看详细信息,请运行以下命令:

As we saw earlier, this command lists to which swarm nodes the service has been deployed. In this case, we'll see that both instances are on notes-public, due to the placement constraints.

Another useful command is the following:


最终,部署到 Docker 集群的每个服务都包含一个或多个正在运行的容器。

您会注意到这显示`svc-notes`正在端口`3000`上监听。在环境设置中,我们没有设置`PORT`变量,因此`svc-notes`将默认监听端口`3000`。回到`docker service ls`的输出,您应该会看到这个:`*:80->3000/tcp`,这意味着 Docker 正在处理从端口`80`到端口`3000`的映射。

这是由`docker-swarm/docker-compose.yml`中的以下设置引起的:

这表示要发布端口80并将其映射到容器上的端口3000

在 Docker 文档中,我们了解到在集群中部署的服务可以通过所谓的“路由网格”访问。连接到已发布端口会将连接路由到处理该服务的容器之一。因此,Docker 充当负载均衡器,在您配置的服务实例之间分发流量。

在本节中,我们终于将 Notes 应用程序堆栈部署到了我们在 AWS EC2 实例上构建的云托管环境。我们创建了一个 Docker Swarm,配置了该 Swarm,创建了一个堆栈文件来部署我们的服务,并将其部署到该基础设施上。然后我们测试了部署的系统,并发现它运行良好。

有了这些,我们可以结束本章了。

总结

本章是学习 Node.js 应用部署的旅程的最高潮。我们开发了一个仅存在于我们的笔记本电脑上的应用,并添加了许多有用的功能。为了在公共服务器上部署该应用以获得反馈,我们进行了三种类型的部署。在第十章《将 Node.js 应用部署到 Linux 服务器》中,我们学习了如何在 Linux 上使用 PM2 启动持久后台任务。在第十一章《使用 Docker 部署 Node.js 微服务》中,我们学习了如何将 Notes 应用程序堆叠进行 Docker 化,并如何使用 Docker 运行它。

在本章中,我们建立在此基础上,学习了如何在 Docker Swarm 集群上部署我们的 Docker 容器。AWS 是一个功能强大且全面的云托管平台,拥有长长的可用服务列表。我们在 VPC 中使用了 EC2 实例和相关基础设施。

为了实现这一目标,我们使用了 Terraform,这是一种流行的工具,用于描述云部署,不仅适用于 AWS,还适用于许多其他云平台。AWS 和 Terraform 在各种大小的项目中被广泛使用。

在这个过程中,我们学到了很多关于 AWS、Terraform 以及如何使用 Terraform 在 AWS 上部署基础设施;如何设置 Docker Swarm 集群;以及如何在该基础设施上部署多容器服务。

我们首先创建了 AWS 账户,在笔记本电脑上设置了 AWS CLI 工具,并设置了 Terraform。然后我们使用 Terraform 定义了一个 VPC 和网络基础设施,用于部署 EC2 实例。我们学会了如何使用 Terraform 自动化大部分 EC2 配置细节,以便快速初始化 Docker Swarm。

我们了解到 Docker Compose 文件和 Docker 堆栈文件是非常相似的东西。后者与 Docker Swarm 一起使用,是描述 Docker 服务部署的强大工具。

在下一章中,我们将学习单元测试和功能测试。虽然测试驱动开发的核心原则是在编写应用程序之前编写测试,但我们却反其道而行之,将关于单元测试的章节放在了书的最后。这并不是说单元测试不重要,因为它确实很重要。

单元测试和功能测试

单元测试已成为良好软件开发实践的主要部分。这是一种通过测试源代码的各个单元来确保它们正常运行的方法。每个单元在理论上都是应用程序中最小的可测试部分。

在单元测试中,每个单元都是单独测试的,尽可能地将被测试的单元与应用程序的其他部分隔离开来。如果测试失败,你希望它是由于你的代码中的错误而不是你的代码碰巧使用的包中的错误。一个常见的技术是使用模拟对象或模拟数据来将应用程序的各个部分相互隔离开来。

另一方面,功能测试并不试图测试单独的组件。相反,它测试整个系统。一般来说,单元测试由开发团队执行,而功能测试由质量保证QA)或质量工程QE)团队执行。这两种测试模型都需要完全认证一个应用程序。一个类比可能是,单元测试类似于确保句子中的每个单词都拼写正确,而功能测试确保包含该句子的段落具有良好的结构。

写一本书不仅需要确保单词拼写正确,还需要确保单词串成有用的语法正确的句子和传达预期含义的章节。同样,一个成功的软件应用程序需要远不止确保每个“单元”行为正确。整个系统是否执行了预期的操作?

在本章中,我们将涵盖以下主题:

  • 断言作为软件测试的基础

  • Mocha 单元测试框架和 Chai 断言库

  • 使用测试来查找错误并修复错误

  • 使用 Docker 管理测试基础设施

  • 测试 REST 后端服务

  • 在真实的 Web 浏览器中使用 Puppeteer 进行 UI 功能测试

  • 使用元素 ID 属性改进 UI 可测试性

在本章结束时,你将知道如何使用 Mocha,以及如何为直接调用的测试代码和通过 REST 服务访问的测试代码编写测试用例。你还将学会如何使用 Docker Compose 来管理测试基础设施,无论是在你的笔记本电脑上还是在来自第十二章的 AWS EC2 Swarm 基础设施上,使用 Terraform 部署 Docker Swarm 到 AWS EC2

这是一个需要覆盖的大片领土,所以让我们开始吧。

第十七章:Assert - 测试方法学的基础

Node.js 有一个有用的内置测试工具,称为assert模块。其功能类似于其他语言中的 assert 库。换句话说,它是一组用于测试条件的函数,如果条件表明存在错误,assert函数会抛出异常。虽然它并不是完整的测试框架,但仍然可以用于一定程度的测试。

在其最简单的形式中,测试套件是一系列assert调用,用于验证被测试对象的行为。例如,一个测试套件可以实例化用户认证服务,然后进行 API 调用并使用assert方法验证结果,然后进行另一个 API 调用来验证其结果,依此类推。

考虑以下代码片段,你可以将其保存在名为deleteFile.mjs的文件中:


The first thing to notice is this contains several layers of asynchronous callback functions. This presents a couple of challenges:  

*   Capturing errors from deep inside a callback
*   Detecting conditions where the callbacks are never called

The following is an example of using `assert` for testing. Create a file named `test-deleteFile.mjs` containing the following:

这就是所谓的负面测试场景,它测试的是请求删除一个不存在的文件是否会抛出正确的错误。如果要删除的文件不存在,deleteFile函数会抛出一个包含不存在文本的错误。这个测试确保正确的错误被抛出,如果抛出了错误的错误,或者没有抛出错误,测试将失败。

如果您正在寻找一种快速测试的方法,assert模块在这种用法下可能很有用。每个测试用例都会调用一个函数,然后使用一个或多个assert语句来测试结果。在这种情况下,assert语句首先确保err具有某种值,然后确保该值是Error实例,最后确保message属性具有预期的文本。如果运行并且没有消息被打印,那么测试通过。但是如果deleteFile回调从未被调用会发生什么?这个测试用例会捕获到这个错误吗?


No news is good news, meaning it ran without messages and therefore the test passed.

The `assert` module is used by many of the test frameworks as a core tool for writing test cases. What the test frameworks do is create a familiar test suite and test case structure to encapsulate your test code, plus create a context in which a series of test cases are robustly executed.

For example, we asked about the error of the callback function never being called. Test frameworks usually have a timeout so that if no result of any kind is supplied within a set number of milliseconds, then the test case is considered an error.

There are many styles of assertion libraries available in Node.js. Later in this chapter, we'll use the Chai assertion library ([`chaijs.com/`](http://chaijs.com/)), which gives you a choice between three different assertion styles (should, expect, and assert).

# Testing a Notes model

Let's start our unit testing journey with the data models we wrote for the Notes application. Because this is unit testing, the models should be tested separately from the rest of the Notes application.

In the case of most of the Notes models, isolating their dependencies implies creating a mock database. Are you going to test the data model or the underlying database? Mocking out a database means creating a fake database implementation, which does not look like a productive use of our time. You can argue that testing a data model is really about testing the interaction between your code and the database. Since mocking out the database means not testing that interaction, we should test our code against the database engine in order to validate that interaction.

With that line of reasoning in mind, we'll skip mocking out the database, and instead run the tests against a database containing test data. To simplify launching the test database, we'll use Docker to start and stop a version of the Notes application stack that's set up for testing.

Let's start by setting up the tools.

## Mocha and Chai­ – the chosen test tools

If you haven't already done so, duplicate the source tree so that you can use it in this chapter. For example, if you had a directory named `chap12`, create one named `chap13` containing everything from `chap12` to `chap13`.

In the `notes` directory, create a new directory named `test`.

Mocha ([`mochajs.org/`](http://mochajs.org/)) is one of many test frameworks available for Node.js. As you'll see shortly, it helps us write test cases and test suites, and it provides a test results reporting mechanism. It was chosen over the alternatives because it supports Promises. It fits very well with the Chai assertion library mentioned earlier. 

While in the `notes/test` directory, type the following to install Mocha and Chai:

当然,这会设置一个package.json文件并安装所需的软件包。

除了 Mocha 和 Chai 之外,我们还安装了两个额外的工具。第一个是cross-env,这是我们以前使用过的,它可以在命令行上设置环境变量的跨平台支持。第二个是npm-run-all,它简化了使用package.json来驱动构建或测试过程。

有关cross-env的文档,请访问www.npmjs.com/package/cross-env

有关npm-run-all的文档,请访问www.npmjs.com/package/npm-run-all

有了设置好的工具,我们可以继续创建测试。

Notes 模型测试套件

因为我们有几个 Notes 模型,测试套件应该针对任何模型运行。我们可以使用 NotesStore API 编写测试,并且应该使用环境变量来声明要测试的模型。因此,测试脚本将加载notes-store.mjs并调用它提供的对象上的函数。其他环境变量将用于其他配置设置。

因为我们使用 ES6 模块编写了 Notes 应用程序,所以我们需要考虑一个小问题。旧版的 Mocha 只支持在 CommonJS 模块中运行测试,因此这需要我们跳过一些步骤来测试 Notes 模块。但是当前版本的 Mocha 支持它们,这意味着我们可以自由使用 ES6 模块。

我们将首先编写一个单独的测试用例,并按照运行该测试和获取结果的步骤进行。之后,我们将编写更多的测试用例,甚至找到一些错误。这些错误将给我们一个机会来调试应用程序并解决任何问题。我们将通过讨论如何运行需要设置后台服务的测试来结束本节。

创建初始 Notes 模型测试用例

test目录中,创建一个名为test-model.mjs的文件,其中包含以下内容。这将是测试套件的外壳:


This loads in the required modules and implements the first test case.

The Chai library supports three flavors of assertions. We're using the `assert` style here, but it's easy to use a different style if you prefer.

For the other assertion styles supported by Chai, see [`chaijs.com/guide/styles/`](http://chaijs.com/guide/styles/).

Chai's assertions include a very long list of useful assertion functions. For the documentation, see [`chaijs.com/api/assert/`](http://chaijs.com/api/assert/).

To load the model to be tested, we call the `useModel` function (renamed as `useNotesModel`). You'll remember that this uses the `import()` function to dynamically select the actual NotesStore implementation to use. The `NOTES_MODEL` environment variable is used to select which to load.

Calling `this.timeout` adjusts the time allowed for completing the test. By default, Mocha allows 2,000 milliseconds (2 seconds) for a test case to be completed. This particular test case might take longer than that, so we've given it more time.

The test function is declared as `async`.  Mocha can be used in a callback fashion, where Mocha passes in a callback to the test to invoke and indicate errors. However, it can also be used with `async` test functions, meaning that we can throw errors in the normal way and Mocha will automatically capture those errors to determine if the test fails.

Generally, Mocha looks to see if the function throws an exception or whether the test case takes too long to execute (a timeout situation). In either case, Mocha will indicate a test failure. That's, of course, simple to determine for non-asynchronous code. But Node.js is all about asynchronous code, and Mocha has two models for testing asynchronous code. In the first (not seen here), Mocha passes in a callback function, and the test code is to call the callback function. In the second, as seen here, it looks for a Promise being returned by the test function and determines a pass/fail regarding whether the Promise is in the `resolve` or `reject` state.

We are keeping the NotesStore model in the global `store` variable so that it can be used by all tests. The test, in this case, is whether we can load a given NotesStore implementation. As the comment states, if this executes without throwing an exception, the test has succeeded.  The other purpose of this test is to initialize the variable for use by other test cases.

It is useful to notice that this code carefully avoids loading `app.mjs`. Instead, it loads the test driver module, `models/notes-store.mjs`, and whatever module is loaded by `useNotesModel`. The `NotesStore` implementation is what's being tested, and the spirit of unit testing says to isolate it as much as possible.

Before we proceed further, let's talk about how Mocha structures tests.

With Mocha, a test suite is contained within a `describe` block. The first argument is a piece of descriptive text that you use to tailor the presentation of test results. The second argument is a `function` that contains the contents of the given test suite.

The `it` function is a test case. The intent is for us to read this as *it should successfully load the module*. Then, the code within the `function` is used to check that assertion.

With Mocha, it is important to not use arrow functions in the `describe` and `it` blocks. By now, you will have grown fond of arrow functions because of how much easier they are to write. However, Mocha calls these functions with a `this` object containing useful functions for Mocha. Because arrow functions avoid setting up a `this` object, Mocha would break.

Now that we have a test case written, let's learn how to run tests.

### Running the first test case

Now that we have a test case, let's run the test. In the `package.json` file, add the following `scripts` section:

我们在这里做的是创建一个test-all脚本,它将针对各个 NotesStore 实现运行测试套件。我们可以运行此脚本来运行每个测试组合,或者我们可以运行特定脚本来测试只有一个组合。例如,test-notes-sequelize-sqlite将针对使用 SQLite3 数据库的SequelizeNotesStore运行测试。

它使用npm-run-all来支持按顺序运行测试。通常,在package.json脚本中,我们会这样写:


This runs a series of steps one after another, relying on a feature of the Bash shell. The `npm-run-all` tool serves the same purpose, namely running one `package.json` script after another in the series. The first advantage is that the code is simpler and more compact, making it easier to read, while the other advantage is that it is cross-platform. We're using `cross-env` for the same purpose so that the test scripts can be executed on Windows as easily as they can be on Linux or macOS.

For the `test-notes-sequelize-sqlite` test, look closely. Here, you can see that we need a database configuration file named `sequelize-sqlite.yaml`. Create that file with the following code:

正如测试脚本名称所示,这使用 SQLite3 作为底层数据库,并将其存储在指定的文件中。

我们缺少两种组合,test-notes-sequelize-mysql用于使用 MySQL 的SequelizeNotesStoretest-notes-mongodb,它测试MongoDBNotesStore。我们稍后将实现这些组合。

自动运行所有测试组合后,我们可以尝试一下:


If all has gone well, you'll get this result for every test combination currently supported in the `test-all` script.

This completes the first test, which was to demonstrate how to create tests and execute them. All that remains is to write more tests.

### Adding some tests

That was easy, but if we want to find what bugs we created, we need to test some functionality. Now, let's create a test suite for testing `NotesStore`, which will contain several test suites for different aspects of `NotesStore`.

What does that mean? Remember that the `describe` function is the container for a test suite and that the `it` function is the container for a test case. By simply nesting `describe` functions, we can contain a test suite within a test suite. It will be clearer what that means after we implement this:

在这里,我们有一个describe函数,它定义了一个包含另一个describe函数的测试套件。这是嵌套测试套件的结构。

目前在it函数中没有测试用例,但是我们有beforeafter函数。这两个函数的功能就像它们的名字一样;before函数在所有测试用例之前运行,而after函数在所有测试用例完成后运行。before函数旨在设置将被测试的条件,而after函数旨在进行拆卸。

在这种情况下,before函数向NotesStore添加条目,而after函数删除所有条目。其想法是在每个嵌套测试套件执行后都有一个干净的状态。

beforeafter函数是 Mocha 称之为钩子的函数。其他钩子是beforeEachafterEach。区别在于Each钩子在每个测试用例执行之前或之后触发。

这两个钩子也充当测试用例,因为createdestroy方法可能会失败,如果失败,钩子也会失败。

beforeafter钩子函数之间,添加以下测试用例:


As suggested by the description for this test suite, the functions all test the `keylist` method.

For each test case, we start by calling `keylist`, then using `assert` methods to check different aspects of the array that is returned. The idea is to call `NotesStore` API functions, then test the results to check whether they matched the expected results.

Now, we can run the tests and get the following:

将输出与describeit函数中的描述字符串进行比较。您会发现,此输出的结构与测试套件和测试用例的结构相匹配。换句话说,我们应该将它们结构化,使其具有良好结构化的测试输出。

正如他们所说,测试永远不会完成,只会耗尽。所以,在我们耗尽之前,让我们看看我们能走多远。

Notes 模型的更多测试

这还不足以进行太多测试,所以让我们继续添加一些测试:


These tests check the `read` method. In the first test case, we check whether it successfully reads a known Note, while in the second test case, we have a negative test of what happens if we read a non-existent Note.

Negative tests are very important to ensure that functions fail when they're supposed to fail and that failures are indicated correctly.

The Chai Assertions API includes some very expressive assertions. In this case, we've used the `deepEqual` method, which does a deep comparison of two objects. You'll see that for the first argument, we pass in an object and that for the second, we pass an object that's used to check the first. To see why this is useful, let's force it to indicate an error by inserting `FAIL` into one of the test strings.

After running the tests, we get the following output:

这就是失败的测试样子。没有勾号,而是一个数字,数字对应下面的报告。在失败报告中,deepEqual函数为我们提供了关于对象字段差异的清晰信息。在这种情况下,这是我们故意让deepEqual函数失败的测试,因为我们想看看它是如何工作的。

请注意,对于负面测试——如果抛出错误则测试通过——我们在try/catch块中运行它。每种情况中的throw new Error行不应该执行,因为前面的代码应该抛出一个错误。因此,我们可以检查抛出的错误中的消息是否是到达的消息,并且如果是这种情况,则使测试失败。

诊断测试失败

我们可以添加更多的测试,因为显然,这些测试还不足以能够将 Notes 发布给公众。这样做之后,运行测试以针对不同的测试组合,我们将在 SQLite3 组合的结果中找到这个结果:


Our test suite found two errors, one of which is the error we mentioned in Chapter 7, *Data Storage and Retrieval*. Both failures came from the negative test cases. In one case, the test calls `store.read("badkey12")`, while in the other, it calls `store.delete("badkey12")`.

It is easy enough to insert `console.log` calls and learn what is going on.

For the `read` method, SQLite3 gave us `undefined` for `row`. The test suite successfully calls the `read` function multiple times with a `notekey` value that does exist. Obviously, the failure is limited to the case of an invalid `notekey` value. In such cases, the query gives an empty result set and SQLite3 invokes the callback with `undefined` in both the error and the row values. Indeed, the equivalent `SQL SELECT` statement does not throw an error; it simply returns an empty result set. An empty result set isn't an error, so we received no error and an undefined `row`.

However, we defined `read` to throw an error if no such Note exists. This means this function must be written to detect this condition and throw an error.

There is a difference between the `read` functions in `models/notes-sqlite3.mjs` and `models/notes-sequelize.mjs`. On the day we wrote `SequelizeNotesStore`, we must have thought through this function more carefully than we did on the day we wrote `SQLITE3NotesStore`. In `SequelizeNotesStore.read`, there is an error that's thrown when we receive an empty result set, and it has a check that we can adapt. Let's rewrite the `read` function in `models/notes-sqlite.mjs` so that it reads as follows:

如果这收到一个空结果,就会抛出一个错误。虽然数据库不会将空结果集视为错误,但 Notes 会。此外,Notes 已经知道如何处理这种情况下抛出的错误。进行这个更改,那个特定的测试用例就会通过。

destroy逻辑中还有第二个类似的错误。在 SQL 中,如果这个 SQL(来自models/notes-sqlite3.mjs)没有删除任何内容,这显然不是一个 SQL 错误:


Unfortunately, there isn't a method in the SQL option to fail if it does not delete any records. Therefore, we must add a check to see if a record exists, namely the following:

因此,我们读取笔记,并且作为副产品,我们验证笔记是否存在。如果笔记不存在,read将抛出一个错误,而DELETE操作甚至不会运行。

当我们运行test-notes-sequelize-sqlite时,它的destroy方法也出现了类似的失败。在models/notes-sequelize.mjs中,进行以下更改:


This is the same change; that is, to first `read` the Note corresponding to the given `key`, and if the Note does not exist, to throw an error.

Likewise, when running `test-level`, we get a similar failure, and the solution is to edit `models/notes-level.mjs` to make the following change:

与其他 NotesStore 实现一样,在销毁之前先读取 Note。如果read操作失败,那么测试用例会看到预期的错误。

这些是我们在第七章中提到的错误,数据存储和检索。我们只是忘记在这个特定模型中检查这些条件。幸运的是,我们勤奋的测试捕捉到了问题。至少,这是要告诉经理的故事,而不是告诉他们我们忘记检查我们已经知道可能发生的事情。

针对需要服务器设置的数据库进行测试——MySQL 和 MongoDB

这很好,但显然我们不会在生产中使用 SQLite3 或 Level 等数据库运行 Notes。我们可以在 Sequelize 支持的 SQL 数据库(如 MySQL)和 MongoDB 上运行 Notes。显然,我们疏忽了没有测试这两种组合。

我们的测试结果矩阵如下:

  • notes-fs: 通过

  • notes-memory: 通过

  • notes-level: 1 个失败,现已修复

  • notes-sqlite3: 2 个失败,现已修复

  • notes-sequelize: 使用 SQLite3:1 个失败,现已修复

  • notes-sequelize: 使用 MySQL:未经测试

  • notes-mongodb: 未经测试

两个未经测试的 NotesStore 实现都需要我们设置一个数据库服务器。我们避免测试这些组合,但我们的经理不会接受这个借口,因为 CEO 需要知道我们已经完成了测试周期。笔记必须使用类似于生产环境的配置进行测试。

在生产中,我们将使用常规的数据库服务器,MySQL 或 MongoDB 是主要选择。因此,我们需要一种低开销的方式来对这些数据库进行测试。对生产配置进行测试必须如此简单,以至于我们在进行测试时不会感到阻力,以确保测试运行得足够频繁,以产生期望的影响。

在本节中,我们取得了很大的进展,并在 NotesStore 数据库模块的测试套件上有了一个良好的开端。我们学会了如何在 Mocha 中设置测试套件和测试用例,以及如何获得有用的测试报告。我们学会了如何使用package.json来驱动测试套件执行。我们还学会了负面测试场景以及如何诊断出现的错误。

但我们需要解决针对数据库服务器进行测试的问题。幸运的是,我们已经使用了一个支持轻松创建和销毁部署基础设施的技术。你好,Docker!

在下一节中,我们将学习如何将 Docker Compose 部署重新用作测试基础设施。

使用 Docker Swarm 管理测试基础设施

Docker 给我们带来的一个优势是能够在我们的笔记本电脑上安装生产环境。在第十二章中,使用 Terraform 将 Docker Swarm 部署到 AWS EC2,我们将一个在我们的笔记本电脑上运行的 Docker 设置转换为可以部署在真实云托管基础设施上的设置。这依赖于将 Docker Compose 文件转换为 Docker Stack 文件,并对我们在 AWS EC2 实例上构建的环境进行定制。

在本节中,我们将将 Stack 文件重新用作部署到 Docker Swarm 的测试基础设施。一种方法是简单地运行相同的部署,到 AWS EC2,并替换var.project_namevar.vpc_name变量的新值。换句话说,EC2 基础设施可以这样部署:


This would deploy a second VPC with a different name that's explicitly for test execution and that would not disturb the production deployment. It's quite common in Terraform to customize the deployment this way for different targets.

In this section, we'll try something different. We can use Docker Swarm in other contexts, not just the AWS EC2 infrastructure we set up. Specifically, it is easy to use Docker Swarm with the Docker for Windows or Docker for macOS that's running on our laptop.

What we'll do is configure Docker on our laptop so that it supports swarm mode and create a slightly modified version of the Stack file in order to run the tests on our laptop. This will solve the issue of running tests against a MySQL database server, and also lets us test the long-neglected MongoDB module. This will demonstrate how to use Docker Swarm for test infrastructure and how to perform semi-automated test execution inside the containers using a shell script.

Let's get started.

## Using Docker Swarm to deploy test infrastructure

We had a great experience using Docker Compose and Swarm to orchestrate Notes application deployment on both our laptop and our AWS infrastructure. The whole system, with five independent services, is easily described in `compose-local/docker-compose.yml` and `compose-swarm/docker-compose.yml`. What we'll do is duplicate the Stack file, then make a couple of small changes required to support test execution in a local swarm.

To configure the Docker installation on our laptop for swarm mode, simply type the following:

与以前一样,这将打印有关加入令牌的消息。如果需要的话,如果你的办公室有多台电脑,你可能会对设置本地 Swarm 进行实验感兴趣。但对于这个练习来说,这并不重要。这是因为我们可以用单节点 Swarm 完成所有需要的工作。

这不是单行道,这意味着当你完成这个练习时,关闭 swarm 模式是很容易的。只需关闭部署到本地 Swarm 的任何内容,并运行以下命令:


Normally, this is used for a host that you wish to detach from an existing swarm. If there is only one host remaining in a swarm, the effect will be to shut down the swarm.

Now that we know how to initialize swarm mode on our laptop, let's set about creating a stack file suitable for use on our laptop.

Create a new directory, `compose-stack-test-local`, as a sibling to the `notes`, `users`, and `compose-local` directories. Copy `compose-stack/docker-compose.yml` to that directory. We'll be making several small changes to this file and no changes to the existing Dockerfiles. As much as it is possible, it is important to test the same containers that are used in the production deployment. This means it's acceptable to inject test files into the containers, but not modify them.

Make every `deploy` tag look like this:

这将删除我们在 AWS EC2 上声明的放置约束,并将其设置为每个服务的一个副本。对于单节点集群,当然我们不用担心放置,也没有必要多个服务实例。

对于数据库服务,删除volumes标签。当需要在数据库数据目录中持久保存数据时,使用此标签是必需的。对于测试基础设施,数据目录并不重要,可以随意丢弃。同样,删除顶级volumes标签。

对于svc-notessvc-userauth服务,进行以下更改:


This injects the files required for testing into the `svc-notes` container. Obviously, this is the `test` directory that we created in the previous section for the Notes service. Those tests also require the SQLite3 schema file since it is used by the corresponding test script. In both cases, we can use `bind` mounts to inject the files into the running container.

The Notes test suite follows a normal practice for Node.js projects of putting `test` files in the test directory. When building the container, we obviously don't include the test files because they're not required for deployment. But running tests requires having that directory inside the running container. Fortunately, Docker makes this easy. We simply mount the directory into the correct place.

The bottom line is this approach gives us the following advantages:

*   The test code is in `notes/test`, where it belongs.
*   The test code is not copied into the production container.
*   In test mode, the `test` directory appears where it belongs.

For Docker (using `docker run`) and Docker Compose, the volume is mounted from a directory on the localhost. But for swarm mode, with a multi-node swarm, the container could be deployed on any host matching the placement constraints we declare. In a swarm, bind volume mounts like the ones shown here will try to mount from a directory on the host that the container has been deployed in. But we are not using a multi-node swarm; instead, we are using a single-node swarm. Therefore, the container will mount the named directory from our laptop, and all will be fine. But as soon as we decide to run testing on a multi-node swarm, we'll need to come up with a different strategy for injecting these files into the container.

We've also changed the `ports` mappings. For `svc-userauth`, we've made its port visible to give ourselves the option of testing the REST service from the host computer. For the `svc-notes` service, this will make it appear on port `3000`. In the `environment` section, make sure you did not set a `PORT` variable. Finally, we adjust `TWITTER_CALLBACK_HOST` so that it uses `localhost:3000` since we're deploying on the localhost.

For both services, we're changing the image tag from the one associated with the AWS ECR repository to one of our own designs. We won't be publishing these images to an image repository, so we can use any image tag we like.  

For both services, we are using the Sequelize data model, using the existing MySQL-oriented configuration file, and setting the `SEQUELIZE_DBHOST` variable to refer to the container holding the database. 

We've defined a Docker Stack file that should be useful for deploying the Notes application stack in a Swarm. The difference between the deployment on AWS EC2 and here is simply the configuration. With a few simple configuration changes, we've mounted test files into the appropriate container, reconfigured the volumes and the environment variables, and changed the deployment descriptors so that they're suitable for a single-node swarm running on our laptop.

Let's deploy this and see how well we did.

## Executing tests under Docker Swarm

We've repurposed our Docker Stack file so that it describes deploying to a single-node swarm, ensuring the containers are set up to be useful for testing. Our next step is to deploy the Stack to a swarm and execute the tests inside the Notes container.

To set it up, run the following commands:

我们运行swarm init在我们的笔记本上打开 swarm 模式,然后将两个TWITTER秘密添加到 swarm 中。由于它是单节点 swarm,我们不需要运行docker swarm join命令来添加新节点到 swarm 中。

然后,在compose-stack-test-local目录中,我们可以运行这些命令:


Because a Stack file is also a Compose file, we can run `docker-compose build` to build the images. Because of the `image` tags, this will automatically tag the images so that they match the image names we specified.

Then, we use `docker stack deploy`, as we did when deploying to AWS EC2\. Unlike the AWS deployment, we do not need to push the images to repositories, which means we do not need to use the `--with-registry-auth` option. This will behave almost identically to the swarm we deployed to EC2, so we explore the deployed services in the same way:

因为这是单主机 swarm,我们不需要使用 SSH 访问 swarm 节点,也不需要使用docker context设置远程访问。相反,我们运行 Docker 命令,它们会在本地主机上的 Docker 实例上执行。

docker ps命令将告诉我们每个服务的精确容器名称。有了这个知识,我们可以运行以下命令来获得访问权限:


Because, in swarm mode, the containers have unique names, we have to run `docker ps` to get the container name, then paste it into this command to start a Bash shell inside the container.

Inside the container, we see the `test` directory is there as expected. But we have a couple of setup steps to perform. The first is to install the SQLite3 command-line tools since the scripts in `package.json` use that command. The second is to remove any existing `node_modules` directory because we don't know if it was built for this container or for the laptop. After that, we need to run `npm install` to install the dependencies.

Having done this, we can run the tests:

测试应该像在我们的笔记本电脑上一样执行,但是它们是在容器内运行的。但是,MySQL 测试不会运行,因为package.json脚本没有设置自动运行。因此,我们可以将其添加到package.json中:


This is the command that's required to execute the test suite against the MySQL database.

Then, we can run the tests against MySQL, like so:

测试应该对 MySQL 执行正确。

为了自动化这一过程,我们可以创建一个名为run.sh的文件,其中包含以下代码:


The script executes each script in `notes/test/package.json` individually. If you prefer, you can replace these with a single line that executes `npm run test-all`.

This script takes a command-line argument for the container name holding the `svc-notes` service. Since the tests are located in that container, that's where the tests must be run. The script can be executed like so:

这运行了前面的脚本,将每个测试组合单独运行,并确保DEBUG变量未设置。这个变量在 Dockerfile 中设置,会导致在测试结果输出中打印调试信息。在脚本中,--workdir选项将命令的当前目录设置为test目录,以简化运行测试脚本。

当然,这个脚本在 Windows 上不会直接执行。要将其转换为 PowerShell 使用,将从第二行开始的文本保存到run.ps1中,然后将SVC_NOTES引用更改为%SVC_NOTES%引用。

我们已经成功地将大部分测试矩阵的执行部分自动化。但是,测试矩阵中存在一个明显的漏洞,即缺乏对 MongoDB 的测试。填补这个漏洞将让我们看到如何在 Docker 下设置 MongoDB。

在 Docker 下设置 MongoDB 并对 Notes 进行测试

在第七章,数据存储和检索中,我们为 Notes 开发了 MongoDB 支持。从那时起,我们专注于Sequelize。为了弥补这一点,让我们确保至少测试我们的 MongoDB 支持。在 MongoDB 上进行测试只需要定义一个 MongoDB 数据库的容器和一点配置。

访问hub.docker.com/_/mongo/获取官方 MongoDB 容器。您可以将其改装以部署在 MongoDB 上运行的 Notes 应用程序。

将以下代码添加到compose-stack-test-local/docker-compose.yml中:


That's all that's required to add a MongoDB container to a Docker Compose/Stack file. We've connected it to `frontnet` so that the database is accessible by `svc-notes`. If we wanted the `svc-notes` container to use MongoDB, we'd need some environment variables (`MONGO_URL`, `MONGO_DBNAME`, and `NOTES_MODEL`) to tell Notes to use MongoDB. 

But we'd also run into a problem that we created for ourselves in Chapter 9, *Dynamic Client/Server Interaction with Socket.IO*. In that chapter, we created a messaging subsystem so that our users can leave messages for each other. That messaging system is currently implemented to store messages in the same Sequelize database where the Notes are stored. But to run Notes with no Sequelize database would mean a failure in the messaging system. Obviously, the messaging system can be rewritten, for instance, to allow storage in a MongoDB database, or to support running both MongoDB and Sequelize at the same time.

Because we were careful, we can execute code in `models/notes-mongodb.mjs` without it being affected by other code. With that in mind, we'll simply execute the Notes test suite against MongoDB and report the results.

Then, in `notes/test/package.json`, we can add a line to facilitate running tests on MongoDB:

我们只是将 MongoDB 容器添加到了frontnet,使得数据库可以在此处显示的 URL 上使用。因此,现在可以简单地使用 Notes MongoDB 模型运行测试套件。

--no-timeouts选项是必要的,以避免针对 MongoDB 测试套件时出现错误。此选项指示 Mocha 不检查测试用例执行是否太长时间。

最后的要求是将以下一行添加到run.sh(或run.ps1适用于 Windows)中:


This ensures MongoDB can be tested alongside the other test combinations. But when we run this, an error might crop up:

问题在于 MongoClient 对象的初始化程序略有变化。因此,我们必须修改notes/models/notes-mongodb.mjs,使用这个新的connectDB函数:


This adds a pair of useful configuration options, including the option explicitly named in the error message. Otherwise, the code is unchanged.

To make sure the container is running with the updated code, rerun the `docker-compose build` and `docker stack deploy` steps shown earlier. Doing so rebuilds the images, and then updates the services. Because the `svc-notes` container will relaunch, you'll need to install the Ubuntu `sqlite3` package again.

Once you've done that, the tests will all execute correctly, including the MongoDB combination.

We can now report the final test results matrix to the manager:

*   `models-fs`: PASS
*   `models-memory`: PASS
*   `models-levelup`: 1 failure, now fixed, PASS
*   `models-sqlite3`: Two failures, now fixed, PASS
*   `models-sequelize` with SQLite3: 1 failure, now fixed, PASS
*   `models-sequelize` with MySQL: PASS
*   `models-mongodb`: PASS

The manager will tell you "good job" and then remember that the models are only a portion of the Notes application. We've left two areas completely untested:

*   The REST API for the user authentication service
*   Functional testing of the user interface

In this section, we've learned how to repurpose a Docker Stack file so that we can launch the Notes stack on our laptop. It took a few simple reconfigurations of the Stack file and we were ready to go, and we even injected the files that are useful for testing. With a little bit more work, we finished testing against all configuration combinations of the Notes database modules.

Our next task is to handle testing the REST API for the user authentication service.

# Testing REST backend services

It's now time to turn our attention to the user authentication service. We've mentioned testing this service, saying that we'll get to them later. We developed a command-line tool for both administration and ad hoc testing. While that has been useful all along, it's time to get cracking with some real tests.

There's a question of which tool to use for testing the authentication service. Mocha does a good job of organizing a series of test cases, and we should reuse it here. But the thing we have to test is a REST service. The customer of this service, the Notes application, uses it through the REST API, giving us a perfect rationalization to test the REST interface rather than calling the functions directly. Our ad hoc scripts used the SuperAgent library to simplify making REST API calls. There happens to be a companion library, SuperTest, that is meant for REST API testing. It's easy to use that library within a Mocha test suite, so let's take that route.

For the documentation on SuperTest, look here: [`www.npmjs.com/package/supertest`](https://www.npmjs.com/package/supertest).

Create a directory named `compose-stack-test-local/userauth`. This directory will contain a test suite for the user authentication REST service. In that directory, create a file named `test.mjs` that contains the following code:

这设置了 Mocha 和 SuperTest 客户端。URL_USERS_TEST环境变量指定了要针对其运行测试的服务器的基本 URL。鉴于我们之前使用的配置,您几乎肯定会使用http://localhost:5858,但它可以是指向任何主机的任何 URL。SuperTest 的初始化方式与 SuperAgent 略有不同。

SuperTest模块提供了一个函数,我们使用URL_USERS_TEST变量调用该函数。这给了我们一个对象,我们称之为request,用于与正在测试的服务进行交互。

我们还设置了一对变量来存储认证用户 ID 和密钥。这些值与用户认证服务器中的值相同。我们只需要在进行 API 调用时提供它们。

最后,这是 Mocha 测试套件的外壳。所以,让我们开始填写beforeafter测试用例:


These are our `before` and `after` tests. We'll use them to establish a user and then clean them up by removing the user at the end.

This gives us a taste of how the `SuperTest` API works. If you refer back to `cli.mjs`, you'll see the similarities to `SuperAgent`.

The `post` and `delete` methods we can see here declare the HTTP verb to use. The `send` method provides an object for the `POST` operation. The `set` method sets header values, while the `auth` method sets up authentication:

现在,我们可以测试一些 API 方法,比如/list操作。

我们已经保证在before方法中有一个帐户,所以/list应该给我们一个包含一个条目的数组。

这遵循了使用 Mocha 测试 REST API 方法的一般模式。首先,我们使用 SuperTest 的request对象调用 API 方法并await其结果。一旦我们得到结果,我们使用assert方法来验证它是否符合预期。

添加以下测试用例:


We are checking the `/find` operation in two ways:

*   **Positive test**: Looking for the account we know exists – failure is indicated if the user account is not found
*   **Negative test**: Looking for the one we know does not exist – failure is indicated if we receive something other than an error or an empty object

Add the following test case:

最后,我们应该检查/destroy操作。这个操作已经在after方法中检查过,我们在那里destroy了一个已知的用户帐户。我们还需要执行负面测试,并验证其对我们知道不存在的帐户的行为。

期望的行为是抛出错误或结果显示一个指示错误的 HTTP status。实际上,当前的认证服务器代码给出了 500 状态码,以及其他一些信息。

这给了我们足够的测试来继续并自动化测试运行。

compose-stack-test-local/docker-compose.yml中,我们需要将test.js脚本注入到svc-userauth-test容器中。我们将在这里添加:


This injects the `userauth` directory into the container as the `/userauth/test` directory. As we did previously, we then must get into the container and run the test script.

The next step is creating a `package.json` file to hold any dependencies and a script to run the test:

在依赖项中,我们列出了 Mocha,Chai,SuperTest 和 cross-env。然后,在test脚本中,我们运行 Mocha 以及所需的环境变量。这应该运行测试。

我们可以从我们的笔记本电脑使用这个测试套件。因为测试目录被注入到容器中,我们也可以在容器内运行它们。要这样做,将以下代码添加到run.sh中:


This adds a second argument – in this case, the container name for `svc-userauth`. We can then run the test suite, using this script to run them inside the container. The first two commands ensure the installed packages were installed for the operating system in this container, while the last runs the test suite.

Now, if you run the `run.sh` test script, you'll see the required packages get installed. Then, the test suite will be executed.

The result will look like this:

因为URL_USERS_TEST可以使用任何 URL,我们可以针对用户认证服务的任何实例运行测试套件。例如,我们可以使用适当的URL_USERS_TEST值从我们的笔记本电脑上测试在 AWS EC2 上部署的实例。

我们取得了很好的进展。我们现在已经为笔记和用户认证服务准备了测试套件。我们已经学会了如何使用 REST API 测试 REST 服务。这与直接调用内部函数不同,因为它是对完整系统的端到端测试,扮演服务的消费者角色。

我们的下一个任务是自动化测试结果报告。

自动化测试结果报告

我们已经自动化了测试执行,Mocha 通过所有这些勾号使测试结果看起来很好。但是,如果管理层想要一个显示测试失败趋势的图表怎么办?报告测试结果作为数据而不是作为控制台上的用户友好的打印输出可能有很多原因。

例如,测试通常不是在开发人员的笔记本电脑上运行,也不是由质量团队的测试人员运行,而是由自动化后台系统运行。CI/CD 模型被广泛使用,其中测试由 CI/CD 系统在每次提交到共享代码存储库时运行。当完全实施时,如果在特定提交上所有测试都通过,那么系统将自动部署到服务器,可能是生产服务器。在这种情况下,用户友好的测试结果报告是没有用的,而必须以数据的形式传递,可以在 CI/CD 结果仪表板网站上显示。

Mocha 使用所谓的Reporter来报告测试结果。Mocha Reporter 是一个模块,以其支持的任何格式打印数据。有关此信息的更多信息可以在 Mocha 网站上找到:mochajs.org/#reporters

您将找到当前可用的reporters列表如下:


Then, you can use a specific Reporter, like so:

npm run script-name命令中,我们可以注入命令行参数,就像我们在这里所做的那样。--标记告诉 npm 将其命令行的其余部分附加到执行的命令上。效果就像我们运行了这个命令:


For Mocha, the `--reporter` option selects which Reporter to use. In this case, we selected the TAP reporter, and the output follows that format.

**Test Anything Protocol** (**TAP**) is a widely used test results format that increases the possibility of finding higher-level reporting tools. Obviously, the next step would be to save the results into a file somewhere, after mounting a host directory into the container.

In this section, we learned about the test results reporting formats supported by Mocha. This will give you a starting point for collecting long-term results tracking and other useful software quality metrics. Often, software teams rely on quality metrics trends as part of deciding whether a product can be shipped to the public.

In the next section, we'll round off our tour of testing methodologies by learning about a framework for frontend testing.

# Frontend headless browser testing with Puppeteer

A big cost area in testing is manual user interface testing. Therefore, a wide range of tools has been developed to automate running tests at the HTTP level. Selenium is a popular tool implemented in Java, for example. In the Node.js world, we have a few interesting choices. The *chai-http* plugin to Chai would let us interact at the HTTP level with the Notes application while staying within the now-familiar Chai environment. 

However, in this section, we'll use Puppeteer ([`github.com/GoogleChrome/puppeteer`](https://github.com/GoogleChrome/puppeteer)). This tool is a high-level Node.js module used to control a headless Chrome or Chromium browser, using the DevTools protocol. This protocol allows tools to instrument, inspect, debug, and profile Chromium or Chrome browser instances. The key result is that we can test the Notes application in a real browser so that we have greater assurance it behaves correctly for users. 

The Puppeteer website has extensive documentation that's worth reading: [`pptr.dev/`](https://pptr.dev/).

Puppeteer is meant to be a general-purpose test automation tool and has a strong feature set for that purpose. Because it's easy to make web page screenshots with Puppeteer, it can also be used in a screenshot service.

Because Puppeteer is controlling a real web browser, your user interface tests will be very close to live browser testing, without having to hire a human to do the work. Because it uses a headless version of Chrome, no visible browser window will show on your screen, and tests can be run in the background instead. It can also drive other browsers by using the DevTools protocol.

First, let's set up a directory to work in.

## Setting up a Puppeteer-based testing project directory

First, let's set up the directory that we'll install Puppeteer in, as well as the other packages that will be required for this project:

这不仅安装了 Puppeteer,还安装了 Mocha、Chai 和 Supertest。我们还将使用package.json文件记录脚本。

在安装过程中,您会发现 Puppeteer 会导致 Chromium 被下载,就像这样:


The Puppeteer package will launch that Chromium instance as needed, managing it as a background process and communicating with it using the DevTools protocol.

The approach we'll follow is to test against the Notes stack we've deployed in the test Docker infrastructure. Therefore, we need to launch that infrastructure:

根据您的需求,可能还需要执行docker-compose build。无论如何,这都会启动测试基础架构,并让您看到运行中的系统。

我们可以使用浏览器访问http://localhost:3000等网址。因为这个系统不包含任何用户,我们的测试脚本将不得不添加一个测试用户,以便测试可以登录并添加笔记。

另一个重要的事项是测试将在一个匿名的 Chromium 实例中运行。即使我们在正常的桌面浏览器中使用 Chrome,这个 Chromium 实例也与我们正常的桌面设置没有任何连接。从可测试性的角度来看,这是一件好事,因为这意味着您的测试结果不会受到个人网络浏览器配置的影响。另一方面,这意味着无法进行 Twitter 登录测试,因为该 Chromium 实例没有 Twitter 登录会话。

记住这些,让我们编写一个初始的测试套件。我们将从一个简单的初始测试用例开始,以证明我们可以在 Mocha 中运行 Puppeteer。然后,我们将测试登录和注销功能,添加笔记的能力,以及一些负面测试场景。我们将在本节中讨论如何改进 HTML 应用程序的可测试性。让我们开始吧。

为 Notes 应用程序堆栈创建一个初始的 Puppeteer 测试

我们的第一个测试目标是建立一个测试套件的大纲。我们需要按顺序执行以下操作:

  1. 向用户身份验证服务添加一个测试用户。

  2. 启动浏览器。

  3. 访问首页。

  4. 验证首页是否正常显示。

  5. 关闭浏览器。

  6. 删除测试用户。

这将确保我们有能力与启动的基础架构进行交互,启动浏览器并查看 Notes 应用程序。我们将继续执行策略并在测试后进行清理,以确保后续测试运行的干净环境,并添加,然后删除,一个测试用户。

notesui目录中,创建一个名为uitest.mjs的文件,其中包含以下代码:


This imports and configures the required modules. This includes setting up `bcrypt` support in the same way that is used in the authentication server. We've also copied in the authentication key for the user authentication backend service. As we did for the REST test suite, we will use the `SuperTest` library to add, verify, and remove the test user using the REST API snippets copied from the REST tests.

Add the following test block:

这将向身份验证服务添加一个用户。回顾一下,您会发现这与 REST 测试套件中的测试用例类似。如果您需要验证阶段,还有另一个测试用例调用/find/testme端点来验证结果。由于我们已经验证了身份验证系统,因此我们不需要在这里重新验证它。我们只需要确保我们有一个已知的测试用户,可以在需要浏览器登录的场景中使用。

将此代码放在uitest.mjs的最后:


At the end of the test execution, we should run this to delete the test user. The policy is to clean up after we execute the test. Again, this was copied from the user authentication service test suite. Between those two, add the following:

记住,在describe中,测试是it块。before块在所有it块之前执行,after块在之后执行。

before函数中,我们通过启动 Puppeteer 实例并启动一个新的 Page 对象来设置 Puppeteer。因为puppeteer.launchheadless选项设置为false,我们将在屏幕上看到一个浏览器窗口。这将很有用,因为我们可以看到发生了什么。sloMo选项也通过减慢浏览器交互来帮助我们看到发生了什么。在after函数中,我们调用这些对象的close方法来关闭浏览器。puppeteer.launch方法接受一个options对象,其中有很多值得学习的属性。

browser对象代表正在运行测试的整个浏览器实例。相比之下,page对象代表的是实质上是浏览器中当前打开的标签页。大多数 Puppeteer 函数都是异步执行的。因此,我们可以使用async函数和await关键字。

timeout设置是必需的,因为有时浏览器实例启动需要很长时间。我们慷慨地设置了超时时间,以最小化偶发测试失败的风险。

对于it子句,我们进行了少量的浏览器交互。作为浏览器标签页的包装器,page对象具有与管理打开标签页相关的方法。例如,goto方法告诉浏览器标签页导航到给定的 URL。在这种情况下,URL 是笔记主页,作为环境变量传递。

waitForSelector方法是一组等待特定条件的方法之一。这些条件包括waitForFileChooserwaitForFunctionwaitForNavigationwaitForRequestwaitForResponsewaitForXPath。这些方法以及waitFor方法都会导致 Puppeteer 异步等待浏览器中发生的某些条件。这些方法的目的是给浏览器时间来响应某些输入,比如点击按钮。在这种情况下,它会等到网页加载过程中在给定的 CSS 选择器下有一个可见的元素。该选择器指的是在页眉中的登录按钮。

换句话说,这个测试访问笔记主页,然后等待直到登录按钮出现。我们可以称之为一个简单的冒烟测试,快速执行并确定基本功能是否存在。

执行初始的 Puppeteer 测试

我们已经启动了使用docker-compose的测试基础设施。要运行测试脚本,请将以下内容添加到package.json文件的脚本部分:


The test infrastructure we deployed earlier exposes the user authentication service on port `5858` and the Notes application on port `3000`. If you want to test against a different deployment, adjust these URLs appropriately. Before running this, the Docker test infrastructure must be launched, which should have already happened.

Let's try running this initial test suite:

我们已经成功地创建了可以运行这些测试的结构。我们已经设置了 Puppeteer 和相关的包,并创建了一个有用的测试。主要的收获是有一个结构可以在其基础上构建更多的测试。

我们的下一步是添加更多的测试。

在笔记中测试登录/注销功能

在上一节中,我们创建了测试笔记用户界面的大纲。关于应用程序的测试并不多,但我们证明了可以使用 Puppeteer 测试笔记。

在本节中,我们将添加一个实际的测试。也就是说,我们将测试登录和注销功能。具体步骤如下:

  1. 使用测试用户身份登录。

  2. 验证浏览器是否已登录。

  3. 注销。

  4. 验证浏览器是否已注销。

uitest.js中,插入以下测试代码:


This is our test implementation for logging in and out. We have to specify the `timeout` value because it is a new `describe` block.

The `click` method takes a CSS selector, meaning this first click event is sent to the Login button. A CSS selector, as the name implies, is similar to or identical to the selectors we'd write in a CSS file. With a CSS selector, we can target specific elements on the page.

To determine the selector to use, look at the HTML for the templates and learn how to describe the element you wish to target. It may be necessary to add ID attributes into the HTML to improve testability.

The Puppeteer documentation refers to the CSS Selectors documentation on the Mozilla Developer Network website: [`developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors`](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors).

Clicking on the Login button will, of course, cause the Login page to appear. To verify this, we wait until the page contains a form that posts to `/users/login`. That form is in `login.hbs`.

The `type` method acts as a user typing text. In this case, the selectors target the `Username` and `Password` fields of the login form. The `delay` option inserts a pause of 100 milliseconds after typing each character. It was noted in testing that sometimes, the text arrived with missing letters, indicating that Puppeteer can type faster than the browser can accept.

The `page.keyboard` object has various methods related to keyboard events. In this case, we're asking to generate the equivalent to pressing *Enter* on the keyboard. Since, at that point, the focus is in the Login form, that will cause the form to be submitted to the Notes application. Alternatively, there is a button on that form, and the test could instead click on the button.

The `waitForNavigation` method has a number of options for waiting on page refreshes to finish. The selected option causes a wait until the DOM content of the new page is loaded.

The `$` method searches the DOM for elements matching the selector, returning an array of matching elements. If no elements match, `null` is returned instead. Therefore, this is a way to test whether the application got logged in, by looking to see if the page has a Logout button.

To log out, we click on the Logout button. Then, to verify the application logged out, we wait for the page to refresh and show a Login button:

有了这些,我们的新测试都通过了。请注意,执行一些测试所需的时间相当长。在调试测试时观察到了更长的时间,这就是我们设置长超时时间的原因。

这很好,但当然,还有更多需要测试的,比如添加笔记的能力。

测试添加笔记的能力

我们有一个测试用例来验证登录/注销功能。这个应用程序的重点是添加笔记,所以我们需要测试这个功能。作为副作用,我们将学习如何使用 Puppeteer 验证页面内容。

为了测试这个功能,我们需要按照以下步骤进行:

  1. 登录并验证我们已经登录。

  2. 点击“添加笔记”按钮进入表单。

  3. 输入笔记的信息。

  4. 验证我们是否显示了笔记,并且内容是正确的。

  5. 点击删除按钮并确认删除笔记。

  6. 验证我们最终进入了主页。

  7. 注销。

你可能会想“再次登录不是重复的吗?”之前的测试集中在登录/注销上。当然,浏览器可能已经处于登录状态了吧?如果浏览器仍然登录,这个测试就不需要再次登录。虽然这是真的,但这会导致登录/注销场景的测试不完整。每个场景在用户是否登录方面都应该是独立的。为了避免重复,让我们稍微重构一下测试。

最外层的描述块中,添加以下两个函数:


This is the same code as the code for the body of the test cases shown previously, but we've moved the code to their own functions. With this change, any test case that wishes to log into the test user can use these functions.

Then, we need to change the login/logout tests to this:

我们所做的只是将此处的代码移动到它们自己的函数中。这意味着我们可以在其他测试中重用这些函数,从而避免重复的代码。

将以下代码添加到uitest.mjs中的笔记创建测试套件:


These are our test cases for adding and deleting Notes. We start with the `doLogin` and `checkLogin` functions to ensure the browser is logged in.

After clicking on the Add Note button and waiting for the browser to show the form in which we enter the Note details, we need to enter text into the form fields. The `page.type` method acts as a user typing on a keyboard and types the given text into the field identified by the selector.

The interesting part comes when we verify the note being shown. After clicking the **Submit** button, the browser is, of course, taken to the page to view the newly created Note. To do this, we use `page.$eval` to retrieve text from certain elements on the screen.

The `page.$eval` method scans the page for matching elements, and for each, it calls the supplied callback function. The callback function is given the element, and in our case, we call the `textContent` method to retrieve the textual form of the element. Then, we're able to use the `assert.include` function to test that the element contains the required text.

The `page.url()` method, as its name suggests, returns the URL currently being viewed. We can test whether that URL contains `/notes/view` to be certain the browser is viewing a note.

To delete the note, we start by verifying that the **Delete** button is on the screen. Of course, this button is there if the user is logged in. Once the button is verified, we click on it and wait for the `FORM` that confirms that we want to delete the Note. Once it shows up, we can click on the button, after which we are supposed to land on the home page.

Notice that to find the Delete button, we need to refer to `a#notedestroy`. As it stands, the template in question does not have that ID anywhere. Because the HTML for the Delete button was not set up so that we could easily create a CSS selector, we must edit `views/noteedit.hbs` to change the Delete button to this:

我们所做的就是添加了 ID 属性。这是改进可测试性的一个例子,我们稍后会讨论。

我们使用的一种技术是调用page.$来查询给定元素是否在页面上。这种方法检查页面,返回一个包含任何匹配元素的数组。我们只是测试返回值是否非空,因为如果没有匹配元素,page.$会返回null。这是一种简单的测试元素是否存在的方法。

点击注销按钮退出登录。

创建了这些测试用例后,我们可以再次运行测试套件:


We have more passing tests and have made good progress. Notice how one of the test cases took 18 seconds to finish. That's partly because we slowed text entry down to make sure it is correctly received in the browser, and there is a fair amount of text to enter. There was a reason we increased the timeout.

In earlier tests, we had success with negative tests, so let's see if we can find any bugs that way.

## Implementing negative tests with Puppeteer

Remember that a negative test is used to purposely invoke scenarios that will fail. The idea is to ensure the application fails correctly, in the expected manner.

We have two scenarios for an easy negative test:

*   Attempt to log in using a bad user ID and password
*   Access a bad URL

Both of these are easy to implement, so let's see how it works.

### Testing login with a bad user ID

A simple way to ensure we have a bad username and password is to generate random text strings for both. An easy way to do that is with the `uuid` package. This package is about generating Universal Unique IDs (that is, UUIDs), and one of the modes of using the package simply generates a unique random string. That's all we need for this test; it is a guarantee that the string will be unique.

To make this crystal clear, by using a unique random string, we ensure that we don't accidentally use a username that might be in the database. Therefore, we will be certain of supplying an unknown username when trying to log in.

In `uitest.mjs`, add the following to the imports:

uuid包支持几种方法,v4方法是生成随机字符串的方法。

然后,添加以下场景:


This starts with the login scenario. Instead of a fixed username and password, we instead use the results of calling `uuidv4()`, or the random UUID string.

This does the login action, and then we wait for the resulting page. In trying this manually, we learn that it simply returns us to the login screen and that there is no additional message. Therefore, the test looks for the login form and ensures there is a Login button. Between the two, we are certain the user is not logged in.

We did not find a code error with this test, but there is a user experience error: namely, the fact that, for a failed login attempt, we simply show the login form and do not provide a message (that is, *unknown username or password*), which leads to a bad user experience. The user is left feeling confused over what just happened. So, let's put that on our backlog to fix.

### Testing a response to a bad URL 

Our next negative test is to try a bad URL in Notes. We coded Notes to return a 404 status code, which means the page or resource was not found. The test is to ask the browser to visit the bad URL, then verify that the result uses the correct error message.

Add the following test case:

通过获取主页的 URL(NOTES_HOME_URL)并将 URL 的pathname部分设置为/bad-unknown-url来计算错误的 URL。由于在笔记中没有这条路径,我们肯定会收到一个错误。如果我们想要更确定,似乎可以使用uuidv4()函数使 URL 变得随机。

调用page.goto()只是让浏览器转到请求的 URL。对于后续页面,我们等到出现一个带有header元素的页面。因为这个页面上没有太多内容,所以header元素是确定我们是否有了后续页面的最佳选择。

要检查 404 状态码,我们调用response.status(),这是在 HTTP 响应中收到的状态码。然后,我们调用page.$eval从页面中获取一些项目,并确保它们包含预期的文本。

在这种情况下,我们没有发现任何代码问题,但我们发现了另一个用户体验问题。错误页面非常丑陋且不友好。我们知道用户体验团队会对此大声抱怨,所以将其添加到待办事项中,以改进此页面。

在这一部分中,我们通过创建一些负面测试来结束了测试开发。虽然这并没有导致发现代码错误,但我们发现了一对用户体验问题。我们知道这将导致与用户体验团队进行不愉快的讨论,因此我们已经主动将修复这些页面的任务添加到了待办事项中。但我们也学会了随时留意沿途出现的任何问题。众所周知,由开发或测试团队发现的问题的修复成本最低。当用户社区报告问题时,修复问题的成本会大大增加。

在我们结束本章之前,我们需要更深入地讨论一下可测试性。

改进笔记 UI 的可测试性

虽然 Notes 应用程序在浏览器中显示良好,但我们如何编写测试软件来区分一个页面和另一个页面?正如我们在本节中看到的,UI 测试经常执行一个导致页面刷新的操作,并且必须等待下一个页面出现。这意味着我们的测试必须能够检查页面,并确定浏览器是否显示了正确的页面。一个错误的页面本身就是应用程序中的一个错误。一旦测试确定它是正确的页面,它就可以验证页面上的数据。

底线是,每个 HTML 元素必须能够轻松地使用 CSS 选择器进行定位。

虽然在大多数情况下,为每个元素编写 CSS 选择器很容易,在少数情况下,这很困难。软件质量工程SQE)经理请求我们的帮助。涉及的是测试预算,SQE 团队能够自动化他们的测试,预算将被进一步拉伸。

所需的只是为 HTML 元素添加一些idclass属性,以提高可测试性。有了一些标识符和对这些标识符的承诺,SQE 团队可以编写可重复的测试脚本来验证应用程序。

我们已经看到了一个例子:views/noteview.hbs中的删除按钮。我们无法为该按钮编写 CSS 选择器,因此我们添加了一个 ID 属性,让我们能够编写测试。

总的来说,可测试性是为了软件质量测试人员的利益而向 API 或用户界面添加东西。对于 HTML 用户界面来说,这意味着确保测试脚本可以定位 HTML DOM 中的任何元素。正如我们所见,idclass属性在满足这一需求方面起到了很大作用。

在这一部分,我们学习了用户界面测试作为功能测试的一种形式。我们使用了 Puppeteer,一个用于驱动无头 Chromium 浏览器实例的框架,作为测试 Notes 用户界面的工具。我们学会了如何自动化用户界面操作,以及如何验证显示的网页是否与其正确的行为匹配。这包括覆盖登录、注销、添加笔记和使用错误的用户 ID 登录的测试场景。虽然这没有发现任何明显的失败,但观察用户交互告诉我们 Notes 存在一些可用性问题。

有了这些,我们准备结束本章。

总结

在本章中,我们涵盖了很多领域,并查看了三个不同的测试领域:单元测试、REST API 测试和 UI 功能测试。确保应用程序经过充分测试是通往软件成功的重要一步。一个不遵循良好测试实践的团队往往会陷入修复回归问题的泥潭。

首先,我们谈到了只使用断言模块进行测试的潜在简单性。虽然测试框架,比如 Mocha,提供了很好的功能,但我们可以用一个简单的脚本走得更远。

测试框架,比如 Mocha,有其存在的价值,至少是为了规范我们的测试用例并生成测试结果报告。我们用 Mocha 和 Chai 做到了这一点,这些工具非常成功。我们甚至在一个小的测试套件中发现了一些错误。

在开始单元测试之路时,一个设计考虑是模拟依赖关系。但并不总是一个好的做法用模拟版本替换每个依赖。因此,我们对一个实时数据库运行了我们的测试,但使用了测试数据。

为了减轻运行测试的行政负担,我们使用 Docker 来自动设置和拆除测试基础设施。就像 Docker 在自动部署 Notes 应用程序方面很有用一样,它在自动化测试基础设施部署方面也很有用。

最后,我们能够在真实的 Web 浏览器中测试 Notes 网络用户界面。我们不能指望单元测试能够找到每一个错误;有些错误只会在 Web 浏览器中显示。

在本书中,我们已经涵盖了 Node.js 开发的整个生命周期,从概念、通过各个开发阶段,到部署和测试。这将为您提供一个坚实的基础,从而开始开发 Node.js 应用程序。

在下一章中,我们将探讨另一个关键领域——安全性。我们将首先使用 HTTPS 对用户访问 Notes 进行加密和认证。我们将使用几个 Node.js 包来减少安全入侵的机会。

Node.js 应用程序中的安全性

我们即将结束学习 Node.js 的旅程。但还有一个重要的话题需要讨论:安全。您的应用程序的安全性非常重要。您想因为您的应用程序是自 Twitter 以来最伟大的东西而上新闻,还是因为通过您的网站发起的大规模网络安全事件而闻名?

多年来,全球各地的网络安全官员一直呼吁加强互联网安全。诸如互联网连接的安全摄像头之类的东西中的安全漏洞已被不法分子武器化为庞大的僵尸网络,并用于殴打网站或进行其他破坏。在其他情况下,由于安全入侵而导致的猖獗身份盗窃对我们所有人构成了财务威胁。几乎每天,新闻中都会有更多关于网络安全问题的揭示。

我们在本书中多次提到了这个问题。从第十章开始,即在 Linux 上部署 Node.js 应用程序,我们讨论了需要将 Notes 的部署分段以对抗入侵,并特别是将用户数据库隔离在受保护的容器中。您在关键系统周围放置的安全层越多,攻击者进入的可能性就越小。虽然 Notes 是一个玩具应用程序,但我们可以用它来学习如何实施 Web 应用程序安全。

安全不应该是事后才考虑的,就像测试不应该是事后才考虑的一样。两者都非常重要,即使只是为了避免公司因错误原因而上新闻。

在本章中,我们将涵盖以下主题:

  • 在 AWS ECS 上为 Express 应用程序实施 HTTPS/SSL

  • 使用 Helmet 库为内容安全策略、DNS 预取控制、帧选项、严格传输安全性和减轻 XSS 攻击实施标头

  • 防止跨站点请求伪造攻击表单

  • SQL 注入攻击

  • 对已知漏洞的软件包进行预部署扫描

  • 审查 AWS 上可用的安全设施

对于一般建议,Express 团队在expressjs.com/en/advanced/best-practice-security.html上有一个出色的安全资源页面。

如果尚未这样做,请复制第十三章,单元测试和功能测试,源树,您可能已经称为chap13,以创建一个安全源树,您可以称为chap14

在本章结束时,您将了解到提供 SSL 证书的详细信息,使用它们来实施 HTTPS 反向代理。之后,您将了解有关改进 Node.js Web 应用程序安全性的几种工具。这应该为您提供 Web 应用程序安全的基础。

让我们从为部署的 Notes 应用程序实施 HTTPS 支持开始。

第十八章:在部署的 Node.js 应用程序中为 Docker 实施 HTTPS

当前的最佳实践是每个网站都必须使用 HTTPS 访问。传输未加密信息的时代已经过去。这种旧模式容易受到中间人攻击和其他威胁的影响。

使用 SSL 和 HTTPS 意味着互联网连接经过身份验证和加密。加密足够好,可以阻止除最先进的窥探者之外的所有人,而身份验证意味着我们确信网站就是它所说的那样。HTTPS 使用 HTTP 协议,但使用 SSL 或安全套接字层进行加密。实施 HTTPS 需要获取 SSL 证书并在 Web 服务器或 Web 应用程序中实施 HTTPS 支持。

给定一个合适的 SSL 证书,Node.js 应用程序可以很容易地实现 HTTPS,因为只需少量代码就可以给我们一个 HTTPS 服务器。但还有另一种方法,可以提供额外的好处。NGINX 是一个备受推崇的 Web 服务器和代理服务器,非常成熟和功能丰富。我们可以使用它来实现 HTTPS 连接,并同时获得另一层保护,防止潜在的不法分子和 Notes 应用程序之间的攻击。

我们已经在 AWS EC2 集群上使用 Docker swarm 部署了 Notes。使用 NGINX 只是简单地向 swarm 添加另一个容器,配置所需的工具来提供 SSL 证书。为此,我们将使用一个将 NGINX 与 Let's Encrypt 客户端程序结合在一起,并编写脚本来自动更新证书的 Docker 容器。Let's Encrypt 是一个非营利性组织,提供免费 SSL 证书的优秀服务。使用他们的命令行工具,我们可以根据需要提供和管理 SSL 证书。

在这一部分,我们将做以下工作:

  1. 配置一个域名指向我们的 swarm

  2. 整合一个包含 NGINX、Cron 和 Certbot(Let's Encrypt 客户端工具之一)的 Docker 容器

  3. 在该容器中实现自动化流程来管理证书的更新

  4. 配置 NGINX 监听端口443(HTTPS)以及端口80(HTTP)

  5. 配置 Twitter 应用程序以支持网站的 HTTPS

这可能看起来是很多工作,但每项任务都很简单。让我们开始吧。

为部署在 AWS EC2 上的应用程序分配一个域名

Notes 应用程序是使用在 AWS EC2 实例上构建的 Docker swarm 部署的。其中一个实例有一个由 AWS 分配的公共 IP 地址和域名。最好给 EC2 实例分配一个域名,因为 AWS 分配的名称不仅用户不友好,而且在下次重新部署集群时会更改。给 EC2 实例分配一个域名需要有一个注册的域名,添加一个列出其 IP 地址的 A 记录,并在 EC2 IP 地址更改时更新 A 记录。

添加 A 记录意味着什么?域名系统DNS)是让我们可以使用geekwisdom.net这样的名称来访问网站,而不是 IP 地址216.239.38.21。在 DNS 协议中,有几种类型的记录可以与系统中的域名条目相关联。对于这个项目,我们只需要关注其中一种记录类型,即 A 记录,用于记录域名的 IP 地址。一个被告知访问任何域的网络浏览器会查找该域的 A 记录,并使用该 IP 地址发送网站内容的 HTTP(S)请求。

将 A 记录添加到域的 DNS 条目的具体方法在不同的域注册商之间差异很大。例如,一个注册商(Pair Domains)有这样的屏幕:

在特定域的仪表板中,可能有一个用于添加新 DNS 记录的部分。在这个注册商中,下拉菜单可以让你在记录类型中进行选择。选择 A 记录类型,然后在你的域名中在右侧框中输入 IP 地址,在左侧框中输入子域名。在这种情况下,我们正在创建一个子域,notes.geekwisdom.net,这样我们就可以部署一个测试站点,而不会影响到托管在该域上的主站点。这也让我们避免了为这个项目注册一个新域名的费用。

一旦你点击“添加记录”按钮,A 记录就会被发布。由于 DNS 记录通常需要一些时间来传播,你可能无法立即访问域名。如果这需要超过几个小时,你可能做错了什么。

一旦 A 记录成功部署,你的用户就可以访问notes.geekwisdom.net这样一个漂亮的域名的 Notes 应用程序。

请注意,每次重新部署 EC2 实例时,IP 地址都会更改。如果重新部署 EC2 实例,则需要更新新地址的 A 记录。

在本节中,我们已经了解了将域名分配给 EC2 实例。这将使我们的用户更容易访问 Notes,同时也让我们可以提供 HTTPS/SSL 证书。

添加域名意味着更新 Twitter 应用程序配置,以便 Twitter 知道该域名。

更新 Twitter 应用程序

Twitter 需要知道哪些 URL 对我们的应用程序有效。到目前为止,我们已经告诉 Twitter 我们笔记本上的测试 URL。我们在一个真实域上有 Notes,我们需要告诉 Twitter 这一点。

我们已经做过这个几次了,所以你已经知道该怎么做了。前往developers.twitter.com,使用您的 Twitter 帐户登录,然后转到应用程序仪表板。编辑与您的 Notes 实例相关的应用程序,并将您的域名添加到 URL 列表中。

我们将为 Notes 应用程序实现 HTTP 和 HTTPS,因此 Notes 将具有http://https:// URL。这意味着您不仅必须将 HTTP URL 添加到 Twitter 配置站点,还必须将 HTTPS URL 添加到其中。

compose-stack/docker-compose.yml文件中,svc-notes配置中的TWITTER_CALLBACK_HOST环境变量也必须使用该域名进行更新。

现在我们已经有了与 EC2 集群关联的域名,并且我们已经通知了 Twitter 该域名。我们应该能够重新部署 Notes 到集群,并能够使用该域名。这包括能够使用 Twitter 登录,创建和删除笔记等。在这一点上,您不能将 HTTPS URL 放入TWITTER_CALLBACK_HOST,因为我们还没有实现 HTTPS 支持。

这些步骤为在 Notes 上使用 Let's Encrypt 实现 HTTPS 做好了准备。但首先,让我们来了解一下 Let's Encrypt 的工作原理,以便更好地为 Notes 实现它。

规划如何使用 Let's Encrypt

与每个 HTTPS/SSL 证书提供商一样,Let's Encrypt 需要确保您拥有您正在请求证书的域。成功使用 Let's Encrypt 需要在发出任何 SSL 证书之前进行成功验证。一旦域名注册到 Let's Encrypt,注册必须至少每 90 天更新一次,因为这是他们 SSL 证书的到期时间。域名注册和证书更新因此是我们必须完成的两项主要任务。

在本节中,我们将讨论注册和更新功能的工作原理。我们的目标是了解我们将如何管理我们计划使用的任何域的 HTTPS 服务。

Let's Encrypt 支持 API,并且有几个客户端应用程序用于此 API。Certbot 是 Let's Encrypt 请求的推荐用户界面。它可以轻松安装在各种操作系统上。例如,它可以通过 Debian/Ubuntu 软件包管理系统获得。

有关 Let's Encrypt 文档,请参阅letsencrypt.org/docs/

有关 Certbot 文档,请参阅certbot.eff.org/docs/intro.html

验证域名所有权是 HTTPS 的核心特性,这使得它成为任何 SSL 证书供应商确保正确分发 SSL 证书的核心要求。Let's Encrypt 有几种验证策略,在这个项目中,我们将专注于其中一种,即 HTTP-01 挑战。

HTTP-01 挑战涉及 Let's Encrypt 服务向 URL 发出请求,例如http://<YOUR_DOMAIN>/.well-known/acme-challenge/<TOKEN><TOKEN>是 Let's Encrypt 提供的编码字符串,Certbot 工具将其写入目录中的文件。我们的任务是以某种方式允许 Let's Encrypt 服务器使用此 URL 检索该文件。

一旦 Certbot 成功地将域名注册到 Let's Encrypt,它将收到一对 PEM 文件,包括 SSL 证书。Certbot 跟踪各种管理细节和 SSL 证书,通常在/etc/letsencrypt目录中。然后必须使用 SSL 证书来实现 Notes 的 HTTPS 服务器。

Let's Encrypt SSL 证书在 90 天后过期,我们必须创建一个自动化的管理任务来更新证书。Certbot 也用于证书更新,通过运行certbot renew。这个命令查看在这台服务器上注册的域名,并对任何需要更新的域名重新运行验证过程。因此,必须保持启用 HTTP-01 挑战所需的目录。

拥有 SSL 证书后,我们必须配置一些 HTTP 服务器实例来使用这些证书来实现 HTTPS。非常有可能配置svc-notes服务来独立处理 HTTPS。在 Node.js 运行时中有一个 HTTPS 服务器对象,可以处理这个要求。在notes/app.mjs中进行小的重写以适应 SSL 证书来实现 HTTPS,以及 HTTP-01 挑战。

但还有另一种可能的方法。诸如 NGINX 之类的 Web 服务器非常成熟、稳健、经过充分测试,最重要的是支持 HTTPS。我们可以使用 NGINX 来处理 HTTPS 连接,并使用所谓的反向代理将流量传递给svc-notes作为 HTTP。也就是说,NGINX 将被配置为接受入站 HTTPS 流量,将其转换为 HTTP 流量发送到svc-notes

除了实现 HTTPS 的安全目标之外,这还有一个额外的优势,即使用一个备受推崇的 Web 服务器(NGINX)来作为对抗某些类型攻击的屏障。

在查看了 Let's Encrypt 文档之后,我们知道了如何继续。有一个可用的 Docker 容器,可以处理我们需要在 NGINX 和 Let's Encrypt 中进行的所有操作。在下一节中,我们将学习如何将该容器与 Notes 堆栈集成,并实现 HTTPS。

使用 NGINX 和 Let's Encrypt 在 Docker 中为 Notes 实现 HTTPS

我们刚刚讨论了如何使用 Let's Encrypt 为 Notes 实现 HTTPS。我们将采取的方法是使用一个预先制作的 Docker 容器,Cronginx(hub.docker.com/r/robogeek/cronginx),其中包括 NGINX、Certbot(Let's Encrypt 客户端)和一个用于管理 SSL 证书更新的 Cron 服务器和 Cron 作业。这只需要向 Notes 堆栈添加另一个容器,进行一些配置,并运行一个命令来注册我们的域名到 Let's Encrypt。

在开始本节之前,请确保您已经设置了一个域名,我们将在这个项目中使用。

在 Cronginx 容器中,Cron 用于管理后台任务以更新 SSL 证书。是的,Cron,Linux/Unix 管理员几十年来一直用来管理后台任务的服务器。

NGINX 配置将同时处理 HTTP-01 挑战并为 HTTPS 连接使用反向代理。代理服务器充当中间人;它接收来自客户端的请求,并使用其他服务来满足这些请求。反向代理是一种从一个或多个其他服务器检索资源的代理服务器,同时使其看起来像资源来自代理服务器。在这种情况下,我们将配置 NGINX 以访问http://svc-notes:3000上的 Notes 服务,同时使 Notes 服务看起来是由 NGINX 代理托管的。

如果您不知道如何配置 NGINX,不用担心,因为我们将准确地展示该怎么做,而且相对简单。

添加 Cronginx 容器以支持 Notes 上的 HTTPS

我们已经确定,添加 HTTPS 支持需要向 Notes 堆栈添加另一个容器。这个容器将处理 HTTPS 连接,并集成用于管理从 Let's Encrypt 获取的 SSL 证书的工具。

compose-stack目录中,编辑docker-compose.yml如下:


Because the `svc-notes` container will not be handling inbound traffic, we start by disabling its `ports` tag. This has the effect of ensuring it does not export any ports to the public. Instead, notice that in the `cronginx` container we export both port `80` (HTTP) and port `443` (HTTPS). That container will take over interfacing with the public internet.

Another change on `svc-notes` is to set the `TWITTER_CALLBACK_HOST` environment variable. Set this to the domain name you've chosen. Remember that correctly setting this variable is required for successful login using Twitter. Until we finish implementing HTTPS, this should have an HTTP URL.

The `deploy` tag for Cronginx is the same as for `svc-notes`. In theory, because `svc-notes` is no longer interacting with the public it could be redeployed to an EC2 instance on the private network. Because both are attached to `frontnet`, either will be able to access the other with a simple domain name reference, which we'll see in the configuration file.

This container uses the same DNS configuration, because Certbot needs to be able to reach the Let's Encrypt servers to do its work.

The final item of interest is the volume mounts. In the previous section, we discussed certain directories that must be mounted into this container. As with the database containers, the purpose is to persist the data in those directories while letting us destroy and recreate the Cronginx container as needed. Each directory is mounted from `/home/ubuntu` because that's the directory that is available on the EC2 instances. The three directories are as follows:

*   `/etc/letsencrypt`: As discussed earlier, Certbot uses this directory to track administrative information about domains being managed on the server. It also stores the SSL certificates in this directory.
*   `/webroots`: This directory will be used in satisfying the HTTP-01 request to the `http://<YOUR_DOMAIN>/.well-known/acme-challenge/<TOKEN>` URL.
*   `/etc/nginx/conf.d`: This directory holds the NGINX configuration files for each domain we'll handle using this Cronginx instance.

For NGINX configuration, there is a default config file at `/etc/nginx/nginx.conf`. That file automatically includes any configuration file in `/etc/nginx/conf.d`, within an `http` context. What that means is each such file should have one or more `server` declarations. It won't be necessary to go deeper into learning about NGINX since the config files we will use are very straightforward.

We will be examining NGINX configuration files. If you need to learn more about these files, the primary documentation is at [`nginx.org/en/docs/`](https://nginx.org/en/docs/).

Further documentation for the commercial NGINX Plus product is at [`www.nginx.com/resources/admin-guide/`](https://www.nginx.com/resources/admin-guide/).

The NXING website has a *Getting Started* section with many useful recipes at [`www.nginx.com/resources/wiki/start/`](https://www.nginx.com/resources/wiki/start/).

It will be a useful convention to follow to have one file in the `/etc/nginx/conf.d` directory for each domain you are hosting. That means, in this project, you will have one domain, and therefore you'll store one file in the directory named `YOUR-DOMAIN.conf`. For the example domain we configured earlier, that file would be `notes.geekwisdom.net.conf`.

### Creating an NGINX configuration to support registering domains with Let's Encrypt

At this point, you have selected a domain you will use for Notes. To register a domain with Let's Encrypt, we need a web server configured to satisfy requests to the `http://<YOUR_DOMAIN>/.well-known/acme-challenge/<TOKEN>` URL, and where the corresponding directory is writable by Certbot. All the necessary elements are contained in the Cronginx container. 

What we need to do is create an NGINX configuration file suitable for handling registration, then run the shell script supplied inside Cronginx. After registration is handled, there will be another NGINX configuration file that's suitable for HTTPS. We'll go over that in a later section.

Create a file for your domain named `initial-YOUR-DOMAIN.conf`, named this way because it's the initial configuration file for the domain. It will contain this:

正如我们所说,NGINX 配置文件相对简单。这声明了一个服务器,本例中监听端口为80(HTTP)。如果需要,可以轻松开启 IPv6 支持。

server_name字段告诉 NGINX 要处理哪个域名。access_logerror_log字段,顾名思义,指定了日志输出的位置。

location块描述了如何处理域的 URL 空间的部分。在第一个块中,它表示/.well-known URL 上的 HTTP-01 挑战是通过从/webroots/YOUR-DOMAIN读取文件来处理的。我们已经在docker-compose.yml文件中看到了该目录的引用。

第二个location块描述了反向代理配置。在这种情况下,我们配置它以在端口3000上对svc-notes容器运行 HTTP 代理。这对应于docker-compose.yml文件中的配置。

这就是配置文件,但在部署到 swarm 之前,我们需要做一些工作。

在 EC2 主机上添加所需的目录

我们已经确定了三个用于 Cronginx 的目录。请记住,每个 EC2 主机都是由我们在 Terraform 文件的user_data字段中提供的 shell 脚本进行配置的。该脚本安装 Docker 并执行另一个设置。因此,我们应该使用该脚本来创建这三个目录。

terraform-swarm中,编辑ec2-public.tf并进行以下更改:


There is an existing shell script that performs the Docker setup. These three lines are appended to that script and create the directories.

With this in place, we can redeploy the EC2 cluster, and the directories will be there ready to be used.

### Deploying the EC2 cluster and Docker swarm

Assuming that the EC2 cluster is currently not deployed, we can set it up as we did in Chapter 12, *Deploying a Docker Swarm to AWS EC2 with Terraform*. In `terraform-swarm`, run this command:

到目前为止,你已经做了几次这样的事情,知道该怎么做。等待部署完成,记录 IP 地址和其他数据,然后初始化 swarm 集群并设置远程控制访问,这样你就可以在笔记本上运行 Docker 命令。

一个非常重要的任务是获取 IP 地址并转到您的 DNS 注册商,更新域的 A 记录为新的 IP 地址。

我们需要将 NGINX 配置文件复制到/home/ubuntu/nginx-conf-d,操作如下:


The `chown` command is required because when Terraform created that directory it became owned by the `root` user. It needs to be owned by the `ubuntu` user for the `scp` command to work.

At this point make sure that, in `compose-swarm/docker-compose.yml`, the `TWITTER_CALLBACK_HOST` environment variable for `svc-notes` is set to the HTTP URL (`http://YOUR-DOMAIN`) rather than the HTTPS URL. Obviously you have not yet provisioned HTTPS and can only use the HTTP domain.

With those things set up, we can run this:

这将向 swarm 添加所需的秘密,并部署 Notes 堆栈。几分钟后,所有服务应该都已启动。请注意,Cronginx 是其中之一。

一旦完全启动,您应该能够像以往一样使用 Notes,但使用您配置的域名。您甚至可以使用 Twitter 登录。

使用 Let's Encrypt 注册域名

我们刚刚在 AWS EC2 基础设施上部署了 Notes 堆栈。这次部署的一部分是 Cronginx 容器,我们将用它来处理 HTTPS 配置。

我们已经在 swarm 上部署了 Notes,cronginx容器充当 HTTP 代理。在该容器内预先安装了 Certbot 工具和一个脚本(register.sh)来帮助注册域名。我们必须在cronginx容器内运行register.sh,一旦域名注册完成,我们将需要上传一个新的 NGINX 配置文件。

cronginx容器内启动 shell 可能会很容易:


You see there is a file named `register.sh` containing the following:

该脚本旨在创建/webroots中所需的目录,并使用 Certbot 注册域名并提供 SSL 证书。参考配置文件,您将看到/webroots目录的使用方式。

certbot certonly命令只检索 SSL 证书,不会在任何地方安装它们。这意味着它不会直接集成到任何服务器中,而只是将证书存储在一个目录中。该目录位于/etc/letsencrypt层次结构内。

--webroot选项意味着我们正在与现有的 Web 服务器合作。必须配置它以从指定为-w选项的目录中提供/.well-known/acme-challenge文件,这就是我们刚刚讨论过的/webroots/YOUR-DOMAIN目录。-d选项是要注册的域名。

简而言之,register.sh与我们创建的配置文件相匹配。

脚本的执行方式如下:


We run the shell script using `sh -x register.sh` and supply our chosen domain name as the first argument. Notice that it creates the `/webroots` directory, which is required for the Let's Encrypt validation. It then runs `certbot certonly`, and the tool starts asking questions required for registering with the service.

The registration process ends with this message:

关键数据是构成 SSL 证书的两个 PEM 文件的路径名。它还告诉您定期运行certbot renew来更新证书。我们已经通过安装 Cron 作业来处理了这个问题。

正如他们所说,将这个目录持久化存储在其他地方是很重要的。我们已经采取了第一步,将其存储在容器外部,这样我们可以随意销毁和重新创建容器。但是当需要销毁和重新创建 EC2 实例时怎么办?在您的待办事项中安排一个任务来设置备份程序,然后在 EC2 集群初始化期间从备份中安装这个目录。

现在我们的域名已经注册到 Let's Encrypt,让我们修改 NGINX 配置以支持 HTTPS。

使用 Let's Encrypt 证书实现 NGINX HTTPS 配置

好了,我们离加密如此之近,我们可以感受到它的味道。我们已经将 NGINX 和 Let's Encrypt 工具部署到了笔记应用程序堆栈中。我们已经验证了仅支持 HTTP 的 NGINX 配置是否正确。我们已经使用 Certbot 为 HTTPS 从 Let's Encrypt 提供 SSL 证书。现在是时候重写 NGINX 配置以支持 HTTPS,并将该配置部署到笔记堆栈中。

compose-stack/cronginx中创建一个新文件,YOUR-DOMAIN.conf,例如notes.geekwisdom.net.conf。之前的文件有一个前缀initial,因为它在实现 HTTPS 的初始阶段为我们提供了服务。现在域名已经注册到 Let's Encrypt,我们需要一个不同的配置文件:


This reconfigures the HTTP server to do permanent redirects to the HTTPS site. When an HTTP request results in a 301 status code, that is a permanent redirect. Any redirect tells web browsers to visit a URL provided in the redirect. There are two kinds of redirects, temporary and permanent, and the 301 code makes this a permanent redirect. For permanent redirects, the browser is supposed to remember the redirect and apply it in the future. In this case, the redirect URL is computed to be the request URL, rewritten to use the HTTPS protocol.

Therefore our users will silently be sent to the HTTPS version of Notes, with no further effort on our part.

To implement the HTTPS server, add this to the config file:

这是 NGINX 中的 HTTPS 服务器实现。与 HTTP 服务器声明有许多相似之处,但也有一些特定于 HTTPS 的项目。它在端口443上监听,这是 HTTPS 的标准端口,并告诉 NGINX 使用 SSL。它具有相同的服务器名称和日志配置。

下一部分告诉 NGINX SSL 证书的位置。只需用 Certbot 给出的路径名替换它。

下一部分处理了/.well-known的 URL,用于将来使用 Let's Encrypt 进行验证请求。HTTP 和 HTTPS 服务器定义都已配置为从同一目录处理此 URL。我们不知道 Let's Encrypt 是否会通过 HTTP 或 HTTPS URL 请求验证,因此我们可能会在两个服务器上都支持这一点。

下一部分是一个代理服务器,用于处理/socket.io的 URL。这需要特定的设置,因为 Socket.IO 必须从 HTTP/1.1 升级到 WebSocket。否则,JavaScript 控制台会打印错误,并且 Socket.IO 功能将无法工作。有关更多信息,请参见代码中显示的 URL。

最后一部分是设置一个反向代理,将 HTTPS 流量代理到运行在端口3000上的 HTTP 后端服务器上的笔记应用程序。

创建了一个新的配置文件后,我们可以将其上传到notes-public EC2 实例中,方法如下:


The next question is how do we restart the NGINX server so it reads the new configuration file? One way is to send a SIGHUP signal to the NGINX process, causing it to reload the configuration:

nginx.pid文件包含 NGINX 进程的进程 ID。许多 Unix/Linux 系统上的后台服务都将进程 ID 存储在这样的文件中。这个命令向该进程发送 SIGHUP 信号,NGINX 在接收到该信号时会重新读取其配置。SIGHUP 是标准的 Unix/Linux信号之一,通常用于导致后台进程重新加载其配置。有关更多信息,请参见signal(2)手册页。

但是,使用 Docker 命令,我们可以这样做:


That will kill the existing container and start a new one.

Instead of that rosy success message, you might get this instead:

这表示 Docker swarm 看到容器退出了,因此无法重新启动服务。

在 NGINX 配置文件中很容易出错。首先仔细查看配置,看看可能出了什么问题。诊断的下一阶段是查看 NGINX 日志。我们可以使用docker logs命令来做到这一点,但我们需要知道容器的名称。因为容器已经退出,我们必须运行这个命令:


The `-a` option causes `docker ps` to return information about every container, even the ones that are not currently running. With the container name in hand, we can run this:

事实上,问题是语法错误,它甚至会友好地告诉您行号。

一旦您成功重新启动了cronginx服务,请访问您部署的 Notes 服务并验证它是否处于 HTTPS 模式。

在本节中,我们成功地为基于 AWS EC2 的 Docker 集群部署了 Notes 应用程序堆栈的 HTTPS 支持。我们使用了上一节中创建的 Docker 容器文件,并将更新后的 Notes 堆栈部署到了集群中。然后我们运行 Certbot 来注册我们的域名并使用 Let's Encrypt。然后我们重写了 NGINX 配置以支持 HTTPS。

我们的下一个任务是验证 HTTPS 配置是否正常工作。

测试 Notes 应用程序的 HTTPS 支持

在本书中,我们对 Notes 进行了临时测试和更正式的测试。因此,您知道要确保 Notes 在这个新环境中正常工作需要做什么。但是还有一些特定于 HTTPS 的事项需要检查。

在浏览器中,转到您托管应用程序的域名。如果一切顺利,您将会看到应用程序,并且它将自动重定向到 HTTPS 端口。

为了让我们人类知道网站是在 HTTPS 上,大多数浏览器在地址栏中显示一个图标。

您应该能够单击该锁图标,浏览器将显示一个对话框,提供有关证书的信息。证书将验证这确实是正确的域,并且还将显示证书是由 Let's Encrypt 通过Let's Encrypt Authority X3颁发的。

您应该能够浏览整个应用程序并仍然看到锁图标。

您应该注意mixed content警告。这些警告将出现在 JavaScript 控制台中,当 HTTPS 加载的页面上的某些内容使用 HTTP URL 加载时会出现。混合内容场景不够安全,因此浏览器会向用户发出警告。消息可能会出现在浏览器内的 JavaScript 控制台中。如果您正确地按照本书中的说明操作,您将不会看到此消息。

最后,前往 Qualys SSL Labs SSL 实现测试页面。该服务将检查您的网站,特别是 SSL 证书,并为您提供一个分数。要检查您的分数,请参阅www.ssllabs.com/ssltest/

完成了这项任务后,您可能希望关闭 AWS EC2 集群。在这样做之前,最好先从 Let's Encrypt 中注销域名。这也只需要运行带有正确命令的 Certbot:


As before, we run `docker ps` to find out the exact container name. With that name, we start a command shell inside the container. The actual act is simple, we just run `certbot delete` and specify the domain name.

Certbot doesn't just go ahead and delete the registration. Instead, it asks you to verify that's what you want to do, then it deletes the registration.

In this section, we have finished implementing HTTPS support for Notes by learning how to test that it is implemented correctly.

We've accomplished a redesign of the Notes application stack using a custom NGINX-based container to implement HTTPS support. This approach can be used for any service deployment, where an NGINX instance is used as the frontend to any kind of backend service.

But we have other security fish to fry. Using HTTPS solves only part of the security problem. In the next section, we'll look at Helmet, a tool for Express applications to set many security options in the HTTP headers.

# Using Helmet for across-the-board security in Express applications

While it was useful to implement HTTPS, that's not the end of implementing security measures. It's hardly the beginning of security, for that matter. The browser makers working with the standards organizations have defined several mechanisms for telling the browser what security measures to take. In this section, we will go over some of those mechanisms, and how to implement them using Helmet.

Helmet ([`www.npmjs.com/package/helmet`](https://www.npmjs.com/package/helmet)) is, as the development team says, not a security silver bullet (do Helmet's authors think we're trying to protect against vampires?). Instead, it is a toolkit for setting various security headers and taking other protective measures in Node.js applications. It integrates with several packages that can be either used independently or through Helmet.

Using Helmet is largely a matter of importing the library into `node_modules`, making a few configuration settings, and integrating it with Express.

In the `notes` directory, install the package like so:

然后将此添加到notes/app.mjs中:


That's enough for most applications. Using Helmet out of the box provides a reasonable set of default security options. We could be done with this section right now, except that it's useful to examine closely what Helmet does, and its options.

Helmet is actually a cluster of 12 modules for applying several security techniques. Each can be individually enabled or disabled, and many have configuration settings to make. One option is instead of using that last line, to initialize and configure the sub-modules individually. That's what we'll do in the following sections.

## Using Helmet to set the Content-Security-Policy header

The **Content-Security-Policy** (**CSP**) header can help to protect against injected malicious JavaScript and other file types.

We would be remiss to not point out a glaring problem with services such as the Notes application. Our users could enter any code they like, and an improperly behaving application will simply display that code. Such applications can be a vector for JavaScript injection attacks among other things.

To try this out, edit a note and enter something like this:

单击保存按钮,您将看到此代码显示为文本。Notes 的危险版本将在 notes 视图页面中插入<script>标签,以便加载恶意 JavaScript 并为访问者造成问题。相反,<script>标签被编码为安全的 HTML,因此它只会显示为屏幕上的文本。我们并没有为这种行为做任何特殊处理,Handlebars 为我们做了这个。

实际上,这更有趣一些。如果我们查看 Handlebars 文档,handlebarsjs.com/expressions.html,我们会了解到这个区别:


In Handlebars, a value appearing in a template using two curly braces (`{{encoded}}`) is encoded using HTML coding. For the previous example, the angle bracket is encoded as `&lt;` and so on for display, rendering that JavaScript code as neutral text rather than as HTML elements. If instead, you use three curly braces (`{{{notEncoded}}}`), the value is not encoded and is instead presented as is. The malicious JavaScript would be executed in your visitor's browser, causing problems for your users.

We can see this problem by changing `views/noteview.hbs` to use raw HTML output:

我们不建议这样做,除非作为一个实验来看看会发生什么。效果是,正如我们刚才说的,允许用户输入 HTML 代码并将其原样显示。如果 Notes 以这种方式行事,任何笔记都可能携带恶意 JavaScript 片段或其他恶意软件。

让我们回到 Helmet 对 Content-Security-Policy 头的支持。有了这个头部,我们指示 Web 浏览器可以从哪个范围下载某些类型的内容。具体来说,它让我们声明浏览器可以从哪些域下载 JavaScript、CSS 或字体文件,以及浏览器允许连接哪些域进行服务。

因此,这个标头解决了所命名的问题,即我们的用户输入恶意 JavaScript 代码。但它还处理了恶意行为者入侵并修改模板以包含恶意 JavaScript 代码的类似风险。在这两种情况下,告诉浏览器特定的允许域名列表意味着恶意网站的 JavaScript 引用将被阻止。从pirates.den加载的恶意 JavaScript 不会运行。

要查看此 Helmet 模块的文档,请参阅helmetjs.github.io/docs/csp/

有很多选项。例如,您可以导致浏览器将任何违规行为报告给您的服务器,这样您就需要为/report-violation实现一个路由处理程序。这段代码对 Notes 来说已经足够了:


For better or for worse, the Notes application implements one security best practice—all CSS and JavaScript files are loaded from the same server as the application. Therefore, for the most part, we can use the `'self'` policy. There are several exceptions:

*   `scriptSrc`: Defines where we are allowed to load JavaScript. We do use inline JavaScript in `noteview.hbs` and `index.hbs`, which must be allowed.
*   `styleSrc`, `fontSrc`: We're loading CSS files from both the local server and from Google Fonts.
*   `connectSrc`: The WebSockets channel used by Socket.IO is declared here.

To develop this, we can open the JavaScript console or Chrome DevTools while browsing the website. Errors will show up listing any domains of failed download attempts. Simply add such domains to the configuration object.

### Making the ContentSecurityPolicy configurable

Obviously, the ContentSecurityPolicy settings shown here should be configurable. If nothing else the setting for `connectSrc` must be, because it can cause a problem that prevents Socket.IO from working. As shown here, the `connectSrc` setting includes the URL `wss://notes.geekwisdom.net`. The `wss` protocol here refers to WebSockets and is designed to allow Socket.IO to work while Notes is hosted on `notes.geekwisdom.net`. But what about when we want to host it on a different domain?

To experiment with this problem, change the hard coded string to a different domain name then redeploy it to your server. In the JavaScript console in your browser you will get an error like this:

发生的情况是,静态定义的常量不再与 Notes 部署的域兼容。您已重新配置此设置,以限制连接到不同域,例如notes.newdomain.xyz,但服务仍托管在现有域,例如notes.geekwisdom.net。浏览器不再相信连接到notes.geekwisdom.net是安全的,因为您的配置说只信任notes.newdomain.xyz

最好的解决方案是通过声明另一个环境变量来使其成为可配置的设置,以便根据需要进行设置以自定义行为。

app.mjs中,将contentSecurityPolicy部分更改为以下内容:


This lets us define an environment variable, `CSP_CONNECT_SRC_URL`, which will supply a URL to be added into the array passed to the `connectSrc` parameter. Otherwise, the `connectSrc` setting will be limited to `"'self'"`.

Then in `compose-swarm/docker-compose.yml`, we can declare that variable like so:

我们现在可以在配置中设置它,根据需要进行更改。

重新运行docker stack deploy命令后,错误消息将消失,Socket.IO 功能将开始工作。

在本节中,我们了解了网站向浏览器发送恶意脚本的潜力。接受用户提供内容的网站,如 Notes,可能成为恶意软件的传播途径。通过使用这个标头,我们能够通知网络浏览器在访问这个网站时信任哪些域名,从而阻止任何恶意内容被恶意第三方添加。

接下来,让我们学习如何防止过多的 DNS 查询。

使用头盔设置 X-DNS-Prefetch-Control 标头

DNS Prefetch 是一些浏览器实现的一种便利,其中浏览器将预先为给定页面引用的域名进行 DNS 请求。如果页面有指向其他网站的链接,它将为这些域名进行 DNS 请求,以便填充本地 DNS 缓存。这对用户很好,因为它提高了浏览器的性能,但它也是一种侵犯隐私的行为,并且可能使人看起来好像访问了他们没有访问的网站。有关文档,请参阅helmetjs.github.io/docs/dns-prefetch-control

使用以下内容设置 DNS 预取控制:


In this case, we learned about preventing the browser from making premature DNS queries. The risk is that excess DNS queries give a false impression of which websites someone has visited.

Let's next look at how to control which browser features can be enabled.

## Using Helmet to control enabled browser features using the Feature-Policy header

Web browsers nowadays have a long list of features that can be enabled, such as vibrating a phone, or turning on the camera or microphone, or reading the accelerometer. These features are interesting and very useful in some cases, but can be used maliciously. The Feature-Policy header lets us notify the web browser about which features to allow to be enabled, or to deny enabling.

For Notes we don't need any of those features, though some look intriguing as future possibilities. For instance, we could pivot to taking on Instagram if we allowed people to upload photos, maybe? In any case, this configuration is very strict:

要启用一个功能,要么将其设置为'self'以允许网站启用该功能,要么将其设置为第三方网站的域名,以允许启用该功能。例如,启用支付功能可能需要添加'paypal.com'或其他支付处理器。

在本节中,我们学习了允许启用或禁用浏览器功能。

在下一节中,让我们学习如何防止点击劫持。

使用头盔设置 X-Frame-Options 标头

点击劫持与劫持汽车无关,而是一种巧妙的技术,用于诱使人们点击恶意内容。这种攻击使用一个包含恶意代码的不可见<iframe>,放置在看起来诱人点击的东西上。然后用户会被诱使点击恶意内容。

Helmet 的frameguard模块将设置一个标头,指示浏览器如何处理<iframe>。有关文档,请参阅helmetjs.github.io/docs/frameguard/


This setting controls which domains are allowed to put this page into an `<iframe>`. Using `deny`, as shown here, prevents all sites from embedding this content using an `<iframe>`. Using `sameorigin` allows the site to embed its own content. We can also list a single domain name to be allowed to embed this content.

In this section, you have learned about preventing our content from being embedded into another website using `<iframe>`.

Now let's learn about hiding the fact that Notes is powered by Express.

## Using Helmet to remove the X-Powered-By header

The `X-Powered-By` header can give malicious actors a clue about the software stack in use, informing them of attack algorithms that are likely to succeed. The Hide Powered-By submodule for Helmet simply removes that header.

Express can disable this feature on its own:

或者您可以使用 Helmet 来这样做:


Another option is to masquerade as some other stack like so:

没有什么比让坏人迷失方向更好的了。

我们已经学会了如何让您的 Express 应用程序隐身,以避免给坏人提供关于如何闯入的线索。接下来让我们学习一下如何声明对 HTTPS 的偏好。

通过严格传输安全性改进 HTTPS

在实现了 HTTPS 支持之后,我们还没有完全完成。正如我们之前所说的,最好让我们的用户使用 Notes 的 HTTPS 版本。在我们的 AWS EC2 部署中,我们强制用户使用 HTTPS 进行重定向。但在某些情况下,我们无法这样做,而必须试图鼓励用户访问 HTTPS 站点而不是 HTTP 站点。

严格传输安全性标头通知浏览器应该使用站点的 HTTPS 版本。由于这只是一个通知,还需要实现从 HTTP 到 HTTPS 版本的重定向。

我们设置严格传输安全性如下:


This tells the browser to stick with the HTTPS version of the site for the next 60 days, and never visit the HTTP version.

And, as long as we're on this issue, let's learn about `express-force-ssl`, which is another way to implement a redirect so the users use HTTPS. After adding a dependency to that package in `package.json`, add this in `app.mjs`:

安装了这个软件包后,用户不必被鼓励使用 HTTPS,因为我们在默默地强制他们这样做。

在我们在 AWS EC2 上的部署中,使用这个模块会导致问题。因为 HTTPS 是在负载均衡器中处理的,Notes 应用程序不知道访问者正在使用 HTTPS。相反,Notes 看到的是一个 HTTP 连接,如果使用了forceSSL,它将强制重定向到 HTTPS 站点。但是因为 Notes 根本没有看到 HTTPS 会话,它只看到 HTTP 请求,而forceSSL将始终以重定向方式响应。

这些设置并非在所有情况下都有用。您的环境可能需要这些设置,但对于像我们在 AWS EC2 上部署的环境来说,这根本不需要。对于这些有用的站点,我们已经了解到如何通知 Web 浏览器使用我们网站的 HTTPS 版本,以及如何强制重定向到 HTTPS 站点。

接下来让我们学习一下跨站脚本XSS)攻击。

使用 Helmet 减轻 XSS 攻击

XSS 攻击试图将 JavaScript 代码注入到网站输出中。通过在另一个网站中注入恶意代码,攻击者可以访问他们本来无法检索的信息,或者引起其他类型的麻烦。 X-XSS-Protection 标头可以防止某些 XSS 攻击,但并非所有类型的 XSS 攻击,因为 XSS 攻击有很多种类型:


This causes an X-XSS-Protection header to be sent specifying `1; mode=block`. This mode tells the browser to look for JavaScript in the request URL that also matches JavaScript on the page, and it then blocks that code. This is only one type of XSS attack, and therefore this is of limited usefulness. But it is still useful to have this enabled.

In this section, we've learned about using Helmet to enable a wide variety of security protections in web browsers. With these settings, our application can work with the browser to avoid a wide variety of attacks, and therefore make our site significantly safer.

But with this, we have exhausted what Helmet provides. In the next section, we'll learn about another package that prevents cross-site request forgery attacks.

# Addressing Cross-Site Request Forgery (CSRF) attacks

CSRF attacks are similar to XSS attacks in that both occur across multiple sites. In a CSRF attack, malicious software forges a bogus request on another site. To prevent such an attack, CSRF tokens are generated for each page view. The tokens are to be included as hidden values in HTML FORMs and then checked when the FORM is submitted. A mismatch on the tokens causes the request to be denied.

The `csurf` package is designed to be used with Express [`www.npmjs.com/package/csurf`](https://www.npmjs.com/package/csurf) . In the `notes` directory, run this:

这将安装csurf软件包,并在package.json中记录依赖关系。

然后像这样安装中间件:


The `csurf` middleware must be installed following the `cookieParser` middleware.

Next, for every page that includes a FORM, we must generate and send a token with the page. That requires two things, in the `res.render` call we generate the token, sending the token with other data for the page, and then in the view template we include the token as a hidden INPUT on any form in the page. We're going to be touching on several files here, so let's get started.

In `routes/notes.mjs,` add the following as a parameter to the `res.render` call for the `/add`, `/edit`, `/view`, and `/destroy` routes:

这将生成 CSRF 令牌,确保它与其他数据一起发送到模板。同样,在routes/users.mjs中的/login路由也要这样做。我们的下一个任务是确保相应的模板将令牌呈现为隐藏的输入。

views/noteedit.hbsviews/notedestroy.hbs中,添加以下内容:


This is a hidden INPUT, and whenever the FORM containing this is submitted this value will be carried along with the FORM parameters.

The result is that code on the server generates a token that is added to each FORM. By adding the token to FORMs, we ensure it is sent back to the server on FORM submission. Other software on the server can then match the received token to the tokens that have been sent. Any mismatched token will cause the request to be rejected.

In `views/login.hbs`, make the same addition but adding it inside the FORM like so:

views/noteview.hbs中,有一个用于提交评论的表单。做出以下更改:


In every case, we are adding a hidden INPUT field. These fields are not visible to the user and are therefore useful for carrying a wide variety of data that will be useful to receive on the server. We've already used hidden INPUT fields in Notes, such as in `noteedit.hbs` for the `docreate` flag.

This `<input>` tag renders the CSRF token into the FORM. When the FORM is submitted, the `csurf` middleware checks it for the correctness and rejects any that do not match.

In this section, we have learned how to stop an important type of attack, CSRF.

# Denying SQL injection attacks

SQL injection is another large class of security exploits, where the attacker puts SQL commands into input data. See [`www.xkcd.com/327/`](https://www.xkcd.com/327/) for an example.

The best practice for avoiding this problem is to use parameterized database queries, allowing the database driver to prevent SQL injections simply by correctly encoding all SQL parameters. For example, we do this in the SQLite3 model:

这使用了一个参数化字符串,key的值被编码并插入到问号的位置。大多数数据库驱动程序都有类似的功能,并且它们已经知道如何将值编码到查询字符串中。即使坏人将一些 SQL 注入到key的值中,因为驱动程序正确地对key的内容进行了编码,最坏的结果也只是一个 SQL 错误消息。这自动使任何尝试的 SQL 注入攻击无效。

与我们本可以编写的另一种选择形成对比:


The template strings feature of ES6 is very tempting to use everywhere. But it is not appropriate in all circumstances. In this case, the database query parameter would not be screened nor encoded, and if a miscreant can get a custom string to that query it could cause havoc in the database.

In this section, we learned about SQL injection attacks. We learned that the best defense against this sort of attack is the coding practice all coders should follow anyway, namely to use parameterized query methods offered by the database driver.

In the next section, we will learn about an effort in the Node.js community to screen packages for vulnerabilities.

# Scanning for known vulnerabilities in Node.js packages

Built-in to the npm command-line tool is a command, `npm audit`, for reporting known vulnerabilities in the dependencies of your application. To support this command is a team of people, and software, who scan packages added to the npm registry. Every third-party package used by your application is a potential security hole.

It's not just that a query against the application might trigger buggy code, whether in your code or third-party packages. In some cases, packages that explicitly cause harm have been added to the npm registry.

Therefore the security audits of packages in the npm registry are extremely helpful to every Node.js developer.

The `audit` command consults the vulnerability data collected by the auditing team and tells you about vulnerabilities in packages your application uses.

When running `npm install`, the output might include a message like this:

这告诉我们,当前安装的软件包中有八个已知的漏洞。每个漏洞在这个规模上被分配了一个严重性等级(docs.npmjs.com/about-audit-reports):

  • 严重: 立即处理

  • : 尽快处理

  • 中等: 尽可能快地处理

  • : 自行处理

在这种情况下,运行npm audit告诉我们,所有低优先级问题都在minimist软件包中。例如,报告中包括了这样的内容:


In this case, `minimist` is reported because `hbs` uses `handlebars`, which uses `optimist`, which uses `minimist`. There are six more instances where `minimist` is used by some package that's used by another package that our application is using.

In this case, we're given a recommendation, to upgrade to `hbs@4.1.1`, because that release results in depending on the correct version of `minimist`.

In another case, the chain of dependencies is this:

在这种情况下,没有推荐的修复方法,因为这些软件包都没有发布依赖于正确版本的minimist的新版本。这种情况的推荐解决方案是向每个相应的软件包团队提交问题,要求他们将其依赖项更新为有问题软件包的后续版本。

在最后一种情况下,是我们的应用直接依赖于有漏洞的软件包:


Therefore it is our responsibility to fix this problem because it is in our code. The good news is that this particular package is not executed on the server side since jQuery is a client-side library that just so happens to be distributed through the npm repository.

The first step is to read the advisory to learn what the issue is. That way, we can evaluate for ourselves how serious this is, and what we must do to correctly fix the problem.

What's not recommended is to blindly update to a later package release just because you're told to do so. What if the later release is incompatible with your application? The best practice is to test that the update does not break your code. You may need to develop tests that illustrate the vulnerability. That way, you can verify that updating the package dependency fixes the problem.

In this case, the advisory says that jQuery releases before 3.5.0 have an XSS vulnerability. We are using jQuery in Notes because it is required by Bootstrap, and on the day we read the Bootstrap documentation we were told to use a much earlier jQuery release. Today, the Bootstrap documentation says to use jQuery 3.5.1\. That tells us the Bootstrap team has already tested against jQuery 3.5.1, and we are therefore safe to go ahead with updating the dependency.

In this section, we have learned about the security vulnerability report we can get from the npm command-line tool. Unfortunately for Yarn users, it appears that Yarn doesn't support this command. In any case, this is a valuable resource for being warned about known security issues.

In the next section, we'll learn about the best practices for cookie management in Express applications.

# Using good cookie practices

Some nutritionists say eating too many sweets, such as cookies, is bad for your health. Web cookies, however, are widely used for many purposes including recording whether a browser is logged in or not. One common use is for cookies to store session data to aid in knowing whether someone is logged in or not.

In the Notes application, we're already following the good practices described in the Express security guidelines:

*   We're using an Express session cookie name different from the default shown in the documentation.
*   The Express session cookie secret is not the default shown in the documentation.
*   We use the `express-session` middleware, which only stores a session ID in the cookie, rather than the whole session data object.

Taken together, an attacker can't exploit any known vulnerability that relies on the default values for these items. While it is convenient that many software products have default values, such as passwords, those defaults could be security vulnerabilities. For example, the default Raspberry Pi login/password is *pi* and *raspberry*. While that's cute, any Raspbian-based IoT device that's left with the default login/password is susceptible to attack.

But there is more customization we can do to the cookie used with `express-session`. That package has a few options available for improving security. See [`www.npmjs.com/package/express-session`](https://www.npmjs.com/package/express-session), and then consider this change to the configuration:

这些是看起来有用的额外属性。secure属性要求 Cookie 只能通过 HTTPS 连接发送。这确保了 Cookie 数据通过 HTTPS 加密进行加密。maxAge属性设置了 Cookie 有效的时间,以毫秒表示。

Cookie 在 Web 浏览器中是一个非常有用的工具,即使有很多对网站如何使用 Cookie 的过度炒作的担忧。与此同时,滥用 Cookie 并造成安全问题是可能的。在这一部分,我们学习了如何通过会话 Cookie 来减轻风险。

在下一节中,我们将回顾 AWS ECS 部署的最佳实践。

加固 AWS EC2 部署

还有一个问题留在了第十二章中,使用 Terraform 在 AWS EC2 上部署 Docker Swarm,即 EC2 实例的安全组配置。我们配置了具有宽松安全组的 EC2 实例,最好是严格定义它们。我们当时确实描述了这不是最佳实践,并承诺稍后解决这个问题。这就是我们要做的地方。

在 AWS 中,要记住安全组描述了一个防火墙,根据 IP 端口和 IP 地址允许或禁止流量。这个工具存在是为了减少不法分子获取我们系统非法访问的潜在攻击面。

对于ec2-public-sg安全组,编辑ec2-public.tf并将其更改为以下内容:


This declares many specific network ports used for specific protocols. Each rule names the protocol in the `description` attribute. The `protocol` attribute says whether it is a UDP or TCP protocol. Remember that TCP is a stream-oriented protocol that ensures packets are delivered, and UDP, by contrast, is a packet-oriented protocol that does not ensure delivery. Each has characteristics making them suitable for different purposes.

Something missing is an `ingress` rule for port `3306`, the MySQL port. That's because the `notes-public` server will not host a MySQL server based on the placement constraints.

Another thing to note is which rules allow traffic from public IP addresses, and which limit traffic to IP addresses inside the VPC. Many of these ports are used in support of the Docker swarm, and therefore do not need to communicate anywhere but other hosts on the VPC.

An issue to ponder is whether the SSH port should be left open to the entire internet. If you, or your team, only SSH into the VPC from a specific network, such as an office network, then this setting could list that network. And because the `cidr_blocks` attribute takes an array, it's possible to configure a list of networks, such as a company with several offices each with their own office network.

In `ec2-private.tf`, we must make a similar change to `ec2-private-sg`:

这基本上是相同的,但有一些具体的区别。首先,因为私有 EC2 实例可以有 MySQL 数据库,我们声明了端口3306的规则。其次,除了一个规则外,所有规则都限制流量到 VPC 内的 IP 地址。

在这两个安全组定义之间,我们严格限制了 EC2 实例的攻击面。这将在任何不法分子试图侵入 Notes 服务时设置一定的障碍。

虽然我们已经为 Notes 服务实施了几项安全最佳实践,但总是还有更多可以做的。在下一节中,我们将讨论如何获取更多信息。

AWS EC2 安全最佳实践

在设计 Notes 应用程序堆栈部署的开始,我们描述了一个应该导致高度安全部署的安全模型。我们是那种可以在餐巾纸背面设计安全部署基础设施的安全专家吗?可能不是。但 AWS 团队确实雇佣了具有安全专业知识的工程师。当我们转向 AWS EC2 进行部署时,我们了解到它提供了一系列我们在原始计划中没有考虑到的安全工具,最终我们得到了一个不同的部署模型。

在这一部分,让我们回顾一下我们做了什么,还要回顾一些 AWS 上可用的其他工具。

AWS 虚拟私有云 (VPC) 包含许多实现安全功能的方法,我们使用了其中的一些:

  • 安全组充当一个严格控制进出受安全组保护的事物流量的防火墙。安全组附加到我们使用的每个基础设施元素上,在大多数情况下,我们配置它们只允许绝对必要的流量。

  • 我们确保数据库实例是在 VPC 内创建的,而不是托管在公共互联网上。这样可以将数据库隐藏起来,避免公共访问。

虽然我们没有实施最初设想的分割,但围绕 Notes 的屏障足够多,应该相对安全。

在审查 AWS VPC 安全文档时,还有一些其他值得探索的设施。

AWS 虚拟私有云中的安全性:docs.aws.amazon.com/vpc/latest/userguide/security.html

在本节中,您有机会审查部署到 AWS ECS 的应用程序的安全性。虽然我们做得相当不错,但还有更多可以利用 AWS 提供的工具来加强应用程序的内部安全性。

有了这些,现在是时候结束本章了。

总结

在本章中,我们涵盖了一个非常重要的主题,应用程序安全。由于 Node.js 和 Express 社区的辛勤工作,我们能够通过在各处添加一些代码来配置安全模块,从而加强安全性。

我们首先启用了 HTTPS,因为这是现在的最佳实践,并且对我们的用户有积极的安全收益。通过 HTTPS,浏览器会对网站进行身份验证。它还可以防止中间人安全攻击,并加密用于在互联网上传输的通信,防止大部分窥探。

helmet包提供了一套工具,用于设置安全头,指示 Web 浏览器如何处理我们的内容。这些设置可以防止或减轻整类安全漏洞。通过csurf包,我们能够防止跨站点请求伪造(CSRF)攻击。

这些几个步骤是确保 Notes 应用程序安全的良好开端。但是你不应该就此止步,因为有一系列永无止境的安全问题需要解决。我们任何人都不能忽视我们部署的应用程序的安全性。

在本书的过程中,旅程是关于学习开发和部署 Node.js 网络应用程序所需的主要生命周期步骤。这始于使用 Node.js 的基础知识,然后是应用程序概念的开发,然后我们涵盖了开发、测试和部署应用程序的每个阶段。

在整本书中,我们学习了高级 JavaScript 功能,如异步函数和 ES6 模块在 Node.js 应用程序中的使用。为了存储我们的数据,我们学习了如何使用几种数据库引擎,以及一种使在不同引擎之间轻松切换的方法。

在当今的环境中,移动优先开发非常重要,为了实现这一目标,我们学习了如何使用 Bootstrap 框架。

实时通信在各种网站上都是期望的,因为先进的 JavaScript 功能意味着我们现在可以在网络应用程序中提供更多的互动服务。为了实现这一目标,我们学习了如何使用 Socket.IO 实时通信框架。

将应用程序服务部署到云主机是被广泛使用的,既可以简化系统设置,也可以扩展服务以满足用户需求。为了实现这一目标,我们学会了使用 Docker,然后学会了如何使用 Terraform 将 Docker 服务部署到 AWS ECS。我们不仅在生产部署中使用 Docker,还用它来部署测试基础设施,其中我们可以运行单元测试和功能测试。

posted @ 2024-05-23 15:57  绝不原创的飞龙  阅读(38)  评论(0编辑  收藏  举报