【Kubernetes 系列五】在 AWS 中使用 Kubernetes:EKS
1. 概述
Amazon Elastic Kubernetes Service (Amazon EKS) 是一项托管服务,可让您在 AWS 上轻松运行 Kubernetes,而无需支持或维护您自己的 Kubernetes 控制层面。
Amazon EKS 跨多个可用区运行 Kubernetes 控制层面实例以确保高可用性。Amazon EKS 可以自动检测和替换运行状况不佳的控制层面实例,并为它们提供自动版本升级和修补。
Amazon EKS 还与许多 AWS 服务集成以便为您的应用程序提供可扩展性和安全性,包括:
- 用于容器镜像的 Amazon ECR
- 用于负载分配的 Elastic Load Balancing
- 用于身份验证的 IAM
- 用于隔离的 Amazon VPC
2. 版本
K8S 版本 | K8S 发布时间 | EKS 平台版本 | EKS 发布日志 |
---|---|---|---|
1.13.7 | 2019.6.7 | eks.1 | Initial release of Kubernetes 1.13 for Amazon EKS. For more information, see Kubernetes 1.13. |
1.12.6 | 2019.2.27 | eks.2 | New platform version to support custom DNS names in the Kubelet certificate and improve etcd performance. This fixes a bug that caused worker node Kubelet daemons to request a new certificate every few seconds. |
1.12.6 | 2019.2.27 | eks.1 | Initial release of Kubernetes 1.12 for Amazon EKS. |
1.11.8 | 2019.3.1 | eks.3 | New platform version to support custom DNS names in the Kubelet certificate and improve etcd performance. |
1.11.8 | 2019.3.1 | eks.2 | New platform version updating Amazon EKS Kubernetes 1.11 clusters to patch level 1.11.8 to address CVE-2019-1002100. |
3. 预备
3.1. 操作环境
3.1.1 Python
- 版本要求:>= 2.7.9
- 用途:安装 aws cli
3.1.2 aws cli
- 版本要求:>= 1.16.156
- 用途:操作 aws 资源
- 安装过程:
pip install awscli --upgrade --user
3.1.3 eksctl
- 版本要求:>= 0.1.37
- 用途:操作 aws eks 资源
- 安装过程:
curl --silent --location "https://github.com/weaveworks/eksctl/releases/download/latest_release/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp
sudo mv /tmp/eksctl /usr/local/bin
eksctl version
3.1.4 kubectl
- 版本要求:最新版本或不低于 Kubernetes 版本 1 个小版本号。
- 用途:操作 Kubernetes 集群
- 安装过程:
curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl
chmod +x ./kubectl
sudo mv ./kubectl /usr/local/bin/kubectl
kubectl version
3.2. 角色权限
参考:
- Amazon EKS 基于身份的策略示例
- https://github.com/weaveworks/eksctl/issues/204#issuecomment-450280786(这位小哥说他亲自试了 30 多次才补全的,而我试了将近 40 次)
- https://docs.aws.amazon.com/autoscaling/ec2/userguide/control-access-using-iam.html
注意:要有适量网关、VPC 和 IP 数量空余,否则会报达到最大限制错误。
3.2.1. CloudFormation 完全权限
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"cloudformation:*"
],
"Resource": "*"
}
]
}
3.2.2. EKS 读写权限
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"eks:ListClusters",
"eks:CreateCluster"
],
"Resource": "*"
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"eks:UpdateClusterVersion",
"eks:ListUpdates",
"eks:DescribeUpdate",
"eks:DescribeCluster",
"eks:ListClusters",
"eks:CreateCluster"
],
"Resource": "arn:aws:eks:*:*:cluster/*"
}
]
}
3.2.3. EC2 相关权限
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"ec2:CreateInternetGateway",
"ec2:CreateVpc",
"ec2:Describe*",
"ec2:createTags",
"ec2:ModifyVpcAttribute",
"ec2:CreateSubnet",
"ec2:CreateSubnet",
"ec2:CreateRouteTable",
"ec2:CreateSecurityGroup",
"ec2:DeleteSecurityGroup",
"ec2:AttachInternetGateway",
"ec2:CreateRoute",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:AuthorizeSecurityGroupEgress",
"ec2:RevokeSecurityGroupEgress",
"ec2:RevokeSecurityGroupIngress",
"ec2:AssociateRouteTable",
"ec2:CreateNatGateway",
"ec2:AllocateAddress",
"ec2:DeleteInternetGateway",
"ec2:DeleteNatGateway",
"ec2:DeleteRoute",
"ec2:DeleteRouteTable",
"ec2:DeleteSubnet",
"ec2:DeleteTags",
"ec2:DeleteVpc",
"ec2:DescribeInternetGateways",
"ec2:DescribeNatGateways",
"ec2:DescribeRouteTables",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeTags",
"ec2:DescribeVpcAttribute",
"ec2:DetachInternetGateway",
"ec2:DisassociateRouteTable",
"ec2:RunInstances",
"ec2:ReleaseAddress"
],
"Resource": "*"
}
]
}
3.2.4. CloudWatch 相关权限
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"cloudwatch:ListMetrics",
"cloudwatch:GetMetricStatistics",
"cloudwatch:Describe*"
],
"Resource": "*"
},
]
}
3.2.5. autoscaling 相关权限
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:CreateAutoScalingGroup",
"autoscaling:DeleteAutoScalingGroup",
"autoscaling:DeleteLaunchConfiguration",
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeScalingActivities",
"autoscaling:UpdateAutoScalingGroup"
],
"Resource": "*"
}
]
}
3.2.6. elasticloadbalancing 相关权限
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "elasticloadbalancing:Describe*",
"Resource": "*"
}
]
}
3.2.7. iam 相关权限
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"iam:CreateRole",
"iam:AttachRolePolicy",
"iam:DetachRolePolicy",
"iam:GetRole",
"iam:PassRole",
"iam:CreateInstanceProfile",
"iam:AddRoleToInstanceProfile",
"iam:RemoveRoleFromInstanceProfile",
"iam:GetInstanceProfile",
"iam:PutRolePolicy",
"iam:DeleteRolePolicy",
"iam:GetRolePolicy",
"iam:ListInstanceProfiles",
"iam:CreateServiceLinkedRole",
"iam:ListInstanceProfilesForRole"
],
"Resource": "*"
}
]
}
3.2.8. LaunchTemplate 相关权限
{
"Sid": "VisualEditor2",
"Effect": "Allow",
"Action": [
"autoscaling:CreateLaunchConfiguration",
"ec2:DeleteLaunchTemplate",
"ec2:ModifyLaunchTemplate",
"ec2:DeleteLaunchTemplateVersions",
"ec2:CreateLaunchTemplateVersion"
],
"Resource": [
"arn:aws:autoscaling:*:*:launchConfiguration:*:launchConfigurationName/*",
"arn:aws:ec2:*:*:launch-template/*"
]
}
3.3. 安装 aws-iam-authenticator
参见:https://docs.aws.amazon.com/zh_cn/eks/latest/userguide/install-aws-iam-authenticator.html
curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.13.7/2019-06-11/bin/linux/amd64/aws-iam-authenticator
chmod +x ./aws-iam-authenticator
mkdir -p $HOME/bin && cp ./aws-iam-authenticator $HOME/bin/aws-iam-authenticator && export PATH=$HOME/bin:$PATH
echo 'export PATH=$HOME/bin:$PATH' >> ~/.bashrc
// 获取 token?
aws-iam-authenticator token -i <cluster name>
// 查看调用者?
aws sts get-caller-identity
3.4. 创建 kubeconfig
参见:https://docs.aws.amazon.com/zh_cn/eks/latest/userguide/create-kubeconfig.html
使用以下命令自动生成 kubeconfig
// 生成 kubeconfig
aws eks --region <your region> update-kubeconfig --name <cluster name>
// 查看 kubeconfig
cat ~/.kube/config
4. 开始使用
4.1. 创建集群
使用以下命令开始创建集群,其原理是:通过 aws cli 调用 CloudFormation 的相关 API,启动一个创建 EKS Cluster 的 Stack 和一个创建 EKS nodes 的 Stack 去创建集群所需的各种资源(包括网关、IP、VPC、EC2 等等)。
eksctl create cluster \
--name prod \
--version 1.13 \
--nodegroup-name standard-workers \
--node-type t3.medium \
--nodes 3 \
--nodes-min 1 \
--nodes-max 4 \
--node-ami auto
注意:如果选择 P2 或 P3 实例类型和 Amazon EKS 优化的 AMI(具有 GPU 支持),则必须使用以下命令在集群上将适用于 Kubernetes 的 NVIDIA 设备插件用作守护程序集。
kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta/nvidia-device-plugin.yml
4.2. 查看集群状态
// 查看节点状态
kubectl get nodes
// 查看服务状态
kubectl get svc
// 查看事件
kubectl get events --all-namespaces
4.3. 部署 Dashboard
参见:
- https://aws.amazon.com/cn/premiumsupport/knowledge-center/eks-cluster-kubernetes-dashboard/
- https://docs.aws.amazon.com/zh_cn/eks/latest/userguide/dashboard-tutorial.html
- https://www.youtube.com/watch?v=JcZJqSa65Yc
// 将 Kubernetes 控制面板部署到集群
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml
// 部署 heapster 以在集群上启用容器集群监控和性能分析
kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/heapster.yaml
// 将 heapster 的 influxdb 后端部署到集群
kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/influxdb.yaml
// 为控制面板创建 heapster 集群角色绑定
kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/rbac/heapster-rbac.yaml
// 创建一个具有新集群管理权限的新服务账户
cat > eks-admin-service-account.yaml << EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: eks-admin
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: eks-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: eks-admin
namespace: kube-system
EOF
// 将此服务账户和集群角色绑定应用到您的集群
kubectl apply -f eks-admin-service-account.yaml
// 检索 eks-admin 服务账户的身份验证令牌。从输出中复制 <authentication_token> 值。您可以使用此令牌连接到控制面板
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep eks-admin | awk '{print $1}')
// 将所有请求从您的 Amazon EC2 实例本地主机端口转发到 Kubernetes 控制面板端口
kubectl port-forward svc/kubernetes-dashboard -n kube-system 6443:443
// 从带 SSH 隧道的本地计算机访问端口
ssh -i EC2KeyPair.pem ec2-user@IP -L 6443:127.0.0.1:6443
访问 https://127.0.0.1:6443 输入 Token 即可访问 Dashboard。
4.4. 删除集群
eksctl delete cluster --region=<your region> --name=<cluster name>
4.5. 更多操作
参见: