Kubernetes集群构建
参考easzlab/kubeasz: 使用Ansible脚本安装K8S集群,介绍组件交互原理,方便直接,不受国内网络环境影响 (github.com)项目
架构规划
软硬件限制
- cpu及内存,master:至少1c2g,推荐2c4g。node:至少1c2g
- linux系统,内核版本至少3.10,推荐centos7/RHEL7
- docker:至少1.9版本,推荐1.12+
- etcd:至少2.0版本,推荐3.0+
节点规划
角色 | 数量 | 描述 |
---|---|---|
部署节点 | 1 | 运行ansible/ezctl命令,建议独立节点 |
etcd节点 | 3 | 注意etcd集群需要1,3,5,...奇数个节点,一般复用master节点 |
master节点 | 2 | 高可用集群至少2个master节点,额外规划一个master VIP(虚地址 |
node节点 | 3 | 运行应用负载的节点,可根据需要提升机器配置/增加节点数 |
机器规划
IP | 主机名 | 角色 |
---|---|---|
192.168.93.128 | master1 | deploy、master1、lb1、etcd |
192.168.93.129 | master2 | master2、lb2、etcd |
192.168.93.130 | node1 | etcd、node |
192.168.93.131 | node2 | node |
192.168.93.135 | VIP |
集群部署
集群准备
四台服务器上全部执行以下命令:
yum install -y epel-release
yum update
yum install python -y
deploy节点
ansible安装
deploy节点安装和准备ansible,执行以下命令安装ansible
yum install python-pip git
pip install --upgrade pip -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
# 如果产生报错,则是因为升级pip版本跨度太大,可以用下面的命令先升级到中间版本
python -m pip install --upgrade pip==20.2.4
pip install --no-cache-dir ansible -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
pip install ansible
deploy配置
免密码登录配置
- deploy节点需要使用
ssh-keygen
生成密钥 - 使用
for ip in 129 130 131;do ssh-copy-id 192.168.93.$ip;done
命令拷贝密钥到其他机器
deploy节点编排k8s安装
# 下载工具脚本ezdown,举例使用kubeasz版本3.0.0
export release=3.1.1
wget https://github.com/easzlab/kubeasz/releases/download/${release}/ezdown
chmod +x ./ezdown
# 使用工具脚本下载
./ezdown -D
上面的脚本运行成功后,所有的文件(kubeasz代码,二进制,离线镜像)都已经在/etc/kubeasz
目录下
执行/etc/kubeasz/ezctl new k8s-01
命令创建集群配置实例:
[root@master1 kubeasz]# ./ezctl new k8s-01
2021-11-17 08:39:22 DEBUG generate custom cluster files in /etc/kubeasz/clusters/k8s-01
2021-11-17 08:39:22 DEBUG set version of common plugins
2021-11-17 08:39:22 DEBUG cluster k8s-01: files successfully created.
2021-11-17 08:39:22 INFO next steps 1: to config '/etc/kubeasz/clusters/k8s-01/hosts'
2021-11-17 08:39:22 INFO next steps 2: to config '/etc/kubeasz/clusters/k8s-01/config.yml'
根据提示配置/etc/kubeasz/clusters/k8s-01/hosts
和 /etc/kubeasz/clusters/k8s-01/config.yml
: 根据前面节点规划修改hosts 文件和其他集群层面的主要配置选项;其他集群组件等配置项可以在config.yml 文件中修改。
集群参数配置
/etc/kubeasz/clusters/k8s-01/hosts
文件内容
# 'etcd' cluster should have odd member(s) (1,3,5,...)
[etcd]
192.168.93.128
192.168.93.129
192.168.93.130
# master node(s)
[kube_master]
192.168.93.128
192.168.93.129
# work node(s)
[kube_node]
192.168.93.130
192.168.93.131
# [optional] harbor server, a private docker registry
# 'NEW_INSTALL': 'true' to install a harbor server; 'false' to integrate with existed one
[harbor]
#192.168.1.8 NEW_INSTALL=false
# [optional] loadbalance for accessing k8s from outside
[ex_lb]
192.168.93.128 LB_ROLE=backup EX_APISERVER_VIP=192.168.93.135 EX_APISERVER_PORT=8443
192.168.93.129 LB_ROLE=master EX_APISERVER_VIP=192.168.93.136 EX_APISERVER_PORT=8443
/etc/kubeasz/clusters/k8s-01/config.yml
文件内容,需要将master和dns的IP修改与hosts内相同
MASTER_CERT_HOSTS:
- "192.168.93.128"
- "192.168.93.129"
- "10.1.1.1"
- "k8s.test.io"
#- "www.test.com"
# coredns 自动安装
dns_install: "yes"
corednsVer: "1.8.4"
ENABLE_LOCAL_DNS_CACHE: true
dnsNodeCacheVer: "1.17.0"
# 设置 local dns cache 地址
LOCAL_DNS_CACHE: "10.68.0.2"
安装
安装介绍
一键安装:ezctl setup k8s-01 all
分步安装:
# 具体使用 ezctl help setup 查看分步安装帮助信息
# ezctl setup k8s-01 01
# ezctl setup k8s-01 02
# ezctl setup k8s-01 03
# ezctl setup k8s-01 04
...
分步安装步骤
-
创建证书和环境准备
执行
ezctl setup k8s-01 01
开始环境准备的工作 -
安装etcd集群
执行
ezctl setup k8s-01 02
会安装etcd集群,安装完成后,验证etcd集群状态:# 变量 $NODE_IPS是etcd节点ip地址 export NODE_IPS="192.168.93.128 192.168.93.130 192.168.93.131" for ip in ${NODE_IPS}; do ETCDCTL_API=3 etcdctl | --endpoints=https://${ip}:2379 \ --cacert=/etc/kubernetes/ssl/ca.pem \ --cert=/etc/kubernetes/ssl/etcd.pem \ --key=/etc/kubernetes/ssl/etcd-key.pem \ endpoint health; done
输出结果如下,表示etcd节点工作正常:
https://192.168.93.128:2379 is healthy: successfully committed proposal: took = 17.886794ms https://192.168.93.130:2379 is healthy: successfully committed proposal: took = 23.42423ms https://192.168.93.131:2379 is healthy: successfully committed proposal: took = 18.927204ms
-
安装容器(docker)
执行
ezctl setup k8s-01 03
命令,会在集群的每台机器上安装docker服务。 -
安装master节点
master节点主要包含三个组件
apiserver
、scheduler
、controller-manager
其中:apiserver
提供集群管理的REST API接口,包括认证授权、数据校验以及集群状态变更等- 只有API Server才直接操作etcd
- 其他模块通过API Server查询或修改数据
- 提供其他模块之间的数据交互和通信的枢纽
scheduler
负责分配调度Pod到集群内的node节点
- 监听kube-apiserver,查询还未分配Node的Pod
- 根据调度策略为这些Pod分配节点
controller-manager
由一系列的控制器组成,它通过apiserver监控整个集群的状态,并确保集群处于预期的工作状态
执行
ezctl setup k8s-01 04
安装master节点,安装完成后,先查看master节点的服务状态:# 查看进程状态 systemctl status kube-apiserver systemctl status kube-controller-manager systemctl status kube-scheduler # 查看进程运行日志 journalctl -u kube-apiserver journalctl -u kube-controller-manager journalctl -u kube-scheduler
使用
kubectl get componentstatus
或者kubectl get cs
查看master节点状态:NAME STATUS MESSAGE ERROR controller-manager Unhealthy Get "https://127.0.0.1:10257/healthz": dial tcp 127.0.0.1:10257: connect: connection refused scheduler Healthy ok etcd-1 Healthy {"health":"true","reason":""} etcd-2 Healthy {"health":"true","reason":""} etcd-0 Healthy {"health":"true","reason":""}
如果发现controller-manager的状态是unhealthy,那么需要修改
/etc/systemd/system/kube-controller-manager.service
文件,将bind-address
改为127.0.0.1:[Service] ExecStart=/opt/kube/bin/kube-controller-manager \ --bind-address=127.0.0.1 \
然后执行
systemctl daemon-reload
,systemctl restart kube-controller-manager
重启服务,然后再查看master状态:[root@master1 ~]# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR scheduler Healthy ok etcd-1 Healthy {"health":"true","reason":""} controller-manager Healthy ok etcd-2 Healthy {"health":"true","reason":""} etcd-0 Healthy {"health":"true","reason":""}
-
安装node节点
kube_node
是集群中运行工作负载的节点,前置条件需要先部署好kube_master
节点,它需要部署如下组件:- kubelet: kube_node上最主要的组件
- kube-proxy: 发布应用服务与负载均衡
- haproxy:用于请求转发到多个 apiserver,详见HA-2x 架构
- calico: 配置容器网络 (或者其他网络组件)
执行
ezctl setup k8s-01 05
开始安装node节点,安装时也会将master节点一并安装node节点的组件,但不会开启Scheduler服务,安装完成后,查看node节点组件服务状态:systemctl status kubelet # 查看状态 systemctl status kube-proxy journalctl -u kubelet # 查看日志 journalctl -u kube-proxy
[root@master1 ~]# kubectl get node NAME STATUS ROLES AGE VERSION 192.168.93.128 Ready,SchedulingDisabled master 17m v1.22.2 192.168.93.129 Ready,SchedulingDisabled master 17m v1.22.2 192.168.93.130 Ready node 16m v1.22.2 192.168.93.131 Ready node 16m v1.22.2
-
部署集群网络
首先回顾下K8S网络设计原则,在配置集群网络插件或者实践K8S 应用/服务部署请时刻想到这些原则:
- 每个Pod都拥有一个独立IP地址,Pod内所有容器共享一个网络命名空间
- 集群内所有Pod都在一个直接连通的扁平网络中,可通过IP直接访问
- 所有容器之间无需NAT就可以直接互相访问
- 所有Node和所有容器之间无需NAT就可以直接互相访问
- 容器自己看到的IP跟其他容器看到的一样
- Service cluster IP尽可在集群内部访问,外部请求需要通过NodePort、LoadBalance或者Ingress来访问
Kubernetes Pod的网络是这样创建的:
1.每个Pod除了创建时指定的容器外,都有一个kubelet启动时指定的
基础容器
,比如:easzlab/pause-amd64
registry.access.redhat.com/rhel7/pod-infrastructure
2.首先 kubelet创建
基础容器
生成network namespace 3.然后 kubelet调用网络CNI driver,由它根据配置调用具体的CNI 插件
4.然后 CNI 插件给
基础容器
配置网络 5.最后 Pod 中其他的容器共享使用
基础容器
的网络执行
ezctl setup k8s-01 06
安装网络组件,安装完成后查看网络服务:[root@master1 ~]# kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE kube-flannel-ds-amd64-fvchk 1/1 Running 1 (12m ago) 18m kube-flannel-ds-amd64-hm5kd 1/1 Running 1 (11m ago) 18m kube-flannel-ds-amd64-k65mj 1/1 Running 1 (13m ago) 18m kube-flannel-ds-amd64-twtql 1/1 Running 1 (11m ago) 18m
-
安装集群主要插件
这一步主要安装一些DNS,dashboad插件,使用
ezctl setup k8s-01 07
安装,安装完成后,查看kube-system
namespace的服务:[root@master1 ~]# kubectl get svc -n kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE dashboard-metrics-scraper ClusterIP 10.68.209.199 <none> 8000/TCP 28m kube-dns ClusterIP 10.68.0.2 <none> 53/UDP,53/TCP,9153/TCP 28m kube-dns-upstream ClusterIP 10.68.31.79 <none> 53/UDP,53/TCP 28m kubernetes-dashboard NodePort 10.68.24.112 <none> 443:32707/TCP 28m metrics-server ClusterIP 10.68.129.80 <none> 443/TCP 28m node-local-dns ClusterIP None <none> 9253/TCP 28m
测试
首先查看集群信息,使用kubectl cluster-info
命令查看:
[root@master1 ~]# kubectl cluster-info
Kubernetes control plane is running at https://127.0.0.1:6443
CoreDNS is running at https://127.0.0.1:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
KubeDNSUpstream is running at https://127.0.0.1:6443/api/v1/namespaces/kube-system/services/kube-dns-upstream:dns/proxy
kubernetes-dashboard is running at https://127.0.0.1:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
查看node/pod的资源使用情况,命令如下:
kubectl top node
kubectl top pod --all-namespaces
[root@master1 ~]# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
192.168.93.128 130m 6% 839Mi 35%
192.168.93.129 93m 4% 766Mi 32%
192.168.93.130 67m 3% 473Mi 20%
192.168.93.131 37m 1% 390Mi 16%
[root@master1 ~]# kubectl top pod --all-namespaces
NAMESPACE NAME CPU(cores) MEMORY(bytes)
kube-system coredns-67cb59d684-8lflh 2m 14Mi
kube-system dashboard-metrics-scraper-856586f554-g6jwq 1m 10Mi
kube-system kube-flannel-ds-amd64-fvchk 2m 19Mi
kube-system kube-flannel-ds-amd64-hm5kd 2m 21Mi
kube-system kube-flannel-ds-amd64-k65mj 2m 23Mi
kube-system kube-flannel-ds-amd64-twtql 2m 12Mi
kube-system kubernetes-dashboard-65b659dd64-qfmz8 1m 14Mi
kube-system metrics-server-7d567f6489-s9zqs 2m 24Mi
kube-system node-local-dns-8mmqk 2m 9Mi
kube-system node-local-dns-qsg7c 2m 11Mi
DNS测试:
-
首先创建nginx server,使用
kubectl run nginx --image=nginx --expose --port=80
快速创建nginx的service和pod:[root@master1 ~]# kubectl run nginx --image=nginx --expose --port=80 service/nginx created pod/nginx created [root@master1 ~]# kubectl get pod NAME READY STATUS RESTARTS AGE nginx 1/1 Running 0 12s [root@master1 ~]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.68.0.1 <none> 443/TCP 43m nginx ClusterIP 10.68.177.44 <none> 80/TCP 62s
-
再运行一个busybox,并且进入容器内运行nslookup命令,查看dns状态:
[root@master1 ~]# kubectl run busybox --rm -it --image=busybox /bin/sh If you don't see a command prompt, try pressing enter. / # nslookup nginx.default.svc.cluster.local Server: 10.68.0.2 Address: 10.68.0.2:53 Name: nginx.default.svc.cluster.local Address: 10.68.252.155 *** Can't find nginx.default.svc.cluster.local: No answer
可以看到上面的10.68.0.2就是dns服务器的地址,并且监听了53端口,看一下所有的namespaces下面svc的cluster-ip:
[root@master1 ~]# kubectl get svc --all-namespaces NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default kubernetes ClusterIP 10.68.0.1 <none> 443/TCP 56m default nginx ClusterIP 10.68.252.155 <none> 80/TCP 6m6s kube-system dashboard-metrics-scraper ClusterIP 10.68.209.199 <none> 8000/TCP 50m kube-system kube-dns ClusterIP 10.68.0.2 <none> 53/UDP,53/TCP,9153/TCP 50m kube-system kube-dns-upstream ClusterIP 10.68.31.79 <none> 53/UDP,53/TCP 50m kube-system kubernetes-dashboard NodePort 10.68.24.112 <none> 443:32707/TCP 50m kube-system metrics-server ClusterIP 10.68.129.80 <none> 443/TCP 50m kube-system node-local-dns ClusterIP None <none> 9253/TCP 50m
在busybox里可以直接pings pod的NAME,DNS会自动解析pod的cluster-ip,但是不能ping通kube-system这个namespace下的pod:
[root@master1 ~]# kubectl run busybox --rm -it --image=busybox /bin/sh If you don't see a command prompt, try pressing enter. / # ping nginx PING nginx (10.68.252.155): 56 data bytes 64 bytes from 10.68.252.155: seq=0 ttl=64 time=0.069 ms 64 bytes from 10.68.252.155: seq=1 ttl=64 time=0.082 ms ^C --- nginx ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 0.069/0.075/0.082 ms / # ping kubernetes PING kubernetes (10.68.0.1): 56 data bytes 64 bytes from 10.68.0.1: seq=0 ttl=64 time=0.042 ms 64 bytes from 10.68.0.1: seq=1 ttl=64 time=0.096 ms ^C --- kubernetes ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 0.042/0.069/0.096 ms / # ping kubernetes-dashboard ping: bad address 'kubernetes-dashboard'
其他功能
-
要新增node节点,只需要执行
ezctl add-node <cluster> <node-ip>
即可,上面的例子中,我们的cluster是k8s-01
-
备份,恢复,都可以使用ezctl命令来执行,具体的用法可以使用
ezctl -h
命令来查看。