背景
K8s 集群或者集群内的组件(容器)发生故障时,一个是要快速恢复环境,还有一个是要获取一份容器运行环境的完整快照, 便于事后分析、定位故障原因。在收集容器运行环境信息后,就可以迅速恢复现场,提高无故障时间,这对有服务等级要求的系统来说,尤为重要。
因为实际生产需要,作者编写了shell脚本文件,用于收集 k8s 集群中 app 的运行环境信息。其结果以markdown
格式存储在k8s-app-info_$(date +%s).md
文件中
收集的容器运行环境信息如下列表:
- 资源事件和资源描述文件
- ingress
- service
- endpoints
- deployment
- replicaset
- daemonset
- cronjob
- job
- pod
- configmaps
- secrets
- pvc
- pv
- 容器日志
- 集群事件
- 集群状态信息
脚本使用
脚本运行在 Linux/Unix
操作系统,使用 BASH
解释器。请确保 kubectl
客户端正常连接到 k8s 集群并具有资源获取的相应权限
下载
wget https://cdn.jsdelivr.net/gh/lework/script@master/shell/k8s/k8s-app-info.sh
chmod +x k8s-app-info.sh
脚本参数
# ./k8s-app-info.sh
Get k8s app info.
Usage:
k8s-app-info.sh [flag]
Flag:
-ns,--namespace namespace
-n,--name name
-l,--selector selector
- -ns,–namespace 指定命名空间
- -n,–name 指定资源名称
- -l,–selector 指定标签选择
获取 app 信息
使用资源名称获取信息
# kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
ingress-demo-app 2/2 2 2 26h
# ./k8s-app-info.sh -ns default -n ingress-demo-app
Get Selector ............................ OK
Get Ingress ............................. 1
Get Service ............................. 1
Get Endpoints ........................... 1
Get Deployment .......................... 1
Get Replicaset .......................... 1
Get Daemonset ........................... 0
Get Cronjob ............................. 0
Get Job ................................. 0
Get Pod ................................. 2
Get Configmaps .......................... 0
Get Secrets ............................. 0
Get Pvc ................................. 0
Get Pv .................................. 0
Get Pod log ............................. 2
Get Pod: ingress-demo-app-78ccc7c466-9kfdb
Get Pod: ingress-demo-app-78ccc7c466-tmq6v
Get k8s Event ........................... OK
Get Cluster ............................. OK
File: k8s-app-info_1606892512.md
使用标签选择器获取信息
# kubectl get all -n kube-system -l k8s-app=calico-node
NAME READY STATUS RESTARTS AGE
pod/calico-node-8xclc 1/1 Running 0 20h
pod/calico-node-j4dkc 1/1 Running 0 20h
pod/calico-node-mjprr 1/1 Running 0 20h
pod/calico-node-ptqq7 1/1 Running 0 20h
pod/calico-node-z2kzd 1/1 Running 0 20h
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/calico-node 5 5 5 5 5 kubernetes.io/os=linux 26h
# ./k8s-app-info.sh -ns kube-system -l k8s-app=calico-node
Get Selector ............................ OK
Get Ingress ............................. 0
Get Service ............................. 0
Get Endpoints ........................... 0
Get Deployment .......................... 0
Get Replicaset .......................... 0
Get Daemonset ........................... 1
Get Cronjob ............................. 0
Get Job ................................. 0
Get Pod ................................. 5
Get Configmaps .......................... 0
Get Secrets ............................. 0
Get Pvc ................................. 0
Get Pv .................................. 0
Get Pod log ............................. 5
Get Pod: calico-node-8xclc
Get Pod: calico-node-j4dkc
Get Pod: calico-node-mjprr
Get Pod: calico-node-ptqq7
Get Pod: calico-node-z2kzd
Get k8s Event ........................... OK
Get Cluster ............................. OK
File: k8s-app-info_1606892642.md
通过 Typora
工具就可以清晰的看到 app 运行环境的信息。