Lework Study hard, improve every day.

快速保存 Kubernetes 集群容器运行环境

2020-11-20
lework
本文 2949 字,阅读全文约需 9 分钟

背景

K8s 集群或者集群内的组件(容器)发生故障时,一个是要快速恢复环境,还有一个是要获取一份容器运行环境的完整快照, 便于事后分析、定位故障原因。在收集容器运行环境信息后,就可以迅速恢复现场,提高无故障时间,这对有服务等级要求的系统来说,尤为重要。

因为实际生产需要,作者编写了shell脚本文件,用于收集 k8s 集群中 app 的运行环境信息。其结果以markdown 格式存储在k8s-app-info_$(date +%s).md 文件中

收集的容器运行环境信息如下列表:

  • 资源事件和资源描述文件
    • ingress
    • service
    • endpoints
    • deployment
    • replicaset
    • daemonset
    • cronjob
    • job
    • pod
    • configmaps
    • secrets
    • pvc
    • pv
  • 容器日志
  • 集群事件
  • 集群状态信息

脚本使用

脚本运行在 Linux/Unix 操作系统,使用 BASH 解释器。请确保 kubectl 客户端正常连接到 k8s 集群并具有资源获取的相应权限

下载

wget https://cdn.jsdelivr.net/gh/lework/script@master/shell/k8s/k8s-app-info.sh
chmod +x k8s-app-info.sh

脚本参数

# ./k8s-app-info.sh 

Get k8s app info.

Usage:
  k8s-app-info.sh [flag]
  
Flag:
  -ns,--namespace  namespace
  -n,--name        name
  -l,--selector    selector
  • -ns,–namespace 指定命名空间
  • -n,–name 指定资源名称
  • -l,–selector 指定标签选择

获取 app 信息

使用资源名称获取信息

# kubectl  get deployment
NAME               READY   UP-TO-DATE   AVAILABLE   AGE
ingress-demo-app   2/2     2            2           26h

# ./k8s-app-info.sh -ns default -n ingress-demo-app
Get Selector  ............................  OK
Get Ingress  .............................  1
Get Service  .............................  1
Get Endpoints  ...........................  1
Get Deployment  ..........................  1
Get Replicaset  ..........................  1
Get Daemonset  ...........................  0
Get Cronjob  .............................  0
Get Job  .................................  0
Get Pod  .................................  2
Get Configmaps  ..........................  0
Get Secrets  .............................  0
Get Pvc  .................................  0
Get Pv  ..................................  0
Get Pod log  .............................  2
Get Pod: ingress-demo-app-78ccc7c466-9kfdb
Get Pod: ingress-demo-app-78ccc7c466-tmq6v
Get k8s Event  ...........................  OK
Get Cluster  .............................  OK

File: k8s-app-info_1606892512.md

使用标签选择器获取信息

# kubectl get all  -n kube-system -l k8s-app=calico-node                  
NAME                    READY   STATUS    RESTARTS   AGE
pod/calico-node-8xclc   1/1     Running   0          20h
pod/calico-node-j4dkc   1/1     Running   0          20h
pod/calico-node-mjprr   1/1     Running   0          20h
pod/calico-node-ptqq7   1/1     Running   0          20h
pod/calico-node-z2kzd   1/1     Running   0          20h

NAME                         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
daemonset.apps/calico-node   5         5         5       5            5           kubernetes.io/os=linux   26h

# ./k8s-app-info.sh -ns kube-system -l k8s-app=calico-node
Get Selector  ............................  OK
Get Ingress  .............................  0
Get Service  .............................  0
Get Endpoints  ...........................  0
Get Deployment  ..........................  0
Get Replicaset  ..........................  0
Get Daemonset  ...........................  1
Get Cronjob  .............................  0
Get Job  .................................  0
Get Pod  .................................  5
Get Configmaps  ..........................  0
Get Secrets  .............................  0
Get Pvc  .................................  0
Get Pv  ..................................  0
Get Pod log  .............................  5
Get Pod: calico-node-8xclc
Get Pod: calico-node-j4dkc
Get Pod: calico-node-mjprr
Get Pod: calico-node-ptqq7
Get Pod: calico-node-z2kzd
Get k8s Event  ...........................  OK
Get Cluster  .............................  OK

File: k8s-app-info_1606892642.md

通过 Typora 工具就可以清晰的看到 app 运行环境的信息。

原文地址 https://lework.github.io/2020/11/20/k8s-app-info/

Comments

Content