Skip to content

etcd 常见问题

增强版 backup 方案

增强版backup方案

etcd 数据加密

  • https://kubernetesjo/docs/tasks/administer-cluster/encrypt-data/
apiVersion: API Server.config.k8s.io/v1 
kind: Encryptionconfiguration
resources:
    - resources:
        - secrets 
        providers:
        - identity: {}
        - aesgcm: 
            keys:
            - name: key1
              secret: c2VjcmV0IGlzIHNlY3VyZQ==
            - name: key2
              secret: dGhpcyBpcyBwYXNzd29yZA==
        - aescbc:
            keys:
            - name: key1 
              secret: c2VjcmV0IGlzIHNlY3VyZQ==
        - secretbox:
            keys:
            - name: key1
              secret: YWJjZGVmZ2hpamtsbW5vcHFyc3R1dnd4eXoxMjM0NTY= 
        - kms:
            name: myKmsPlugin 
            endpoint: unix:///tmp/socketfile.sock 
            cachesize: 100

Kubernetes 中数据分离

  • 对于大规模集群/大量的事件会对etcd造成压力
  • API server启动脚本中指定etcd servers集群
/usr/local/bin/kube-apiserver --etcd-servers=https://localhost:4001 --etcd- cafile=/etc/ssl/kubernetes/ca.crt--storage-backend=etcd3 --etcd-servers- overrides=/events#https://localhost:4002

查询 APIServer

返回某namespace中的所有Pod

GET /api/vl /namespaces/test/pods
200 OK
Content-Type: application/json
(
    "kind": "PodList",
    "apiVersion": "vl",
    "metadata": ("resourceversion":"!0245"},
    "items": [...]
}

从 12345 版本开始,监听所有对象变化

GET /api/vl /namespaces/test/pods?watch=1&resourceVersion=10245
200 OK
Transfer-Encoding: chunked
Content-Type: application/json
{
"type": "ADDED",
"object": {"kind": "Pod", "apiVersion": "vl", "metadata": ("resourceversion": "10596",...}
)
(
"type": "MODIFIED",
"object": {"kind": "Pod", "apiVersion": "vl", "metadata": {"resourceVersion": "11020",...),...)
}

分页查询

GET /api/v1/pods?limit=500
---
200 OK
Content-Type: application/json
{
    "kind": "PodList,
    "apiVersion": "vl",
    "metadata": {
        "resourceVersion":"10245",
        "continue": "HENCODED_CONTINUE_TOKEN",
        },
    "items": [...] // returns pods 1-500
}
GET /api/v1/pods?limit=500&contnue=ENCODED_CONTINUE_TOKEN
---
200 OK
Content-Type: application/json
{
    "kind": "PodList",
    "apiVersion": "v1",
    "metadata": {
        "resourceVersionn":"10245",
        "continue": "ENCODED_CONTINUE_TOKEN_2,
        },
    "items": [...] // returns pods 501-1000
}

Resourceversion

  • 单个对象的 resourceversion
    • 对象的最后修改 resourceversion
  • List 对象的 resourceversion
    • 生成 list response 时的 resourceversion
  • List 行为
    • List 对象时,如果不加 resourceversion,意味着需要 Most Recent 数据,请求会击穿 APIServer 缓存,直接发送至 etcd
    • APIServer 通过 Label 过滤对象查询时,过滤动作是在 APIServer 端,APIServer 需要向 etcd 发起全量查询请求

Resourceversion

遭遇到的陷阱

频繁的 leader election
etcd 分裂
etcd 不响应
与 apiserver 之间的链路阻塞
磁盘暴涨

少数 etcd 成员 Down

少数etcd成员Down

Master 节点出现网络分区

Case:网络分区出现
Group#1:master-1,master-2
Group#2:master-3,master-4,master-5
Master节点出现网络分区

课后练习 5.2

在 Kubernetes 集群中创建一个高可用的 etcd 集群

参考资料

B树和B+树: https://segmentfault.com/a/1190000020416577
Etcd流程分析: https://www.jianshu.com/p/2614fdb5d1c3