Kubernetes数据库Etcd日常运维及技巧

etcd 是基于 raft算法的分布式键值数据库,生来就为集群化而设计的,由于Raft算法在做决策时需要超半数节点的投票,所以etcd集群一般推荐奇数节点,如3、5或者7个节点构成一个集群。

member操作

现有3节点集群(存储已持久化)

一个节点宕机,重启后无法加入集群

1将节点关闭(副本数置为0),并将数据删除

2删除崩溃节点

1
etcdctl --endpoints=172.16.176.179:2379 member remove <崩溃节点id>

3添加 member 节点

1
etcdctl --endpoints=172.16.176.179:2379 member add etcd-0 --peer-urls=http://etcd-headless-0.apisix-etcd.svc.nbugs.local:2380

image-20211112133902236

4查询 member 节点列表信息

image-20211112133940501

5启动节点

ETCD_INITIAL_CLUSTER_STATE 须设置为 existing

6节点出现Error: etcdserver: re-configuration failed due to not enough started members

7删除节点

8等待节点加入集群

报错处理

1
2
3
Error: etcdserver: re-configuration failed due to not enough started members
member remove <崩溃节点id>
这个出现一般是节点初始化后无法加入集群,用命令删掉节点等待节点加入集群即可,不用删数据
1
2
Error: bad member ID arg (strconv.ParseUint: parsing "": invalid syntax), expecting ID in Hex
重新初始化节点
1
2
failed to publish local member to cluster through raft
重新初始化节点
1
2
member 2d9c4bf1e501bb91 has already been bootstrapped
重新初始化节点
1
2
discovery failed","error":"couldn't find local name \"etcd-1\" in the initial cluster configuration
重新初始化节点

etcdctl命令

因为etcd配置了证书,所以所有的命令都要带上证书访问,如:

1
ETCDCTL_API=3 ./etcdctl --endpoints=https://0:2379,https://1:2379,https://2:2379 --cacert /etc/etcd/ssl/ca.pem --cert /etc/etcd/ssl/etcd.pem --key /etc/etcd/ssl/etcd-key.pem endpoint status -w=table
  • version: 查看版本
  • member list: 查看节点
  • endpoint status: 节点状态,leader 情况
  • endpoint health: 健康状态与耗时
  • set app demo: 写入
  • get app: 获取
  • update app demo1:更新
  • rm app: 删除
  • mkdir demo 创建文件夹
  • rmdir dir 删除文件夹
  • backup 备份
  • watch key 监测 key 变化
  • get / –prefix –keys-only: 查看所有 key