Kubernetes CKA課程筆記 68
考量到災難發生的狀況,導致了我們失去了在K8s Cluster內自己定義的元件:
模擬上述狀況,並且打算透過前一篇做的etcd store備份檔來還原
先來確認現有環境(也就是第66篇筆記的備份檔裡有哪些元件)執行:
kubectl get all
這邊原本還可以移除一些deployment模擬(理論上也可以移!),不過沒關係
先嘗試移除service元件如下,執行:
kubectl delete service {service名稱1} {service名稱2} {service名稱3}
再次確認得到:
另外我們還可以移除自己建的(secret/configMap)
同理執行configMap的移除:
以上我就移除了很多字定義的元件了,就好像下面的狀況:
但是現在我們有:
K8s元件還原 — 利用etcd store backup做復原
可以參考官網doc:
步驟1:將前篇(筆記66)做的備份檔,還原為etcd server運行使用的檔案
這邊直接透過指令用前一篇的備份檔建立還原的etcd store,執行:
ETCDCTL_API=3 etcdctl snapshot restore {前篇的備份檔絕對路徑} --data-dir {新的etcd store位置(亦即將備份檔還原成etcd server使用的檔案)}
參考:
我們要從備份檔(/tmp/etcdbackup.db)建立出還原後
提供給etcd server使用的資料原檔
用來取代舊的(假設損毀後的/var/lib/etcd目錄檔案)
所以建立還原點指令為:
ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcdbackup.db --data-dir /var/lib/etcdbackup
指令再調整為:
sudo ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcdbackup.db --data-dir /var/lib/etcdbackup
我這邊發現會失敗(跟講師不一樣,講師這樣執行就成功了)
並且實際上是有建出/var/lib/etcdbackup目錄的
後來爬文找到這篇:
查看etcdctl的restore API,執行:
ETCDCTL_API=3 etcdctl snapshot restore --help
嘗試就直接忽略吧,後面也是說從目錄還原需要用這個option
所以先移除剛剛建立時有錯誤訊息的/var/lib/etcdbackup目錄
重新執行復原指令,執行:
sudo ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcdbackup.db --data-dir /var/lib/etcdbackup --skip-hash-check true
後來最終還原指令改為(是的,只差一個等號...):
sudo ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcdbackup.db --data-dir /var/lib/etcdbackup --skip-hash-check=true
確認一下該還原後的目錄:
步驟2:將目前etcd Server使用儲存的檔案,指向還原後的目錄:
簡單的方式就是直接改動/etc/kubernetes/manifest/etcd.yaml
因為前面有學到,kubelet會一直監視/etc/kubernetes/manifest目錄
底下的元件yaml檔的異動,並執行其異動內容,使其生效,所以:
然後會發現剛改好後kubectl指令會卡住(因為etcd Pod正在重整):
步驟3.當然就是確認之前做的k8s元件(最上面我移除的那幾個)是否還原了
以上就完成了遺失的K8s元件復原了!!!
後來結束前發現:
查看logs,執行:
kubectl logs etcd-master -n kube-system
看到:
我嘗試還原到路徑:/var/lib/etcd後會看到的log會是如下:
在錯誤的還原檔下,也就是etcd Pod還是Pending情況下建立Pod會整個卡住:
所以最後先暫時還原回到原本的/var/lib/etcd目錄:
其中擷取卡住的etcd Pod的部分log如下:
2021-11-26 14:52:48.176036 N | etcdmain: the server is already initialized as member before, starting as etcd member...
[WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
2021-11-26 14:52:48.176102 I | embed: peerTLS: cert = /etc/kubernetes/pki/etcd/peer.crt, key = /etc/kubernetes/pki/etcd/peer.key, trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file =
2021-11-26 14:52:48.176896 I | embed: name = master
2021-11-26 14:52:48.176911 I | embed: data dir = /var/lib/etcd
2021-11-26 14:52:48.176916 I | embed: member dir = /var/lib/etcd/member
2021-11-26 14:52:48.176920 I | embed: heartbeat = 100ms
2021-11-26 14:52:48.176924 I | embed: election = 1000ms
2021-11-26 14:52:48.176933 I | embed: snapshot count = 10000
2021-11-26 14:52:48.176942 I | embed: advertise client URLs = https://172.31.32.35:2379
2021-11-26 14:52:48.176947 I | embed: initial advertise peer URLs = https://172.31.32.35:2380
2021-11-26 14:52:48.176953 I | embed: initial cluster =
2021-11-26 14:52:48.186945 I | etcdserver: recovered store from snapshot at index 1
2021-11-26 14:52:48.187788 I | mvcc: restore compact to 4680451
2021-11-26 14:52:48.200563 I | etcdserver: restarting member 8e9e05c52164694d in cluster cdf818194e3a8c32 at commit index 2274
raft2021/11/26 14:52:48 INFO: 8e9e05c52164694d switched to configuration voters=(10276657743932975437)
raft2021/11/26 14:52:48 INFO: 8e9e05c52164694d became follower at term 2
raft2021/11/26 14:52:48 INFO: newRaft 8e9e05c52164694d [peers: [8e9e05c52164694d], term: 2, commit: 2274, applied: 1, lastindex: 2274, lastterm: 2]
2021-11-26 14:52:48.202507 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32 from store
2021-11-26 14:52:48.203946 W | auth: simple token is not cryptographically signed
2021-11-26 14:52:48.205743 I | mvcc: restore compact to 4680451
2021-11-26 14:52:48.211095 I | etcdserver: starting server... [version: 3.4.13, cluster version: to_be_decided]
2021-11-26 14:52:48.211495 I | etcdserver: 8e9e05c52164694d as single-node; fast-forwarding 9 ticks (election ticks 10)
2021-11-26 14:52:48.212001 N | etcdserver/membership: set the initial cluster version to 3.4
2021-11-26 14:52:48.212121 I | etcdserver/api: enabled capabilities for version 3.4
2021-11-26 14:52:48.213965 I | embed: ClientTLS: cert = /etc/kubernetes/pki/etcd/server.crt, key = /etc/kubernetes/pki/etcd/server.key, trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file =
2021-11-26 14:52:48.214306 I | embed: listening for metrics on http://127.0.0.1:2381
2021-11-26 14:52:48.214640 I | embed: listening for peers on 172.31.32.35:2380
raft2021/11/26 14:52:48 INFO: 8e9e05c52164694d is starting a new election at term 2
raft2021/11/26 14:52:48 INFO: 8e9e05c52164694d became candidate at term 3
raft2021/11/26 14:52:48 INFO: 8e9e05c52164694d received MsgVoteResp from 8e9e05c52164694d at term 3
raft2021/11/26 14:52:48 INFO: 8e9e05c52164694d became leader at term 3
raft2021/11/26 14:52:48 INFO: raft.node: 8e9e05c52164694d elected leader 8e9e05c52164694d at term 3
2021-11-26 14:52:48.304019 I | etcdserver: published {Name:master ClientURLs:[https://172.31.32.35:2379]} to cluster cdf818194e3a8c32
2021-11-26 14:52:48.304142 I | embed: ready to serve client requests
2021-11-26 14:52:48.304686 I | embed: ready to serve client requests
2021-11-26 14:52:48.307152 I | embed: serving client requests on 127.0.0.1:2379
2021-11-26 14:52:48.308149 I | embed: serving client requests on 172.31.32.35:2379
2021-11-26 14:55:12.351618 I | mvcc: store.index: compact 4680915
2021-11-26 14:55:12.352863 I | mvcc: finished scheduled compaction at 4680915 (took 830.881µs)
有問題的如上粗體字(待研究!!)