Interactive Lab ยท CKA Guaranteed Task ยท Intermediate

etcd BACKUP & RESTORE

$ ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-snapshot.db
โš  This task appears in almost every CKA exam. Master it.
๐Ÿ—„ What is etcd and why does it matter?
  • 1etcd is the brain of your Kubernetes cluster. It stores every piece of cluster state: nodes, pods, secrets, configmaps, deployments, RBAC rules everything.
  • 2If etcd is lost and you have no backup, your entire cluster configuration is gone forever. The worker nodes keep running their current pods, but you lose all control over the cluster.
  • 3The etcdctl tool is used to interact with etcd. Always set ETCDCTL_API=3 before running commands. API version 2 is deprecated and will give wrong results.
  • 4On the CKA exam, you are always given the etcd certificates and endpoint. You just need to know the exact command structure.
Cluster Architecture
๐Ÿ—„
ETCD
Healthy
port 2379
๐ŸŽ›
API SERVER
Running
port 6443
๐Ÿ“…
SCHEDULER
Running
leader elected
๐Ÿ”„
CONTROLLER
Running
reconciling
๐Ÿ’ก etcd runs as a static pod on the control plane node at /etc/kubernetes/manifests/etcd.yaml. You can find the cert paths and endpoint there. On the exam, always check this file first.
Check etcd health
# Check etcd endpoint health
ETCDCTL_API=3 etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  endpoint health
terminalLIVE
๐Ÿ’พ Step 2 Take an etcd Snapshot
  • 1The backup command is etcdctl snapshot save followed by the path where you want to save the snapshot file.
  • 2You must always provide three certificate flags: --cacert, --cert, and --key. The exam task always gives you these paths.
  • 3Fill in the backup path below and run it. The simulator will show you the exact output you would see on a real cluster.
โš  On the real exam: after saving the snapshot, always run etcdctl snapshot status on the file to verify it is not corrupt before moving on.
snapshot save command
--backup-path
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-snapshot.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key
terminalLIVE
๐Ÿ’ฅ Step 3 Simulate a Cluster Failure
  • 1In this step we simulate etcd data loss the scenario where you actually need to use the backup.
  • 2On a real cluster this could happen due to disk corruption, accidental deletion of etcd data, or a failed upgrade. The symptoms: kubectl get pods hangs, the API server becomes unreachable.
  • 3Click Simulate Failure below to see what a broken etcd looks like, then move to Step 4 to restore it.
Cluster State
๐Ÿ—„
ETCD
Healthy
port 2379
๐ŸŽ›
API SERVER
Running
port 6443
๐Ÿ“ฆ
WORKLOADS
Running
15 pods active
terminalLIVE
๐Ÿ”„ Step 4 Restore from Snapshot
  • 1The restore command is etcdctl snapshot restore. It creates a new data directory from the snapshot. You then point etcd at this new directory.
  • 2After restoring, update /etc/kubernetes/manifests/etcd.yaml to change --data-dir to your new restore path. etcd will restart automatically as a static pod.
  • 3Fill in the snapshot path and restore path, then run. Watch the cluster recover.
snapshot restore command
--snapshot-file
--data-dir
ETCDCTL_API=3 etcdctl snapshot restore /backup/etcd-snapshot.db \
  --data-dir=/var/lib/etcd-restore
Update etcd manifest
# Edit the etcd static pod manifest
vi /etc/kubernetes/manifests/etcd.yaml
 
# Find this line and update the path:
- --data-dir=/var/lib/etcd-restore
 
# Also update the hostPath volume to match:
path: /var/lib/etcd-restore
Cluster Recovery
๐Ÿ—„
ETCD
Failed
data corrupted
๐ŸŽ›
API SERVER
Unreachable
connection refused
๐Ÿ“ฆ
WORKLOADS
Unknown
no etcd connection
terminalLIVE
Backup Command
ETCDCTL_API=3Always set this. API v3 is required for snapshot commands.
snapshot save <path>Creates a point-in-time snapshot of etcd data.
--endpointsetcd server address. Usually https://127.0.0.1:2379 on control plane.
--cacert/etc/kubernetes/pki/etcd/ca.crt
--cert/etc/kubernetes/pki/etcd/server.crt
--key/etc/kubernetes/pki/etcd/server.key
Restore Command
snapshot restore <file>Restores etcd from a snapshot file into a new data directory.
--data-dirWhere to restore the data. Use a new path, not the existing etcd dir.
After restoreEdit /etc/kubernetes/manifests/etcd.yaml and update --data-dir and the hostPath volume to the new path.
Verify restoreRun snapshot status on the file first. Then watch kubectl get pods until API server responds.
Exam Day Checklist
1. Find cert pathscat /etc/kubernetes/manifests/etcd.yaml
2. Set API versionexport ETCDCTL_API=3
3. Take snapshotetcdctl snapshot save <path> with all cert flags
4. Verify snapshotetcdctl snapshot status <path>
5. Restore if neededetcdctl snapshot restore <path> --data-dir=<new>
6. Update manifestEdit etcd.yaml --data-dir and hostPath to new path
7. Wait for restartwatch kubectl get pods -n kube-system
โœ“
Nice work
Step complete