CSI

How to Repair File System Provided From Kubernetes Ceph Persistent Volume

How to repair file system provided from Kubernetes ceph persistent volume 某日由於底層VM出現硬體故障，整座自建的Kubernetes接受影響，部署於該環境的應用MySQL無法掛載pv，從pod出現的錯誤訊息如下: MountVolume.SetUp failed for volume "pvc-78cf22fa-d776-43a4-98d7-d594f02ea018" : mount command failed, status: Failure, reason: failed to mount volume /dev/rbd13 [xfs] to /var/lib/kubelet/plugins/ceph.rook.io/rook-ceph/mounts/pvc-78cf22fa-d776-43a4-98d7-d594f02ea018, error mount failed: exit status 32 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/plugins/ceph.rook.io/rook-ceph/mounts/pvc-78cf22fa-d776-43a4-98d7-d594f02ea018 --scope -- mount -t xfs -o rw,defaults /dev/rbd13 /var/lib/kubelet/plugins/ceph.rook.io/rook-ceph/mounts/pvc-78cf22fa-d776-43a4-98d7-d594f02ea018 Output: Running scope as unit run-r2594d6c82152421c8891bfa8761e8c05.scope. mount: mount /dev/rbd13 on /var/lib/kubelet/plugins/ceph.rook.io/rook-ceph/mounts/pvc-78cf22fa-d776-43a4-98d7-d594f02ea018 failed: Structure needs cleaning 處理方式: 透過ceph tool重新掛載該pv的image，試著使用fsck修復

CSI · Ingress · Kubernetes

May 6, 2023

937 words

5 minutes

On-Premises K8s Installation

Load balancer & VIP HAProxy & Keepalived Init HA control plane & worker node sudo kubeadm init --control-plane-endpoint "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT" --upload-certs ... ... You can now join any number of control-plane node by running the following command on each as a root: kubeadm join 192.168.0.200:6443 --token 9vr73a.a8uxyaju799qwdjv --discovery-token-ca-cert-hash sha256:7c2e69131a36ae2a042a339b33381c6d0d43887e2de83720eff5359e26aec866 --control-plane --certificate-key f8902e114ef118304e561c3ecd4d0b543adc226b7a07f675f56564185ffe0c07 Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use kubeadm init phase upload-certs to reload certs afterward. Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.0.200:6443 --token 9vr73a.a8uxyaju799qwdjv --discovery-token-ca-cert-hash sha256:7c2e69131a36ae2a042a339b33381c6d0d43887e2de83720eff5359e26aec866 Networking with Weave Net $ kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')" https://www.weave.works/docs/net/latest/overview/ https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/ Ingress controller with Nginx $ kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.3.1/deploy/static/provider/cloud/deploy.yaml https://kubernetes.github.io/ingress-nginx/ Storage Ceph $ git clone --single-branch --branch v1.10.1 https://github.com/rook/rook.git $ cd rook/deploy/examples $ kubectl create -f crds.yaml -f common.yaml -f operator.yaml $ kubectl create -f cluster.yaml https://rook.io/docs/rook/v1.10/Getting-Started/intro/ Monitoring Prometheus, Grafana $ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts $ helm repo update $ helm install -n [NAMESPACE] [RELEASE_NAME] prometheus-community/kube-prometheus-stack https://prometheus.io https://grafana.com/oss/grafana/ Log Loki $ helm repo add grafana https://grafana.github.io/helm-charts $ helm repo update $ helm upgrade --install --namespace=[NAMESPACE] [RELEASE_NAME] grafana/loki-stack https://grafana.com/oss/loki/

CNI · CSI · Kubernetes

September 18, 2020

209 words

1 minute