Posts

Terraform, Infrastructure as Code management tool

IaC Concept fast deployment for business requirement infrastructure consistency security compliance automation Azure cloud infra architecture overview https://learn.microsoft.com/en-us/azure/architecture/reference-architectures/containers/aks-microservices/aks-microservices Proposed GitOps flow change request stakeholder notification, plan IaC modularization for reusable purpose version control, branch: main, dev, new_net, new_db, stg, prd, … automating testing continuous deployment update process to related notificatition channel HCL Code example variable "aks_node_pool" { description = "" type = map(object({ vm_size = string node_count = number zones = list(string) os_disk_size_gb = number os_disk_type = string })) default = { "mqpool" : { vm_size = "Standard_DS4_v2", node_count = 6, zones = ["1", "2"], os_disk_size_gb = 64, os_disk_type = "Ephemeral" } "tbcorepool" : { vm_size = "Standard_DS3_v2", node_count = 3, zones = ["1", "2"], os_disk_size_gb = 64, os_disk_type = "Ephemeral" } "tbrepool" : { vm_size = "Standard_DS3_v2", node_count = 25, zones = ["1", "2"], os_disk_size_gb = 64, os_disk_type = "Ephemeral" } "tbtranspool" : { vm_size = "Standard_DS3_v2", node_count = 12, zones = ["1", "2"], os_disk_size_gb = 64, os_disk_type = "Ephemeral" } "tbjspool" : { vm_size = "Standard_DS2_v2", node_count = 6, zones = ["1", "2"], os_disk_size_gb = 64, os_disk_type = "Ephemeral" } } } module "aks" { source = "../../modules/azure/kubernetes" resource_group = var.resource_group location = var.location vnet_app_name = local.vnet_app_name subnet_aks_name = var.subnet_aks_name aks_name = local.aks_name aks_namespace_admin_group_object_ids = var.aks_namespace_admin_group_object_ids aks_admin_group_object_ids = var.aks_admin_group_object_ids apg_id = module.apg.apg_id acr_id = module.acr.acr_id aks_tags = var.aks_tags aks_default_node_pool = var.aks_default_node_pool aks_node_pool = var.aks_node_pool vnet_depends_on = [ module.network.vnet_app_name, module.apg.apg_id, module.acr.acr_id ] } for more detail, please check: https://github.com/chienfuchen32/terraform-azure Further IaC with documentation tool for more detail, please check: Cloud Native Taiwan User Group meetup #65 slide Ref https://www.redhat.com/en/topics/automation/what-is-infrastructure-as-code-iac https://about.gitlab.com/topics/gitops/ https://developer.hashicorp.com/terraform/language/modules https://developer.hashicorp.com/terraform/language/tests https://learn.microsoft.com/en-us/azure/devops/service-hooks/services/teams?view=azure-devops

Continuous Deployment · Continuous Integration · DevOps · IaC · SRE

July 26, 2025

264 words

2 minutes

Thingsboard Performance Test

Thingsboard Micro service Architecture version v3.5.1 Cloud Infrastructure Environment Azure Kubernetes v1.26.3. Tier: Standard Load Balancer. sku: Standard Azure Application Gateway. Tier: WAF V2 (auto scale instance 1-3) Azure PostgreSQL Flexible Server. version: 14.8 sku: 1 * D2ds_v4, 2vCores, 8GiB RAM, 128GiB 500 IOPS Disk) Azure Cache for Redis. sku: 1 * Premium P1 (6GB Cache Size) Azure Managed Cassandra. version: 4 sku: 9 * D8s_V5 (8 vCPUs, 32GB RAM, 2 * P30 1024GiB, 5000 IOPS, 200 MB/sec Disk)

Distributed System · High Availability · IoT · Kubernetes · Performance Test · Thingsboard

October 5, 2023

8413 words

40 minutes

N Queen Problem

Purpose TODO Example from LeetCode 51. N-Queens problem link Implementation Backtracking all possible placement, check if it is queen attack range. class Solution: def solveNQueens(self, n: int) -> List[List[str]]: place = [-1 for _ in range(n)] row = 0 place[0] = 0 ans = [] while True: if self.is_valid(place, n) is True: if row == n - 1: ans.append([''.join(b for b in self.board[i]) for i in range(n)]) else: row += 1 for j in range(row, -1, -1): if place[j] == n - 1: place[j] = -1 row -= 1 else: place[j] += 1 break if j == 0 and place[j] == -1: return ans else: place[row] += 1 def is_valid(self, place, n)->bool: self.board = [['.' for _ in range(n)] for _ in range(n)] for i in range(n): if place[i] != -1: self.board[i][place[i]] = 'Q' # row and col for i in range(n): for j in range(i+1, n): if place[j] != -1 and place[i] == place[j]: return False # cross for i in range(n): if place[i] == -1: continue for j in range(n): if j == 0: continue o = i + j p = place[i] + j if o >= 0 and o < n and \ p >= 0 and p < n: if self.board[o][p] == 'Q': return False o = i + j p = place[i] - j if o >= 0 and o < n and \ p >= 0 and p < n: if self.board[o][p] == 'Q': return False o = i - j p = place[i] + j if o >= 0 and o < n and \ p >= 0 and p < n: if self.board[o][p] == 'Q': return False o = i - j p = place[i] - j if o >= 0 and o < n and \ p >= 0 and p < n: if self.board[o][p] == 'Q': return False return True

Algorithm · Backtracking · Data Structure

June 14, 2023

311 words

2 minutes

Least Recently Used Cache

Purpose TODO Example from LeetCode 146. LRU Cache problem link Implementation TODO commemt class Node: def __init__(self, key, value, prev=None, nxt=None): self.key = key self.value = value self.prev = prev self.next = nxt class DoubleLinkList: def __init__(self): self.head = None self.tail = None self.length = 0 def add_node_to_tail(self, node: Node) -> None: self.length += 1 if self.head is None: self.head = node return if self.tail is None: self.tail = node self.tail.prev = self.head self.head.next = self.tail return self.tail.next = node node.prev = self.tail self.tail = node def remove_node_from_head(self) -> int: self.length -= 1 old_key = self.head.key self.head = self.head.next if self.head is not None: self.head.prev = None return old_key def remove_node_from_node(self, node) -> None: self.length -= 1 if node.prev is not None: node.prev.next = node.next if node.next is not None: node.next.prev = node.prev if node == self.tail: if self.tail.prev is not None: self.tail.prev.next = None self.tail = self.tail.prev if node == self.head: if self.head.next is not None: self.head.next.prev = None self.head = self.head.next def print_all_node(self) -> None: ptr = self.head while ptr is not None: ptr = ptr.next class LRUCache: def __init__(self, capacity: int): self.cache = {} # {key: Node} self.link_list = DoubleLinkList() self.capacity = capacity def get(self, key: int) -> int: if self.cache.get(key, None) is not None: node = self.cache[key] value = node.value new_node = Node(key, value) self.link_list.remove_node_from_node(node) self.cache[key] = new_node self.link_list.add_node_to_tail(new_node) return value return -1 def put(self, key: int, value: int) -> None: if self.cache.get(key, None) is not None: node = self.cache[key] del self.cache[key] self.link_list.remove_node_from_node(node) else: if self.link_list.length >= self.capacity: head_key = self.link_list.remove_node_from_head() del self.cache[head_key] new_node = Node(key, value) self.cache[key] = new_node self.link_list.add_node_to_tail(new_node)

Algorithm · Data Structure

May 31, 2023

265 words

2 minutes

How to Repair File System Provided From Kubernetes Ceph Persistent Volume

How to repair file system provided from Kubernetes ceph persistent volume 某日由於底層VM出現硬體故障，整座自建的Kubernetes接受影響，部署於該環境的應用MySQL無法掛載pv，從pod出現的錯誤訊息如下: MountVolume.SetUp failed for volume "pvc-78cf22fa-d776-43a4-98d7-d594f02ea018" : mount command failed, status: Failure, reason: failed to mount volume /dev/rbd13 [xfs] to /var/lib/kubelet/plugins/ceph.rook.io/rook-ceph/mounts/pvc-78cf22fa-d776-43a4-98d7-d594f02ea018, error mount failed: exit status 32 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/plugins/ceph.rook.io/rook-ceph/mounts/pvc-78cf22fa-d776-43a4-98d7-d594f02ea018 --scope -- mount -t xfs -o rw,defaults /dev/rbd13 /var/lib/kubelet/plugins/ceph.rook.io/rook-ceph/mounts/pvc-78cf22fa-d776-43a4-98d7-d594f02ea018 Output: Running scope as unit run-r2594d6c82152421c8891bfa8761e8c05.scope. mount: mount /dev/rbd13 on /var/lib/kubelet/plugins/ceph.rook.io/rook-ceph/mounts/pvc-78cf22fa-d776-43a4-98d7-d594f02ea018 failed: Structure needs cleaning 處理方式: 透過ceph tool重新掛載該pv的image，試著使用fsck修復

CSI · Ingress · Kubernetes

May 6, 2023

937 words

5 minutes

Continuous Integration Continuous Deployment

DevOps concept & practice |---------| |---------| merge |-------| |--------------------------| |local dev|------->| version |-------->| build |--------->|* security check | |---------| commit | control | trigger | agent | pipeline |* build / push artifact | ^ |---------| webhook |-------| task |* unit / integration test | | |--------------------------| | | |---------------------| analysis |------------| operation |----------------| |release | * next release plan |<---------| monitor |---------->| staging |<--| | * issues action | | metrics log| | production env | |---------------------| |------------| |----------------|

Continuous Deployment · Continuous Integration · DevOps

September 15, 2022

80 words

1 minute

Nginx Ingress Controller K8S Guide

Nginx 介紹 Nginx是非同步框架的網頁伺服器，也可以用作反向代理、負載平衡器和HTTP快取。設置範例 server { listen 80; listen [::]:80; server_name localhost; listen 443 ssl default_server; ssl_certificate /tls/server.crt; ssl_certificate_key /tls/server.key; #charset koi8-r; #access_log /var/log/nginx/host.access.log main; location / { #root /usr/share/nginx/html; #index index.html index.htm; proxy_pass http://192.168.24.100:9090/; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "Upgrade"; proxy_set_header Host $host; proxy_connect_timeout 60s; proxy_read_timeout 300s; proxy_send_timeout 300s; } location /socket.io { proxy_pass http://192.168.24.100:9091/; #Version 1.1 is recommended for use with keepalive connections proxy_http_version 1.1; #WebSocket proxy_set_header Upgrade $http_upgrade; #WebSocket proxy_set_header Connection $connection_upgrade; #WebSocket proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto https; proxy_set_header Cookie $http_cookie; } error_page 500 502 503 504 /50x.html; location = /50x.html { root /usr/share/nginx/html; } } 可以透過nginx -s reload讓nginx對新的配置生效

Ingress · Kubernetes · Reverse Proxy

April 15, 2022

546 words

3 minutes

Kubernetes Guide for Developer

Environment Before you try to use any client tool connect to Kubernetes, please check the endpoint field “server” of “cluster” credential config file. Please make sure you’re able connect with the kube-apiserver. For instance, example below shown here is “k8s-cluster” apiVersion: v1 kind: Config clusters: - name: team-a-admin@kubernetes cluster: server: 'https://k8s-cluster:8443' certificate-authority-data: >- xxxxx users: - name: team-a-admin user: token: >- xxxxx contexts: - name: team-a-admin@kubernetes context: user: team-a-admin cluster: team-a-admin@kubernetes namespace: team-a current-context: team-a-admin@kubernetes This example is based on self-host Kubernetes v1.18 environment, you machine might not be able to recognize the target,

Ingress · Kubernetes

March 18, 2022

497 words

3 minutes

Disk encryption: LUKS ( Linux Unified Key Setup) with Tang

Key component tang server is responsible for helping dracut to decrypt the target disk. It won’t store any client key. encrypted server is required to use clevis, dracut. It provide a easier way that integrate with tang server to decrypt LUKS disk. Network topology |-------------------------| |------------| |LUKS encrypted server |-- disk decryption -->|tang server | |(clevis, dracut) [env] |<----- response ------|[tang] | |-------------------------| |____________| Tang server software installation via apt on x86x64 Ubuntu 20.04 adm@tang:~$ sudo apt-get install tang -y ## check version adm@tang:~$ apt list --installed | grep tang tang/focal,now 7-1build1 amd64 [installed] ## Enable the tangd service adm@tang:~$ sudo systemctl enable tangd.socket Create an override file with 7500 to prevent port conflict adm@tang:~$ sudo systemctl edit tangd.socket # tangd.socket [Socket] ListenStream= ListenStream=7500 adm@tang:~$ sudo systemctl daemon-reload ## Check that your configuration is working: adm@tang:~$ sudo systemctl show tangd.socket -p Listen Listen=[::]:7500 (Stream) ## Start the tangd service adm@tang:~$ sudo systemctl restart tangd.socket adm@tang:~$ sudo systemctl status tangd.socket ● tangd.socket - Tang Server socket Loaded: loaded (/lib/systemd/system/tangd.socket; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/tangd.socket.d └─override.conf Active: active (listening) since Mon 2022-03-14 00:54:03 UTC; 1h 25min ago Triggers: ● tangd@0.service Listen: [::]:7500 (Stream) Accepted: 0; Connected: 0; Tasks: 0 (limit: 984) Memory: 44.0K CGroup: /system.slice/tangd.socket Mar 14 00:54:03 d systemd[1]: Listening on Tang Server socket. encrypted server: try clevis, luks to bind with tang Assume that tang server is now running on 192.168.100.10:7500, we need to run clevis to bind local encrypted disk (/dev/md0 in this case) with tang.

Encryption · Linux Unified Key Setup

March 17, 2022

1782 words

9 minutes

On-Premises K8s Installation

Load balancer & VIP HAProxy & Keepalived Init HA control plane & worker node sudo kubeadm init --control-plane-endpoint "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT" --upload-certs ... ... You can now join any number of control-plane node by running the following command on each as a root: kubeadm join 192.168.0.200:6443 --token 9vr73a.a8uxyaju799qwdjv --discovery-token-ca-cert-hash sha256:7c2e69131a36ae2a042a339b33381c6d0d43887e2de83720eff5359e26aec866 --control-plane --certificate-key f8902e114ef118304e561c3ecd4d0b543adc226b7a07f675f56564185ffe0c07 Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use kubeadm init phase upload-certs to reload certs afterward. Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.0.200:6443 --token 9vr73a.a8uxyaju799qwdjv --discovery-token-ca-cert-hash sha256:7c2e69131a36ae2a042a339b33381c6d0d43887e2de83720eff5359e26aec866 Networking with Weave Net $ kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')" https://www.weave.works/docs/net/latest/overview/ https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/ Ingress controller with Nginx $ kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.3.1/deploy/static/provider/cloud/deploy.yaml https://kubernetes.github.io/ingress-nginx/ Storage Ceph $ git clone --single-branch --branch v1.10.1 https://github.com/rook/rook.git $ cd rook/deploy/examples $ kubectl create -f crds.yaml -f common.yaml -f operator.yaml $ kubectl create -f cluster.yaml https://rook.io/docs/rook/v1.10/Getting-Started/intro/ Monitoring Prometheus, Grafana $ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts $ helm repo update $ helm install -n [NAMESPACE] [RELEASE_NAME] prometheus-community/kube-prometheus-stack https://prometheus.io https://grafana.com/oss/grafana/ Log Loki $ helm repo add grafana https://grafana.github.io/helm-charts $ helm repo update $ helm upgrade --install --namespace=[NAMESPACE] [RELEASE_NAME] grafana/loki-stack https://grafana.com/oss/loki/

CNI · CSI · Kubernetes

September 18, 2020

209 words

1 minute