Kubernetes

好文共賞 - How Kubernetes And Kafka Will Get You Fired - MyApollo
CNCF Certified Kubernetes Administrator (CKA) 證照心得 - Jasper Sui | Home
從題目中學習k8s :: 第 12 屆 iThome 鐵人賽
Head-first k8s
Kubernetes 簡介 - Huan-Lin 學習筆記
Ivan on Containers, Kubernetes, and Backend Development
- Kubernetes Basics: Understanding Pods, Deployments, and Services
- Service Discovery in Kubernetes: Combining the Best of Two Worlds
- Kubernetes API Basics - Resources, Kinds, and Objects
改進容器化部署 - 成功導入 K8S 的經驗與挑戰.pdf - Google 雲端硬碟
GitHub - Mozart4242/kubernetes-real-world: This is your kubernetes dream that WILL come true, a Real-World Kubernetes cluster with needed production enterprise services.
解決K8s網路定址缺陷　Antrea Egress從頭學(一) | 網管人
Kubernetes 部署在虛機好還是裸機好? - 魂系架構 Phil's Workspace
Kubernetes 概觀 - iT 邦幫忙
Automate All the Boring Kubernetes Operations with Python | Martin Heinz | Personal Website & Blog
从 Helm 到 Operator：Kubernetes应用管理的进化 | crossoverJie's Blog
如何 Debug Kubernetes Pod - Yowko's Notes
The Containerization Tech Stack | Medium
Day 20- Kubernetes 中的命名空間與資源隔離 - iT 邦幫忙
k0sproject/k0s: k0s - The Zero Friction Kubernetes
How to run Slurm workloads on OpenShift with Slinky operator | Red Hat Developer
Kubectl Cheatsheet - Collabnix
Kubetools - A Curated List of Kubernetes Tools | kubetools
Test
- 如何对 kubernetes 应用做 e2e(端到端) 测试 | crossoverJie's Blog
log
- 在 kubernetes 环境下如何采集日志 | crossoverJie's Blog
Dashboard/UI
- [推薦] Kubernetes (k8s) GUI 管理工具 – Lens | 辛比誌
- 探索 Kite：现代化轻量级 Kubernetes Dashboard，助力 DevOps 高效管理集群 | Solitudes
- weibaohui/k8m: 一款轻量级、跨平台的 Mini Kubernetes AI Dashboard，支持大模型+智能体+MCP(支持设置操作权限)，集成多集群管理、智能分析、实时异常检测等功能，支持多架构并可单文件部署，助力高效集群管理与运维优化。
VMWare Tanzu
- 全功能Tanzu社群版開箱　演練K8s叢集部署管理 | 網管人
- 今天不談技術：談新名字. 話說最近VMware釋出了一大堆的新名稱，估計大家應該都昏頭了吧？用這個機會，我… | by Shawn Ho | 輕鬆小品-k8s的點滴 | Medium
monitor
- k8s 云原生应用如何接入监控 | crossoverJie's Blog
HPC
- Orchestrating Kubernetes Clusters on HPC Infrastructure - Elia Oggian - YouTube
- The Pros and Cons of Kubernetes for HPC - HPCwire
- The Convergence of HPC, AI and Cloud
- Kubernetes and Batch
- Is there anything like SLURM for k8s/eks (machine learning GPU workloads) : r/kubernetes
- Introducing SUNK: A Slurm on Kubernetes Implementation for HPC and Large Scale AI — CoreWeave
  - SUNK: Slurm on Kubernetes (Presented by Navarre Pratt at Supercomputing) - YouTube
Troubleshooting
- Kubernetes Self-Healing 背後的維運辛苦談 | hwchiu learning note
- 你相信服務上 Kubernetes 真的比 VM 好管理？ - 邱宏瑋 (hwchiu) - HackMD
  - Does Kubernetes is really easy than VM for cluster administrator. - Speaker Deck

Under the hood

Kubernetes cluster components 負責讓 Pod 在 Node 上跑起來 => kubelet 與 container runtime engine Service 與 Pod 之間的流量轉發 => kube-proxy

kubelet 負責接收指令將 Pod 放到 Node 上執行，container runtime engine負責將 Pod 中的容器跑起來，kube-proxy 負責將流量轉發到 Pod 上

kube-apiserver etcd kube-controller-manager kube-scheduler

使用者透過 CLI 或 API 告訴 api-server 要建立一個 Pod，會發生了什麼事？ 1. api-server 對使用者作身分驗證，確認他有建立 Pod 的權限。 2. api-server 將新 Pod 的資訊寫入 etcd。 3. api-server 告訴 scheduler 要找一個合適的 Node 來執行新的 Pod。 4. 找到適合的 Node 後，scheduler 透過 api-server 將該 Node 寫入 etcd。 5. 目前為止，Pod 的 spec 與該去哪裡都僅存在於 etcd 中，還沒有真正的執行。kubelet 透過api-server 知道有 Pod 要建立，會負責拉取需要的 Image、調用 Container Runtime Engine建立容器、調用 CNI 來分配 IP 給 Pod。 6. Pod 無論是否成功跑起來，都會向 api-server 回報 Pod 的狀態，狀態會被寫回 etcd

CRI(Container Runtime Interface) CNI(Container Network Interface) CSI(Container Storage Interface)

Pod ReplicaSet Deployment StatefulSet

Service

Namespace

Static pod: /etc/kubernetes/manifests/ kubectl api-resources --namespaced=true {service-name}.{namespace}.svc.cluster.local NodePort 的範圍在 30000~32767

Awesome Ref

kubectl

kubectl get pod
kubectl get pod busybox -o jsonpath='{.status.podIP}'
kubectl get pod busybox --show-labels
kubectl get pod busybox -o wide
kubectl get pod busybox -o yaml
kubectl get pod -l <key1>=<value1>,<key2>=<value2>,...
kubectl get pod -l app=nginx
kubectl get pod --all-namespaces

kubectl describe pod <pod-name>

kubectl logs <pod-name>
kubectl logs -f <pod-name>
kubectl logs <pod-name> -c <container-name>
kubectl logs svc/<service-name>

kubectl delete pod <pod-name> <pod-name> ...
kubectl delete pod --force --grace-period=0

kubectl run <pod-name> --image <image> --command -- <command> <arg1> <arg2> ... <argN>
kubectl run busybox --image busybox --command -- sleep 300


kubectl exec <pod-name> -- <command> <arg1> <arg2> ... <argN>
kubectl exec busybox -- echo hello
kubectl exec -it <pod-name> -- /bin/bash

kubectl set image pod <pod-name> <container-name>=<new-image>
kubectl set image pod webapp web=nginx:1.15
kubectl set image deployment nginx-deploy nginx=nginx:1.15

kubectl label <object-type> <object-name> <key>=<value>

kubectl scale deploy nginx-deploy --replicas 5

kubectl cp <local-file-path> <pod-name>:<pod-file-path>
kubectl cp <pod-name>:<pod-file-path> <local-file-path>

kubectl expose <pod or deploy> <deployment-name> --type=<service-type> --name=<service-name> --port=<service-port> --target-port=<target-port>
kubectl expose deploy nginx-deploy --type=NodePort --name=nginx-deploy-svc --port=80 --target-port=80

kubectl create service nodeport my-service --tcp=80:80 --node-port=30010

kubectl port-forward <type>/<name> <local-port>:<target-port>

kubectl rollout history deploy nginx
kubectl rollout history deploy nginx --revision=3
kubectl annotate deployment/nginx kubernetes.io/change-cause="Rollout to first revision"

kubectl rollout undo deployment/nginx
kubectl rollout undo deployment/nginx --to-revision=3

kubectl rollout restart deployment/nginx

kubectl api-resources

Network

ClusterIP vs NodePort vs LoadBalancer: Key Differences & When to Use
- In Kubernetes, the ClusterIP Service is used for Pod-to-Pod communication within the same cluster.
  - This means that a client running outside of the cluster, such as a user accessing an application over the internet, cannot directly access a ClusterIP Service.
- The NodePort Service provides a way to expose your application to external clients.
  - An external client is anyone who is trying to access your application from outside of the Kubernetes cluster.
  - The NodePort Service does this by opening the port you choose (in the range of 30000 to 32767) on all worker nodes in the cluster.
- One disadvantage of the NodePort Service is that it doesn't do any kind of load balancing across multiple nodes.
理解 K8s 如何處理不同類型 service 的封包來源 IP - HackMD
How does Ingress Nginx Controller work? - HackMD
- Ingress Controller is a reverse proxy + HTTP router that runs inside Kubernetes.
- It allows you to expose multiple HTTP/HTTPS applications using a single external IP, based on the hostname (app1.example.com, app2.example.com) and URL path.
  - Host-based routing (different domains)
  - Path-based routing (/api, /web)
  - SSL/TLS termination (HTTPS)
  - Load balancing for HTTP traffic
  - Features like rate limiting, authentication, rewrites, etc.

simulate the air-gapped env - K8s can't live without a default route · Issue #123120 · kubernetes/kubernetes

ip route
ip route del default
ip route add default dev lo
ip route

CPU limit

Kubernetes CPU limits are not intuitive at all

Volume, PV, PVC, StorageClass

Day 13 -【Storage】：Volume 的三種基本應用 --- emptyDir、hostPath、configMap & secret
- emptyDir 就是一個空的目錄，當 Pod 被刪除時，emptyDir 中的資料也會被刪除，所以它的目的並不是保存資料，而是讓 Pod 中的多個 container 共享資料
  - 「用 emptyDir 來共享 Pod 中的容器資料」是一種常見的應用
- hostPath 就是指定 Node 上的某目錄掛載到 Pod 中讓 container 存取
  - 這裡的「host」指的是執行 Pod 的 Node
  - 要特別注意的是，指定的 hostPath 並不一定在每個 Node 上都有，如果 scheduler 把 Pod 調度其他 Node 上，就會造成資料無法讀取的情況。
  - 因此，hostPath 通常是用來測試 single-node cluster 的環境
Day 14 -【Storage】： PV、PVC & StorageClass
- 當開發者的 Pod 需要儲存空間時，僅需提出需求(例如「要幾G」)，剩下直接交給 k8s 管理員來提供實際的 storage，不但能減輕開發者的負擔，也能讓 storage 與 pod 的生命週期解耦
  - Persistent Volume (PV)：擁有獨立生命週期的 storage。
  - Persistent Volume Claim (PVC)：儲存空間的需求。
- K8s 管理員的工作就是當有 PVC 提出後，提供相對應的 PV 給 PVC。提供 PV 的方式稱為「provisioning」，可以分為以下兩種模式
  - Static：由管理員自行定義並建置，例如管理使用 yaml 來建立 PV
  - Dynamic: 管理員配置「StorageClass」，後續由 storageClass 來自動建置 PV
- 當建立 PVC 後，K8s 會自動尋找符合條件的 PV，如果符合條件的 PV 不存在，則 PVC 會一直處於 Pending 狀態，直到符合條件的PV被建立
- PV 與 PVC 是一對一關係，PV 一旦和某個 PVC 綁定後，綁定期間就不能再被其他 PVC 使用
- hostPath - HostPath volume (for single node testing only; WILL NOT WORK in a multi-node cluster; consider using local volume instead)
- storageClass 的完整效果：使用者只需要建立 PVC，storageClass 會負責管理 PV 的建立與刪除
  - Kubernetes-note/on-prem/nfs.md at master · michaelchen1225/Kubernetes-note
Day26 了解 K8S 的 Volumes & StorageClass

NFS provisioner

Kubernetes-note/on-prem/nfs.md at master · michaelchen1225/Kubernetes-note

helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner --set nfs.server=192.168.122.222 --set nfs.path=/srv/nfs-share --set storageClass.name=nfs-storage

cloud-native storage orchestrator

GPU

A Practical Guide to Running NVIDIA GPUs on Kubernetes | jimangel.io

Horizontal Pod Autoscaling(HPA)

Prerequisite: Metrics Server

實務需求

DevOps Taiwan | 邱牛徵幫手，真正的 k8s 高手可以考慮去台積挑戰 | Facebook
搭建與管理提供 Container/VM 等不同運算服務的平台供全 IT 使用，範疇從 Day1 到 Day2 包山包海，會需要定期值班一起維護系統穩定性
改善既有運算平台的各種缺陷，強化系統穩定性來支撐 7*24 小時的維運需求
與各產品的 SRE 團隊密切合作，打造出好維運且穩定的服務供各 SRE 可以更好的維運自己團隊的應用程式
從 Bare Metal, OS, Container, K8S, Network, Storage, Compute, Security, IaC, ServiceMesh, BGP, KVM 等範疇中找到自己擅長的領域並且專攻，成為團隊內該領域的專家，專家要可以看懂 Open Source Code 並且回答其底層實作原理
設計與推動 Capacity Planning 與效能調校，支撐系統穩定成長
推動系統自動化與工程化，降低人工操作需求，所有操作都追求自動化，禁止手動操作
建立標準化事件應變流程（Incident Response），並執行無責備事後檢討（Blameless Postmortem）以持續改善。
管理全台灣最大的地端運算平台 (Container, VM)，有能力針對各種極限挑戰並且從 OSS 中找出可能的解法．
以自身能力與經驗帶領團隊一起成長

Requirements 1. 8+years 經驗，具備強大的程式開發能力，有良好的開發與測試心態與能力，目標是成為最偷懶的工程師 2. 深入理解 Kubernetes 的各種原理與使用方式，有管理過 Kubernetes Cluster 的經驗 3. 願意理解台積地端環境的架構與現況，並且接受製造業的困境而非想要一步登天改造世界 4. 熟悉 Linux 系統，能夠針對 Linux 進行深度的效能與問題除錯 5. 針對 Storage/Network/IaC/Observability/Security 其中一個領域有非常強且深入的經驗

Nice to Have 1. Multi-cluster / Multi-region 架構經驗 2. Kubernetes 非常強，能夠講出我不懂的地方 3. 有大規模架構的測試經驗，能夠設計與開發複雜的整合測試有導入或推動 AI 工具於組織中的經驗（如 Claude、GitHub Copilot、Codex） 4. 熟悉 AI Agent / Agentic Workflow / Skills，並能將其應用於 SRE / Platform Engineering 場景 5. 有設計 AI 輔助工作與維運流程的經驗 6. 帶領團隊經驗，從專案管理到人員管理都有相關經驗，能夠帶領團隊一起成長

RBAC

Role 是用來定義在某個命名空間底下的角色，而 ClusterRole 則是屬於叢集通用的角色 ClusterRole 除了可以配置 Role 的權限內容外，由於是 Cluster 範圍的角色，因此還能夠配置

叢集範圍的資源，例如 Nodes
非資源類型的 endpoints，例如 /healthz
跨命名空間的資源，例如 kubectl get pods --all-namespace

Resource Quota

Kubernetes: Limit GPU resource usage for a namespace | Bright Cluster Manager Knowledge Base

Deployment

Use KVM + Cockpit + kubespray or kubeadm

KVM

# KVM
sudo apt-get -y install bridge-utils cpu-checker libvirt-clients libvirt-daemon libvirt-daemon-system qemu qemu-kvm
sudo kvm-ok
lsmod | grep kvm
modinfo kvm
modinfo kvm_amd
systemctl status libvirtd

Cockpit

# Web Console for VM
sudo apt-get install cockpit cockpit-machines
sudo systemctl start cockpit
sudo systemctl status cockpit
sudo ufw allow 9090/tcp
sudo ufw reload

Launch the Ubuntu Guest OS

ssh-keygen -t ed25519 -C "KVM VM Instance" -f ~/.ssh/kvm-vm-instance

# Launch the Ubuntu Guest OS
wget -O /var/lib/libvirt/images/jammy-server-cloudimg-amd64.img https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img

cat <<EOF >> /root/meta-data
# meta-data
instance-id: k8s-ubuntu-vm
local-hostname: k8s-ubuntu-vm
EOF

# ubuntu/foo@123
# note: shell is interpreting the $ characters in your hashed password as variable expansions. so use single quote 'EOF'

cat <<'EOF' >> /root/user-data
#cloud-config
ssh_pwauth: true
users:
  - name: root
    lock_passwd: false
    hashed_passwd: $6$P2LlhzIYWXC4XyCa$JzSWM6UBQ3BNLtQXO2jKTIhkBQyNl8DuhJ6tx8kovCtiick0mXMWE6z12HGBhTzAPW0EpDglWn7j.W9XZoaBl0
  - name: ubuntu
    lock_passwd: false
    hashed_passwd: $6$P2LlhzIYWXC4XyCa$JzSWM6UBQ3BNLtQXO2jKTIhkBQyNl8DuhJ6tx8kovCtiick0mXMWE6z12HGBhTzAPW0EpDglWn7j.W9XZoaBl0
    shell: /bin/bash
    sudo: ALL=(ALL) NOPASSWD:ALL
    #ssh_authorized_keys:
    #  - ssh-ed25519 AAAAC3NzaC1lZDI1NT... your-public-key-here
EOF

qemu-img create -F qcow2 -b /var/lib/libvirt/images/jammy-server-cloudimg-amd64.img -f qcow2 /var/lib/libvirt/images/k8s-ubuntu-vm-worker-2.img 30G

virt-install --name k8s-worker-2 --ram 4096 --vcpus 4 --import --disk path=/var/lib/libvirt/images/k8s-ubuntu-vm-worker-2.img,format=qcow2 --cloud-init root-password-generate=on,disable=on,meta-data=/root/meta-data,user-data=/root/user-data --os-variant ubuntu22.04 --network bridge=virbr0 --graphics vnc,listen=0.0.0.0 --console pty,target_type=serial --noautoconsole

(optional): if bridge interface br0 is configed on this environment

virt-install --name k8s-control-1 --ram 4096 --vcpus 4 --import --disk path=/var/lib/libvirt/images/k8s-ubuntu-vm-worker-2.img,format=qcow2 --cloud-init root-password-generate=on,disable=on,meta-data=/root/meta-data,user-data=/root/user-data --os-variant ubuntu22.04 --network bridge=br0 --network bridge=virbr0 --graphics vnc,listen=0.0.0.0 --console pty,target_type=serial --noautoconsole

(optional): prepare key access

# prepare key access
ssh-keygen -t ed25519 -C "KVM VM Instance" -f ~/.ssh/kvm-vm-instance
ssh-copy-id -i ~/.ssh/kvm-vm-instance.pub ubuntu@192.168.122.131
ssh-copy-id -i ~/.ssh/kvm-vm-instance.pub ubuntu@192.168.122.200
ssh-copy-id -i ~/.ssh/kvm-vm-instance.pub ubuntu@192.168.122.188

# test
ssh -i ~/.ssh/kvm-vm-instance ubuntu@192.168.122.188

(scenario 1)kubeadm - Day 03 -【Basic Concept】：建立 Kubeadm Cluster + Bonus Tips - - 標準配置 Kubernetes (K8s) 叢集安裝筆記 - Ubuntu 篇 - 清新下午茶 - 當 Kubernetes (K8s) 遇到 GPU 詳細裝機筆記 - Redhat 篇 - 清新下午茶

(optional)would like to use legacy k8s version, ex. 1.25.x

echo 'deb [trusted=yes] https://pkgs.k8s.io/core:/stable:/v1.25/deb/ /'  | sudo tee /etc/apt/sources.list.d/kubernetes.list

(scenario 2)kubespray - Day 06 使用 Kubespray 建立自己的 K8S（一） - iT 邦幫忙 - Day 07 使用 Kubespray 建立自己的 K8S（二） - iT 邦幫忙

apt install -y python3-venv
git clone --depth 1 --branch v2.28.0 https://github.com/kubernetes-sigs/kubespray.git
# (optional) for legacy kubernetes which kubespray supports
# git clone --depth 1 --branch v2.23.2 https://github.com/kubernetes-sigs/kubespray.git kubespray-2.23
python3 -m venv kubespray-venv
source kubespray-venv/bin/activate
cd kubespray
pip install -U -r requirements.txt
cp -rfp inventory/sample inventory/mycluster

kubespray/inventory/mycluster/inventory.ini

[kube_control_plane]
k8s-ubuntu-vm-control-1 ansible_host=192.168.122.188 ansible_user=ubuntu

[etcd:children]
kube_control_plane

[kube_node]
k8s-ubuntu-vm-worker-1 ansible_host=192.168.122.200 ansible_user=ubuntu
k8s-ubuntu-vm-worker-2 ansible_host=192.168.122.131 ansible_user=ubuntu

(optional)if the certificate needs the Subject Alternative Name(SAN) for FQDN or Multiple NIC interfaces kubespray/inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml

supplementary_addresses_in_ssl_keys:
  - 192.168.1.33
  - foo.com

Use the kubespray to manage the K8S cluster

# deploy
ansible-playbook -i inventory/mycluster/inventory.ini --private-key=~/.ssh/kvm-vm-instance --become --become-user=root cluster.yml

# (optional)deploy specific version, ex.1.25.6
# kubespray require 2.23.2 because 2.24.0 drop support for Kubernetes 1.25.x
ansible-playbook -i inventory/mycluster/inventory.ini --private-key=~/.ssh/kvm-vm-instance --become --become-user=root cluster.yml -e kube_version=v1.25.6

# add the nodes
ansible-playbook -i inventory/mycluster/inventory.ini --private-key=~/.ssh/kvm-vm-instance --become --become-user=root scale.yml

# remove the nodes
ansible-playbook -i inventory/mycluster/inventory.ini --private-key=~/.ssh/kvm-vm-instance --become --become-user=root remove-node.yml --extra-vars "node=k8s-ubuntu-vm-worker-3"

# install the app like dashborad, helm
ansible-playbook -i inventory/mycluster/inventory.ini --private-key=~/.ssh/kvm-vm-instance --become --become-user=root cluster.yml --tags=apps

Dashborad

The Kubernetes Dashboard project has been archived and is no longer actively maintained. For new installations, consider using Headlamp.

helm install my-headlamp headlamp/headlamp --namespace kube-system

Get the application URL by running these commands, http://{IP}:8080 - Kubernetes port forwarding: cleaning up orphaned ports | Tech & Code with Kris

export POD_NAME=$(kubectl get pods --namespace kube-system -l "app.kubernetes.io/name=headlamp,app.kubernetes.io/instance=my-headlamp" -o jsonpath="{.items[0].metadata.name}")
export CONTAINER_PORT=$(kubectl get pod --namespace kube-system $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
kubectl --namespace kube-system port-forward $POD_NAME 8080:$CONTAINER_PORT --address 0.0.0.0

Get the token using

# default expired is 1 hour
kubectl create token my-headlamp --namespace kube-system

# expired after 8 hour
kubectl create token my-headlamp --namespace kube-system --duration 8h

Kubernetes Dashboard has been archived and is no longer actively maintained

dashborad with service account and NodePort In newer Kubernetes versions (1.24+), secrets for service accounts aren't created automatically.

kubectl create serviceaccount dashboard-admin-sa
kubectl create clusterrolebinding dashboard-admin-sa --clusterrole=cluster-admin --serviceaccount=default:dashboard-admin-sa
kubectl create token dashboard-admin-sa

kubernetes-dashboard 安裝 - HackMD

kubectl edit service kubernetes-dashboard  -n kube-system

  ports:
  - nodePort: 30710

  type: NodePort

dashboard quick test and skip login

kubectl edit deployment kubernetes-dashboard -n kube-system

containers:
      - args:
        - --namespace=kube-system
        - --auto-generate-certificates
        - --enable-skip-login
        image: docker.io/kubernetesui/dashboard:v2.7.0

kubectl proxy --address='0.0.0.0' --accept-hosts='.*'
# http://172.19.30.115:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/#/login

management tools

kubectl

# version v1.32.5
curl -LO "https://dl.k8s.io/release/v1.32.5/bin/linux/amd64/kubectl"
curl -LO "https://dl.k8s.io/v1.32.5/bin/linux/amd64/kubectl.sha256"
echo "$(cat kubectl.sha256)  kubectl" | sha256sum --check
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

source <(kubectl completion bash) # set up autocomplete in bash into the current shell, bash-completion package should be installed first.
echo "source <(kubectl completion bash)" >> ~/.bashrc # add autocomplete permanently to your bash shell.

echo 'source <(kubectl completion bash)' >> ~/.bashrc
echo 'alias k=kubectl' >> ~/.bashrc
echo 'complete -o default -F __start_kubectl k' >> ~/.bashrc
source ~/.bashrc

helm

# Method 1
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

# Method 2
curl https://mirror.openshift.com/pub/openshift-v4/clients/helm/3.18.4/helm-linux-amd64
sudo install -o root -g root -m 0755 helm-linux-amd64 /usr/local/bin/helm

minikube

minikube start | minikube

kind

[DAY6]從0開始裝k8s(1)-kind - iT 邦幫忙

Rancher

kubespray

Day 07 使用 Kubespray 建立自己的 K8S（二） - iT 邦幫忙

Helm

Kubernetes 基礎教學（三）Helm 介紹與建立 Chart | Cheng-Wei Hu
Day20 - 使用 Helm 管理 Kubernetes 的應用佈署 - iT 邦幫忙
Kubernetes 遇見 Helm charts - iT 邦幫忙
可觀測性宇宙的第九天 - Helm 安裝包管理器介紹 - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天
Day 21- Kubernetes 的套件管理工具 Helm - iT 邦幫忙
- example: Kubernetes Dashboard
  - Day 29- 使用 Kubernetes Dashboard 圖形介面管理 Cluster - iT 邦幫忙
- example: mysql
  - Day 24 Package Manager for Kubernetes - Helm & 用 Helm 安裝 MySQL Cluster - iT 邦幫忙
Day 3 - Helm 介紹 - iT 邦幫忙
kubernetes - HELM vs K8s Operators - Stack Overflow
Operator vs. Helm: Finding the Best Fit for Your Kubernetes Applications | Datadog
[Kubernetes] Helm chart 的匯出匯入 (helm export import) 與離線安裝 (docker offline install) - 清新下午茶
Helm Smart Resource：讓你的 Chart 學會與既有資源和平共處 | omegaatt
Helm | Cheat Sheet
https://artifacthub.io/
- Bitnami
repository
- https://charts.helm.sh/stable
- https://helm.ngc.nvidia.com/nvidia
- https://prometheus-community.github.io/helm-charts

helm version
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo list
helm install my-nginx bitnami/nginx
# helm upgrade -n poc-scope --install  my-nginx bitnami/nginx
helm list -A
helm list -n poc-scope
helm status my-nginx
helm history my-nginx
helm uninstall my-nginx

如果要覆寫，values.yaml 的設定可以透過 -f 方式指定覆蓋預設 values.yaml 配置或是使用 --set 覆寫

Operator

Kubernates Operator的本質

KubeVirt

Kubeflow

KubeRay

kueue

Introducing Kueue | Kubernetes
- Run a MPIJob | Kueue

Yunikorn

Run MPI Jobs | Apache YuniKorn

Volcano

hkube

MetalLB

:star:MetalLB and NGINX Ingress // Setup External Access for Kubernetes Applications - YouTube
- kubernetes/clusters/homelab-k8s/apps/metallb-plus-nginx-ingress at main · morrismusumi/kubernetes
- how to debug
- in bare-metal, on-prem, or homelab clusters, Kubernetes does not know how to assign external IPs to services. This is where MetalLB comes in.
  - To assign external IPs to Services of type LoadBalancer
  - To make your services accessible from outside the cluster using a stable IP address
  - To "fake" the cloud load-balancer behavior on bare metal
Day 16 MetalLB 簡介＆安裝 - iT 邦幫忙
Day 17 MetalLB 使用 - iT 邦幫忙
我的 K8S DevOps 實驗環境 - 服務負載均衡器
MetalLB: K8S 叢集的網路負載平衡器. 今天跟大家分享在地端資料中心內建立Kubernetes叢集之後，如何針對網路進行… | by Albert Weng | Medium
MetalLB 登場：無縫部署. 延續上篇的內容，在了解了MetalLB的基本概念之後，我們就進入實際上部署的動作… | by Albert Weng | Medium

install by helm

helm repo add metallb https://metallb.github.io/metallb
helm install metallb metallb/metallb -n metallb-system --create-namespace

config IP Pool metallb-ipaddresspool.yaml

apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: first-pool # IP Pool 的名字
  namespace: metallb-system # Namespace
spec:
  addresses:
  - 192.168.122.201-192.168.122.209

config L2 Mode Advertisement l2-advertisement.yaml

apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: example
  namespace: metallb-system
spec: # 沒有填寫 Spec 就是對於全部的 namespace 下的 IPAddressPool 都套用
  ipAddressPools:
  - first-pool # 對於 first-pool 會套用廣播

done, and we can use Service with type: LoadBalancer

Nginx Ingress

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install nginx-ingress ingress-nginx/ingress-nginx   --namespace ingress-nginx   --create-namespace

Cert Manager

Installing Cert-Manager and NGINX Ingress with Let’s Encrypt on Kubernetes | by Hakan Bayraktar | Medium

kube-prometheus-stack

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm search repo prometheus-community

helm show values prometheus-community/kube-prometheus-stack --version 75.12.0 > values.yaml
helm install -f values.yaml prometheus-stack prometheus-community/kube-prometheus-stack -n prometheus-stack --version 75.12.0 --create-namespace

kubectl --namespace prometheus-stack get secrets prometheus-stack-grafana -o jsonpath="{.data.admin-password}" | base64 -d ; echo

# kubectl port-forward service/prometheus-stack-prometheus 9090:9090 -n prometheus-stack

# kubectl port-forward service/prometheus-stack-grafana 3000:80 -n prometheus-stack

loki

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm search repo grafana

helm show values grafana/loki-stack --version 2.10.2 > loki-stack-values.yaml
helm install -f loki-stack-values.yaml loki grafana/loki-stack -n loki-stack --version 2.10.2 --create-namespace

helm uninstall loki -n loki-stack

harbor

helm repo add harbor https://helm.goharbor.io
helm repo update
helm search repo harbor --versions
helm show values harbor/harbor --version 1.11.4 > harbor-values.yaml
# type, harborAdminPassword
helm install harbor harbor/harbor --version 1.11.4 -f harbor-values.yaml -n harbor-registry  --create-namespace

kubectl get secrets harbor-core -n harbor-registry -o jsonpath='{.data.tls\.crt}' | base64 -d > harbor.ca

CKA

Certification Expiration Policy Change 2024 - Linux Foundation - Education
- IMPORTANT: Policy Change All certification products with a 36-month certification period will change to a 24-month certification period starting for exams taken April 1, 2024, 00:00 UTC.

kubectl run multi-container-pod --image=nginx --dry-run=client -o yaml > sidecar.yaml
kubectl create deployment --image=nginx nginx --replicas=4 --dry-run=client -o yaml > nginx-deployment.yaml
# Create a Service named redis-service of type ClusterIP to expose pod redis on port 6379
# (This will automatically use the pod's labels as selectors)
kubectl expose pod redis --port=6379 --name redis-service --dry-run=client -o yaml
# Create a Service named nginx of type NodePort to expose pod nginx's port 80 on port 30080 on the nodes
kubectl expose pod nginx --type=NodePort --port=80 --name=nginx-service --dry-run=client -o yaml

kubectl get pod <pod-name> -o yaml > pod-dump.yaml
kubectl create clusterrole --help
kubectl api-resources
kubectl run debug --image=curlimages/curl -it --rm -- sh
kubectl expose pod messaging --port=6379 --name messaging-service

kubectl get priorityclasses
kubectl create priorityclass high-priority --value=999 --description="high priority"
kubectl patch deployment -n priority busybox-logger -p '{"spec":{"template":{"spec":{"priorityClassName":"high-priority"}}}}'

Gateway API - certificateRefs - https://gemini.google.com/share/554de4ea8760

HorizontalPodAutoscaler Walkthrough - scaleDown

kubectl autoscale deployment my-app --cpu-percent=50 --min=1 --max=10
kubectl get hpa
kubectl delete hpa my-app
kubectl events hpa my-app | grep -i "ScalingReplicaSet"

Manual scale - The kubectl scale command can be used to scale both deployments and statefulsets. When scaling a statefulset, Kubernetes ensures that the state and order of the pods are maintained, unlike in deployments where pods can be created and destroyed in any order. - When you scale a deployment to a higher number of replicas than the cluster can support due to resource constraints, Kubernetes will create as many replicas as possible within the available resources. The remaining replicas will be in a pending state until sufficient resources are freed up or added to the cluster.

kubectl scale deployment flask-web-app --replicas=3

helm list
helm list -A
helm get manifest <release_name> -n <namespace>
helm repo list
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
helm search repo <foo>
helm search repo <foo> --versions
helm upgrade <foo> <bar> --version <version>
helm uninstall <release_name> -n <namespace>

helm lint ./new-version
helm install webpage-server-02 ./new-version

helm template argocd argocd/argo-cd --version 7.7.3 --set crds.install=false --namespace argocd > /root/argo-helm.yaml

keyword for document - CKA-note/附錄/CKA 常用的官方文件.md at main · michaelchen1225/CKA-note

sidecar container
downward API, fieldRef
kubectl expose

TBD

crd
hpa
vpa
k8s gateway
helm
kustomize
kubeadm
rolling update

Guacamole

Dip it – Deploy Apache Guacamole on Kubernetes – The Database Me

Operations

實作 K8s 實體機節點硬碟備份與還原 - HackMD

CNI

How to find out what podcidr is assigned to each node by calico CNI in kubernetes - Stack Overflow

Kubernetes

Under the hood

Awesome Ref

kubectl

Network

CPU limit

Volume, PV, PVC, StorageClass

NFS provisioner

cloud-native storage orchestrator

GPU

Horizontal Pod Autoscaling(HPA)

實務需求

RBAC

Resource Quota

Deployment

Use KVM + Cockpit + kubespray or kubeadm

Dashborad

Kubernetes Dashboard has been archived and is no longer actively maintained

management tools

minikube

kind

Rancher

kubespray

Helm

Operator

KubeVirt

Kubeflow

KubeRay

kueue

Yunikorn

Volcano

hkube

MetalLB

Nginx Ingress

Cert Manager

kube-prometheus-stack

loki

harbor

CKA

Guacamole

Operations

CNI

Cluster Management