kelseyhightower Kubernetes The Hard Way
Posted by
Manish Panchmatia
on Sunday, May 24, 2020
Labels:
CKA,
DevOps,
k8s
/
Comments: (2)
Full article...>>
CFSSL consists of:
- a set of packages useful for building custom TLS PKI tools
- the
cfssl
program, which is the canonical command line utility using the CFSSL packages. - the
multirootca
program, which is a certificate authority server that can use multiple signing keys. - the
mkbundle
program is used to build certificate pool bundles. - the
cfssljson
program, which takes the JSON output from thecfssl
andmultirootca
programs and writes certificates, keys, CSRs, and bundles to disk.
The
cfssl
command line tool takes a command to specify what operation it should carry out: sign signs a certificate
bundle build a certificate bundle
genkey generate a private key and a certificate request
gencert generate a private key and a certificate
serve start the API server
version prints out the current version
selfsign generates a self-signed certificate
print-defaults print default configurations
Use
cfssl [command] -help
to find out more about a command. The version
command takes no arguments.
Networking
gcloud compute networks : kubernetes-the-hard-way
gcloud compute networks subnets : 10.240.0.0/24
gcloud compute firewall-rules
1. tcp, udp, icmp : source-ranges 10.240.0.0/24,10.200.0.0/16
2. tcp:22,tcp:6443,icmp : source-ranges 0.0.0.0/0
gcloud compute firewall-rules list
Now, create public address
gcloud compute addresses
Compute
3 K8s controllers:
controller-0: 10.240.0.10
controller-1: 10.240.0.11
controller-2: 10.240.0.12
POD CIDR : 10.200.0.0/16
3 Worker node
worker-0: 10.240.0.20 pod-cidr 10.200.0.0/24
worker-1: 10.240.0.21 pod-cidr 10.200.1.0/24
worker-2: 10.240.0.22 pod-cidr 10.200.2.0/24
TLS Certificates
TLS certificates for the following components:
* etcd,
* kube-apiserver,
* kube-controller-manager,
* kube-scheduler,
* kubelet, and
* kube-proxy.
A public key infrastructure (PKI) is a set of roles, policies, hardware, software and procedures needed to create, manage, distribute, use, store and revoke digital certificates and manage public-key encryption. In cryptography, a PKI is an arrangement that binds public keys with respective identities of entities (like people and organizations).
Generate
1. ca.config file
Usage:
"signing",
"key encipherment",
"server auth",
"client auth"
2. Generate CSR JSON file
Output: Private key and Certificate for CA
3. Generate various CSR JSON files. Use CA key, CA key certificate, CA config file.
Output Private key and Certificate
3.1. Admin
3.2. for each worker node for kubelet.
3.3 for kube-controller-manager
3.4 kube-proxy
3.5 kube-scheduler
4. Generate K8s API server certificate.
For -hostname argument pass
KUBERNETES_HOSTNAMES=kubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster,kubernetes.svc.cluster.local, K8s master node public IP, K8s all master nodes' private IP addresses.
5. Generate Service Account pair
scp
6. To Worker node copy (scp) the following files
ca.pem
worker-N-key.pem
worker-N.pem
7. To all master node, copy (scp) following files
ca.pem
ca-key.pm
kubernetes.pm
kubernetes-key.pm
service-account.pm
service-account-key.pm
client authentication configuration
The kube-proxy, kube-controller-manager, kube-scheduler, and kubelet client certificates will be used to generate client authentication configuration file, also known as kubeconfigs. It enables Kubernetes clients to locate and authenticate to the Kubernetes API Servers.
Node authorization is a special-purpose authorization mode that specifically authorizes API requests made by kubelets. https://kubernetes.io/docs/reference/access-authn-authz/node/
1. Generate kubeconfig file for each worker node, with user name as system:node:workerN. The output is worker-N.kubeconfig
2. Generate kubeconfig file for the kube-proxy service. The output is kube-proxy.kubeconfig
3. Generate a kubeconfig file for the kube-controller-manager service. here server is 127.0.0.1 and output is kube-controller-manager.kubeconfig
4. Generate a kubeconfig file for the kube-scheduler service. here server is 127.0.0.1 and output is kube-scheduler.kubeconfig
5. Generate a kubeconfig file for the admin user. here server is 127.0.0.1 and output is admin.kubeconfig
To generate .kubeconfig file, we will use these three commands:
kubectl config set-cluster
kubectl config set-credentials
kubectl config set-context
Files for worker nodes:
- worker-N.kubeconfig
- kube-proxy.kubeconfig
Files for master nodes
- admin.kubeconfig
- kube-controller-manager.kubeconfig
- kube-scheduler.kubeconfig
Data Encryption Config and Key
1. Generate encryption key with command
head -c 32 /dev/urandom | base64
2. Generate encryption-config.yaml file using that encryption key.
Upload it on all three master node.
Bootstrap etcd
On each master node
1. download and install etcd
2. copy these 3 files at /etc/etcd
ca.pem
kubernetes-key.pem
kubernetes.pem
3. Create /etc/systemd/system/etcd.service file. It opens 2379 and 2380 port for etcd
4. Start etcd service
Bootstrap k8s-controller, K8s API server, K8s Scheduler
On each master node
1. download and install
kube-apiserver
kube-controller-manager
kube-scheduler
kubectl
2. Move all binary to /usr/local/bin
3. Move the following files to /var/lib/kubernetes/
ca.pem ca-key.pem kubernetes-key.pem kubernetes.pem service-account-key.pem service-account.pem encryption-config.yaml
kube-controller-manager.kubeconfig
kube-scheduler.kubeconfig
4. Create .service file for each of them at /etc/systemd/system/
For API server specify etcd and other parameters
--service-cluster-ip-range=10.32.0.0/24 \\
--service-node-port-range=30000-32767 \\
We can configure nginx for healthcheck of any service. Copy kubernetes.default.svc.cluster.local file at /etc/nginx/sites-available/
server {
listen 80;
server_name kubernetes.default.svc.cluster.local;
location /healthz {
proxy_pass https://127.0.0.1:6443/healthz;
proxy_ssl_trusted_certificate /var/lib/kubernetes/ca.pem;
}
}
RBAC for Kubelet Authorization
Let's set the Kubelet --authorization-mode flag to Webhook. Webhook mode uses the SubjectAccessReview API to determine authorization.
1. Create the system:kube-apiserver-to-kubelet ClusterRole with permissions to access the Kubelet API and perform most common tasks associated with managing pods:
2. Bind the system:kube-apiserver-to-kubelet ClusterRole to the kubernetes user:
It is sufficient to run on any one worker with kubectl
K8s Frontend LoadBalancer
Bootstrapping the Kubernetes Worker Nodes
1. First install
socat conntrack ipset
The socat binary enables support for the kubectl port-forward command.
2. Turn off swap
sudo swapoff -a
3. download and install
critools (cri-ctl)
runc, container networking plugins, containerd, kubelet, and kube-proxy.
4. Installation directory
/etc/cni/net.d \
/opt/cni/bin \
/var/lib/kubelet \
/var/lib/kube-proxy \
/var/lib/kubernetes \
/var/run/kubernetes
5. Create network configuration file at /etc/cni/net.d/
10-bridge.conf
99-loopback.conf
6. configure containerd service
7. configure Kubelet
8. configure kube-proxy
9. Start services: containerd kubelet kube-proxy
Configuring kubectl for Remote Access
Use the following commands
kubectl config set-cluster // --certificate-authority=ca.pem
kubectl config set-credentials // --client-certificate=admin.pem --client-key=admin-key.pem
kubectl config set-context // --user=admin
kubectl config use-context
Provisioning Pod Network Routes
Add route for pods CIDR on each node, with destination as node's IP address.
Deploying the DNS Cluster Add-on
https://storage.googleapis.com/kubernetes-the-hard-way/coredns.yaml
Ref:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/#steps-for-the-first-control-plane-node
Ref:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/#steps-for-the-first-control-plane-node
CKAD: Tips
1. how to run on master node?
nodeName: master
2. how to run command and args
commands: ["/bin/sh", "-c" "COMMAND"]
3. rolling update
Rolling update YAML
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
4. inside container
volumeMounts:
- mountPath:
5. Useful command
k explain pods --recursive
6. Environment Variable
env:
- name: ENV_NAME
valueFrom:
configMapKeyRef:
name: CM
key: KEY
- name: ENV_NAME
value: "VALUE"
envFrom:
- configMapRef:
name: CM_NAME
Same applies for secret
7. Empty Dir volume
volumes:
- name: VOL
emptyDir: {}
8. Ports inside container
ports:
- containerPort: AAAA
9. CPU limit
resources:
requests:
cpu: "0.2"
10. PVC at Pod
volumes:
- name: V_NAME
persistentVolumeClaim:
claimName: PVC_NAME
11.
A. Security Context for container
securityContext:
capabilities:
add:
- SYS_TIME
drop:
- SYS_TIME
securityContext:
runAsUser: UID
runAsGroup: GID
fsGroup: NA
fsGroupChangePolicy: NA
allowPrivilegeEscalation : true | false
privileged: true | false
B. Security Context for pod
securityContext:
systls:
- name: NAME
value: VALUE
12. Ingress
spec:
rules:
- host: HOST_URL
http:
paths:
- path: /PATH
backend:
serviceName: K8S_SVC
servicePort: PORT(note NODE_PORT)
For testing HOST_URL can be specified with -H option
curl -H "HOST_URL" http://IP_ADDRESS/PATH
13. PV
persistentVolumeReclaimPolicy: Retain | Recycle | Delete
14. netpol
Please define port also of service
podSelector:
matchLabels:
KEY: VALUE
policyTypes:
- Ingress
- Egress
ingress:
- from:
- ipBlock:
cidr: 172.17.0.0/16
except:
- 172.17.1.0/24
- namespaceSelector:
matchLabels:
KEY: VALUE
- podSelector:
matchLabels:
KEY: VALUE
Same for egress, we shall use to
15 Job
activeDeadlineSeconds
completions
parallelism
restartPolicy : {Never | OnFailure } Default is Always. Default is not suitable for Job
backoffLimit
ttlSecondsAfterFinished default 'never'
16 Probe
A livenessProbe
B readinessProbe
C startupProbe
A
exec:
command:
- COMMAND1
- COMMAND2
B
httpGet:
path: /PATH
port: PORT
httpHeaders:
- name: Custom-Header
value: VALUE
C
tcpSocket:
port: PORT
For all:
initialDelaySeconds: 15
periodSeconds: 20
failureThreshold
11. k explain K8S_OBJECT --recursive
12. Rolling Update
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
13. Volumes at pod using secret and configmap
volumes:
name: VOLUME_NAME
configMap:
name: CM_NAME
volume:
name: VOLUME_NAME
secret:
secretName: S_NAME
14. For 'k create' commnad, first we shall specify name of K8s object and then other parameter. the exception is svc. For svc, first specify type of svc and then its name and then other parameters.
15. Inside YAML file, all type/parameter with plural name are list. E.g .volumes, volumemounts, containers, resources etc. Only exception is command. It is singular, yet list. However args is plural, no exception.
16. Find API version with command
k explain OBJECT --recursive | grep VERSION
17. compare to
k get po POD_NAME -o yaml
below command is better
k get po POD_NAME -o yaml --export
18. To change namespace
k config set-context --current --namespace=NAMESPACE
nodeName: master
2. how to run command and args
commands: ["/bin/sh", "-c" "COMMAND"]
3. rolling update
Rolling update YAML
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
4. inside container
volumeMounts:
- mountPath:
5. Useful command
k explain pods --recursive
6. Environment Variable
env:
- name: ENV_NAME
valueFrom:
configMapKeyRef:
name: CM
key: KEY
- name: ENV_NAME
value: "VALUE"
envFrom:
- configMapRef:
name: CM_NAME
Same applies for secret
7. Empty Dir volume
volumes:
- name: VOL
emptyDir: {}
8. Ports inside container
ports:
- containerPort: AAAA
9. CPU limit
resources:
requests:
cpu: "0.2"
10. PVC at Pod
volumes:
- name: V_NAME
persistentVolumeClaim:
claimName: PVC_NAME
11.
A. Security Context for container
securityContext:
capabilities:
add:
- SYS_TIME
drop:
- SYS_TIME
securityContext:
runAsUser: UID
runAsGroup: GID
fsGroup: NA
fsGroupChangePolicy: NA
allowPrivilegeEscalation : true | false
privileged: true | false
B. Security Context for pod
securityContext:
systls:
- name: NAME
value: VALUE
12. Ingress
spec:
rules:
- host: HOST_URL
http:
paths:
- path: /PATH
backend:
serviceName: K8S_SVC
servicePort: PORT(note NODE_PORT)
For testing HOST_URL can be specified with -H option
curl -H "HOST_URL" http://IP_ADDRESS/PATH
13. PV
persistentVolumeReclaimPolicy: Retain | Recycle | Delete
14. netpol
Please define port also of service
podSelector:
matchLabels:
KEY: VALUE
policyTypes:
- Ingress
- Egress
ingress:
- from:
- ipBlock:
cidr: 172.17.0.0/16
except:
- 172.17.1.0/24
- namespaceSelector:
matchLabels:
KEY: VALUE
- podSelector:
matchLabels:
KEY: VALUE
Same for egress, we shall use to
15 Job
activeDeadlineSeconds
completions
parallelism
restartPolicy : {Never | OnFailure } Default is Always. Default is not suitable for Job
backoffLimit
ttlSecondsAfterFinished default 'never'
16 Probe
A livenessProbe
B readinessProbe
C startupProbe
A
exec:
command:
- COMMAND1
- COMMAND2
B
httpGet:
path: /PATH
port: PORT
httpHeaders:
- name: Custom-Header
value: VALUE
C
tcpSocket:
port: PORT
For all:
initialDelaySeconds: 15
periodSeconds: 20
failureThreshold
11. k explain K8S_OBJECT --recursive
12. Rolling Update
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
13. Volumes at pod using secret and configmap
volumes:
name: VOLUME_NAME
configMap:
name: CM_NAME
volume:
name: VOLUME_NAME
secret:
secretName: S_NAME
14. For 'k create' commnad, first we shall specify name of K8s object and then other parameter. the exception is svc. For svc, first specify type of svc and then its name and then other parameters.
15. Inside YAML file, all type/parameter with plural name are list. E.g .volumes, volumemounts, containers, resources etc. Only exception is command. It is singular, yet list. However args is plural, no exception.
16. Find API version with command
k explain OBJECT --recursive | grep VERSION
17. compare to
k get po POD_NAME -o yaml
below command is better
k get po POD_NAME -o yaml --export
18. To change namespace
k config set-context --current --namespace=NAMESPACE
StatefulSet
Purpose
1. creation order is guaranteed unless podManagementPolicy: parallel. The default podManagementPolicy value is OrderedReady
2. pod name remain same even after restart
3. Use volumeClaimTemplate . Its an array. The content of array element is same as PVC. Each pod will get its own PV.
If we delete statefulset then all pods may not get deleted. First we shall scale statefulset to 0 then delete statefulset. After that we shall manually delete PVC.
For statefulset hostname and pod name are same.
kubectl patch statefulset
It can be used to update:
- label
- annotation
- container image
- resource requests
- resource limits
Two types of updtateStrategy
- RollingUpdate: The RollingUpdate update strategy will update all Pods in a StatefulSet, in reverse ordinal order
- OnDelete
During update, if any pod, that is not under update process, fails, then it will restored back to its original version. Support N pod. Updates goes from pod-N to pod-1 and change container image to version 1 to version 2. Now suppose update for pod-i is going on. pod-j is crashed. if j > i then it will be restored back to version 2 and if j < i then it will be restored back to version 1
Headeless service will add DNS entries
- for each pod: "pod name"."headless service name"."namespace name".svc.cluster.local
Here pod IP address is not used.
- for headeless service: DNS is mapped to all pod's DNS.
To create Headless service, specify ClusterIP: None
1. Headless service with deployment.
Pod shall have value for subdomain as same as name of headless service.
Also specify hostname then only pod's dns name A record will be created. But all pod will have same hostname.
2. To create Headless service with statefulset, no need to specify (1) subdomain (2) hostname
Instead of subdomain, we shall specify serviceName
1. creation order is guaranteed unless podManagementPolicy: parallel. The default podManagementPolicy value is OrderedReady
2. pod name remain same even after restart
3. Use volumeClaimTemplate . Its an array. The content of array element is same as PVC. Each pod will get its own PV.
If we delete statefulset then all pods may not get deleted. First we shall scale statefulset to 0 then delete statefulset. After that we shall manually delete PVC.
For statefulset hostname and pod name are same.
kubectl patch statefulset
It can be used to update:
- label
- annotation
- container image
- resource requests
- resource limits
Two types of updtateStrategy
- RollingUpdate: The RollingUpdate update strategy will update all Pods in a StatefulSet, in reverse ordinal order
- OnDelete
During update, if any pod, that is not under update process, fails, then it will restored back to its original version. Support N pod. Updates goes from pod-N to pod-1 and change container image to version 1 to version 2. Now suppose update for pod-i is going on. pod-j is crashed. if j > i then it will be restored back to version 2 and if j < i then it will be restored back to version 1
Headeless service will add DNS entries
- for each pod: "pod name"."headless service name"."namespace name".svc.cluster.local
Here pod IP address is not used.
- for headeless service: DNS is mapped to all pod's DNS.
To create Headless service, specify ClusterIP: None
1. Headless service with deployment.
Pod shall have value for subdomain as same as name of headless service.
Also specify hostname then only pod's dns name A record will be created. But all pod will have same hostname.
2. To create Headless service with statefulset, no need to specify (1) subdomain (2) hostname
Instead of subdomain, we shall specify serviceName