K8s Dashboard


CKAD : 4.Design


Each component should be decoupled from resources.

All resources have transient relationship with others.

K8s orchestration works with series of agents = controllers = watch loops.

PodSpec determines the best node for deployment

One CPU =
1 AWS vCPU or
1 GCP Core
1 Azure vCore
1 Hyperthread in bare metal Intel CPU with Hyperthreading

Label and Selector 

Label can be
- production / development
- department name
- team name
- primary application developer

Selectors are namespace scoped, unless --all-namespace is used
--selector key=value
OR
-l key=value

If we use command
kubectl create deployment design2 --image=nginx
Then a pod is created with label app=design2
Here we can not specify replicas in same command.
We need to run one more 
if we edit label of that pod, then design2 deployment will create another pod with label app=design2
if we delete deployment design2 then all pod whose label is app=design2, those will only get deleted. 

To show all labels use --show-labels options with kubectl get command. 

Job

Job is for scenarios when you don’t want the process to keep running indefinitely.  

It does support parallel processing of a set of independent but related work items. These might be 
- emails to be sent, 
- frames to be rendered, 
- files to be transcoded, 
- ranges of keys in a NoSQL database to scan, and so on.

A Replication Controller manages Pods which are not expected to terminate (e.g. web servers), and a Job manages Pods that are expected to terminate (e.g. batch tasks).

Jobs are part of batch API group
It has following parameters
1. activeDeadlineSeconds it can remain alive for that many seconds only.
2. completions : How many instance ? default 1. We can edit it using
k edit job 'job name' 
3. parallelism : how many should run at a time ? default 1. with value 0, the job is paused. 
4. restartPolicy : {Never | OnFailure } Default is Always. Default is not suitable for Job
5. It is restarted for backoffLimit times default 6
6. ttlSecondsAfterFinished default 'never'
7. backoffLimit: Total how many job shall be created. 

* If parallelism > completions  then parallelism = completions 

So logic create 'parallelism' number of pod in one shot. Check how many are successful. It is equal or more than completions then stop. Else create another set of 'parallelism' number of pod. Continue untill total pod creation count is less than backoffLimit. 

While debugging set restartpolicy = never. This policy applies to pod not to job

The job status is Failed if
- restarted more than backoffLimit times OR
- it run more than activeDeadlineSeconds
else status is Complete

To create a job with imperative command: 

k run 'name of job' --image='container image' --restart=OnFailure

Delete Job
With cascade = false, only job get deleted, not pods. Default is true. 

kubectl delete jobs job_name cascade=false

CronJob
Linux style cronjob syntax
MM(minute) HH DD MM(month) WW
It can be list with comma separated value: 1,2
It can be range with hyphen: 1-5
It can be * to indicate all
It can be */ and number to indicate periodic: */2
* and ? has same meaning. 

CronJob creates multiple jobs as per schedule. The CronJob is only responsible for creating Jobs that match its schedule, and the Job in turn is responsible for the management of the Pods it represents.

It has following parameters

1. If a cronJob has sleep for 30 seconds and activeDeadlineSeconds is 10 then none of the job created by cronJob get completed state. 

2. startingDeadlineSeconds If cornjob cannot scheduled within this time then it is considered as failed. Failure can also because of forbid policy. After 100 such failure, no more job will get scheduled. 

Note: If startingDeadlineSeconds is set then failure count is considered in last startingDeadlineSeconds . It should be less then 100. 

3. concurrencyPolicy 
Allow:
Forbid: If second job is scheduled, before earlier job finished, then it is not allowed
Replace

4. suspend
all subsequent job will not be scheduled. 

5. successfulJobsHistoryLimit and failedJobsHistoryLimit
How many job shall be kept

To create CronJob (i.e. cj) using imperative commands

k run 'name of job' --image='container image' --restart=OnFailure --schedule="* * * * *"
Terms for multi container pod
1. Ambassador : Communicate with outside resources / outside cluster. E.g. Envoy Proxy
- Proxy local connection
- Reverse Proxy
- Limits HTTP request
- Re-route to outside world
2. Adapter : Modify data that primary container generates
3. sidecar : helps to provide service that is not found in primary container. E.g. logging

Flexibility : one application per pod
granular scalability : one application per pod
best inter container performance : Multiple application per pod

Containerizing an application

- It should be stateless
- It should be transient
- Remove the environment configuration. It should be via ConfigMap and Secrets
- it is like converting city bus to scooters 

After containerization of application, just ask

Q1 : Is my application as decoupled as it could possible be?
Q2 : Are all components design considering other components are transient. Will it work with Chaos Monkey?
Q3 : Can I scale any particular component?
Q4 : Have I used stable and open standard to meet my need?

Managing Resource Usage

If pod ask more CPU then defined, then
- Nothing

If pod ask more memory then defined, then behavior is undefined
- restart pod OR
- evicted node

If pod ask memory more than node has, then
- evicted node

If pod ask more storage then defined, then 
- evicted node

Resource Limits

1. CPU: cpu
2. Memory: memory
3. Huge Pages: hugepages-2Mi
4. Ephemeral Storage: ephemeral-storage
they apply at container level
Pod level value is summation of all container's values

The resources can be specify at project quota level also. 

limits:
  cpu: "1"
  memory: "1Gi"
requests:
  cpu: "0.5"
  memory: "500Mi"

k describe node "Node Name"

We can specify LimitRange object with default values. It is applicable within namespace. It is applicable if admission controller LimitRanger is enabled. 

apiVersion: v1
kind: LimitRange
metadata:
  name: limit-mem-cpu-per-container
spec:
  limits:
  - max:
      cpu: "800m"
      memory: "1Gi"
    min:
      cpu: "100m"
      memory: "99Mi"
    default:
      cpu: "700m"
      memory: "900Mi"
    defaultRequest:
      cpu: "110m"
      memory: "111Mi"
    type: Container

- While creating container, if memory request and limit both are not specified 
then default range applies.  
- While creating container, if memory request is not specified and limit is specified. Then request value is same as limit
- While creating container, if memory request is specified and limit is not specified. Then limit value is double then request value


CNI 

* Some CNI plugins supports Network Policies. E.g. Calico, Canal, Kube Router, Romana, Weave Net

* Some CNI plugins supports encryption of UDP and TCP traffic. E.g. Calico, Kopeio, Weave Net

* Some CNI plugins allows vxlan. E.g. Canal, Flannel, Kopeio-networking, Weave Net

* CNI plugins are layer 2 or layer 3
Layer 2: Canal, Flannel, Kopeio-networking, Weave Net
Layer 3: Calico, Romana, Kube Router

* kubenet is basic CNI. It relis on cloud provider for routing and cross node networking

./enter_pod.sh "pod name"


#!/bin/sh

containerId=`kubectl get pods $1 -o jsonpath='{.status.containerStatuses[1].containerID}' | sed -e "s/^docker:\/\///"`
pid=`docker inspect --format {{.State.Pid}} $containerId`
echo $pid
sudo nsenter --target $pid --mount --uts --ipc --net --pid sh

CKAD : 3. Build


App Container (appc) is an open specification that defines several aspects of how to run applications in containers: an image format, runtime environment, and discovery protocol. rkt's native image format and runtime environment are those defined by the specification.

clear container (from intel) uses kvmtool mini-hypervisor. So it is VM with quick bootup and low memory footprint. Not comparable with Docker but acceptable for many use cases. 

If we create a file inside Docker container, then it is acutally located at 
/var/snap/docker/common/var-lib-docker/aufs/diff/  
OR 
/var/lib/docker/aufs/diff/

Tools
1. Docker
2. buildah
- create OCI image
- with or without Dockerfile
- no superuser previliage needed
- Go-lan based API for easy integation
3. podman (pod manager)
- replacement of "docker run"
- it is for container LCM
4. Kompose

sudo kompose convert -f docker-compose.yaml -o localregistry.yaml

latest is just a string. we need process to name and rename latest version as "laters" as an when it available. Else, there is no point. 

k exec -it -- /bin/bash
Here instead of /bin/bash any tool of local host, (where kubectl is running) can be used. 

redinessProbe and livenessProbe
1. exec statement
2. HTTP GET. return value 200-399
3. TCP. Try to open port on pre-determined port

To get logs generated by etcd
k -n kube-system logs etcd

The events can be listed with
k describe pod  

We can user "--dry-run -o yaml" just to generate YAML file

Minikube

To access K8s service on Minikube we have few approaches

1. Make it as NodePort Service

1.1  we can change service type as NodePort by
k path svc -p '{"spec":{"type":"NodePort"}}'
Now to access NodePort service on Minikube, we need IP address of virtual box. 

1.2
minikube ip
this command  give IP address of Worker+Master Node.
User command 
curl http://192.168.99.108:31754/v2/
curl http://"Minikube IP":"NodePort"/v2/

1.3

Use the command
minikube service --url
you will get service end point
http://192.168.99.108:31754

http://"Minikube IP":"NodePort"

We can open this URL using default browser using command
minikube service 

2.1

We can use ClusterIP

sudo route add 10.100.88.2 gw 192.168.99.108
sudo route add gw

Registry

We should add insecure registry to docker with its ClusterIP 

sudo vim /etc/docker/daemon.json

{ "insecure-registries":["10.110.186.162:5000"] }

Then Restart Docker Service
sudo systemctl restart docker.service

CKAD : 2. K8s Architecture


Key take away points
  • All the configuration is defined in YAML and stored in JSON format
  • Container creation tools: Buildah, Podman, cri-o, containerd, frakti, 
  • Mesos has multi level scheduler for data center cluster
  • Evolution: Brog-> Mesos, Cloud Foundry, K8s, Omega
  • Replication Controller is now
  • - Deployment controller
  • - Replicaset
  • ReplicaSet has Selector, 
  • ReplicationController does not have Selector
  • If we edit rs, then it is applicable only for new pods
  • To get details about container image "k describe pod | rs" is better command
  • rs has metaData - > name So if we rs is part of deploy. If we delete rs, then new rs is created with same name. If rs is not part of deploy then first we need to store it by "k get rs 'rs name' - o yaml > 'file name' " then create it again with command "k create -f 'file name'"
  • If pod is part of rs OR if pod is part of deploy (so it is part of rs too). If we delete pod, then it will create with new name, because pod has metaData -> generateName
  • Deployment ensures that resources are available such as (1) IP Address and (2) Storage. Then deploys ReplicaSet
  • So if we delete ReplicaSet then deployment recreate it. 
  • If we delete deployment, then ReplicSet also get deleted. But service and pod remains
  • If we delete service then pod does not get deleted.
  • Annotation is not for k8s. it is for 3rd party tools
  • 'Cloud controller manager' is optional at master node. If it is present, the kublet shall be started with option --cloud-provider-external
  • Pause container is used to get IP address
  • We can edit only few fields of pod (1) image (2) activeDeadlineSeconds (3) tolerations. However if we edit deploy, we edit any param and pod will be automatically restarted. 
Tent and Tolerations
  • Node has taints to discourage pod assignment, unless pod has toleration taint is expressed as key=value:effect There are 3 effects. (1) No Schedule (2) Prefer No Schedule (3) No Execute means existing pod will be evacuated. 
  • Master node will have taint : No Schedule 
  • To create taint the command is : k taint nodes 'node name' key=value:effect
  • To remove the taint the commands are: k taint nodes 'node name' key=value:effect- , k taint nodes 'node name' key- & k taint nodes 'node name' key:effect-
  • Pod has tolerations = "key", operator ("Equal"), "value", "effect" (same as above)
  • We can edit pod's tolerations with k edit command.
  • If tolerations operator = Exists then only key + effect shall match. 
  • If tolerations operator = Equal (default value is Equal) then only key + value + effect shall match. 
  • Taints and tolerations is not about which pod will get schedule on which node. It only tells node that given pod can be accepted or not. 
Node Affinity
  • To schedule a pod on specific node we have node affinity. 
Dockerfile and pod relation
  • With Docker command whatever additional parameter we pass, that will be replaced CMD and append to ENTRYPOINT of Dockerfile
  • Docker command can replace ENTRYPOINT also with --entrypoint option. 
  • Dockerfile ## Docker Command ## Pod YAML
  • ENTRYPOINT ## --entrypoint ## command:
  • CMD ##  ## args:


Useful commands



To use any command in different namespace
(1) kubectl (2) verb (3) -n 'namespace name' (4) then rest of the part of command. 
We cannot add 
-n 'namespace name'
at the end of command

1. 
A. To run pod, without YAML file 

k run newpod --image=nginx --generator=run-pod/v1

k run newpod --image=nginx --dry-run --restart=Never -o yaml

Here the pod name and container name will be identical

We can specify label with "-l key=value"

B. To create deployment without YAML file 

k run firstpod --image=nginx

k run firstpod --image=nginx --dry-run -o yaml

k create deployment firstpod --image=nginx
k create deployment firstpod --image=nginx --dry-run -o yaml

We shall mention container port in deployment

        ports:

        - containerPort: 3306


C. To create service. if pod has label app=svcname. We cannot pass label name to svc

kubectl expose pod pod_name --port=6379 --name svcname --dry-run -o yaml

kubectl create service clusterip svcname --tcp=6379:6379 --dry-run -o yaml

kubectl create service nodeport svcname --tcp=6379:6379 --node-port=32080 --dry-run -o yaml

If we do not have any pod with label app=svcname then also service will be created. However when we list ep, we found there is no pod in that svc

2. To know all taints k describe nodes | grep -i taint

3.1 To know about all resources k api-resources

Here are list of shortcuts

ConfigMap ---- cm
EndPoints ----- ep
Namespace --- ns
Node --------    no
PersistentVolumeClaim --- pvc
PersistentVolume --- pv
Pod -- po
ReplicationController -- rc
ServiceAccount --- sa
Service --- svc
CustomResourceDefinition --- crd
DaemonSet --- ds
Deployment --- deploy
ReplicaSet --- rs
StatefulSet --- sts
HorizontalPodAutoscaler --- hpa
CronJob --- cj
CertificateSigningRequest --- csr
Ingress --- ing
NetworkPolicy --- netpol
PodSecurityPolicy --- psp
StorageClass --- sc

3.2 We can list associated verbs with command k api-resources -o wide

3.3 We can list multiple resources as comma separated list
kubectl get deploy,rs,po,svc,ep

4. Under container we write:
ports:
- containerPort: 80

5. Under Service we write:
ports:
 - protocol: TCP
   port: 80


6. Useful commands tips

Context
Create context
k config set-context "any context name"  --namespace='name space name'

context is (1) cluster (2) namespace and (3) user

List context
k config get-contexts

Use specific context
k config use-context 'context name'


k explain 

Please refer for imperative commands : https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands

We should use grep command with "-C number" option. 
C for number of line before and after both
B for before
A for after

Kubernetes Resource Map