Data on Kubernetes
storage system attributes:
- availability [ # of replica , primary and secondary DB ]
the ability to access the data during failure conditions. The failures may be due to failures in the storage media, transport, controller or any other component in the system.
RTO Recovery Time Objective, MTTF, MTTR
- consistency
a. eventually consistent
b. strongly consistent
RPO Recovery Point Objective (time)
- scalability [ sharding = divide larger part into smaller part ]
a. number of client
b. throughput
c. capacity
d. increase number of component to support all above.
- durability
a. [ # of replica ]
b. endurance characteristic of storage media: SSD, spinning disk, tape
c. ability to detect corruption of data and recover corrupted data (bit-rot)
- performance
a. latency
b. operation per second
c. throughput for read and throughput for write.
For cloud native storage system : three more
- observability
- elasticity [ on demand scale up/down ]
- data locality [ pod affinity ]
Storage stacks / layers
1. Data Access Interface [ block device, file system, App API: (Object store, k-v store and DB), PIP
2. Storage topology [ centralized, distributed, sharded, and hyper-converged ]
3. data protection layer, which adds redundancy [ RAID, Erasure coding, and Replicas ]
4. additional data services [ replication, snapshots (PiT Point in Time), clones, incremental snapshots for efficient backups ]
5. host, OS, physical non-volatile storage
Common Patterns and Features
1. Operator
2. CSI
2.1 CSI building blocks
2.1.1 identity gRPC service : info and capabilities of plugin
2.1.2 controller gRPC service:
2.1.2.1. create and delete volume,
2.1.2.2. create and delete snapshot,
2.1.2.3. attach and detach volume, and
2.1.2.4. expand volume
2.1.3. node gRPC service
2.1.3.1. mount and unmount volume, and
2.1.3.1. expand volume.
2.2 CSI features
2.2.1 CSI topology and CSI capacity tracking provides input to K8s scheduler.
2.2.2 raw block mode ( instead of file system)
2.2.3 snapshot and group snapshot for backup / recovery.
3. K8s workload API
Volume Claim Template
4. Topology Aware Scheduling
5. Pod Disturption Budget
6. Resource Management
6.1 pod's [ guaranteed ] QoS,
6.2 pod's [ higher than normal ] priority
6.3 VPA is better than HPA for statefulset.
7. Separation of CP (using operator) and DP (E-W traffic)
8. Default secure
8.1 no port accessible outside
8.2 k8s secret.
Day 2 operations
Upgrade
- CRD version
Backup/Restore
- App level
- volume level with hook so during backup, app does not use volume
Increase / Decrease Storage capacity
- stateful set can expand storage volume
- HPA
Data Migration
- init containers
- job
Reference:
- https://docs.google.com/document/u/0/d/1ro-M0aTBT64irBIy_n6Z-MkVubXc8ltBOiAJtXHw4oo/mobilebasic#h.dsrlcg44gbsz
- CNCF Storage Whitepaper version 2 https://docs.google.com/document/d/1Ag9PxdIe3iaMHcI5joHjyqo-x7fbIQErT02XkZz7Zvw/edit#heading=h.uvwp1nxx8pio
- Data Protection workflows https://github.com/kubernetes/community/blob/master/wg-data-protection/data-protection-workflows-white-paper.md
- Cloud Native Disaster Recovery for Stateful Workloads https://bit.ly/cncf-cloud-native-DR
- Performance & Benchmarking https://bit.ly/cncf-sig-storage-performance-benchmarking
0 comments:
Post a Comment