Managing Storage with Containers


https://www.meetup.com/Docker-Bangalore/events/253542738/

Flex Volume Drivers in Kubernetes and CSI, Peeyush Gupta, IBM
=============================================================

Storage in container

- stateless / stateful
- volumes
- dynamic provisioning
- PVC, PV and Storage Class

* Storage Class refers to dynamic provisioning
* PVC refers to Storage Class
* POD refers to PVC for volume. 

Kubelets running on host, request Flex Driver. Flex Driver implements vendor specific APIs for storage/volume : 1. Mount 2. Unmount 3. attach 4. detach. 

This binary need to be placed(copied) at specific path for each POD. 
For CNI also cadico driver need to be placed(copied) at specific path for each POD. The better alternative is CSI = Container Storage Interface. 

CO = Container Orachstrator. Example: Kubernetes (K8S), MESOS, Cloud Foundray, OpenShift (by RedHat). 
CO has 1. node 2. controller 3. identity

There is a single binary for node and controller. Based on Identity either node or controller role can be played. 

1. Indemotent APIs
2. Sidecar container
2.1 Driver register (Identity Service)
2.2 Extended  Provisoner (watch create volume)
2.3 Extended Attacher (watch attach)
2.4 Liveness Probe
2.5 Extended snapshotter (started in mid July 2018)

Containerized Gluster Storage in Kubernetes - Saravanakumar, Red Hat
====================================================================

GlusterFS was born by oil industry. Oil industry need to process data from different hosts to detect presence of oil.  Now it is more than 10 years old. 

Steps (all steps as sudo)

1. install and start glusterd service on all host. 
2. gluster peer status
3. gluster volume create
4. gluster volume start
This will start gluster on all host. 
5. gluster volume status
6. mount -t glusterfs

PVC access mode
1. ROX: Read only by many nodes
2. RWO: Read/Write by single node
3. RWX: Read/Write by many nodes

Heketi provides a RESTful management interface which can be used to manage the life cycle of GlusterFS volumes

Storage requirements for running Spark workloads on Kubernetes, Rachit Arora
============================================================================

Spark core engine runs over 1. Yarn, 2. MESOS, 3. Standalone Schedular, 4. K8S
1. Spark SQL 2. Spark Streaming 3. Spark Machine Learning Lib 4. GraphX runs over Spark Core Engine

* Data engineer 1. Ingest and store data from multiple source 2. Prepare Data. 
* Data Scientis 2. Prepare Data 3. Analysze Data build model 
* Application Developer 4. Visulize Data

Now new trend is serverless analytics. 

'Spark over K8S' provides Jupyter-Kernel gateway for data scientist to analyze data

Distributed FS
1. NFS and BigNFS
2. HDFS
3. DBFS (Data Briks FS)
4. S3 / Object Storage
5. Portworx
6. GlusterFS

URLs: 

datascience.ibm.com 
www.ibm.com/analytics/us/en/watson-data-platform/tutorial

Tweeter Handle : @k8sBLR

MLCC


Let me share key take away points from meetup event "Google Machine Learning Study Jam

In Feb/March 2018, Google announced MLCC Machine Learning crash course In July 2018, MLCC Study Jam series comes to India. Click here and click here to know more. I attended one such event with my friend, by Industry 5.0 meetup.

Here are few useful links

TensorFlow Content Bundle, Spring 2018

Gradients and Partial Derivatives : YouTube Video 
Later on, I found the Maths play list is good. All videos  by  Eugene Khutoryansky are excellent

Another YouTube video by  Christopher Gondek. Here also, the playlist about 'Machine Learning Visualization' is good. 

Microsoft announced about FPGA based Edge Computing : Brainwave project

AutoML and transfer learning. At present, they are at nascent stage. Once let it fully evolved then we may not need people who know AI/ML. The machine themselves will learn. I did little Googling and found few links : http://www.ml4aad.org/automl/ and https://automl.info/

As per my knowledge, after completing any Machine Learning course till one completely switch his/her career path, Kaggle is the only platform to get hands-on experience. I came to know one more such platform Seedbank  I found one seed about 'Piano Transcription' quite interesting. We discussed with Sanjay Chitnis about creating similar seed to recognize Indian Raga

There is an interesting book 'Pattern Recognition and Machine Learning (Information Science and Statistics)' by Chrisopher M. Bishop

http://playground.tensorflow.org is an excellent, browser based Neural network tool. It is also used as part of MLCC We discussed about L1 regularization, L2 regularization, confusion matrix, precision, accuracy, recall, F1 score, Receiver operating characteristic etc. Precision is all about how many positive case, the algorithm could able to detect out of all positive cases. Recall is about how good is the diagnostic test? 

CNN is combination of filter and dimension reduction. RNN is a special case of LSTM. GAN is widely used to creation. The GAN Zoo has list of all variations of GAN

We also discussed about Semi-Supervised Learning , Topic Learning OR Keyword Learning, that is beyond supervised learning, Gold standard etc. 

At the end, Sanjay drew out attention to an interesting trend that now, product cost is keep reducing. Features in products are keep increasing. Service cost is keep increasing ! India has lots of data available. There is good scope of data analytics and machine learning for General Election 2019 at India. 

DevOps & Digital Transformation


DevOn (Prowernesss) conducted its first meetup with core theme of "DevOps and Digital Transformation". We noticed many buzzwords in Software Quality. It started with CMMI, then P-CMMI, then Agile, Scrum and now DevOps. The most important is, how any orgnisation tunes the latest trend with (like DevOps) with its business strategy, culture and customer expectation. Uber became successful with this approach. 

Hariharan Ganesan broke myths that Rolls-Royce is no longer making cars ! He himself does not posses any car by Rolls-Royce and neither he is getting any discount to buy one. :-) He talked more about Rolls-Royce, its business, product, strategy, market, people etc. In spite of being company based on metal, nut and bolts, DevOps is relevant for Rolls-Royce. 

We were amazed by less-known facts about aviation industry. The first IoT enable business was aircraft. Its engine is not sold, but offered on rental per hour basis. It is an intelligent engine and having engine to engine communication capability. Rolls Royce outsources 10mm size small robot (size of a bug) which goes inside of aircraft and takes inside engine photos. R2 (R square) Data Labs is sister concern of Rolls Royce. Key criteria are (1) availability : avoid flight delay and flight cancellation (2) safety and (3) fuel efficiency. At R2 lab, there is a 'digital twin' of each engine. Digital Twin is much more than Aut-CAD or CAM. It is mathematical model (relatively simple) and/or physical model based on CAD/CAM running on HPC. When aircraft is flying over Atlantic ocean, its engine talked with other engine about yesterday's weather condition over Atlantic ocean to compare. At R2 Labs, weather data is fedded to digital twin of the engine for to fine tune and make better engine performance 

Prashant Kumar talked about Docker and Kubernetes. As per definition : docker is combination of code, runtime, system tools, system libs etc. It is most popular container. Docker image is immutable. DockerHub is docker registry - something like git, github, big-bucket. Docker is popular container. Docker improves DevOps workflow by (1) environment consistency (2) Isolation (3) organizing applications and (4) portability. One must mention/use correction version of docker engine for portability. Prashant gave tips about 'dangling API' to clean manually all the unused objects, else the file system will be out of space. Kubernetes maintains containers at scale. A pod can have one or more than one containers. Sometimes, for state-full services we need to attached volume like EBS. 

Vinay Krishna talked about ' DevOps Success Recipe: One Team One Goal ' The process and tools are just means. They may not solve actual problem. Vinay explained few real life scenarios, without mentioning any organisation name. 

1. Developers re-designed a web page that was used by customer support team, without informing them. The team faced lots of issue while answering the customer call and at mid night the software needed to be rollback to previous version. 

2. The end-users were involved from beginning. Developers completed many modules, and then started GUI. The end-users raise concern about GUI and finally GUI was re-designed after many e-mail exchanges (to blame) . Involving the end-user may not help, if they attend the meeting just for shake of attending. 

3. All stack holders were involved. Code coverage was 80 percentage and above. CI/CD pipeline was green. All test cases passed. There was 100 % automation. No severity three bugs, yet customer was not happy. All bugs were moved as severity.  Many test cases for code coverage was not up to the mark and few does not have even needed assert statement. 

Yes all three amigos (as per agile) business, developers and testers must be involved as one team. Once someone commented that this task belongs to new DevOps team. There were already three teams, (1) Developers, (2) QA (3) Operation. DevOps team members were same people from operation team and after Docker + cloud training the team was re-branded as DevOps team! Operation people does not like development work. Developers does not like to do work , that is done by operation team. All needs to work as one team. 

4. Prashant also shared a positive example about valuable suggestions from operation team like : (1) Add human readable logs (2) add feature switch turn on/off etc. 

Prashant also talked about NetFlix tools like Chaos Monkey, Simian Army, Chaos Gorilla etc. They are resiliency tools that helps applications tolerate random instance failures on cloud. He suggested to read "Value Stream Mapping" and "Software Horror Stories"

Now let me share my views. This talk remind me about similar practical facts that I learnt during MBA. When SPC (statistical process control) introduced, the shop-floor people does not understand about 2 lines in chart recorder. They drew manually to keep chart withing two lines !! It reminded me the famous business novel 'The Goal' about ToC (theory of constraints). In the novel the people strongly rejected reporting so many numbers/statistics. Later on, new system of using green tag and rad tag itself caused another issue.  Now time changed. I also remember a famous joke when a US firm ordered something from Japan with 97% quality, the Japan team prepared 3% pieces separately with poor quality. 

In IT industry also, I observed that (1) sometimes, all the review process does not apply to 'system engineers team' even the clarity in requirement is the most crucial part. (2) Sometimes, quality people have un-realistic matrices about number comments within kLOC, to meet them, manager add dummy comments or remove genuine comments from the record. It defeats its purpose. 

So in summary, the human tendency, the culture, power politics cannot be ignored to bring the change. Quality must be everyone's responsibility. 

Shynish Meladath talked about Industrial IOT and DevOps. The key take away points were 

1. 'Particle Photon' It is an IoT kit 
2. Eclipse 4diac  provides an open source infrastructure for distributed industrial process measurement and control systems based on the IEC 61499 standard.
3. resion.io that brings Linux containers to IoT. 
4. Kaa is an open source IoT platform
5. Spanner CI is continuous integration for IoT
6. macchina is a versatile platform for car. 

I missed the last session "Data-Driven DevOps" by Ashwin Shankarananda, due to other priority. 

Disclaimer: I did my best to capture notes and key take away points from the event. However the content is as per my understanding and it may or may not reflect the original content/intention by the speaker. Any corrections are welcome. 

Reference https://www.meetup.com/Digital-DevOps-Bangalore/events/252154106/


OpenStack meetup


16th June 2018, I attended OpenStack Meetup at Ericsson office. Let me share my notes for readers of this blog : Express YourSelf !

Shashi Singh from Altiostar Networks discussed about EPA (Enhanced Platform Awareness). 

EPA is about about making aware NFVO, VNFM and VIM, that specialized hardware is available below virtualization layer. E.g. High I/O throughput, high performance CPU, GPU, crupto accelerators and many more as below slides:







Shashi explained nicely NFV MANO architecture to build the context and introducing the acronmys. Telco NFVI providers are: RedHat, WindRiver, VMWare, Mirantis etc. VNFM is categorized as specific VNFM and Generic VNFM. It supports three interfaces: Ve-Vnfm-vnf, Vi-Vnfm and Or-Vnfm.

He explained how EPA can eliminate the need of passing through virtualization layer for data packet, if the required VNFs are running at same CPU socket. I confirmed my understanding that, one example of EPA is let all VNFs for user-plane data having single CPU afinity. We also discussed about SR-IOV single root input/output virtualization, cpu pinning, threading policy etc. Sometimes within storage node, one can leaverage use of SR-IOV, DPDK etc to support more I/O. TOSCA standard defines combination of NS-D (Network Service Descriptor) and VNFD (VNF Descriptor). Shashi also mentioend about Queens Release, Cyborg framework, nova, ironic etc. 

Here is list of Intel technologies for EPA

1. Intel Advance Encryption Standard - New Instructions (Intel AES-NI)
2. Intel Advance Vector Extensions (AVE) and AVE2
3. Intel Quick Sync Video Technology
4. Intel QuickAssist Technology for encryption / decryption and  compression / decompression
5. Intel Trusted Execution Technology (TXT)
6. Intel Node Manager : Server Mangement at Data Center
7. Data Plane Development Kit (DPDK) at Xenon processor
8. SR-IOV
9. Intel Xeon Phi Co-processor: for PCI

I came to know about this website https://www.telecomtv.com/ During tea-break, someone commented, that Kubernetes is now open source, but it is very old. Google is working on new technology / product named by Omega that is yet to be open sourced. 

Palaniswamy from Tech M, explains about ManageIQ (with demostration) as Multi Cloud Management Platform. ManageIQ supports public clouds like : Amazon Web Services, Microsoft Azure, Google Cloud Platform; OpenStack based private clouds; containers like Kubernetes, OpenShift Origin etc. ManageIQ internally uses PostgreSQL DB. Ansible Tower is used for configuration and automation. 




Sukant J R and Manoranjan Sahoo from Ericsson presented about CI/CD for containerized openstack development based on Helm. 




We also discussed about 4 types of people in IT industry always remains. (1) Developers (2) Support engineers (3) Integrators and (4) Testers. The new technology comes and goes. One needs to work, as per his/her core strength. 

Apart from that, Uday T.Kumar from Ericsson shared some insights about OpenStack Summit and how to contribute to OpenStack community. He also acknowledged that Bangalore OpenStack community is very active and sharing the latest updates. Later on those updates are known to entire world at OpenStack summit. 

Disclaimer: I captured this notes, as per my understanding on best effor basis. So it may not accurately refelct the spearker's view. Any corrections are welcome.  

Reference: 
https://www.meetup.com/Indian-OpenStack-User-Group/events/249891291/
https://01.org/sites/default/files/page/openstack-epa_wp_fin.pdf
https://networkbuilders.intel.com/network-technologies/enhancedplatformawareness

My Holiday Destination