Epic history of LLM
RNN. Seq to seq NLP tasks.
1. Many to one: Sentimental Analysis
2. One to Many: Image caption
3. Many to Many:
- Synch many to many: # input = # output. E.g. Part of speech tagging, Named Entity Recognition
- Asynch many to many: translation, text summarization, question and answer, chatboat, speech to text,
Seq2seq model is used for Many to Many
Stage 1: 2014 Encoder decoder network
Encoder and decoder are LSTM. RNN and GRU are other options.
It is good for small sentences. Not for 30+ words
BLEU score
Stage 2: 2015 Attention Mechanism
Encoder is same
Attention Mechanism: Attention layer at decoder finds out which hidden state is useful at each stage of decoder and generate context vector for that stage. So, Multiple context vectors based on encoder's (hidden state of LSTM = ctht vector) are available to decoder.
Training time is more.
2015 to 2017: May types of Attention Mechanisms were introduced.
Stage 3: 2017 Transformer
No LSTM
No RNN Cell
Self-attention was introduced
Both encoder and decoder uses attention
Transformer can process all words in parallel
1. Attention layer = Multi Head Attention
2. Normalization Layer
3. Dense Layer
4. Input embeddings
It needs hardware, time, and data
Stage 4: 2018 Jan Transfer Learning
Challenges
1 Single model cannot perform all tasks like sentimental, translation, summarization
2 lots of labeled data
Universal Language Model Fine-tuning ULMFiT proposed to use Language modelling as Pre-training. Language modelling is NLP task to predict next word. Advantages
1. Rich feature training
2. unsupervised task
model: AWD LSTM model
data set: wikipedia
finetuning changed output as classifier with many data set
Scratch 10000 data. Now fine tune 100 data still better result
- No transformer
Now in 2018, we have two technolgoies
1. architecture: transformer
2. training. Pretrain and transfer learning
Stage 5: 2018 Oct LLM
Transfer learning on transformer
1. Google : BERT (encoder only model)
2. OpenAI: GPT (decoder only model)
LM to LLM
1. data
2 hardware GPU clusters
3 time : days to weeks
4. cost = h/w + electricity + people + infra
5. energy consumption
---------------
GPT3 - > chatGPT
1. RLHF : Reinforcement Learning from Human Feedback
2. incorporate safety and ethical guidelines
3. improvement in contextual point
4. dialogue specific
5. continuous improvement based on user feedback
Reference https://www.youtube.com/watch?v=8fX3rOjTloc&list=PPSV
साधनमन्त्र
इदानीं वयं समुहे साधनमन्त्रस्य जपं कुर्याम।
DSPy
DSPy = Declarative Self-improving Python.
Components
1. language model — LLM that will answer our questions,
2. signature —a declaration of the program’s input and output (what task we want to solve),
- 1. inline
- 2. class
dspy.InputField()
List[Literal['', '', '']] = dspy.OutputField()
3. module — the prompting technique (how we want to solve the task).
- Building blocks
- different prompting strategies,
- 1. dspy.Predict
- 2. dspy.ChainOfThought
- 3. dspy.ReAct (to add tools = function calling
4. Optimiser
- 1. Automatic few-shot learning (e.g. BootstrapFewShot or BootstrapFewShotWithRandomSearch)
- 2. Automatic instructions optimisation (e.g. MIPROv2)
- 3. Automatic fine-tuning (e.g, BootstrapFinetune)
Other points
- dspy.inspect_history for logs
- Caching
# 1. updating config
dspy.configure_cache(enable_memory_cache=False, enable_disk_cache=False)
# 2. not using cache for specific module
math = dspy.Predict("question -> answer: float", cache = False)
- dspy.configure(adapter=dspy.JSONAdapter())
- DSPy is integrated with MLFlow (an observability tool)
Hashicorp User Group Bangalore Meetup #1 : Powering the Multi-Cloud Era
Alternatives for IDP
(1) https://github.com/JanssenProject/jans https://github.com/JanssenProject/jans/tree/main/jans-keycloak-link https://imshakil.medium.com/janssen-mod-auth-openidc-module-to-test-openid-connect-single-sign-on-s… It is by Glu
(2) Vault it self support OIDC https://developer.hashicorp.com/vault/docs/secrets/identity/oidc-provider https://brian-candler.medium.com/using-vault-as-an-openid-connect-identity-provider-ee0aaef2bba2
SQL++ is for JSON data. https://www.couchbase.com/sqlplusplus/
https://techmilap.com/ is free website for hosting event
Vault can provide dynamic temporary secrets to access data for each identity used by consumer. so later on, we can audit, who has accessed data. In our case, pods use ServiceAccount (SA). here we get dynamic secret per serviceaccount. So we cannot audit which pod accessed the data. we can only audit, data is accessed by which ServiceAccount. This dynamic secret has short life so one cannot use it again. SA we can use it as many time as we want.
Vault secure data in-transit with TLS and other encryption method that is called "encryption as a service"
In terraform, state file is the most confidential.
Nomad is alternative of K8s. It can manage VM also using QEMU driver. Consul is used for networking and service. Fabio is for ingress and load balancing in Nomad.
Identity Provider
https://github.com/pando85/kaniop Kaniop is a Kubernetes operator for managing Kanidm.
https://kanidm.com/ Kanidm is a modern, secure identity management system that provides authentication and authorization services with support for POSIX accounts, OAuth2, and more. It is simple and written in rust
IDP
(1)
https://github.com/JanssenProject/jans
https://github.com/JanssenProject/jans/tree/main/jans-keycloak-link
https://imshakil.medium.com/janssen-mod-auth-openidc-module-to-test-openid-connect-single-sign-on-s…
It is by Glu
(2) Vault it self support OIDC https://developer.hashicorp.com/vault/docs/secrets/identity/oidc-provider https://brian-candler.medium.com/using-vault-as-an-openid-connect-identity-provider-ee0aaef2bba2
-------------
Why Choose Keycloak?. Understanding the Need for an Identity… | by J3 | Jungletronics | Medium
Ory
GitHub - ory/k8s: Kubernetes Helm Charts for the ORY ecosystem. · GitHub



