LLMOps
For AI application, we need automation of
1. Data preparation
2. model tuning
3. Deployment
4. Maintenance and
5. Monitoring
- Managing Dependency adds complexity.
E2E workflow for LLM based application.
MLOps framework
1. data ingestion
2. data validation
3. data transformation
4. model
5. model analysis
6. serving model
7. logging.
LLM System Design
boarder design of E2E app including front end, back end, data engineering etc.
Chain multiple LLMs together
* Grounding : provides additional information/fact with prompt to LLM.
* Track History. how it works past.
LLM App
User input->Preprocessing->grounding->prompt goes to LLM model->LLM Response->Grounding->Post processing + Responsible AI->Final output to user.
Model Customization
1. Data Prep
2. Model Tuning
3. Evaluate
It is iterative process
LLMOps Pipeline (Simplified)
1. Data Preparation and versioning (for training data)
2. Supervised tuning (pipeline)
3. Artifact = config and workflow : are generated.
- config = config for workflow
E.g.
Which data set to use
- Workflow = steps
4. Pipeline execution
5. deploy LLM
6. Prompting and predictions
7. Responsible AI
Orchestration = 1 + 2 . Orchestration : What is first, then next step and further next step. sequence of step assurance.
Automation = 4 + 5
Fine Tuning Data Model using Instructions (Hint)
1. rules
2. step by step
3. procedure
4. example
File formats
1. JSONL: JSON Line. Human readable. For small and medium size dataset.
2. TFRecord
3. Parquet for large and complex dataset.
MLOps Workflow for LLM
1. Apache Airflow
2. KubeFlow
DSL = Domain Specific Language
Decorator
@dls.component
@dls.pipeline
Next compiler will generate YAML file for pipeline
YAML file has
- components
- deploymentSpec
Pipeline can be run on
- K8s
- Vertex AI pipeline execute pipeline in serverless enviornment
PipelineJob takes inputs
1. Template Path: pipline.yaml
2. Display name
3. Parameters
4. Location: Data center
5. pipeline root: temp file location
Open Source Pipeline
https://us-kfp.pkg.dev/ml-pipeline/large-language-model-pipelines/tune-large-model/v2.0.0
Deployment
Batch and REST
1. Batch. E.g. customer review. Not real time.
2. REST API e.g. chat. More like teal time library.
* pprint is library to format
LLM provides output and 'safetyAttributes'
- blocked
* We can find citation also from output of LLM
===========
vertexAI SDK
https://cloud.google.com/vertex-ai
BigQuery
https://cloud.google.com/bigquery
sklearn
To decide data 80-20% for training and evaluation.
Building AI/ML apps in Python with BigQuery DataFrames | Google Cloud Blog
===========
0 comments:
Post a Comment