LLMOps


For AI application, we need automation of 

1. Data preparation
2. model tuning
3. Deployment
4. Maintenance and 
5. Monitoring

  • Managing Dependency adds complexity. 

E2E workflow for LLM based application. 

MLOps framework

1. data ingestion

2. data validation

3. data transformation

4. model

5. model analysis

6. serving model

7. logging. 

LLM System Design

boarder design of E2E app including front end, back end, data engineering etc. 

Chain multiple LLMs together

* Grounding : provides additional information/fact with prompt to LLM. 

* Track History. how it works past. 

LLM App

User input->Preprocessing->grounding->prompt goes to LLM model->LLM Response->Grounding->Post processing + Responsible AI->Final output to user.

Model Customization

1. Data Prep

2. Model Tuning

3. Evaluate

It is iterative process

LLMOps Pipeline (Simplified)

1. Data Preparation and versioning (for training data)

2. Supervised tuning (pipeline) 

3. Artifact = config and workflow : are generated. 

- config = config for workflow

E.g. 

Which data set to use

- Workflow = steps 

4. Pipeline execution

5. deploy LLM 

6. Prompting and predictions

7. Responsible AI

Orchestration = 1 + 2 . Orchestration : What is first, then next step and further next step. sequence of step assurance. 

Automation = 4 + 5

Fine Tuning Data Model using Instructions (Hint)

1. rules

2. step by step

3. procedure

4. example

File formats

1. JSONL: JSON Line. Human readable. For small and medium size dataset. 

2. TFRecord 

3. Parquet for large and complex dataset. 

MLOps Workflow for LLM

1. Apache Airflow

2. KubeFlow

DSL = Domain Specific Language

Decorator 

@dls.component

@dls.pipeline

Next compiler will generate YAML file for pipeline

YAML file has

- components

- deploymentSpec

Pipeline can be run on

- K8s

- Vertex AI pipeline execute pipeline in serverless enviornment

PipelineJob takes inputs

1. Template Path: pipline.yaml

2. Display name

3. Parameters

4. Location: Data center

5. pipeline root: temp file location

Open Source Pipeline

https://us-kfp.pkg.dev/ml-pipeline/large-language-model-pipelines/tune-large-model/v2.0.0

Deployment

Batch and REST

1. Batch. E.g. customer review. Not real time. 

2. REST API e.g. chat. More like teal time library. 

* pprint is library to format 

LLM provides output and 'safetyAttributes'

- blocked

* We can find citation also from output of LLM

===========

vertexAI SDK

https://cloud.google.com/vertex-ai

BigQuery 

https://cloud.google.com/bigquery

sklearn

To decide data 80-20% for training and evaluation. 

Building AI/ML apps in Python with BigQuery DataFrames | Google Cloud Blog

===========

0 comments:

Post a Comment