LLM tuning on GCP is a repo library for training, serving and MLOps LLM model on Google Cloud Platform.
LLM has shown state of the art performance across domain, tasks. LLM adoption has rapid growth, foundation model is never the gold belt for domain specific task or vertical use case. Build/Tuning 1st party model will help customer to improve the accuracy and efficiency for the domain specfic task or vertical use case.
With this repo, customers can leverage GCP engineer pre-proofed code and script to start the journey of LLM training/tuning, serving, and MLOps, We hope this guide will be the lighthouse on the GCP LLM journey.
- Training: Training/Finetuning with Deepspeed on Vertex AI, Training/Finetuning with FSDP on Vertex AI, Training/Finetuning with Deepspeed on GKE, Training/Finetuning on TPU
- Serving: Serving with vLLM on GCE, Serving with vLLM on GKE, Serving with vLLM on Vertex AI, Serving with FastChat on GKE
- MLOps: End to End MLOps LLM pipeline (from training to serving) on Vertex AI
Detailed documentation can be in the subfolder(Train, Serve, MLOps)