Welcome to the GitHub repository for the ODSC workshop on LLMOps. This workshop is designed to help you unlock the full potential of LLMs through quantization, distillation, fine-tuning, Kubernetes, and so much more!
Most of these case studies are from my book: Quick Start Guide to LLMs
For more details and to join the workshop, click here.
Dive deep into the practical application with our comprehensive notebooks. These notebooks will guide you step by step through the two case studies covered in the workshop, allowing for an interactive and hands-on learning experience.
Here are the slides for the workshop.
Quantizing Llama-3 dynamically - Using bitsandbytes to quantize a model in real-time on load. We will investigate the differences before and after quantization
See how to load a pre-quantized version of Llama to compare speed and memory usage:
Working with GGUF (no GPU)
Working with GGUF (with a GPU)
Evaluating LLMs with Rubrics - Exploring a rubric prompt to evaluate generative output
Evaluating Alignment (time permitting) - Seeing how an LLM can judge agent's responses
Here are some notebooks that I reference during the workshop but won't have time to get into:
If you enjoyed the case studies, please consider giving my book a 5 star rating on Amazon as it really helps me as an author! For more details and to join the workshop, click here.