English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 14h 25m | 2.57 GB
Learn how to put Large Language Model-based applications into production safely and efficiently.
This practical book offers clear, example-rich explanations of how LLMs work, how you can interact with them, and how to integrate LLMs into your own applications. Find out what makes LLMs so different from traditional software and ML, discover best practices for working with them out of the lab, and dodge common pitfalls with experienced advice.
In LLMs in Production you will:
- Grasp the fundamentals of LLMs and the technology behind them
- Evaluate when to use a premade LLM and when to build your own
- Efficiently scale up an ML platform to handle the needs of LLMs
- Train LLM foundation models and finetune an existing LLM
- Deploy LLMs to the cloud and edge devices using complex architectures like PEFT and LoRA
- Build applications leveraging the strengths of LLMs while mitigating their weaknesses
LLMs in Production delivers vital insights into delivering MLOps so you can easily and seamlessly guide one to production usage. Inside, you’ll find practical insights into everything from acquiring an LLM-suitable training dataset, building a platform, and compensating for their immense size. Plus, tips and tricks for prompt engineering, retraining and load testing, handling costs, and ensuring security.
Most business software is developed and improved iteratively, and can change significantly even after deployment. By contrast, because LLMs are expensive to create and difficult to modify, they require meticulous upfront planning, exacting data standards, and carefully-executed technical implementation. Integrating LLMs into production products impacts every aspect of your operations plan, including the application lifecycle, data pipeline, compute cost, security, and more. Get it wrong, and you may have a costly failure on your hands.
LLMs in Production teaches you how to develop an LLMOps plan that can take an AI app smoothly from design to delivery. You’ll learn techniques for preparing an LLM dataset, cost-efficient training hacks like LORA and RLHF, and industry benchmarks for model evaluation. Along the way, you’ll put your new skills to use in three exciting example projects: creating and training a custom LLM, building a VSCode AI coding extension, and deploying a small model to a Raspberry Pi.
What’s Inside
- Balancing cost and performance
- Retraining and load testing
- Optimizing models for commodity hardware
- Deploying on a Kubernetes cluster
Table of Contents
Chapter 1. Words’ awakening: Why large language models have captured attention
Chapter 1. Navigating the build-and-buy decision with LLMs
Chapter 1. Debunking myths
Chapter 1. Summary
Chapter 2. Large language models: A deep dive into language modeling
Chapter 2. Language modeling techniques
Chapter 2. Attention is all you need
Chapter 2. Really big transformers
Chapter 2. Summary
Chapter 3. Large language model operations: Building a platform for LLMs
Chapter 3. Operations challenges with large language models
Chapter 3. LLMOps essentials
Chapter 3. LLM operations infrastructure
Chapter 3. Summary
Chapter 4. Data engineering for large language models: Setting up for success
Chapter 4. Evaluating LLMs
Chapter 4. Data for LLMs
Chapter 4. Text processors
Chapter 4. Preparing a Slack dataset
Chapter 4. Summary
Chapter 5. Training large language models: How to generate the generator
Chapter 5. Basic training techniques
Chapter 5. Advanced training techniques
Chapter 5. Training tips and tricks
Chapter 5. Summary
Chapter 6. Large language model services: A practical guide
Chapter 6. Setting up infrastructure
Chapter 6. Production challenges
Chapter 6. Deploying to the edge
Chapter 6. Summary
Chapter 7. Prompt engineering: Becoming an LLM whisperer
Chapter 7. Prompt engineering basics
Chapter 7. Prompt engineering tooling
Chapter 7. Advanced prompt engineering techniques
Chapter 7. Summary
Chapter 8. Large language model applications: Building an interactive experience
Chapter 8. Edge applications
Chapter 8. LLM agents
Chapter 8. Summary
Chapter 9. Creating an LLM project: Reimplementing Llama 3
Chapter 9. Simple Llama
Chapter 9. Making it better
Chapter 9. Deploy to a Hugging Face Hub Space
Chapter 9. Summary
Chapter 10. Creating a coding copilot project: This would have helped you earlier
Chapter 10. Data is king
Chapter 10. Build the VS Code extension
Chapter 10. Lessons learned and next steps
Chapter 10. Summary
Chapter 11. Deploying an LLM on a Raspberry Pi: How low can you go?
Chapter 11. Preparing the model
Chapter 11. Serving the model
Chapter 11. Improvements
Chapter 11. Summary
Chapter 12. Production, an ever-changing landscape: Things are just getting started
Chapter 12. The future of LLMs
Chapter 12. Final thoughts
Chapter 12. Summary
appendix A. History of linguistics
appendix A. Medieval linguistics
appendix A. Renaissance and early modern linguistics
appendix A. Early 20th-century linguistics
appendix A. Mid-20th century and modern linguistics
appendix B. Reinforcement learning with human feedback
appendix C. Multimodal latent spaces
Resolve the captcha to access the links!