As an AI Engineer I’ve found that Chip Huyen’s book, “AI Engineering: Building Applications with Foundation Models,” serves as a comprehensive and practical guide for professionals seeking to design and deploy generative AI applications in production. It feels to me like it provides timeless foundational knowledge for building scalable AI products, rather than focusing on specific tools or fleeting trends. I use this book as the foundational basis in all my teaching falling under the banner of AI Engineering. Buy this book AND read it. It’s quite a slog but if you can survive it you’ll find yourself deeply knowledgeable in the subject by the time you get through it.
The book addresses the emergence of AI engineering as a distinct discipline, driven by two major factors:
- Increased Demand for AI Applications: The powerful capabilities of foundation models (Large Language Models, LLMs, and Large Multimodal Models, LMMs) have led to a vast increase in the types of tasks AI can perform, from writing and coding to image and video production, education, information aggregation, and workflow automation.
- Lowered Entry Barrier: The “model as a service” approach, popularised by major AI labs, allows developers to leverage powerful models via APIs without needing to train them from scratch. Additionally, AI itself can assist in code generation and enable interaction in plain English, making AI application development accessible even to those without a traditional software engineering background.
Core Themes and Content:
- Framework for Adapting Foundation Models: The central purpose is to offer a framework for adapting these pre-trained models to specific real-world applications. This involves understanding how to get models to generate desired outputs, handle contextual information, and perform reliably.
- Key Techniques Covered: The book delves into essential AI engineering techniques such as:
- Prompt Engineering: Crafting effective instructions to guide model behaviour without changing its weights. It covers best practices, system vs. user prompts, and defence against prompt attacks.
- Retrieval-Augmented Generation (RAG): Constructing relevant context for models by retrieving information from external memory sources. This includes discussion of retrieval algorithms (e.g., semantic search with embeddings), chunking, and evaluating RAG solutions.
- Agents: Building intelligent agents that can perceive their environment, plan actions, and use tools to accomplish complex tasks autonomously. This section is noted as more experimental due to the field’s nascent stage.
- Finetuning: Adapting models by updating their weights, typically requiring more resources but offering significant improvements in quality, latency, and cost for specific tasks.
- Data Engineering: Curating, generating, annotating, and processing data for training and adaptation, including insights into synthetic data generation.
- Inference Optimisation: Techniques to make models run faster and cheaper in production, covering model-level and service-level optimisations, and AI accelerators.
- Evaluation as a Central Challenge: The book highlights evaluation as one of the hardest challenges in AI engineering, dedicating two chapters to exploring various methods and building reliable evaluation pipelines. It covers functional correctness, similarity measurements, and the increasingly popular “AI as a judge” approach.
- Understanding Foundation Models: It explains the fundamental design decisions behind foundation models, including training data, model architecture (e.g., transformer), post-training alignment (e.g., Supervised Finetuning, RLHF), and sampling strategies. It also addresses baffling behaviours like hallucinations and inconsistencies, attributing them to the probabilistic nature of AI.
- AI Engineering Workflow: The book follows the typical process of developing an AI application, from initial project scoping to deployment and continuous improvement through user feedback. It introduces an architectural approach that starts simple and progressively adds components like routing, gateways, caches, and agent patterns.
- Distinction from ML Engineering: Huyen clarifies that AI engineering, while building on ML engineering principles, differs in its focus on model adaptation and evaluation rather than model development from scratch. It deals with larger, more compute-intensive, and open-ended models.
Target Audience:
The book is aimed at a technical audience, including AI engineers, ML engineers, data scientists, engineering managers, and technical product managers who want to leverage foundation models to solve real-world problems. It also benefits tool developers, researchers, and job candidates interested in understanding AI capabilities, limitations, and career skills.
What it is Not:
The book is not a tutorial for specific tools and not an ML theory textbook. While it mentions tools and theoretical concepts, its primary focus is on practical implementation and problem-solving frameworks for building successful AI applications.
Overall, “AI Engineering” by Chip Huyen bridges the gap between the rapid advancements in foundation models and their practical application, providing a structured approach to navigate the complex landscape of developing robust and scalable generative AI systems.