From the course: Fundamentals of AI Engineering: Principles and Practical Applications
Unlock this course with a free trial
Join today to access over 24,300 courses taught by industry experts.
Chunking strategies
From the course: Fundamentals of AI Engineering: Principles and Practical Applications
Chunking strategies
- [Instructor] Hi everyone. Welcome to our session on document chunking strategies. Today we're going to explore how different ways of breaking down documents can affect the performance of retrieval systems and language models. To start off, open up chapter three and click on the file that corresponds to 03_05.ipynb. As always, in the upper right hand corner of your notebook, make sure you've selected the .venv virtual environment as it's necessary and pre-configured to run this notebook. First, what is chunking and why is it so important? At its core, chunking is the process of breaking down documents into smaller, manageable pieces. This is absolutely critical for working in large language models for a few reasons. First, every LLM that you work with generally has something called a maximum token length or a context window. That corresponds to the number of tokens that you can give that LLM at any given moment for interpretation. If you've ever had a very long conversation with…