LinkedIn and 3rd parties use essential and non-essential cookies to provide, secure, analyze and improve our Services, and to show you relevant ads (including professional and job ads) on and off LinkedIn. Learn more in our Cookie Policy.

Select Accept to consent or Reject to decline non-essential cookies for this use. You can update your choices at any time in your settings.

Start free trial Sign in

From the course: Fundamentals of AI Engineering: Principles and Practical Applications

Unlock this course with a free trial

Join today to access over 24,300 courses taught by industry experts.

Chunking strategies

Chunking strategies

From the course: Fundamentals of AI Engineering: Principles and Practical Applications

Start my 1-month free trial Buy for my team

Chunking strategies

“

- [Instructor] Hi everyone. Welcome to our session on document chunking strategies. Today we're going to explore how different ways of breaking down documents can affect the performance of retrieval systems and language models. To start off, open up chapter three and click on the file that corresponds to 03_05.ipynb. As always, in the upper right hand corner of your notebook, make sure you've selected the .venv virtual environment as it's necessary and pre-configured to run this notebook. First, what is chunking and why is it so important? At its core, chunking is the process of breaking down documents into smaller, manageable pieces. This is absolutely critical for working in large language models for a few reasons. First, every LLM that you work with generally has something called a maximum token length or a context window. That corresponds to the number of tokens that you can give that LLM at any given moment for interpretation. If you've ever had a very long conversation with…

Contents