From the course: Fundamentals of AI Engineering: Principles and Practical Applications

Unlock this course with a free trial

Join today to access over 24,300 courses taught by industry experts.

Scaling strategies (caching)

Scaling strategies (caching)

- [Narrator] Hi, everyone. Welcome back and welcome to our session on caching in vector databases. To get started, open chapter_5, and open the file called 05_05.ipynd. As always, in the upper right-hand corner, make sure that the virtual environment that you've selected is the .vn virtual environment. We've discussed caching a few times in this course, and that's for good reason. As you build production AI systems, the effect and importance of caching is paramount. Caching is an optimization technique that allows us to significantly reduce latency and computational load on our critical systems by storing and saving frequently accessed results. Now, why should we actually do this? There's three concrete immediate reasons that come to mind. First, again, reduced latency. Cached results can be returned instantly without computing embeddings or searching the vector space. When you're building AI applications and the user experience is paramount, ensuring that we minimize latency is…

Contents