LinkedIn and 3rd parties use essential and non-essential cookies to provide, secure, analyze and improve our Services, and to show you relevant ads (including professional and job ads) on and off LinkedIn. Learn more in our Cookie Policy.

Select Accept to consent or Reject to decline non-essential cookies for this use. You can update your choices at any time in your settings.

Start free trial Sign in

From the course: Fundamentals of AI Engineering: Principles and Practical Applications

Unlock this course with a free trial

Join today to access over 24,300 courses taught by industry experts.

Scaling strategies (caching)

Scaling strategies (caching)

From the course: Fundamentals of AI Engineering: Principles and Practical Applications

Start my 1-month free trial Buy for my team

Scaling strategies (caching)

“

- [Narrator] Hi, everyone. Welcome back and welcome to our session on caching in vector databases. To get started, open chapter_5, and open the file called 05_05.ipynd. As always, in the upper right-hand corner, make sure that the virtual environment that you've selected is the .vn virtual environment. We've discussed caching a few times in this course, and that's for good reason. As you build production AI systems, the effect and importance of caching is paramount. Caching is an optimization technique that allows us to significantly reduce latency and computational load on our critical systems by storing and saving frequently accessed results. Now, why should we actually do this? There's three concrete immediate reasons that come to mind. First, again, reduced latency. Cached results can be returned instantly without computing embeddings or searching the vector space. When you're building AI applications and the user experience is paramount, ensuring that we minimize latency is…

Contents