Using Full Context Size

In this chapter, you'll dive into adjusting the context size of your AI models. You’ll learn why a larger context size is crucial for handling extensive documents and how to set it up using the Ollama API. We’ll walk through creating a custom Modelfile with Docker to increase the token limit from 2048 to 131,072 tokens.

You'll also explore troubleshooting techniques when your model gets stuck in an infinite loop or generates repetitive text. Discover how to enable streaming responses for real-time logging and debugging. We’ll cover adjusting parameters like temperature to improve model performance and prevent repetition issues.

By the end of this chapter, you’ll have a solid understanding of how to optimize your AI models for better performance with large datasets.

Grab the book from my store!

Buy Now

Evaluating Llama 3.2 3B