The Context Window (or simply Context, also referred to as Context Size or Context Length) refers to the number of words or tokens that a LLM considers when predicting the next word. It allows the model to “look back” at previous text to influence the current prediction.
Key Concepts
- Importance: Context is critical for accurate Next Word Prediction. Predicting the next word depends not just on the immediate predecessor but on the sequence of words that came before
- Mechanism: Mechanisms like Attention Mechanism allow the LLM to give access to the entire context and weigh the importance of different words within that context.
- Scope: Determines how many previous words are taken into account for training and inference.
Implementation Details
In the context of creating Input-Target Pairs, the Context Size determines how many tokens are included in the input for the model.
- If
context_size = 4, the model looks at a sequence of 4 tokens to predict the 5th one.
