Input-Target Pairs

Input-Target Pairs are the fundamental data structure used for training Large Language Model in a supervised, self-supervised manner. In this setup, the “Input” is a sequence of tokens, and the “Target” is the next token that follows the input sequence.

Concept

The goal is to train the model to predict the next word given a context.

Input: A sequence of text (e.g., “LLMs learn to”).
Target: The immediate next word (e.g., “predict”).

This process is repeated for every position in the text, creating multiple training examples from a single sentence.

Example

Given the sentence: “LLMs learn to predict one word at a time” If the Context Size is 4:

Input: “LLMs” $\rightarrow$ Target: “learn”
Input: “LLMs learn” $\rightarrow$ Target: “to”
Input: “LLMs learn to” $\rightarrow$ Target: “predict”

In code, this is often implemented by creating two variables, x (input) and y (target), where y is simply x shifted by one position.

x = [290, 4920, 2241, 287] # Input tokens
y = [4920, 2241, 287, 257] # Target tokens (shifted by 1)

Concept

Example

Related Notes

Chat with Mike 3.0