Press F11 to go full screen
Ready to start
How AI Predicts the Next Word
“The cat sat on the ”
1. Tokenize
2. Analyze
3. Calculate
4. Choose
1
Convert text into processable units
Why tokenization is necessary
Neural networks work with numbers, not raw text. We map each piece of text to a numeric token ID.
Original sentence AI receives:
“The cat sat on the”
How AI actually processes it (Token IDs):
464
(The)
(The)
→
2415
(cat)
(cat)
→
3332
(sat)
(sat)
→
319
(on)
(on)
→
262
(the)
(the)
Each word is mapped to a unique token ID from a fixed vocabulary. These numbers are the inputs to the model.
Token IDs flow into attention mechanism
2
Calculate contextual importance weights
How attention works
The model scores how much each prior token should influence the next prediction.
Attention weights calculated:
cat 85
sat 72
on 64
Attention uses softmax normalization. Higher scores mean more influence on the next token.
Weighted embeddings feed into output layer
3
Compute probabilities over the vocabulary
Softmax and logits
The model converts scores to probabilities that sum to 1 across the vocabulary.
Top candidates from vocabulary:
mat
42.3%
chair
28.7%
table
19.4%
floor
9.6%
Only the top few are shown here. The rest have very small probabilities.
Probability distribution used for sampling
4
Sample to select the next token
Selection methods available
Greedy selection, temperature sampling, and nucleus sampling. This demo uses weighted random sampling.
x