← Back to articles

Deep Dive into LLMs: Notes from Andrej Karpathy’s 3h30 video

It took me several weeks to complete watching this YouTube video. It was not a random one, but Deep Dive into LLMs like ChatGPT from Andrej Karpathy. This guy knows a bit about neural networks 😉 and this video has been greatly recommended to me. It’s a walkthrough on how LLMs are trained, what steps are required to make them act as the assistants we know as LLM end-users, and how inference works.

Here are my notes, closely matching the different chapters of the video.

Pretraining: Data Collection and Processing

The process begins by crawling the internet

Several filtering steps are applied:

Tokenization: Breaking Text into Manageable Pieces

Neural Network Input/Output

Neural Network Internals

The process involves:

Interactive visualizations like bbycroft.net/llm help understand the architecture

Inference: Generating Text

Base Models

Post-Training: Refining the Model

Addressing Hallucinations

Models imitate the confident tone of training data even when they don’t know answers

Solutions include:

LLM Knowledge exists in two places:

Self-Knowledge

Computational Limitations

From Supervised Fine-Tuning to Reinforcement Learning

Training progression:

Reinforcement Learning

RLHF: Reinforcement Learning from Human Feedback

Used to train AI in domains where answers aren’t objectively verifiable

Creates a reward model based on human preferences:

RLHF works because:

Limitations:

Future Developments

Useful Resources

Conclusion