← All open source projects

LLMs from scratch

rasbt/LLMs-from-scratch

LLMs from scratch is Sebastian Raschka’s educational repository for implementing a GPT-like language model step by step in PyTorch.

Forks 14,821
Author rasbt
Language Jupyter Notebook
License NOASSERTION
Synced 2026-06-09

What it is

LLMs from scratch is Sebastian Raschka’s repository for people who want to understand LLMs beyond terminology. The project idea is to build a GPT-like model step by step: from text and tokens to attention, transformer blocks, pretraining, and fine-tuning.

The project is valuable because it does not start with a finished huge model. It explains details at a learning scale where the connection between code, math, and model behavior is visible. For developers, it is a bridge between using LLMs and understanding key parts of how they work.

What is inside and how people use it

Inside are chapters, Jupyter notebooks, Python code, book materials, and exercises. The main stack is Python and PyTorch. The format suits self-study: read a section, run the code, change parameters, and inspect the result.

Learning path

This fragment shows how the repository can be read as a study plan rather than a loose file collection.

Language: Markdown
1. Prepare text and tokens
2. Build attention mechanisms
3. Assemble a GPT-style model
4. Pretrain on a small dataset
5. Fine-tune for instruction following
6. Evaluate outputs and limitations

A typical use case is training an engineer who already writes code but wants deeper understanding of modern language models. The repository is also useful for teaching: chapters can become a curriculum, and notebooks can become exercises.

Strengths and limitations

The strength is step-by-step structure and hands-on implementation. The project does not hide everything behind an API, so the reader sees where tokenization, attention, training, and tuning appear.

The limitation is educational scale. It does not replace industrial model training, distributed infrastructure, large datasets, or safety work. But it gives the foundation needed to read more complex papers with understanding.