llm-beginner: a progressive hands-on curriculum for mastering LLMs and AI agents through from-scratch implementations
llm-beginner: a progressive hands-on curriculum for mastering LLMs and AI agents through from-scratch implementations
What it solves
This project provides a structured, hands-on learning path for beginners to master Large Language Models (LLMs) and AI agents. It bridges the gap between theoretical knowledge and practical implementation by guiding users through six progressive tasks, moving from basic Transformer architecture to complex autonomous coding agents.
How it works
Users follow a curriculum of six independent tasks, each designed to be completed over several weeks. The learning methodology emphasizes "writing from scratch first, then comparing with frameworks" to ensure deep understanding of the underlying principles. Each task includes its own dependencies, data download scripts, and self-check scripts to validate the implementation.
- Transformer Basics: Implementing self-attention and Transformer blocks for text classification.
- mini-GPT: Building a decoder-only model from scratch, including BPE tokenization, RoPE, and KV cache.
- SFT & DPO: Performing Supervised Fine-Tuning and Direct Preference Optimization using LoRA on a base model.
- RAG: Building a retrieval-augmented generation pipeline with embedding models, vector databases (FAISS), and rerankers.
- Tool Agents: Implementing a ReAct loop to allow LLMs to use external tools (calculators, sandboxes, APIs).
- Coding Agents: Creating a sophisticated agent capable of modifying local code and running tests using MCP (Model Context Protocol), Skills, and Subagents.
Who it’s for
Learners with a foundation in Python and deep learning who want to transition into the LLM and AI agent space through practical, code-first exercises.
Highlights
- Progressive Curriculum: Moves from basic architecture to advanced agentic workflows.
- Hands-on Validation: Includes
eval/run.pyscripts for each task to provide immediate feedback on implementation correctness. - Hardware Accessible: Designed to run on consumer-grade GPUs (8GB-16GB VRAM) or Mac M-series chips.
- Comprehensive Tech Stack: Covers modern techniques like RoPE, LoRA, DPO, RAG, ReAct, and MCP.
Sources
- undefinednndl/llm-beginner