gittech. site

for different kinds of informations and explorations.

But How Does GPT Actually Work? A Step-by-Step Notebook

Published at
5 days ago

🧠 Train a Small GPT-Style LLM from Scratch

πŸš€ This repository contains a Jupyter Notebook that trains a small GPT-style, decoder-only language model from scratch using PyTorch.

πŸ”— Open the Notebook

πŸ“Œ Overview

This project is an educational walkthrough of the process of building and training a Minimal GPT-style Decoder Only Transformer Model. The notebook covers:

  • πŸ“– Tokenization – Converting text into tokens
  • πŸ”„ Positional Encoding – Adding order to input sequences
  • πŸ“ˆ Self Attention Intuition - Building intuition behind the self attention operation
  • πŸ— Transformer Decoder Blocks – Multi-head self-attention & feedforward layers
  • 🎯 Training from Scratch – Using a small pretraining and SFT dataset to train a language model
  • πŸ”₯ Inference – Generating text using the trained model

πŸ“‚ Repository Structure

πŸ“‚ gpt-from-scratch
│── πŸ“„ README.md # Project documentation (this file)
│── πŸ“’ llm-from-scratch.ipynb # Jupyter Notebook with full training pipeline

πŸš€ Getting Started

1️⃣ Clone the Repository

git clone https://github.com/kevinpdev/gpt-from-scratch.git
cd gpt-from-scratch

2️⃣ Install Dependencies

Make sure you have Python and Jupyter installed. Install required packages:

pip install torch transformers datasets jupyter tiktoken

3️⃣ Run the Notebook

Launch Jupyter Notebook:

jupyter notebook

Open llm-from-scratch.ipynb and run

🎯 Goals & Use Cases

βœ… Understand dataset formats and working with Huggingface libraries
βœ… Learn the process of tokenization
βœ… Learn the inner workings of GPT-style models
βœ… Train a small-scale Transformer on a custom dataset
βœ… Understand self-attention and language modeling
βœ… Experiment with fine-tuning & inference

πŸ”— Notebook & Resources

πŸ“Œ Notebook: llm-from-scratch.ipynb
πŸ“– Transformer Paper: β€œAttention Is All You Need"
πŸ“– GPT Paper: "Improving Language Understanding by Generative Pre-Training"
πŸ›  PyTorch Documentation: pytorch.org
πŸ‘ Huggingface Documentation: https://huggingface.co/docs