← Back to Blog
Building TinyGPT from Scratch
April 04, 2026
Tech: PyTorch · Transformers · Deep Learning
I implemented a minimal GPT model from scratch with a Pytorch high level API. Not to say it’s from scratch, I want to show all process we can use to optimize model for training and inference with high constraints GPU ressources, and memory requirements. Just stay connected!
Architecture
- Embeddings
- Transformer Bloc
- Normalization
- Linear
-Output
Training Results
Final loss: X.XX
Example generation:
Guterberg text
Key Challenges
- Data fetching
- GPU memory optimization
- inference
Links
- GitHub repo: https://github.com/Kazeo57/tinyGPT.git
Ressources
- https://medium.com/@kelly.nguyen01/implementing-nanogpt-from-scratch-using-pytorch-a-guide-with-pride-and-prejudice-58f3b51f9a84