
Share
This experiment pushes the boundaries of what's possible with consumer hardware by training a large transformer model on an M4 Mac Mini, achieving impressive accuracy despite strict memory and processing constraints.
In a recent experiment, I trained a 67-million-parameter transformer model from scratch on an M4 Mac Mini using Apple Silicon’s Metal Performance Shaders (MPS) backend. The goal was to see how far a carefully designed small model could go when constrained by consumer hardware limits. Despite the limitations-24GB of unified memory and no discrete GPU-the model achieved 93.94 percent exact-match accuracy on CLI command generation, a task where even a single missing character results in failure.
The defining constraint was the M4 Mac Mini with its 24GB of unified memory and no discrete GPU. This setup used Apple Silicon’s Metal Performance Shaders (MPS) backend for training, which is optimized for Apple’s hardware but lacks the parallel processing power of dedicated GPUs. Every design decision had to balance memory pressure and computational efficiency.
CLI command generation is a stringent task. Commands are short, compositional, and highly sensitive to errors. A missing flag or an incomplete pipe can render the entire command invalid. This made exact-match accuracy the only relevant metric, as there was no room for partial correctness.
The model leveraged modern architectural components like Rotary Position Embedding (RoPE), Root Mean Square Normalization (RMSNorm), and the SwiGLU activation function. These choices were driven by the need for efficiency and performance on limited hardware.

The final results were impressive given the constraints:
This project highlighted several key insights:
This experiment demonstrates that with careful design and modern architectural components, it is possible to train effective small language models on consumer hardware. While the accuracy of 93.94 percent is impressive, the real takeaway is the feasibility of training from scratch on limited resources. This opens up new possibilities for researchers and practitioners who may not have access to high-end GPUs or large datasets.
Tags
Original Sources
↗ https://geddydukes.com/blog/tiny-llm?utm_source=tldrai
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
29 January 2026
88 articles
Related Articles
Related Articles
More Stories