In a remarkable feat, software developer and self-proclaimed spreadsheet enthusiast Ishan Anand has successfully integrated the GPT-2 language model into Microsoft Excel. This groundbreaking achievement not only demonstrates the versatility of spreadsheets but also provides a unique perspective into how large language models (LLMs) operate, particularly the underlying Transformer architecture responsible for intelligent next-token prediction.
Anand’s pioneering approach
Recognizing the inherent complexity of AI systems, Anand believes that understanding a spreadsheet can unlock the secrets of artificial intelligence. “If you can understand a spreadsheet, then you can understand AI,” he confidently states. The developer’s innovative approach has resulted in a 1.25GB spreadsheet, which he has generously made available on GitHub for anyone to download and explore.
While Anand’s spreadsheet implementation of GPT-2 may not match the cutting-edge capabilities of contemporary LLMs, it offers a valuable glimpse into the groundbreaking GPT-2 model, which garnered significant attention in 2019 for its state-of-the-art performance. It’s important to note that GPT-2 predates the era of conversational AI, with ChatGPT emerging from efforts to prompt GPT-3 conversationally in 2022.
Exploring the transformer architecture
At the core of Anand’s Excel implementation lies the GPT-2 Small model, which boasts 124 million parameters. In contrast, the full version of GPT-2 employed a staggering 1.5 billion parameters, while its successor, GPT-3, raised the bar even higher with up to 175 billion parameters. Despite its relatively modest size, Anand’s implementation showcases the Transformer architecture’s ability to perform smart “next-token prediction,” where the language model intelligently completes an input sequence with the most likely subsequent part.
While the spreadsheet can handle only 10 tokens of input, a minuscule fraction compared to GPT-4 Turbo’s capacity of 128,000 tokens, Anand’s work serves as a valuable educational resource. He believes his “low-code introduction” is ideal for tech executives, marketers, product managers, AI policymakers, ethicists, developers, and scientists seeking to understand the foundations of LLMs better.
A foundation for modern LLMs
Anand asserts that the Transformer architecture employed in his GPT-2 implementation remains “the foundation for OpenAI’s ChatGPT, Anthropic’s Claude, Google’s Bard/Gemini, Meta’s Llama, and many other LLMs.” His multi-sheet work guides users through word tokenization, text positions and weightings, iterative refinement of next-word prediction, and ultimately, selecting the output token – the predicted last word of the sequence.
One of the noteworthy benefits of Anand’s Excel-based implementation is the ability to run the LLM entirely locally on a PC, without relying on cloud services or API calls. However, he cautions against attempting to use this Excel file on Mac or cloud-based spreadsheet applications, as it may lead to crashes and performance issues. Additionally, Anand recommends using the latest version of Excel for optimal performance.
While Anand’s GPT-2 implementation may not match the capabilities of contemporary LLMs, it serves as a remarkable educational tool and a testament to the versatility of spreadsheets. By demystifying the inner workings of language models, Anand’s work empowers individuals from diverse backgrounds to gain a deeper understanding of artificial intelligence and its underlying architectural principles.