Percepta Embedded a Computer Right Inside the Transformer

LLMs can solve complex mathematical problems, but they stumble on simple arithmetic. The team led by Christos Tzamos at Percepta found a way to fix this - they literally embedded a virtual machine into the model's weights.

Here’s how it works: a program is fed as tokens, and the model executes it step by step through its weights, outputting the result token by token. No external tools - all computations happen autoregressively inside the transformer itself.

The main problem with regular attention is that it’s too slow for real computations. Percepta circumvented this with a new decoding path that makes attention exponentially faster - almost constant work for each token. The result is over 30,000 tokens per second on a regular CPU.

In practice, the model executes programs in C (compiled to WebAssembly) for millions of steps and solves the most complex Sudoku puzzles with 100% accuracy.

<a href="https://www.percepta.ai/blog/can-llms-be-computers">https://www.percepta.ai/blog/can-llms-be-computers</a>;

#ai #llm #research #percepta

————————— Мысли Рвачева —————————