Boost Your AI’s Math Skills Instantly with One Simple Trick

Explore how DeepMind's OPRO enhances AI's math skills dramatically using human-like encouragement phrases, marking a monumental shift in AI training

Wednesday September 20, 2023 , 2 min Read

In a groundbreaking study recently published on arXiv, Google's DeepMind unveils a new technique named Optimisation by PROmpting (OPRO) that significantly enhances the mathematical prowess of large language models (LLMs) like ChatGPT and Google's PaLM 2. This pioneering approach departs from traditional mathematical optimisers, instead embracing natural language instructions to facilitate the problem-solving process of these LLMs.

OPRO works by utilising "meta-prompts," expressed through everyday language, as a guide during the optimisation procedure. The innovation lies in the dynamic interaction between two LLMs: one assuming the role of a scorer, assessing the quality of the solutions based on parameters like accuracy, and another functioning as an optimiser, devising new solutions by amalgamating past outcomes with the problem's verbal description. Through repeated iterations, the optimiser LLM refines the meta-prompts, progressively elevating the scorer LLM's efficacy in pinpointing the best solutions.

Remarkably, the study unearthed the potent influence of human-like encouragement on the performance of AI models. When faced with math problems, the simple insertion of phrases such as "take a deep breath and work on this step by step" before the question led to a dramatic surge in accuracy scores. Specifically, using this phrase with Google's PaLM 2 caused a leap from a mere 34% to an impressive 80.2% accuracy on GSM8K, a dataset of elementary math word problems.

While it may seem enigmatic, the efficacy of these phrases is tied to the enormous repertoire of language phrases that the LLMs have assimilated from diverse sources like books and the web. Incorporating these phrases possibly enables the AI to access superior solutions stored within its neural network weights, mimicking a more thoughtful and systematic approach to problem-solving.

According to the researchers, OPRO's true prowess lies in its capacity to sift through a plethora of prompts to discern the one that renders optimal results for a given task. This breakthrough holds the promise of revolutionising the way we interact with LLMs, potentially unlocking a future where these models can generate results that are not just accurate but also markedly insightful. The venture into harnessing human-style encouragement to elevate AI performance marks a promising stride towards cultivating more effective and intelligent AI models, potentially reshaping the frontier of artificial intelligence.