AI Dream Team: How Multiple LLMs are Cooperating to Solve Complex Problems

The world of Artificial Intelligence is constantly evolving, with Large Language Models (LLMs) at the forefront of this revolution. But what if, instead of relying on a single, monolithic model, we could harness the power of multiple LLMs working together? That’s precisely what Sakana AI, a Japanese AI lab, has achieved with its innovative new technique.This blog post explores Sakana AI’s groundbreaking approach, called Multi-LLM AB-MCTS, which allows multiple LLMs to cooperate on a single task. We’ll delve into how this “dream team” of AI agents can solve complex problems that are beyond the capabilities of individual models, and what this means for the future of AI in enterprise applications.

What is Multi-LLM AB-MCTS?

Multi-LLM AB-MCTS (Adaptive Branching Monte Carlo Tree Search) is a novel technique developed by Sakana AI that enables multiple Large Language Models (LLMs) to collaborate on a single task. Think of it as assembling a team of experts, each with their unique skills, to tackle a complex project.

Why Does It Matter?

In the rapidly evolving landscape of AI, frontier models possess distinct strengths and weaknesses based on their training data and architecture. One model might excel at coding, while another shines in creative writing. Recognizing these differences as valuable assets, Sakana AI’s approach allows businesses to dynamically leverage the best aspects of different frontier models. This unlocks potential to achieve superior results by assigning the right AI to the right task.

How Does It Work?

The core of the Multi-LLM AB-MCTS method lies in an algorithm that intelligently balances two search strategies:

Searching Deeper: Refining a promising answer repeatedly.
Searching Wider: Generating completely new solutions from scratch.

This balance is achieved using Monte Carlo Tree Search (MCTS), a decision-making algorithm. The system uses probability models to determine whether to refine an existing solution or generate a new one.

The Multi-LLM Advantage

The innovation extends further with Multi-LLM AB-MCTS, which decides not only what to do (refine vs. generate) but also which LLM should do it. The system starts with a balanced mix of available LLMs and learns over time which models are more effective, allocating more of the workload to them.

Benefits and Comparisons

Sakana AI’s technique falls under “inference-time scaling,” improving performance by allocating computational resources after a model is trained. This contrasts with “training-time scaling,” which focuses on making models bigger and training them on larger datasets.Traditional methods include:

Reinforcement Learning: Using reinforcement learning to prompt models to generate longer, more detailed chain-of-thought (CoT) sequences.
Repeated Sampling: Giving the model the same prompt multiple times to generate a variety of potential solutions.

Multi-LLM AB-MCTS offers a smarter, more strategic version of repeated sampling, complementing reasoning techniques like long CoT through RL.

Real-World Examples and Testing

The researchers tested their system on the ARC-AGI-2 benchmark, which is designed to test a human-like ability to solve novel visual reasoning problems. They combined models, including o4-mini, Gemini 2.5 Pro, and DeepSeek-R1.The results were impressive:

The collective of models found correct solutions for over 30% of the test problems, significantly outperforming any of the models working alone.
The system dynamically assigned the best model for a given problem.
In some cases, models solved problems that were previously impossible for any single one of them. For example, one model corrected the flawed solution generated by another to produce the right answer.

Common Mistakes to Avoid

When implementing multi-LLM systems, consider these potential pitfalls:

Hallucinations: Different models have varying tendencies to “hallucinate.” Combining models can mitigate this issue.
Inefficient Task Allocation: Failing to dynamically allocate tasks to the most suitable model.
Lack of Strategic Balancing: Not properly balancing searching deeper versus searching wider.

TreeQuest: An Open-Source Framework

To facilitate the adoption of this technique, Sakana AI has released the underlying algorithm as an open-source framework called TreeQuest, available under an Apache 2.0 license.TreeQuest offers a flexible API, enabling users to implement Multi-LLM AB-MCTS for their own tasks with custom scoring and logic. This will allow for further innovation and experimentation within the AI community.

Business Applications

The potential business applications of Multi-LLM AB-MCTS are vast. According to Takuya Akiba, research scientist at Sakana AI, this method could be highly effective for:

Complex algorithmic coding
Improving the accuracy of machine learning models
Optimizing performance metrics of existing software (e.g., improving the response latency of a web service)

Conclusion

Sakana AI’s Multi-LLM AB-MCTS represents a significant step forward in AI development. By enabling multiple LLMs to collaborate, this technique unlocks new possibilities for solving complex problems and creating more robust and reliable AI applications.Key takeaways:

Multi-LLM AB-MCTS allows multiple LLMs to cooperate on a single task.
This approach can solve problems that are insurmountable for any single model.
Sakana AI has released an open-source framework called TreeQuest to help developers implement this technique.

Friendly Tip: Explore the TreeQuest framework and consider how you can apply Multi-LLM AB-MCTS to your own projects! Share this post and let us know what you think in the comments below.

Sakana AI’s Breakthrough: AIs Team Up for Smarter Solutions!