ニュース

The comparison involved 100 games of Mastermind, a reasoning task requiring the models to deduce a hidden code through logical guesses informed by feedback hints. Key metrics included success rate, ...
In the exercise, VERSES compared OpenAI advanced reasoning model o1-preview to Genius. Each model attempted to crack the Mastermind code on 100 games with up to ten guesses to crack the code.
In this latest demonstration, VERSES demonstrates Genius, winning the code-breaking game Mastermind in a side-by-side comparison with China’s leading AI model, DeepSeek’s R1, which has been positioned ...
In the challenge, VERSES compared the DeepSeek-R1 model to Genius. Each model attempted to crack the Mastermind code on 100 games within up to ten guesses.
VERSES® Genius™ Outperforms DeepSeek R1 Model in Code-Breaking “Mastermind” Challenge Demonstration of Multi-Step Reasoning by VERSES’ Genius Agent Beats China’s Top AI Model in ...