Leading AI Models Struggle with 90s Video Games
Even the most advanced AI models are unable to effectively play the classic first-person shooter Doom. This conclusion was reached by experts after testing neural networks in the new benchmark VideoGameBench. Claude can play Pokemon, but can it play DOOM? With a simple agent, we let VLMs play it, and found Sonnet 3.7 to get […]