'Humanity's Last Exam' benchmark is stumping top AI models - can you do any better?
A new academic benchmark aims to 'test the limits of AI knowledge at the frontiers of human expertise.' So far, these LLMs are stumped.
A new academic benchmark aims to 'test the limits of AI knowledge at the frontiers of human expertise.' So far, these LLMs are stumped.