UC Berkeley's RDI centre earlier this month introduced Agents' Last Exam, a new benchmark that tests how well AI agents ...