New secret math benchmark stumps AI models and PhDs alike
Ars Technica 2024-11-12
Summary:
FrontierMath's difficult questions remain unpublished so that AI companies can't train against it.
Link:
https://arstechnica.com/ai/2024/11/new-secret-math-benchmark-stumps-ai-models-and-phds-alike/From feeds:
Cyberlaw » Ars TechnicaMusic and Digital Media » Ars Technica