Original link: New secret math benchmark stumps AI models and PhDs alike / ArsTechnica.
It seems that the AI models score 90% on math tests that are public and 2% on math tests that are not public.