There is a common problem for all AI companies for overfitting to benchmarks. XAI Grok 4 has some problems with prompt adherence. XAI could have had overfitting resulted from the reinforcement ...