Recently I’ve been evaluating the ability of LLMs to perform simple reasoning, using the Miss Manners benchmark. This article ranks the LLMs on this benchmark and summarises the results.

Continue reading “Miss Manners LLM Benchmark”