LLM Connections Solver

Evaluating LLM abstract reasoning through the New York Times Connections word game

About

An evaluation framework and automated solver designed to test the abstract reasoning capabilities of Large Language Models using the NYT Connections game. The repository includes a dataset of 442 games, LLM performance results, and a scoring system based on a knowledge taxonomy.

#benchmarking #reasoning #nlp #word-game

Details

Built with: Unknown
Creator: Mariam Mustafa @mustafamariam
Source date: Published on X Jun 26, 2024
Listed: Added to Dropday just now
Evidence: Strong
The page is a verified GitHub repository containing extensive code, datasets, and a link to a formal research paper on arXiv.

Source post

Watch on X

Timeline

Teaser

Video

Playable

Product

Loading…

Media & coverage

sourced from 1 post

Similar

▶

VibeThinkerApps & Tools

▶

VulcanBenchApps & Tools

Ornith 1.0 9BApps & Tools