FAB - Factory of Abstract-style Benchmark

Fri, 01 Nov 2024 00:00:00 +0000

Developed the first fully automated, low-cost benchmark generation framework for abstract-style evaluation across general-purpose domains. Enables scalable testing of large language models using structured abstraction errors, covering semantic, structural, and factual variants. Repository: https://github.com/spidermonk7/FAB-Benchmark

Python | Shaoyang Cui

FAB - Factory of Abstract-style Benchmark