On June 3, leading blockchain and AI organizations, EigenLayer, Cyber, Sentient among them, announced the establishment of the CAIBA – Crypto AI Benchmark Alliance. The purpose of this community-led initiative is to set transparent standards for AI models and crypto agents’ evaluation.
The members of the Alliance are 14 organizations: Alchemy, Cyber, EigenLayer, Goldsky, IOSG, LazAI, Magic Newton, Metis, MyShell, OpenGradient, RootData, Sentient, Surf, Thirdweb.
These companies are providing datasets, tools, and specialized knowledge to develop a benchmarking framework. Every benchmark, published openly on developer platforms, will consist of defined tasks, reference solutions, and grading scripts.
A benchmark for CAIA – Crypto AI Agents – is the first release of the Alliance and is currently available. The benchmark provides accurate answering protocol and token-related questions, mapping out multi-step tasks and tools like block explorers and APIs to complete tasks. There are tasks included in CAIA, related to tokenomics, on-chain analysis, project research, and transaction workflows.
New benchmarks are now underway, and the Alliance welcomes new contributors. Developers, researchers, and protocols are encouraged to participate by submitting models for evaluation or proposing new benchmark tasks.
Image: CAIBA
Read the full article here