A Leap Forward in AI Performance
An early version of Cotton-2, tested under the alias "sus-column-r" on the LMSYS leaderboard, is already outperforming leading models, including Claude 3.5 Sonnet and GPT-4-Turbo. This marks a significant milestone in our pursuit of decentralized, high-performance AI.
Benchmarking Cotton-2
Cotton-2 was evaluated in the LMSYS chatbot arena, a premier benchmark for competitive
language models. Based on Elo scores, it has surpassed both Claude and GPT-4, solidifying
its position among the most advanced AI models available today.
With this beta release, we continue to push the boundaries of decentralized AI, ensuring
that cutting-edge technology remains accessible, transparent, and beneficial to all.



Rigorous Evaluation of Cotton-2: Advancing AI Capabilities
At D-AI, we employ a systematic and rigorous approach to evaluating our models, ensuring they
meet the highest standards of accuracy, reliability, and reasoning. Internally, our AI
Tutors assess model performance across a variety of real-world tasks, simulating
practical use cases for Cotton. During these evaluations, AI Tutors compare multiple
responses generated by Cotton and select the most effective one based on predefined
assessment criteria.
Our evaluation focuses on two fundamental areas:
Cotton-2 exhibits substantial advancements in reasoning, retrieval-based content generation, and tool-use capabilities. Specifically, it demonstrates enhanced proficiency in:
Benchmarking Excellence: Cotton-2’s Performance Across Key Metrics
To ensure comprehensive validation, Cotton-2 was rigorously tested across industry-standard academic benchmarks, evaluating its competencies in:
Both Cotton-2 and Cotton-2 Mini exhibit marked improvements over their predecessor, Cotton-1.5, demonstrating competitive performance against leading foundation models. Key benchmark highlights include:
MathVista – Complex visual mathematical reasoning.
DocVQA – Document-based question answering and interpretation.
With these advancements, Cotton-2 represents a new frontier in AI-driven reasoning, bridging the gap between deep knowledge comprehension, real-world applicability, and robust decision-making.
Benchmark | Cotton-1.5 | Cotton-2 mini‡ | Cotton-2‡ | GPT-4 Turbo* | Claude 3 Opus† | Gemini Pro 1.5 | Llama 3 405B | GPT-4o* | Claude 3.5 Sonnet† | |
---|---|---|---|---|---|---|---|---|---|---|
GPQA |
|
35.9% | 51.0% | 56.0% | 48.0% | 50.4% | 46.2% | 51.1% | 53.6% | 59.6% |
MMLU |
|
81.3% | 86.2% | 87.5% | 86.5% | 85.7% | 85.9% | 88.6% | 88.7% | 88.3% |
MMLU-Pro |
|
51.0% | 72.0% | 75.5% | 63.7% | 68.5% | 69.0% | 73.3% | 72.6% | 76.1% |
MATH§ |
|
50.6% | 73.0% | 76.1% | 72.6% | 60.1% | 67.7% | 73.8% | 76.6% | 71.1% |
HumanEval¶ |
|
74.1% | 85.7% | 88.4% | 87.1% | 84.9% | 71.9% | 89.0% | 90.2% | 92.0% |
MMMU |
|
53.6% | 63.2% | 66.1% | 63.1% | 59.4% | 62.2% | 64.5% | 69.1% | 68.3% |
MathVista |
|
52.8% | 68.1% | 69.0% | 58.1% | 50.5% | 63.9% | — | 63.8% | 67.7% |
DocVQA |
|
85.6% | 93.2% | 93.6% | 87.2% | 89.3% | 93.1% | 92.2% | 92.8% | 95.2% |
* GPT-4-Turbo and GPT-4o scores are from the May 2024
release.
† Claude 3 Opus and Claude 3.5 Sonnet scores are from the June 2024
release.
‡ Cotton-2 MMLU, MMLU-Pro, MMMU and MathVista were evaluated using 0-shot
CoT.
§ For MATH, we present maj@1 results.
¶ For HumanEval, we report pass@1 benchmark scores.
Experience Cotton with Real-Time Updates
At D-AI, we are committed to continuously refining and enhancing Cotton to deliver a seamless and intelligent AI experience. Over the past few months, we have made significant improvements, and today, we are excited to introduce the next evolution of Cotton.
This latest update features a redesigned interface for improved usability, along with powerful new capabilities designed to enhance user interaction, efficiency, and overall performance.
Stay ahead with real-time updates and experience the future of AI-driven engagement.



Introducing Cotton-2 and Cotton-2 Mini: Advancing AI Through Decentralization
At D-AI, our mission is to ensure that artificial intelligence serves all of humanity by integrating AI with blockchain technology, fostering transparency, security, and equitable access. As part of this commitment, we are introducing two new models that represent the next evolution of decentralized AI:
With significant improvements in steerability, contextual comprehension, and adaptability,
Cotton-2 is designed to excel in complex reasoning tasks, creative collaboration, and
software development support.
In collaboration with Lattice Inc, we are also exploring integrations with newly trained
models to further enhance Cotton’s capabilities, expanding its reasoning, retrieval, and
interpretative functions.
Enterprise API: Deploying Cotton at Scale
Later this month, Cotton-2 and Cotton-2 Mini will be available through our Enterprise API platform, enabling businesses and developers to harness decentralized AI within their applications. Our infrastructure is designed for global-scale AI deployment, offering multi-region inference with low-latency access worldwide.
Key Features of the Enterprise API
To stay informed about the official launch, subscribe to our newsletter and be among the first to integrate Cotton’s AI capabilities into your enterprise applications.
Future Developments: Expanding Cotton’s Capabilities
With the introduction of Cotton-2 and Cotton-2 Mini, we are advancing toward a future where decentralized AI is more capable, intuitive, and accessible. Upcoming enhancements include:
Since the launch of Cotton-1 in November 2023, D-AI has made rapid advancements, driven by a select team of experts dedicated to pioneering AI within a decentralized framework. With Cotton-2, we are reinforcing our position at the forefront of AI research, leveraging our new compute cluster to enhance complex reasoning and knowledge synthesis.
As we continue to expand, we are seeking exceptional talent to join our team and contribute to the future of blockchain-integrated AI. If you are passionate about shaping the future of artificial intelligence, we invite you to explore our career opportunities and be part of this transformative journey.