The Cotton word art arranged in two Greek columns that together look like the number 2.

August 13, 2024

Cotton-2 Beta Release

Pushing the Boundaries of AI with Cutting-Edge Reasoning

We are excited to introduce Cotton-2, our most advanced language model yet, designed to set new standards in chat, coding, and reasoning. This release also brings Cotton-2 Mini, a streamlined yet powerful counterpart, ensuring AI accessibility for a wider range of users.

Both models are now available to Cotton users, with enterprise API access launching later this month.

A Leap Forward in AI Performance

An early version of Cotton-2, tested under the alias "sus-column-r" on the LMSYS leaderboard, is already outperforming leading models, including Claude 3.5 Sonnet and GPT-4-Turbo. This marks a significant milestone in our pursuit of decentralized, high-performance AI.

Benchmarking Cotton-2

Cotton-2 was evaluated in the LMSYS chatbot arena, a premier benchmark for competitive language models. Based on Elo scores, it has surpassed both Claude and GPT-4, solidifying its position among the most advanced AI models available today.

With this beta release, we continue to push the boundaries of decentralized AI, ensuring that cutting-edge technology remains accessible, transparent, and beneficial to all.

TODO
TODO
TODO

Rigorous Evaluation of Cotton-2: Advancing AI Capabilities

At D-AI, we employ a systematic and rigorous approach to evaluating our models, ensuring they meet the highest standards of accuracy, reliability, and reasoning. Internally, our AI Tutors assess model performance across a variety of real-world tasks, simulating practical use cases for Cotton. During these evaluations, AI Tutors compare multiple responses generated by Cotton and select the most effective one based on predefined assessment criteria.

Our evaluation focuses on two fundamental areas:

  • Instruction Adherence – The ability to interpret and execute complex instructions with precision.
  • Factual Accuracy – The capability to generate responses that are verifiable, contextually accurate, and logically sound.

  • Cotton-2 exhibits substantial advancements in reasoning, retrieval-based content generation, and tool-use capabilities. Specifically, it demonstrates enhanced proficiency in:

  • Identifying missing information with greater accuracy.
  • Reasoning through sequential events in complex queries.
  • Filtering out irrelevant inputs, leading to more coherent and focused responses.
  • Benchmarking Excellence: Cotton-2’s Performance Across Key Metrics

    To ensure comprehensive validation, Cotton-2 was rigorously tested across industry-standard academic benchmarks, evaluating its competencies in:

  • Reading comprehension and logical reasoning
  • Advanced mathematics and scientific inquiry
  • Code generation and problem-solving

  • Both Cotton-2 and Cotton-2 Mini exhibit marked improvements over their predecessor, Cotton-1.5, demonstrating competitive performance against leading foundation models. Key benchmark highlights include:

  • Graduate-Level Science (GPQA): Significant advancements in domain-specific reasoning and knowledge synthesis.
  • General Knowledge (MMLU, MMLU-Pro): Strong performance in broad-spectrum knowledge assessments.
  • Mathematical Problem-Solving (MATH): Superior handling of competition-level mathematical reasoning.
  • Vision-Based AI Tasks: State-of-the-art performance in:
    MathVista – Complex visual mathematical reasoning.
    DocVQA – Document-based question answering and interpretation.

  • With these advancements, Cotton-2 represents a new frontier in AI-driven reasoning, bridging the gap between deep knowledge comprehension, real-world applicability, and robust decision-making.

    Benchmark Cotton-1.5 Cotton-2 mini Cotton-2 GPT-4 Turbo* Claude 3 Opus Gemini Pro 1.5 Llama 3 405B GPT-4o* Claude 3.5 Sonnet
    GPQA
    35.9% 51.0% 56.0% 48.0% 50.4% 46.2% 51.1% 53.6% 59.6%
    MMLU
    81.3% 86.2% 87.5% 86.5% 85.7% 85.9% 88.6% 88.7% 88.3%
    MMLU-Pro
    51.0% 72.0% 75.5% 63.7% 68.5% 69.0% 73.3% 72.6% 76.1%
    MATH§
    50.6% 73.0% 76.1% 72.6% 60.1% 67.7% 73.8% 76.6% 71.1%
    HumanEval
    74.1% 85.7% 88.4% 87.1% 84.9% 71.9% 89.0% 90.2% 92.0%
    MMMU
    53.6% 63.2% 66.1% 63.1% 59.4% 62.2% 64.5% 69.1% 68.3%
    MathVista
    52.8% 68.1% 69.0% 58.1% 50.5% 63.9% 63.8% 67.7%
    DocVQA
    85.6% 93.2% 93.6% 87.2% 89.3% 93.1% 92.2% 92.8% 95.2%

    * GPT-4-Turbo and GPT-4o scores are from the May 2024 release.
    Claude 3 Opus and Claude 3.5 Sonnet scores are from the June 2024 release.
    Cotton-2 MMLU, MMLU-Pro, MMMU and MathVista were evaluated using 0-shot CoT.
    § For MATH, we present maj@1 results.
    For HumanEval, we report pass@1 benchmark scores.

    Experience Cotton with Real-Time Updates

    At D-AI, we are committed to continuously refining and enhancing Cotton to deliver a seamless and intelligent AI experience. Over the past few months, we have made significant improvements, and today, we are excited to introduce the next evolution of Cotton.

    This latest update features a redesigned interface for improved usability, along with powerful new capabilities designed to enhance user interaction, efficiency, and overall performance.

    Stay ahead with real-time updates and experience the future of AI-driven engagement.

    TODO
    TODO
    TODO

    Introducing Cotton-2 and Cotton-2 Mini: Advancing AI Through Decentralization

    At D-AI, our mission is to ensure that artificial intelligence serves all of humanity by integrating AI with blockchain technology, fostering transparency, security, and equitable access. As part of this commitment, we are introducing two new models that represent the next evolution of decentralized AI:

  • Cotton-2 – A state-of-the-art AI assistant with advanced natural language understanding and vision capabilities, seamlessly integrating real-time information retrieval for enhanced contextual accuracy.
  • Cotton-2 Mini – A lightweight yet high-performance model optimized for efficiency, speed, and balanced response quality.

  • With significant improvements in steerability, contextual comprehension, and adaptability, Cotton-2 is designed to excel in complex reasoning tasks, creative collaboration, and software development support.

    In collaboration with Lattice Inc, we are also exploring integrations with newly trained models to further enhance Cotton’s capabilities, expanding its reasoning, retrieval, and interpretative functions.

    Enterprise API: Deploying Cotton at Scale

    Later this month, Cotton-2 and Cotton-2 Mini will be available through our Enterprise API platform, enabling businesses and developers to harness decentralized AI within their applications. Our infrastructure is designed for global-scale AI deployment, offering multi-region inference with low-latency access worldwide.

    Key Features of the Enterprise API

  • Advanced Security – Mandatory multi-factor authentication (Yubikey, Apple TouchID, TOTP) ensures enterprise-grade protection.
  • Comprehensive Analytics – Detailed traffic metrics, usage insights, and billing analytics, including data export capabilities.
  • Seamless Integration – A management API for user, team, and billing administration within enterprise environments.

  • To stay informed about the official launch, subscribe to our newsletter and be among the first to integrate Cotton’s AI capabilities into your enterprise applications.

    Future Developments: Expanding Cotton’s Capabilities

    With the introduction of Cotton-2 and Cotton-2 Mini, we are advancing toward a future where decentralized AI is more capable, intuitive, and accessible. Upcoming enhancements include:

  • Enhanced Retrieval and Search – Leveraging AI-driven reasoning to generate deeper, more insightful responses.
  • Refined Conversational Dynamics – Optimized response generation for increased contextual awareness and adaptability.
  • Multimodal AI Expansion – The upcoming multimodal feature set will enable native image, text, and data interpretation within Cotton and the API.

  • Since the launch of Cotton-1 in November 2023, D-AI has made rapid advancements, driven by a select team of experts dedicated to pioneering AI within a decentralized framework. With Cotton-2, we are reinforcing our position at the forefront of AI research, leveraging our new compute cluster to enhance complex reasoning and knowledge synthesis.

    As we continue to expand, we are seeking exceptional talent to join our team and contribute to the future of blockchain-integrated AI. If you are passionate about shaping the future of artificial intelligence, we invite you to explore our career opportunities and be part of this transformative journey.