Model Rating Report
Claude 3.7 Sonnet
Claude 3.7 Sonnet is a hybrid reasoning model with a standard text generation and separate "extended thinking" mode.
Developer
Anthropic
Country of Origin
USA
Systemic Risk
Open Data
Open Weight
API Access Only
Ratings
Overall Transparency
54%
Data Transparency
29%
Model Transparency
24%
Evaluation Transparency
74%
EU AI Act Readiness
44%
CAIT-D Readiness
30%
Transparency Assessment
The transparency assessment evaluates how clear and detailed the model creators are about their practices. Our assessment is based on the official documentation lists in Sources above. While external analysis may contain additional details about this system, our goal is to evaluate transparency of the providers themselves.
Sources
Release Announcement: https://www.anthropic.com/news/claude-3-7-sonnet
System Card: https://assets.anthropic.com/m/785e231869ea8b3b/original/claude-3-7-sonnet-system-card.pdf
Developer Documentation: https://docs.anthropic.com/en/docs/welcome
Basic Details
Date of Release
February 24, 2025
Methods of Distribution
Claude is available through Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI.
Modality
Claude models can take text and images as input and output text.
Input and Output Format
The context window is 200k. The max output is 8192 tokens in regular mode and 65k tokens in 'Extended thinking' mode.
License
Properiatary
Instructions for Use
The [User Guide](https://docs.anthropic.com/en/docs/welcome) provides detailed instructions.
Documentation Support
Medium Transparency
The Developer Documentation is extensive and provides a lot of hands-on guidance for use. While the System Card contains a lot of information on system testing, it provides few details on the model and the data, and could be better organized.
Changelog
https://docs.anthropic.com/en/release-notes/overview
Policy
Acceptable Use Policy
https://www.anthropic.com/legal/aup
User Data
Model inputs and outputs are not used to train models. Exceptions apply when the data is flagged for review by the Trust & Safety team or reported by the user.
Data Takedown
Anthropic had a detailed data takedown and [privacy policy](https://www.anthropic.com/legal/privacy). (Article on [copyright infringement](https://privacy.anthropic.com/en/articles/7996901-i-think-a-user-is-infringing-my-copyright-or-other-intellectual-property-how-do-i-report-it)).
AI Ethics Statement
Anthropic uses a [Responsible Scaling Policy](https://www.anthropic.com/news/anthropics-responsible-scaling-policy) and has a [constitution](https://www.anthropic.com/news/claudes-constitution) used during model training.
Incident Reporting
Incidents can be reported by emailing usersafety@anthropic.com. In addition, a Responsible Disclosure Policy is documented [here](https://www.anthropic.com/responsible-disclosure-policy).
Model and Training
Task Description
Medium Transparency
Claude can be used for reasoning, coding, multilingual tasks and image understanding. The "extended thinking" mode can be used for improved performance on math, physics, instruction-following and coding tasks. The developer documentation provides multiple extended examples.
Number of Parameters
Model Design
Low Transparency
The model has a "hybrid" design that allows it to provide instant responses or engage into a more complex reasoning mode. Details of the implementation are not provided.
Training Methodology
Low Transparency
Claude was pre-trained on a large dataset for next word prediction. It was post-trained using human feedback techniques to produce responses that are helpful and harmless. Part of post-training involved Constitutional AI, a reinforcement learning technique that aligns the model with a set of rules and principles.
Computational Resources
Energy Consumption
System Architecture
Training Hardware
Data
Dataset Size
Dataset Description
Low Transparency
Claude 3.7 Sonnet is trained on a proprietary mix of publicly available information on the Internet, as well as non-public data from third parties, data provided by data labeling services and paid contractors, and data we generate internally. The model was not been trained on any user prompt or output data submitted to us by users or customers, including free users, Claude Pro users, and API customers.
Data Sources
Low Transparency
The training dataset consists of a proprietary mix of publicly available information on the Internet, non-public data from third parties, data provided by data labeling services and paid contractors, and data created internally. The web data is collected using a general purpose web-crawler that respects robot.txt files and does not attempt to bypass CAPTCHA controls.
Data Collection - Human Labor
Low Transparency
The documentation explicitly references data labeling services and paid contractors, but does not provide any additional details.
Data Preprocessing
Low Transparency
The system card states that data cleaning and filtering methods were used, like deduplication and classification. No additional information is provided.
Data Bias Detection
Unknown
No information provided.
Data Deduplication
Data cleaning included deduplication.
Data Toxic and Hateful Language Handling
No information provided.
IP Handling in Data
No information provided.
Data PII Handling
No information provided.
Data Collection Period
Claude's knowledge cut-off is the end of October 2024 (web data was collected through November 2024)
Evaluation
Performance Evaluation
Medium Transparency
Claude was evaluated on reasoning, math, coding and agentic tool use benchmarks. The model excelled in coding and agentic tool-use capabilities: out-performing existing models by 13% on the SWE-bench Verified, which measures the models ability to solve real-world software issues, and setting a new SotA on TAU-bench, a framework for testing AI agent performance on complex tasks that involve user and tool interactions.
Evaluation of Limitations
High Transparency
Claude was evaluated for Appropriate Harmlessness, Bias, Computer Use Safety and Chain-of-Thought Faithfulness.
Appropriate Harmlessness is a newly developed evaluation that considers both if a model refused to reply and if it generated unsafe content. This scheme was used to account for ambiguous input prompts and safe responses to prompts labeled as unsafe. On an internal dataset, Claude produced unnecessary refusal 12.5% of the time and a policy violation 0.6% of the time.
Bias evaluation was conducted using the Bias Benchmark for Question Answering, which measures whether a model relies of stereotypes during question answering. Claude was shown to maintain neutrality without compromising accuracy.
Computer Use safety evaluated whether Claude was susceptible to Indirect Prompt Injection when used in Computer Use. This evaluation was conducted using a hand-crafted dataset of "computer screens" containing unsafe content. The study found that prompt injections could be prevented 88% of the time by the final model.
Chain-of-Thought (CoT) Faithfulness was a new evaluation designed to measure whether the CoT generated by the "extended thinking" mode aligns with the final response. They found that the CoT does not reliably reveal the full reasoning process.
Evaluation with Public Tools
Adversarial Testing Procedure
High Transparency
Claude underwent extensive testing under Anthropic's Responsible Scaling Policy and received an ASL-2 rating (same as Claude 3.5). This evaluation included CBRN, autonomy and Cyber Risk evaluations. Risks were measured using multiple techniques including:
- Uplift trials - Compare how individuals perform on a sensitive task with access to Claude vs the internet alone
- Knowledge Benchmarks - evaluated model's knowledge of sensitive subjects
- Capability Benchmarks - evaluated model's ability to solve custom cybersecurity challenges
- External Red-Teaming
Detailed discussion of each evaluation is included in the System Card.
Model Mitigations
Medium Transparency
Model mitigations were implemented to cover a broad range of risks including Child Safety, Cyber Attacks, Dangerous Weapons and Technology, Hate & Discrimination and CBRN harms. The mitigations including post-training techniques like Constitutional AI and RLHF.