Model Rating Report
Claude 4 Family
Claude Opus 4 is a state-of-the-art text and code generation model, with sustained performance on complex, long-running tasks and agent workflows. Claude 4 models are hybrid models offering two modes: near-instant responses and extended thinking for deeper reasoning.
Developer
Anthropic
Country of Origin
USA
Systemic Risk
Open Data
Open Weight
API Access Only
Ratings
Overall Transparency
58%
Data Transparency
37%
Model Transparency
30%
Evaluation Transparency
74%
EU AI Act Readiness
50%
CAIT-D Readiness
36%
Transparency Assessment
The transparency assessment evaluates how clear and detailed the model creators are about their practices. Our assessment is based on the official documentation lists in Sources above. While external analysis may contain additional details about this system, our goal is to evaluate transparency of the providers themselves.
Sources
System Card: https://www-cdn.anthropic.com/6be99a52cb68eb70eb9572b4cafad13df32ed995.pdf
Announcements:
https://www.anthropic.com/claude/opus
Basic Details
Date of Release
Claude Opus 4 and Claude Sonnet 4 were released on May 22, 2025. The announcement was made on this date, with both models becoming available to users at that time.
Methods of Distribution
Claude 4 Opus and Claude 4 Sonnet are both available through multiple distributions including include web browser interface, mobile apps (iOS and Android), and through APIs including Anthropic's own API, Amazon Bedrock, and Google Cloud's Vertex AI. Opus 4 is available to Pro, Max, Team, and Enterprise users, while Sonnet 4 is also available to free users.
Modality
Input options include image and text, but only text generation for output. Both models feature hybrid reasoning capability with both standard and extended thinking modes.
Input and Output Format
200K token context window for inputs and up to 64k output tokens for Sonnet 4 and 32K output tokens for Opus 4.
License
Proprietary.
Instructions for Use
The formal documentation provides general guidance on use cases for each model, and the [API documentation](https://docs.anthropic.com/en/docs/build-with-claude/overview) provide both high-level and low-level instructions for use.
Documentation Support
Medium Transparency
The documentation extensively covers model capabilities, applications, and some technical specifications. It includes information about how to understand the model's capabilities and proper use, but lacks detail about data and model design.
Changelog
The developers provide a changelog for the app, the api and versioned system prompts [here](https://docs.anthropic.com/en/release-notes/overview).
Policy
Acceptable Use Policy
https://www.anthropic.com/legal/aup
User Data
The materials mention that Anthropic uses data from Claude users who have opted in to have their data used for training, indicating that user data may be collected and used for model training with user consent.
Data Takedown
Anthropic had a detailed data takedown and [privacy policy](https://www.anthropic.com/legal/privacy). (Article on [copyright infringement](https://privacy.anthropic.com/en/articles/7996901-i-think-a-user-is-infringing-my-copyright-or-other-intellectual-property-how-do-i-report-it)).
AI Ethics Statement
Anthropic uses a [Responsible Scaling Policy](https://www.anthropic.com/news/anthropics-responsible-scaling-policy) and has a [constitution](https://www.anthropic.com/news/claudes-constitution) used during model training.
Incident Reporting
Incidents can be reported by emailing usersafety@anthropic.com. In addition, a Responsible Disclosure Policy is documented [here](https://www.anthropic.com/responsible-disclosure-policy).
Model and Training
Task Description
High Transparency
The documents detail numerous tasks that Claude 4 models excel at, including coding (both models leading on SWE-bench), agentic search, AI agent applications, content creation, customer-facing AI assistants, visual data extraction, robotic process automation, and knowledge Q&A. Opus 4 is particularly noted for its ability to handle complex, long-running tasks.
In terms of limitations, the model can hallucinate, reinforce disparate treatment (e.g. produce responses that favor certain populations) and be susceptible to prompt injections and jailbreaks (at rates lower than Claude-3.7).
Number of Parameters
No information is provided in any of the available documentation.
Model Design
Low Transparency
Claude 4 models are described as "hybrid reasoning models" that offer two modes: near-instant responses and extended thinking for deeper reasoning. They feature extended thinking with tool use capabilities, allowing them to alternate between reasoning and tool use to improve responses. In addition, the AI system provides summaries of long thought processes, generated by an additional smaller model, instead of showing the whole trace (developers can opt-out of this process).
Training Methodology
Low Transparency
Claude Opus 4 and Claude Sonnet 4 were trained with a focus on being helpful, honest, and harmless. They were pre-trained on large, diverse datasets to acquire language capabilities and used human feedback, Constitutional AI (based on principles such as the UN's Universal Declaration of Human Rights), and training of selected character traits.
Computational Resources
The materials provided do not disclose the computational resources used to train Claude 4 models.
Energy Consumption
No information is provided on the carbon footprint or specific mitigations for energy consumption beyond general claims of "model efficiency".
System Architecture
Claude 4 models are hybrid reasoning models with an "extended thinking mode", but no specific architecture details are provided.
Training Hardware
The materials provided do not specify the training hardware used for Claude 4 models.
Data
Dataset Size
The materials provided do not disclose the size of the datasets used to train Claude 4 models.
Dataset Description
Low Transparency
Claude Opus 4 and Claude Sonnet 4 were trained on a proprietary mix of publicly available information on the Internet, non-public data from third parties, data provided by data-labeling services and paid contractors, data from Claude users who opted in to have their data used for training, and internally generated data at Anthropic. Numeric analysis about the characteristics of the dataset is not available.
Data Sources
Medium Transparency
Training data sources include publicly available information on the Internet, non-public data from third parties, data from data-labeling services and paid contractors, data from Claude users who have opted in to have their data used for training, and data generated internally at Anthropic. For web data, the web followed industry-standard practices with respect to "robots.txt" instructions and did not access password-protected pages or those requiring sign-in or CAPTCHA verification.
Data Collection - Human Labor
Medium Transparency
Anthropic partners with data work platforms to engage workers who help improve their models through preference selection, safety evaluation, and adversarial testing. They state they only work with platforms that align with their belief in providing fair and ethical compensation to workers and are committed to safe workplace practices regardless of location. Anthropic publicized the [Inbound Services Agreement](https://www.anthropic.com/legal/inbound-services-agreement) that crowd workers abide agree to.
Data Preprocessing
Low Transparency
Anthropic employed several data cleaning and filtering methods during the training process, including deduplication and classification.
Data Bias Detection
Unknown
The materials provided do not specifically address how Anthropic detected or addressed biases in their training data beyond data collection and processing for alignment, broadly.
Data Deduplication
Data Toxic and Hateful Language Handling
The materials provided do not specifically address how toxic and hateful language was handled in the training data.
IP Handling in Data
No information is provided about IP handling in Data.
Data PII Handling
The materials provided do not specifically address how personally identifiable information (PII) was handled in the training data. They do note that Claude users who have allowed share access to their usage data has been incorporated in some way to model training, but do not discuss how or whether these data are fully de-identified.
Data Collection Period
The materials mention that the models were trained on publicly available information on the Internet as of March 2025, indicating that this was the cutoff date for the training data, though no start data is provided.
Evaluation
Performance Evaluation
Medium Transparency
The materials provide extensive benchmark results showing Claude 4 models' performance on coding (SWE-bench, Terminal-bench), reasoning (GPQA Diamond), multilingual Q&A (MMMLU), visual reasoning (MMMU), and high school math competition (AIME 2020). Both models show significant improvements over previous versions and competitive performance against other leading models.
Evaluation of Limitations
High Transparency
The System Card includes an extensive section on bias evaluations that assess the models' treatment of political topics and potential discriminatory bias, among other topics. Claude Opus 4 and Claude Sonnet 4 demonstrated bias levels similar to or less than Claude Sonnet 3.7.
Evaluation with Public Tools
Adversarial Testing Procedure
High Transparency
The System Card details single-turn violative request evaluations, ambiguous context evaluations, multi-turn testing, and jailbreak resistance testing using the StrongREJECT benchmark. The alignment assessment section also describes various adversarial testing procedures, including alignment faking assessment and reward hacking evaluations.
Model Mitigations
Medium Transparency
The System Card describes the iterative model evaluations throughout training to understand how catastrophic risk-related capabilities evolved over time. They tested multiple different model snapshots and implemented appropriate safeguards, with Claude Opus 4 being deployed with ASL-3 safeguards and Claude Sonnet 4 with ASL-2 safeguards. The actual mitigation techniques are described in broad terms; they included post-training using Constitutional AI and "training for specific characteristics".