Trustible AI

Model Rating Report

GPT-4.5

GPT-4.5 is a text generation model, scaled from GPT-4o.

Developer

OpenAI

Country of Origin

USA

Systemic Risk

Open Data

Open Weight

API Access Only

Ratings

Overall Transparency

40%

Data Transparency

Model Transparency

24%

Evaluation Transparency

51%

EU AI Act Readiness

39%

CAIT-D Readiness

23%

Transparency Assessment

The transparency assessment evaluates how clear and detailed the model creators are about their practices. Our assessment is based on the official documentation lists in Sources above. While external analysis may contain additional details about this system, our goal is to evaluate transparency of the providers themselves.

Sources

- https://openai.com/index/introducing-gpt-4-5/
- https://help.openai.com/en/articles/10658365-gpt-4-5-in-chatgpt
- https://cdn.openai.com/gpt-4-5-system-card-2272025.pdf
- https://www.youtube.com/watch?v=LQEhOObUhQg

Basic Details

Date of Release

February 27, 2025

Methods of Distribution

The model is currently only available as a research preview through an OpenAI API account (Chat Completions API, Assistants API, and Batch API) or as a chatbot with a ChatGPT Pro Users account on web, mobile and desktop.

Modality

Only text+image inputs and text output is supported at this time.

Input and Output Format

The model accepts text and image inputs and outputs text responses, but doesn't specify technical details like maximum context window length, token limits, or specific formatting requirements.

License

Proprietary

Instructions for Use

The system card lacks comprehensive usage instructions. It doesn't provide specific examples, recommendations, hardware/software dependencies, or detailed interaction guidelines. It mentions that GPT-4.5 follows an "Instruction Hierarchy" that prioritizes system messages over user messages, but doesn't offer practical guidance for effective model usage.

Documentation Support

Medium Transparency

The OpenAI website has easily-accessible help articles that guide hands-on use. The system card covers performance and safety evaluations in detail.

Changelog

You can find the changelog [here](https://platform.openai.com/docs/changelog); however, it may not contain all the details related to minor changes.

Policy

Acceptable Use Policy

The OpenAI Usage policies can be found [here](https://openai.com/policies/usage-policies/).

User Data

User Data is used to train GPT models, unless users explicitly opt-out. (https://help.openai.com/en/articles/5722486-how-your-data-is-used-to-improve-model-performance)

Data Takedown

You can find how to opt out of model training and remove your data here: https://help.openai.com/en/articles/7730893-data-controls-faq. Copyright-related disputes can be submitted here: https://openai.com/form/copyright-disputes/.

AI Ethics Statement

OpenAI describe their principles in their [OpenAI Charter](https://openai.com/charter/).

Incident Reporting

OpenAI has a reporting feature that you can use to give feedback and report incidents. You can find more information [here](https://chatgpt.com/g/g-Jjm1uZYHz-incident-reporting).

Model and Training

Task Description

Medium Transparency

GPT-4.5 is a general-purpose text generation model that can be used for creative and nuanced tasks like writing and solving practical problems. It has an extensive knowledge base, improved ability to follow user intent, and greater “EQ” that previous generations. However, the model lacks chain-of-thought reasoning abilities and may be slower due to its size. It, also, performs significantly worse than reasoning models (i.e. o1) on complex coding tasks or tasks require detailed logic or multi-step reasoning.

Number of Parameters

Model Design

Low Transparency

The exact architecture details like layer counts or attention mechanisms are not available. This model prioritizes scaling over chain of thought reasoning. This results in a pre-trained model that claims "broader knowledge and deeper a understanding of the world, leading to reduced hallucinations and more reliability". Almost no details are available at this time.

Training Methodology

Low Transparency

This model uses the typical approach for state of the art large foundation models: pre-training from large-scale unsupervised learning on a large corpora and post-training for fine tuning with supervision (SFT), and reinforcement learning (RLHF). Benchmark performance metrics are shared, but no detailed information about the training methodology is provided.

Computational Resources

They claim that this model improves on GPT-4's computational efficiency by "more than 10x" without any further details.

Energy Consumption

No specific information about carbon emissions or total energy consumption associated with training the model

System Architecture

Training Hardware

Data

Dataset Size

No information is provided about the total size of the training dataset.

Dataset Description

Low Transparency

The system card says the model was trained using publicly available data, proprietary data from data partnerships and custom datasets developed at OpenAI. No details about dataset composition or origin are provided.

Data Sources

Unknown

Data Collection - Human Labor

Unknown

No information is provided.

Data Preprocessing

Low Transparency

The data processing pipeline claims rigorous filtering to maintain data quality and mitigate potential risks as well as advanced data filtering processes to reduce processing of personal information. Beyond these considerations, no details are available.

Data Bias Detection

Unknown

Data Deduplication

Data Toxic and Hateful Language Handling

A moderation API and safety classifiers are used to prevent the use of harmful or sensitive content, including explicit content involving minors, but doesn't detail specific handling of toxic or hateful language in the training data. No details are provided beyond mention that this was addressed.

IP Handling in Data

Data PII Handling

Data Collection Period

Evaluation

Performance Evaluation

Medium Transparency

GPT-4.5 was evaluated on knowledge-related and coding benchmarks. It showed improved performance over GPT-4o and o1 on several benchmarks, like MMMLU (multilingual) and Simple-QA; however, it is significantly worse than o1 at coding (SWE-Bench Verified) and math (AIME ‘24 (math)) assessments.
On human preference testing, GPT-4.5 outperformed GPT-4o by 7 to 13% depending on task type. Additional qualitative examples suggest that GPT-4.5 may have a better understanding of subtle cues or implicit expectations in prompts, potentially, making it a more useful tool for human collaboration.

Evaluation of Limitations

Medium Transparency

GPT-4.5 was evaluated for hallucinations, biases and safety.

Bias: GPT-4.5 showed low bias on BBQ, a benchmark that assess whether known social biases override the ability for the model to produce the correct answer.

Hallucinations: Metrics for the SimpleQA, PersonQA, and some domain-specific benchmarks which indicate this model is a substantial improvement over previous OpenAI models, though the system card does acknowledge that hallucinations persist and remain a significant limitation. The remainder of the discussion of limitations is primarily reporting benchmark performance, noting there are some improvements, but adding no further discussion about model limitations.

Safety was evaluated by measuring correct refusals to unsafe requests (using existing benchmarks like WIldChat and custom multi-modal benchmarks) and robustness against jailbreaks (using a custom dataset of human-sourced jailbreak prompts).

Evaluation with Public Tools

Adversarial Testing Procedure

Medium Transparency

Multiple adversarial tests were performed, including jailbreak evaluations against human-sourced jailbreaks, red teaming, third party assessment from Apollo Research, and generic stress-testing according to OpenAI's Preparedness Framework. The Safety Advisory Group classified GPT-4.5 as overall medium risk, including medium risk for CBRN and persuasion and low for cybersecurity and model autonomy.

Model Mitigations

Low Transparency

Some safety mitigations are discussed in general terms. The mitigations include SFT and RHLF during post-training to align with human preferences, safety training for political persuasion tasks, monitoring and detection systems, and enhanced content moderation. In addition, the model was taught to adhere to an [Instruction Hierarchy](https://arxiv.org/abs/2404.13208) that explicitly defined how the model should behave given conflicting instructions. These are claims made in the system card with no further information for public assessment.