Trustible AI

Model Rating Report

GPT-o3 and o4

GPT-o3 and -GPT-o4 mini are designed to reason for longer before responding. The models are able to agentically interact with all tools currently available to ChatGPT. This includes searching the internet, analysing uploaded files and visual inputs, and generating images.

Developer

OpenAI

Country of Origin

USA

Systemic Risk

Open Data

Open Weight

API Access Only

Ratings

Overall Transparency

47%

Data Transparency

22%

Model Transparency

18%

Evaluation Transparency

59%

EU AI Act Readiness

46%

CAIT-D Readiness

40%

Transparency Assessment

The transparency assessment evaluates how clear and detailed the model creators are about their practices. Our assessment is based on the official documentation lists in Sources above. While external analysis may contain additional details about this system, our goal is to evaluate transparency of the providers themselves.

Sources

Press Release: https://openai.com/index/o3-o4-mini-system-card/
System Card: https://cdn.openai.com/pdf/2221c875-02dc-4789-800b-e7758f3722c1/o3-and-o4-mini-system-card.pdf
Introduction: https://openai.com/index/introducing-o3-and-o4-mini/

Basic Details

Date of Release

16 April 2025

Methods of Distribution

The models can be accessed through the ChatGPT API

Modality

The models can take text, files, code, and images as input. They are able to access tools within the ChatGPT catalogue (web browsing, Python, image analysis and generation etc) and produce text.

Input and Output Format

The context window for this model is 200,000 tokens and it can output a maximum of 100,000 tokens.

License

Proprietary

Instructions for Use

Instructions for using this model can be found both on the OpenAI website and on the ChatGPT page.

Documentation Support

Low Transparency

The documentation is clear and easy to read. Most key information is available but too general. Some information is missing entirely, such as details on the system architecture or training processes.

Changelog

You can find the changelog [here](https://platform.openai.com/docs/changelog); however, it may not contain all the details related to minor changes.

Policy

Acceptable Use Policy

Usage Policies are available on Open AI's website.

User Data

User data is used to train ChatGPT models. This includes log data (e.g. your IP address), usage data, device information, location information, and cookies.

Data Takedown

You can find how to opt out of model training and remove your data [here](https://help.openai.com/en/articles/7730893-data-controls-faq)

AI Ethics Statement

OpenAI describe their principles in their [OpenAI Charter](https://openai.com/charter/).

Incident Reporting

ChatGPT has a reporting feature that you can use to give feedback and report incidents. You can find more information [here](https://chatgpt.com/g/g-Jjm1uZYHz-incident-reporting).

Model and Training

Task Description

Medium Transparency

The models are capable of responding to text, image, code, and file input. They have access to the full range of ChatGPT tools, including searching the internet, image analysis and generation, and Python.

Number of Parameters

Model Design

Unknown

Not explicitly stated in the provided documents.

Training Methodology

Low Transparency

OpenAI reasoning models are trained using reinforcement learning on chains of thought to encourage reasoning.

Computational Resources

Energy Consumption

System Architecture

Training Hardware

Data

Dataset Size

Dataset Description

Low Transparency

The two models were trained on diverse datasets. This included information that is available publicly online, information from third parties, and information from users, human trainers and researchers. Data is pre-processed to maintain quality and mitigate potential risks.

Data Sources

Low Transparency

The data used to train these models was sourced from third parties, publicly available information on the internet, and from ChatGPT users, researchers, and human trainers.

Data Collection - Human Labor

Low Transparency

Human labour is used in the production of this data. This includes data produced and generated by researchers and human trainers.

Data Preprocessing

Low Transparency

Data was filtered to maintain quality and mitigate a series of identified risks. Personal information was reduced from the training data. Moderation API and safety classifiers were also used to help prevent the use of harmful or sensitive content, including explicit material.

Data Bias Detection

Unknown

Data Deduplication

Data Toxic and Hateful Language Handling

IP Handling in Data

Data PII Handling

Personal information is removed from the dataset through pre-training filtering.

Data Collection Period

Evaluation

Performance Evaluation

Medium Transparency

The models are tested against a variety of safety and performance evaluations. These include in-house evaluations, third-party evaluations by groups including Apollo Research and METR, PersonQA, PaperBench, SWE-Lancer, and MMLU. The results of each evaluation are listed in the system card and compared to other OpenAI models. Many of these benchmarks are reported with clear explanations for how and why the evaluation was conducted, but this is not the case for all of the evaluations.

Evaluation of Limitations

Medium Transparency

The models can hallucinate, be jailbroken (i.e. prompted to produce inappropriate content) and produce incorrect refusals. Hallucinations are a significant concern with the models hallucinating 33% and 48% percent of the time on PersonQA (a benchmark that asks questions about public figures). Detailed results related to these limitations are reported in the System Card.

Evaluation with Public Tools

Adversarial Testing Procedure

Medium Transparency

The model is tested extensively for safety risks. This includes through jailbreak testing to evaluate the robustness of the model using adversarial prompts. These jailbreaks are either human sourced or they are from the StrongReject database. Other safety risks are evaluated, including harmful image generation, production of disallowed content, hallucinations, and bias, among others. The measures taken to prevent each risk, the evaluations used to test them, and the results of each evaluation are included in the system card.

Model Mitigations

Medium Transparency

The model mitigations included post-training to teach the model about refusal behavior for harmful requests and using moderation models for the most egregious content. The final models are tested for a variety of safety risks including fairness and bias, personal identification, and deception by both OpenAI and third parties.