Model Rating Report

Last Updated: April 21, 2025

Overview

GPT-4o

GPT-4o

GPT-4o is a multilingual, multimodal transformer model for processing text, images, audio and code.

Developer

OpenAI

Country of Origin

USA

Systemic Risk

Open Data

Open Weight

API Access Only

Ratings

Overall Transparency

55%

Data Transparency

41%

Model Transparency

18%

Evaluation Transparency

70%

EU AI Act Readiness

51%

CAIT-D Readiness

56%

Transparency Assessment

The transparency assessment evaluates how clear and detailed the model creators are about their practices. Our assessment is based on the official documentation lists in Sources above. While external analysis may contain additional details about this system, our goal is to evaluate transparency of the providers themselves.

Basic Details

May 13th, 2024. A new image generation approach was added on March 25th, 2025.

GPT-4o and GPT-4o mini are available via the Chat Completions API, Assistants API, and Batch API so long as you have an OpenAI API account. You can also access GPT-4o via the OpenAI Playground.

GPT-4o is a multimodal that can take in and output any combination of text, audio, and image. It can also take video as an input, but it currently is unable to produce video output.

The input context window for GPT-4o is 128K tokens with an output limit of 2,048 tokens. The input and output format can be found on the model release page.

You can find the terms of use for your countryhere as the model is proprietary.

You can find instructions on how to use any of the ChatGPT models here.

The documentation is easy to access and navigate. It covers the capabilities, evaluations, and mitigations in a lot of detail. Some information is available about training data. However, there is no information available on the architecture and hardware used to train the model.

You can find the changelog for GPT-4o here.

Report Issue / Feedback

Policy

The OpenAI Usage policies can be found here.

User data is used to train ChatGPT models. This includes log data (e.g. your IP address), usage data, device information, location information, and cookies.

You can find how to opt out of model training and remove your data here

OpenAI describe their principles in their OpenAI Charter.

ChatGPT has a reporting feature that you can use to give feedback and report incidents. You can find more information here.

Report Issue / Feedback

Model and Training

GPT-4o is capable of producing a combination of text, audio, and images in response to multimodal inputs (text, audio, image, and video). OpenAI provide a series of capability "explorations" here. These are samples of the model's responses to a series of prompt scenarios, including "Poster creation for the movie 'detective'" and "Lecture summarisation". Some limitations are shown in a video highlighting issues with the model switching languages mid-sentence, but there is no detailed explanation beyond this and no clear identification of an exhaustive list of limitations.

No explanation provided for this rating.

No explanation provided for this rating.

Training involved a pre-training phase using partially filtered data, a post-training phase where the model is aligned to human preferences via red-teaming.

No explanation provided for this rating.

No explanation provided for this rating.

No explanation provided for this rating.

No explanation provided for this rating.

Report Issue / Feedback

Data

No explanation provided for this rating.

Key dataset components include data from public web pages, code and math data, and multimodal data. There is no publicly available information on exactly what data was used to train the model.

The pre-training data for GPT-4o included a curated set of publicly available data, mostly collected from machine learning datasets and web crawls, and proprietary data from data partnerships. These partnerships allow OpenAI to access non-publicly available data, such as data usually behind a paywall, archives, and metadata.

No explanation provided for this rating.

The data used to train the model was pre-filtered to remove any unwanted and harmful information. The use of the Moderation API and safety classifiers means they can filter out data that may contribute to harmful content or information hazards. This includes CSAM, hateful content, violence, and CBRN. Image datasets are filtered for explicit content. Personal information is also removed from the training data using advanced data filtering processes.

No explanation provided for this rating.

No explanation provided for this rating.

Data that could contribute to harmful content or information hazards is filtered out. The documentation lists "hateful content" as a category of data that would be removed from the set before training, but does not give a clear description for what this is.

IP data is stored and used to train ChatGPT models, it is not removed from the dataset.

Personal information is reduced from the training data using advanced data filtering processes.

The data cut-off is October 2023.

Report Issue / Feedback

Evaluation

GPT-4o has been evaluated on a number of benchmarks relating to text evaluation, audio ASR performance, audio translation performance, multimodal performance (M3Exam Zero-Shot), and vision understanding. The results of these capability evaluations can be found here.

Red teaming was used to evaluate the model to determine any possible dangerous capabilities, assess risks, and stress test mitigations. Red teamers covered categories that spanned (amongst others) violative and disallowed content, misinformation, bias, ungrounded inferences, emotional perception and anthropomorphisation risks, and copyright. The data generated by this led to the creation of a series of quantitative evaluations.

Alongside the new evaluation methods, a range of pre-existing evaluation datasets were altered to allow them to work for speech-to-speech models. This does lead to heavy reliance on the text-to-speech model.

One type of evaluation dataset used to assess GPT-4o includes the capture the flag challenges designed to determine cybersecurity risk. These evals covered web application exploitation, reverse engineering, remote exploitation, and cryptography. The biological risk and persuasiveness of the model was also assessed.

Finally, third-party evaluations were also run by METR and ApolloResearch to evaluate the risks posed by GPT-4o.

GPT-4o is evaluated using public tools including MMLU, HumanEval, and M3Exam. You can find the full list of performance evaluations here

Red-teaming was used to test for potentially dangerous capabilities in GPT-4o. The red-teamers were external to OpenAI and were given access to snapshots of the model at various stages of training. Red-teaming occurred in four stages: the first three tested the model via an internal tool, and the latter used the full iOS experience.

Red teamers were asked to assess novel risks and stress test mitigations covering a wide range of categories. This included disallowed content, bias, ungrounded inference, and fraudulent behaviour.

Automated red teaming was also used to test the addendum native image generation capabilities.

The creation of GPT-4o involved a multi-stage mitigation process including pre-processing of the training data, alignment with human values, multi-stage red teaming, and a series of both internal and external evaluations. Based on these evaluations, specific measures were taken to mitigate the risk posed by the model. There are also clear feedback channels for users and ways to report incidences that allow the data provided to further improve the model.

For example, the model was seen to make potentially biased inferences about speakers. To mitigate this issue, the model was post-trained to refuse to comply with questions asking it to make ungrounded inferences and hedge questions that require sensitive trait attribution. Compared to the initial model, the post-trained model was 24 points more likely to correctly responded to sensitive trait identification requests.

The addition of new native image generation techniques in March 2025 also introduced new safety risks due to increased image generation and modification capabilities. Mitigation strategies against this risk include chat model refusals, prompt blocking, output blocking, and increased safeguards for minors. These are detailed in the addendum to the GPT-4o system card.

The details and results of the rest of the mitigations are published on the GPT-4o System Card.

Report Issue / Feedback