Trustible AI

Model Rating Report

Amazon Nova

Nova is a family of text and code generation models that comes in 3 sizes (Pro, Lite and Micro). The two larger sizes support image and video understanding.

Developer

Amazon

Country of Origin

USA

Systemic Risk

Open Data

Open Weight

API Access Only

Ratings

Overall Transparency

52%

Data Transparency

14%

Model Transparency

48%

Evaluation Transparency

59%

EU AI Act Readiness

51%

CAIT-D Readiness

33%

Transparency Assessment

The transparency assessment evaluates how clear and detailed the model creators are about their practices. Our assessment is based on the official documentation lists in Sources above. While external analysis may contain additional details about this system, our goal is to evaluate transparency of the providers themselves.

Sources

Paper: https://assets.amazon.science/4d/96/c72e5f634ee49bf341c89e50249a/the-amazon-nova-family-of-models-technical-report-and-model-card-2-11.pdf
User Guide: https://docs.aws.amazon.com/nova/latest/userguide/what-is-nova.html

Basic Details

Date of Release

December 2024

Methods of Distribution

The Nova models are available through the BedRock API.

Modality

All three models can ingest and output text and code. Lite and Pro can, also, ingest images, video and various document formats.

Input and Output Format

Pro and Lite have an input context window of 300k tokens; Lite has ones of 128k tokens. All 3 models have a max output of 5k tokens. Additional guidance is outlines in the User Guide.

License

Proprietary. Users must follow AWS [Terms of Service](https://aws.amazon.com/service-terms/).

Instructions for Use

The User Guide has detailed instructions for use with examples.

Documentation Support

High Transparency

The Nova models are clearly and accessibly documented: The User Guide provides detailed guidance and examples for each of the Nova models. The report contains detailed explanations of the range of evaluations conducted.

Changelog

Policy

Acceptable Use Policy

Guidelines for Responsible Use: https://docs.aws.amazon.com/nova/latest/userguide/responsible-use.html

User Data

AWS does not use inputs and outputs generated from NOVA models for training.

Data Takedown

Data takedown requests can be submitted here: https://titan.aws.com/privacy

AI Ethics Statement

The Nova models are developed and tested in accordance with the following Responsible AI Dimensions: Fairness, Explainability, Privacy and Security, Safety, Controllability, Veracity and robustness, Governance and Transparency.

Incident Reporting

Issues can be reported inside of Bedrock in the AWS console.

Model and Training

Task Description

High Transparency

The Nova models can be used for image, video, document and text understanding and text/code generation. They, also, support [tool use](https://docs.aws.amazon.com/nova/latest/userguide/tool-use.html) for calling specific APIs or code functions. Key limitations are outlined in the User Guide. For example, the Nova models do not support people identification in images and may struggle with multi-lingual image understanding; in video inputs, audio and timestamps are not supported.

Number of Parameters

Model Design

Low Transparency

Nova Pro, Lite and Micro are Transformed-based models.

Training Methodology

Low Transparency

The models were pretrained using a mixture of multilingual and multimodal data. Next, the models were post-trained using Supervised Fine-Tuning and reward model training on human preference data. Finally, methods like Direct Preference Optimization and Proximal Policy Optimization were used to ensure alignment with human preference.

Computational Resources

Energy Consumption

System Architecture

Input and output moderation models are used to detect unsafe prompts and generated content.

Training Hardware

The Nova family of models were trained on Amazon’s custom Trainium1 (TRN1) chips,10 NVidia A100 (P4d instances), and H100 (P5 instances) accelerators.

Data

Dataset Size

Dataset Description

Unknown

No information provided.

Data Sources

Low Transparency

The Nova models were trained on a variety of sources, including licensed data, proprietary data, open source datasets, and publicly available data. This included text data for over 200 languages including Arabic, Dutch, English, French, German, Hebrew, Hindi, Italian, Japanese, Korean, Portuguese, Russian, Simplified Chinese, Spanish, and Turkish. For post-training, new data were created to demonstrated safe behavior based on the Amazon RAI objectives.

Data Collection - Human Labor

Unknown

The curation of RLHF datasets is discussed, but no details on labor used were identified.

Data Preprocessing

Low Transparency

The pre-training data was curated for alignment with RAI objectives. This included de-identifying or removing certain types of personal data.

Data Bias Detection

Unknown

Data Deduplication

Data Toxic and Hateful Language Handling

IP Handling in Data

Data PII Handling

Certain types of personal data were removed or de-identified.

Data Collection Period

Evaluation

Performance Evaluation

High Transparency

The Nova Technical Report includes the results from multiple benchmarks that cover multiple core capabilities, including text-only and multi-modal reasoning, agentic workflows and long-context text retrieval. While the exact evaluation code is not published, the prompts used included in the appendix. Qualitative examples of model performance on a multi-modal are included.

Evaluation of Limitations

Low Transparency

The Nova models are evaluated for Hallucinations and for alignment with RAI objectives. The Report explains the process for measuring alignment with RAI objectives, but does not include specific details.

Evaluation with Public Tools

Adversarial Testing Procedure

Medium Transparency

The model developers framed the Adversarial Testing process around their 8 core Responsible AI dimensions: Fairness, Explainability, Privacy and Security, Safety, Controllability, Veracity and Robustness, Governance, and Transparency. Nova’s adherence to these principles was evaluated using automated benchmarks (public and new proprietary ones) and red-teaming (internal and external). The internal red-teaming exercise used a team of data analysts and subject-matter experts who tested the model’s robustness against adversarial prompts across all the RAI dimensions. The process resulted in a collection of over 300 adversarial prompting techniques that covered different languages and modalities. In addition, four external red-teaming groups (ActiveFence, Deloitte, Gomes Group and Nemysis) were employed to test the model in areas including hate speech, political misinformation and CBRN capabilities.

Model Mitigations

Medium Transparency

The developers used Supervised Fine Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) methods to align the NOVA models with Amazon's RAI dimensions. The post-training datasets included single and multi-turn examples of expected safe behavior for each RAI dimension in multiple languages. In addition to alignment, the deployed Nova systems uses input and output moderation models that serve as an additional defense and allow the developers to respond more quickly to new threats.