Model Rating Report
Amazon Nova
Nova is a family of text and code generation models that comes in 3 sizes (Pro, Lite and Micro). The two larger sizes support image and video understanding.
Developer
Amazon
Country of Origin
USA
Systemic Risk
Open Data
Open Weight
API Access Only
Ratings
Overall Transparency
52%
Data Transparency
14%
Model Transparency
48%
Evaluation Transparency
59%
EU AI Act Readiness
51%
CAIT-D Readiness
33%
Transparency Assessment
The transparency assessment evaluates how clear and detailed the model creators are about their practices. Our assessment is based on the official documentation lists in Sources above. While external analysis may contain additional details about this system, our goal is to evaluate transparency of the providers themselves.
Sources
Paper: https://assets.amazon.science/4d/96/c72e5f634ee49bf341c89e50249a/the-amazon-nova-family-of-models-technical-report-and-model-card-2-11.pdf
User Guide: https://docs.aws.amazon.com/nova/latest/userguide/what-is-nova.html
Basic Details
Date of Release
December 2024
Methods of Distribution
The Nova models are available through the BedRock API.
Modality
All three models can ingest and output text and code. Lite and Pro can, also, ingest images, video and various document formats.
Input and Output Format
Pro and Lite have an input context window of 300k tokens; Lite has ones of 128k tokens. All 3 models have a max output of 5k tokens. Additional guidance is outlines in the User Guide.
License
Proprietary. Users must follow AWS [Terms of Service](https://aws.amazon.com/service-terms/).
Instructions for Use
The User Guide has detailed instructions for use with examples.
Documentation Support
High Transparency
The Nova models are clearly and accessibly documented: The User Guide provides detailed guidance and examples for each of the Nova models. The report contains detailed explanations of the range of evaluations conducted.
Changelog
Policy
Acceptable Use Policy
Guidelines for Responsible Use: https://docs.aws.amazon.com/nova/latest/userguide/responsible-use.html
User Data
AWS does not use inputs and outputs generated from NOVA models for training.
Data Takedown
Data takedown requests can be submitted here: https://titan.aws.com/privacy
AI Ethics Statement
The Nova models are developed and tested in accordance with the following Responsible AI Dimensions: Fairness, Explainability, Privacy and Security, Safety, Controllability, Veracity and robustness, Governance and Transparency.
Incident Reporting
Issues can be reported inside of Bedrock in the AWS console.
Model and Training
Task Description
High Transparency
The Nova models can be used for image, video, document and text understanding and text/code generation. They, also, support [tool use](https://docs.aws.amazon.com/nova/latest/userguide/tool-use.html) for calling specific APIs or code functions. Key limitations are outlined in the User Guide. For example, the Nova models do not support people identification in images and may struggle with multi-lingual image understanding; in video inputs, audio and timestamps are not supported.
Number of Parameters
Model Design
Low Transparency
Nova Pro, Lite and Micro are Transformed-based models.
Training Methodology
Low Transparency
The models were pretrained using a mixture of multilingual and multimodal data. Next, the models were post-trained using Supervised Fine-Tuning and reward model training on human preference data. Finally, methods like Direct Preference Optimization and Proximal Policy Optimization were used to ensure alignment with human preference.
Computational Resources
Energy Consumption
System Architecture
Input and output moderation models are used to detect unsafe prompts and generated content.
Training Hardware
The Nova family of models were trained on Amazon’s custom Trainium1 (TRN1) chips,10 NVidia A100 (P4d instances), and H100 (P5 instances) accelerators.
Data
Dataset Size
Dataset Description
Unknown
No information provided.
Data Sources
Low Transparency
The Nova models were trained on a variety of sources, including licensed data, proprietary data, open source datasets, and publicly available data. This included text data for over 200 languages including Arabic, Dutch, English, French, German, Hebrew, Hindi, Italian, Japanese, Korean, Portuguese, Russian, Simplified Chinese, Spanish, and Turkish. For post-training, new data were created to demonstrated safe behavior based on the Amazon RAI objectives.
Data Collection - Human Labor
Unknown
The curation of RLHF datasets is discussed, but no details on labor used were identified.
Data Preprocessing
Low Transparency
The pre-training data was curated for alignment with RAI objectives. This included de-identifying or removing certain types of personal data.
Data Bias Detection
Unknown
Data Deduplication
Data Toxic and Hateful Language Handling
IP Handling in Data
Data PII Handling
Certain types of personal data were removed or de-identified.
Data Collection Period
Evaluation
Performance Evaluation
High Transparency
The Nova Technical Report includes the results from multiple benchmarks that cover multiple core capabilities, including text-only and multi-modal reasoning, agentic workflows and long-context text retrieval. While the exact evaluation code is not published, the prompts used included in the appendix. Qualitative examples of model performance on a multi-modal are included.
Evaluation of Limitations
Low Transparency
The Nova models are evaluated for Hallucinations and for alignment with RAI objectives. The Report explains the process for measuring alignment with RAI objectives, but does not include specific details.
Evaluation with Public Tools
Adversarial Testing Procedure
Medium Transparency
The model developers framed the Adversarial Testing process around their 8 core Responsible AI dimensions: Fairness, Explainability, Privacy and Security, Safety, Controllability, Veracity and Robustness, Governance, and Transparency. Nova’s adherence to these principles was evaluated using automated benchmarks (public and new proprietary ones) and red-teaming (internal and external). The internal red-teaming exercise used a team of data analysts and subject-matter experts who tested the model’s robustness against adversarial prompts across all the RAI dimensions. The process resulted in a collection of over 300 adversarial prompting techniques that covered different languages and modalities. In addition, four external red-teaming groups (ActiveFence, Deloitte, Gomes Group and Nemysis) were employed to test the model in areas including hate speech, political misinformation and CBRN capabilities.
Model Mitigations
Medium Transparency
The developers used Supervised Fine Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) methods to align the NOVA models with Amazon's RAI dimensions. The post-training datasets included single and multi-turn examples of expected safe behavior for each RAI dimension in multiple languages. In addition to alignment, the deployed Nova systems uses input and output moderation models that serve as an additional defense and allow the developers to respond more quickly to new threats.