-
The Amazon Nova Family of Models: Technical Report and Model Card
Authors:
Amazon AGI,
Aaron Langford,
Aayush Shah,
Abhanshu Gupta,
Abhimanyu Bhatter,
Abhinav Goyal,
Abhinav Mathur,
Abhinav Mohanty,
Abhishek Kumar,
Abhishek Sethi,
Abi Komma,
Abner Pena,
Achin Jain,
Adam Kunysz,
Adam Opyrchal,
Adarsh Singh,
Aditya Rawal,
Adok Achar Budihal Prasad,
Adrià de Gispert,
Agnika Kumar,
Aishwarya Aryamane,
Ajay Nair,
Akilan M,
Akshaya Iyengar,
Akshaya Vishnu Kudlu Shanbhogue
, et al. (761 additional authors not shown)
Abstract:
We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents…
▽ More
We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents and text. Amazon Nova Micro is a text-only model that delivers our lowest-latency responses at very low cost. Amazon Nova Canvas is an image generation model that creates professional grade images with rich customization controls. Amazon Nova Reel is a video generation model offering high-quality outputs, customization, and motion control. Our models were built responsibly and with a commitment to customer trust, security, and reliability. We report benchmarking results for core capabilities, agentic performance, long context, functional adaptation, runtime performance, and human evaluation.
△ Less
Submitted 17 March, 2025;
originally announced June 2025.
-
ReXCL: A Tool for Requirement Document Extraction and Classification
Authors:
Paheli Bhattacharya,
Manojit Chakraborty,
Santhosh Kumar Arumugam,
Rishabh Gupta
Abstract:
This paper presents the ReXCL tool, which automates the extraction and classification processes in requirement engineering, enhancing the software development lifecycle. The tool features two main modules: Extraction, which processes raw requirement documents into a predefined schema using heuristics and predictive modeling, and Classification, which assigns class labels to requirements using adap…
▽ More
This paper presents the ReXCL tool, which automates the extraction and classification processes in requirement engineering, enhancing the software development lifecycle. The tool features two main modules: Extraction, which processes raw requirement documents into a predefined schema using heuristics and predictive modeling, and Classification, which assigns class labels to requirements using adaptive fine-tuning of encoder-based models. The final output can be exported to external requirement engineering tools. Performance evaluations indicate that ReXCL significantly improves efficiency and accuracy in managing requirements, marking a novel approach to automating the schematization of semi-structured requirement documents.
△ Less
Submitted 10 April, 2025;
originally announced April 2025.
-
Bounds on Covert Capacity with Sub-Exponential Random Slot Selection
Authors:
Shi-Yuan Wang,
Keerthi S. K. Arumugam,
Matthieu R. Bloch
Abstract:
We consider the problem of covert communication with random slot selection over binary-input Discrete Memoryless Channels and Additive White Gaussian Noise channels, in which a transmitter attempts to reliably communicate with a legitimate receiver while simultaneously maintaining covertness with respect to an eavesdropper. Covertness refers to the inability of the eavesdropper to distinguish the…
▽ More
We consider the problem of covert communication with random slot selection over binary-input Discrete Memoryless Channels and Additive White Gaussian Noise channels, in which a transmitter attempts to reliably communicate with a legitimate receiver while simultaneously maintaining covertness with respect to an eavesdropper. Covertness refers to the inability of the eavesdropper to distinguish the transmission of a message from the absence of communication, modeled by the transmission of a fixed channel input. Random slot selection refers to the transmitter's ability to send a codeword in a time slot with known boundaries selected uniformly at random among a predetermined number of slots. Our main contribution is to develop bounds for the information-theoretic limit of communication in this model, called the covert capacity, when the number of time slots scales sub-exponentially with the codeword length. Our upper and lower bounds for the covert capacity are within a multiplicative factor of $\sqrt{2}$ independent of the channel. This result partially fills a characterization gap between the covert capacity without random slot selection and the covert capacity with random selection among an exponential number of slots in the codeword length. Our key technical contributions consist of i) a tight upper bound for the relative entropy characterizing the effect of random slot selection on the covertness constraint in our achievability proof; ii) a careful converse analysis to characterize the maximum allowable weight or power of codewords to meet the covertness constraint. Our results suggest that, unlike the case without random slot selection, the choice of covertness metric does not change the covert capacity in the presence of random slot selection.
△ Less
Submitted 18 July, 2025; v1 submitted 12 September, 2024;
originally announced September 2024.
-
Revisiting the Performance of Deep Learning-Based Vulnerability Detection on Realistic Datasets
Authors:
Partha Chakraborty,
Krishna Kanth Arumugam,
Mahmoud Alfadel,
Meiyappan Nagappan,
Shane McIntosh
Abstract:
The impact of software vulnerabilities on everyday software systems is significant. Despite deep learning models being proposed for vulnerability detection, their reliability is questionable. Prior evaluations show high recall/F1 scores of up to 99%, but these models underperform in practical scenarios, particularly when assessed on entire codebases rather than just the fixing commit. This paper i…
▽ More
The impact of software vulnerabilities on everyday software systems is significant. Despite deep learning models being proposed for vulnerability detection, their reliability is questionable. Prior evaluations show high recall/F1 scores of up to 99%, but these models underperform in practical scenarios, particularly when assessed on entire codebases rather than just the fixing commit. This paper introduces Real-Vul, a comprehensive dataset representing real-world scenarios for evaluating vulnerability detection models. Evaluating DeepWukong, LineVul, ReVeal, and IVDetect shows a significant drop in performance, with precision decreasing by up to 95 percentage points and F1 scores by up to 91 points. Furthermore, Model performance fluctuates based on vulnerability characteristics, with better F1 scores for information leaks or code injection than for path resolution or predictable return values. The results highlight a significant performance gap that needs addressing before deploying deep learning-based vulnerability detection in practical settings. Overfitting is identified as a key issue, and an augmentation technique is proposed, potentially improving performance by up to 30%. Contributions include a dataset creation approach for better model evaluation, Real-Vul dataset, and empirical evidence of deep learning models struggling in real-world settings.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
PwR: Exploring the Role of Representations in Conversational Programming
Authors:
Pradyumna YM,
Vinod Ganesan,
Dinesh Kumar Arumugam,
Meghna Gupta,
Nischith Shadagopan,
Tanay Dixit,
Sameer Segal,
Pratyush Kumar,
Mohit Jain,
Sriram Rajamani
Abstract:
Large Language Models (LLMs) have revolutionized programming and software engineering. AI programming assistants such as GitHub Copilot X enable conversational programming, narrowing the gap between human intent and code generation. However, prior literature has identified a key challenge--there is a gap between user's mental model of the system's understanding after a sequence of natural language…
▽ More
Large Language Models (LLMs) have revolutionized programming and software engineering. AI programming assistants such as GitHub Copilot X enable conversational programming, narrowing the gap between human intent and code generation. However, prior literature has identified a key challenge--there is a gap between user's mental model of the system's understanding after a sequence of natural language utterances, and the AI system's actual understanding. To address this, we introduce Programming with Representations (PwR), an approach that uses representations to convey the system's understanding back to the user in natural language. We conducted an in-lab task-centered study with 14 users of varying programming proficiency and found that representations significantly improve understandability, and instilled a sense of agency among our participants. Expert programmers use them for verification, while intermediate programmers benefit from confirmation. Natural language-based development with LLMs, coupled with representations, promises to transform software development, making it more accessible and efficient.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Alexa, play with robot: Introducing the First Alexa Prize SimBot Challenge on Embodied AI
Authors:
Hangjie Shi,
Leslie Ball,
Govind Thattai,
Desheng Zhang,
Lucy Hu,
Qiaozi Gao,
Suhaila Shakiah,
Xiaofeng Gao,
Aishwarya Padmakumar,
Bofei Yang,
Cadence Chung,
Dinakar Guthy,
Gaurav Sukhatme,
Karthika Arumugam,
Matthew Wen,
Osman Ipek,
Patrick Lange,
Rohan Khanna,
Shreyas Pansare,
Vasu Sharma,
Chao Zhang,
Cris Flagg,
Daniel Pressel,
Lavina Vaz,
Luke Dai
, et al. (17 additional authors not shown)
Abstract:
The Alexa Prize program has empowered numerous university students to explore, experiment, and showcase their talents in building conversational agents through challenges like the SocialBot Grand Challenge and the TaskBot Challenge. As conversational agents increasingly appear in multimodal and embodied contexts, it is important to explore the affordances of conversational interaction augmented wi…
▽ More
The Alexa Prize program has empowered numerous university students to explore, experiment, and showcase their talents in building conversational agents through challenges like the SocialBot Grand Challenge and the TaskBot Challenge. As conversational agents increasingly appear in multimodal and embodied contexts, it is important to explore the affordances of conversational interaction augmented with computer vision and physical embodiment. This paper describes the SimBot Challenge, a new challenge in which university teams compete to build robot assistants that complete tasks in a simulated physical environment. This paper provides an overview of the SimBot Challenge, which included both online and offline challenge phases. We describe the infrastructure and support provided to the teams including Alexa Arena, the simulated environment, and the ML toolkit provided to teams to accelerate their building of vision and language models. We summarize the approaches the participating teams took to overcome research challenges and extract key lessons learned. Finally, we provide analysis of the performance of the competing SimBots during the competition.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
Alexa Arena: A User-Centric Interactive Platform for Embodied AI
Authors:
Qiaozi Gao,
Govind Thattai,
Suhaila Shakiah,
Xiaofeng Gao,
Shreyas Pansare,
Vasu Sharma,
Gaurav Sukhatme,
Hangjie Shi,
Bofei Yang,
Desheng Zheng,
Lucy Hu,
Karthika Arumugam,
Shui Hu,
Matthew Wen,
Dinakar Guthy,
Cadence Chung,
Rohan Khanna,
Osman Ipek,
Leslie Ball,
Kate Bland,
Heather Rocker,
Yadunandana Rao,
Michael Johnston,
Reza Ghanadan,
Arindam Mandal
, et al. (2 additional authors not shown)
Abstract:
We introduce Alexa Arena, a user-centric simulation platform for Embodied AI (EAI) research. Alexa Arena provides a variety of multi-room layouts and interactable objects, for the creation of human-robot interaction (HRI) missions. With user-friendly graphics and control mechanisms, Alexa Arena supports the development of gamified robotic tasks readily accessible to general human users, thus openi…
▽ More
We introduce Alexa Arena, a user-centric simulation platform for Embodied AI (EAI) research. Alexa Arena provides a variety of multi-room layouts and interactable objects, for the creation of human-robot interaction (HRI) missions. With user-friendly graphics and control mechanisms, Alexa Arena supports the development of gamified robotic tasks readily accessible to general human users, thus opening a new venue for high-efficiency HRI data collection and EAI system evaluation. Along with the platform, we introduce a dialog-enabled instruction-following benchmark and provide baseline results for it. We make Alexa Arena publicly available to facilitate research in building generalizable and assistive embodied agents.
△ Less
Submitted 7 June, 2023; v1 submitted 2 March, 2023;
originally announced March 2023.
-
Porting numerical integration codes from CUDA to oneAPI: a case study
Authors:
Ioannis Sakiotis,
Kamesh Arumugam,
Marc Paterno,
Desh Ranjan,
Balsa Terzic,
Mohammad Zubair
Abstract:
We present our experience in porting optimized CUDA implementations to oneAPI. We focus on the use case of numerical integration, particularly the CUDA implementations of PAGANI and $m$-Cubes. We faced several challenges that caused performance degradation in the oneAPI ports. These include differences in utilized registers per thread, compiler optimizations, and mappings of CUDA library calls to…
▽ More
We present our experience in porting optimized CUDA implementations to oneAPI. We focus on the use case of numerical integration, particularly the CUDA implementations of PAGANI and $m$-Cubes. We faced several challenges that caused performance degradation in the oneAPI ports. These include differences in utilized registers per thread, compiler optimizations, and mappings of CUDA library calls to oneAPI equivalents. After addressing those challenges, we tested both the PAGANI and m-Cubes integrators on numerous integrands of various characteristics. To evaluate the quality of the ports, we collected performance metrics of the CUDA and oneAPI implementations on the Nvidia V100 GPU. We found that the oneAPI ports often achieve comparable performance to the CUDA versions, and that they are at most 10% slower.
△ Less
Submitted 17 February, 2023; v1 submitted 11 February, 2023;
originally announced February 2023.
-
m-CUBES An efficient and portable implementation of multi-dimensional integration for gpus
Authors:
Ioannis Sakiotis,
Kamesh Arumugam,
Marc Paterno,
Desh Ranjan,
Balsa Terzic,
Mohammad Zubair
Abstract:
The task of multi-dimensional numerical integration is frequently encountered in physics and other scientific fields, e.g., in modeling the effects of systematic uncertainties in physical systems and in Bayesian parameter estimation. Multi-dimensional integration is often time-prohibitive on CPUs. Efficient implementation on many-core architectures is challenging as the workload across the integra…
▽ More
The task of multi-dimensional numerical integration is frequently encountered in physics and other scientific fields, e.g., in modeling the effects of systematic uncertainties in physical systems and in Bayesian parameter estimation. Multi-dimensional integration is often time-prohibitive on CPUs. Efficient implementation on many-core architectures is challenging as the workload across the integration space cannot be predicted a priori. We propose m-Cubes, a novel implementation of the well-known Vegas algorithm for execution on GPUs. Vegas transforms integration variables followed by calculation of a Monte Carlo integral estimate using adaptive partitioning of the resulting space. m-Cubes improves performance on GPUs by maintaining relatively uniform workload across the processors. As a result, our optimized Cuda implementation for Nvidia GPUs outperforms parallelization approaches proposed in past literature. We further demonstrate the efficiency of m-Cubes by evaluating a six-dimensional integral from a cosmology application, achieving significant speedup and greater precision than the CUBA library's CPU implementation of VEGAS. We also evaluate m-Cubes on a standard integrand test suite. m-Cubes outperforms the serial implementations of the Cuba and GSL libraries by orders of magnitude speedup while maintaining comparable accuracy. Our approach yields a speedup of at least 10 when compared against publicly available Monte Carlo based GPU implementations. In summary, m-Cubes can solve integrals that are prohibitively expensive using standard libraries and custom implementations. A modern C++ interface header-only implementation makes m-Cubes portable, allowing its utilization in complicated pipelines with easy to define stateful integrals. Compatibility with non-Nvidia GPUs is achieved with our initial implementation of m-Cubes using the Kokkos framework.
△ Less
Submitted 21 June, 2022; v1 submitted 3 February, 2022;
originally announced February 2022.
-
PAGANI: A Parallel Adaptive GPU Algorithm for Numerical
Authors:
Ioannis Sakiotis,
Kamesh Arumugam,
Marc Paterno,
Desh Ranjan,
Balša Terzić,
Mohammad Zubair
Abstract:
We present a new adaptive parallel algorithm for the challenging problem of multi-dimensional numerical integration on massively parallel architectures. Adaptive algorithms have demonstrated the best performance, but efficient many-core utilization is difficult to achieve because the adaptive work-load can vary greatly across the integration space and is impossible to predict a priori. Existing pa…
▽ More
We present a new adaptive parallel algorithm for the challenging problem of multi-dimensional numerical integration on massively parallel architectures. Adaptive algorithms have demonstrated the best performance, but efficient many-core utilization is difficult to achieve because the adaptive work-load can vary greatly across the integration space and is impossible to predict a priori. Existing parallel algorithms utilize sequential computations on independent processors, which results in bottlenecks due to the need for data redistribution and processor synchronization. Our algorithm employs a high-throughput approach in which all existing sub-regions are processed and sub-divided in parallel. Repeated sub-region classification and filtering improves upon a brute-force approach and allows the algorithm to make efficient use of computation and memory resources. A CUDA implementation shows orders of magnitude speedup over the fastest open-source CPU method and extends the achievable accuracy for difficult integrands. Our algorithm typically outperforms other existing deterministic parallel methods.
△ Less
Submitted 23 June, 2021; v1 submitted 13 April, 2021;
originally announced April 2021.
-
Embedding Covert Information in Broadcast Communications
Authors:
Keerthi Suria Kumar Arumugam,
Matthieu R. Bloch
Abstract:
We analyze a two-receiver binary-input discrete memoryless broadcast channel, in which the transmitter communicates a common message simultaneously to both receivers and a covert message to only one of them. The unintended recipient of the covert message is treated as an adversary who attempts to detect the covert transmission. This model captures the problem of embedding covert messages in an inn…
▽ More
We analyze a two-receiver binary-input discrete memoryless broadcast channel, in which the transmitter communicates a common message simultaneously to both receivers and a covert message to only one of them. The unintended recipient of the covert message is treated as an adversary who attempts to detect the covert transmission. This model captures the problem of embedding covert messages in an innocent codebook and generalizes previous covert communication models in which the innocent behavior corresponds to the absence of communication between legitimate users. We identify the exact asymptotic behavior of the number of covert bits that can be transmitted when the rate of the innocent codebook is close to the capacity of the channel to the adversary. Our results also identify the dependence of the number of covert bits on the channel parameters and the characteristics of the innocent codebook.
△ Less
Submitted 28 August, 2018;
originally announced August 2018.
-
Covert Communication over a K-User Multiple Access Channel
Authors:
Keerthi Suria Kumar Arumugam,
Matthieu R. Bloch
Abstract:
We consider a scenario in which $K$ transmitters attempt to communicate covert messages reliably to a legitimate receiver over a discrete memoryless MAC while simultaneously escaping detection from an adversary who observes their communication through another discrete memoryless MAC. We assume that each transmitter may use a secret key that is shared only between itself and the legitimate receiver…
▽ More
We consider a scenario in which $K$ transmitters attempt to communicate covert messages reliably to a legitimate receiver over a discrete memoryless MAC while simultaneously escaping detection from an adversary who observes their communication through another discrete memoryless MAC. We assume that each transmitter may use a secret key that is shared only between itself and the legitimate receiver. We show that each of the $K$ transmitters can transmit on the order of $\sqrt{n}$ reliable and covert bits per $n$ channel uses, exceeding which, the warden will be able to detect the communication. We identify the optimal pre-constants of the scaling, which leads to a complete characterization of the covert capacity region of the $K$-user binary-input MAC. We show that, asymptotically, all sum-rate constraints are inactive unlike the traditional MAC capacity region. We also characterize the channel conditions that have to be satisfied for the transmitters to operate without a secret key.
△ Less
Submitted 7 June, 2019; v1 submitted 15 March, 2018;
originally announced March 2018.