-
The Amazon Nova Family of Models: Technical Report and Model Card
Authors:
Amazon AGI,
Aaron Langford,
Aayush Shah,
Abhanshu Gupta,
Abhimanyu Bhatter,
Abhinav Goyal,
Abhinav Mathur,
Abhinav Mohanty,
Abhishek Kumar,
Abhishek Sethi,
Abi Komma,
Abner Pena,
Achin Jain,
Adam Kunysz,
Adam Opyrchal,
Adarsh Singh,
Aditya Rawal,
Adok Achar Budihal Prasad,
Adrià de Gispert,
Agnika Kumar,
Aishwarya Aryamane,
Ajay Nair,
Akilan M,
Akshaya Iyengar,
Akshaya Vishnu Kudlu Shanbhogue
, et al. (761 additional authors not shown)
Abstract:
We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents…
▽ More
We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents and text. Amazon Nova Micro is a text-only model that delivers our lowest-latency responses at very low cost. Amazon Nova Canvas is an image generation model that creates professional grade images with rich customization controls. Amazon Nova Reel is a video generation model offering high-quality outputs, customization, and motion control. Our models were built responsibly and with a commitment to customer trust, security, and reliability. We report benchmarking results for core capabilities, agentic performance, long context, functional adaptation, runtime performance, and human evaluation.
△ Less
Submitted 17 March, 2025;
originally announced June 2025.
-
Efficient Noise Calculation in Deep Learning-based MRI Reconstructions
Authors:
Onat Dalmaz,
Arjun D. Desai,
Reinhard Heckel,
Tolga Çukur,
Akshay S. Chaudhari,
Brian A. Hargreaves
Abstract:
Accelerated MRI reconstruction involves solving an ill-posed inverse problem where noise in acquired data propagates to the reconstructed images. Noise analyses are central to MRI reconstruction for providing an explicit measure of solution fidelity and for guiding the design and deployment of novel reconstruction methods. However, deep learning (DL)-based reconstruction methods have often overloo…
▽ More
Accelerated MRI reconstruction involves solving an ill-posed inverse problem where noise in acquired data propagates to the reconstructed images. Noise analyses are central to MRI reconstruction for providing an explicit measure of solution fidelity and for guiding the design and deployment of novel reconstruction methods. However, deep learning (DL)-based reconstruction methods have often overlooked noise propagation due to inherent analytical and computational challenges, despite its critical importance. This work proposes a theoretically grounded, memory-efficient technique to calculate voxel-wise variance for quantifying uncertainty due to acquisition noise in accelerated MRI reconstructions. Our approach approximates noise covariance using the DL network's Jacobian, which is intractable to calculate. To circumvent this, we derive an unbiased estimator for the diagonal of this covariance matrix (voxel-wise variance) and introduce a Jacobian sketching technique to efficiently implement it. We evaluate our method on knee and brain MRI datasets for both data- and physics-driven networks trained in supervised and unsupervised manners. Compared to empirical references obtained via Monte Carlo simulations, our technique achieves near-equivalent performance while reducing computational and memory demands by an order of magnitude or more. Furthermore, our method is robust across varying input noise levels, acceleration factors, and diverse undersampling schemes, highlighting its broad applicability. Our work reintroduces accurate and efficient noise analysis as a central tenet of reconstruction algorithms, holding promise to reshape how we evaluate and deploy DL-based MRI. Our code will be made publicly available upon acceptance.
△ Less
Submitted 4 May, 2025;
originally announced May 2025.
-
Explainable Unsupervised Anomaly Detection with Random Forest
Authors:
Joshua S. Harvey,
Joshua Rosaler,
Mingshu Li,
Dhruv Desai,
Dhagash Mehta
Abstract:
We describe the use of an unsupervised Random Forest for similarity learning and improved unsupervised anomaly detection. By training a Random Forest to discriminate between real data and synthetic data sampled from a uniform distribution over the real data bounds, a distance measure is obtained that anisometrically transforms the data, expanding distances at the boundary of the data manifold. We…
▽ More
We describe the use of an unsupervised Random Forest for similarity learning and improved unsupervised anomaly detection. By training a Random Forest to discriminate between real data and synthetic data sampled from a uniform distribution over the real data bounds, a distance measure is obtained that anisometrically transforms the data, expanding distances at the boundary of the data manifold. We show that using distances recovered from this transformation improves the accuracy of unsupervised anomaly detection, compared to other commonly used detectors, demonstrated over a large number of benchmark datasets. As well as improved performance, this method has advantages over other unsupervised anomaly detection methods, including minimal requirements for data preprocessing, native handling of missing data, and potential for visualizations. By relating outlier scores to partitions of the Random Forest, we develop a method for locally explainable anomaly predictions in terms of feature importance.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
-
Responsible AI Agents
Authors:
Deven R. Desai,
Mark O. Riedl
Abstract:
Thanks to advances in large language models, a new type of software agent, the artificial intelligence (AI) agent, has entered the marketplace. Companies such as OpenAI, Google, Microsoft, and Salesforce promise their AI Agents will go from generating passive text to executing tasks. Instead of a travel itinerary, an AI Agent would book all aspects of your trip. Instead of generating text or image…
▽ More
Thanks to advances in large language models, a new type of software agent, the artificial intelligence (AI) agent, has entered the marketplace. Companies such as OpenAI, Google, Microsoft, and Salesforce promise their AI Agents will go from generating passive text to executing tasks. Instead of a travel itinerary, an AI Agent would book all aspects of your trip. Instead of generating text or images for social media post, an AI Agent would post the content across a host of social media outlets. The potential power of AI Agents has fueled legal scholars' fears that AI Agents will enable rogue commerce, human manipulation, rampant defamation, and intellectual property harms. These scholars are calling for regulation before AI Agents cause havoc.
This Article addresses the concerns around AI Agents head on. It shows that core aspects of how one piece of software interacts with another creates ways to discipline AI Agents so that rogue, undesired actions are unlikely, perhaps more so than rules designed to govern human agents. It also develops a way to leverage the computer-science approach to value-alignment to improve a user's ability to take action to prevent or correct AI Agent operations. That approach offers and added benefit of helping AI Agents align with norms around user-AI Agent interactions. These practices will enable desired economic outcomes and mitigate perceived risks. The Article also argues that no matter how much AI Agents seem like human agents, they need not, and should not, be given legal personhood status. In short, humans are responsible for AI Agents' actions, and this Article provides a guide for how humans can build and maintain responsible AI Agents.
△ Less
Submitted 25 February, 2025;
originally announced February 2025.
-
Geofeed Adoption and Authentication
Authors:
Dipsy Desai,
Kicho Yu,
Sulyab Thottungal Valapu
Abstract:
IP Geofeed is a recently proposed informational standard that allows network operators to publish the geographical location of deployed IPv4 and IPv6 prefixes. In this work we study the adoption of IP geofeed, assess deployment of geofeed at Regional Internet Registry and Autonomous System levels, and analyze adherence to RFC 8805 and RFC 9092 in deployed geofeeds. We evaluate the authentication m…
▽ More
IP Geofeed is a recently proposed informational standard that allows network operators to publish the geographical location of deployed IPv4 and IPv6 prefixes. In this work we study the adoption of IP geofeed, assess deployment of geofeed at Regional Internet Registry and Autonomous System levels, and analyze adherence to RFC 8805 and RFC 9092 in deployed geofeeds. We evaluate the authentication mechanism proposed in RFC 9092 and find that it lacks key features from a security perspective. We propose a novel approach to simplify the authentication of geofeeds and assess its efficiency using different benchmarks. Our findings highlight the challenges in current geofeed adoption and the potential for improving both security and scalability in geofeed validation processes.
△ Less
Submitted 12 February, 2025;
originally announced February 2025.
-
Can an unsupervised clustering algorithm reproduce a categorization system?
Authors:
Nathalia Castellanos,
Dhruv Desai,
Sebastian Frank,
Stefano Pasquali,
Dhagash Mehta
Abstract:
Peer analysis is a critical component of investment management, often relying on expert-provided categorization systems. These systems' consistency is questioned when they do not align with cohorts from unsupervised clustering algorithms optimized for various metrics. We investigate whether unsupervised clustering can reproduce ground truth classes in a labeled dataset, showing that success depend…
▽ More
Peer analysis is a critical component of investment management, often relying on expert-provided categorization systems. These systems' consistency is questioned when they do not align with cohorts from unsupervised clustering algorithms optimized for various metrics. We investigate whether unsupervised clustering can reproduce ground truth classes in a labeled dataset, showing that success depends on feature selection and the chosen distance metric. Using toy datasets and fund categorization as real-world examples we demonstrate that accurately reproducing ground truth classes is challenging. We also highlight the limitations of standard clustering evaluation metrics in identifying the optimal number of clusters relative to the ground truth classes. We then show that if appropriate features are available in the dataset, and a proper distance metric is known (e.g., using a supervised Random Forest-based distance metric learning method), then an unsupervised clustering can indeed reproduce the ground truth classes as distinct clusters.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Case-based Explainability for Random Forest: Prototypes, Critics, Counter-factuals and Semi-factuals
Authors:
Gregory Yampolsky,
Dhruv Desai,
Mingshu Li,
Stefano Pasquali,
Dhagash Mehta
Abstract:
The explainability of black-box machine learning algorithms, commonly known as Explainable Artificial Intelligence (XAI), has become crucial for financial and other regulated industrial applications due to regulatory requirements and the need for transparency in business practices. Among the various paradigms of XAI, Explainable Case-Based Reasoning (XCBR) stands out as a pragmatic approach that e…
▽ More
The explainability of black-box machine learning algorithms, commonly known as Explainable Artificial Intelligence (XAI), has become crucial for financial and other regulated industrial applications due to regulatory requirements and the need for transparency in business practices. Among the various paradigms of XAI, Explainable Case-Based Reasoning (XCBR) stands out as a pragmatic approach that elucidates the output of a model by referencing actual examples from the data used to train or test the model. Despite its potential, XCBR has been relatively underexplored for many algorithms such as tree-based models until recently. We start by observing that most XCBR methods are defined based on the distance metric learned by the algorithm. By utilizing a recently proposed technique to extract the distance metric learned by Random Forests (RFs), which is both geometry- and accuracy-preserving, we investigate various XCBR methods. These methods amount to identify special points from the training datasets, such as prototypes, critics, counter-factuals, and semi-factuals, to explain the predictions for a given query of the RF. We evaluate these special points using various evaluation metrics to assess their explanatory power and effectiveness.
△ Less
Submitted 13 August, 2024;
originally announced August 2024.
-
Open Set Recognition for Random Forest
Authors:
Guanchao Feng,
Dhruv Desai,
Stefano Pasquali,
Dhagash Mehta
Abstract:
In many real-world classification or recognition tasks, it is often difficult to collect training examples that exhaust all possible classes due to, for example, incomplete knowledge during training or ever changing regimes. Therefore, samples from unknown/novel classes may be encountered in testing/deployment. In such scenarios, the classifiers should be able to i) perform classification on known…
▽ More
In many real-world classification or recognition tasks, it is often difficult to collect training examples that exhaust all possible classes due to, for example, incomplete knowledge during training or ever changing regimes. Therefore, samples from unknown/novel classes may be encountered in testing/deployment. In such scenarios, the classifiers should be able to i) perform classification on known classes, and at the same time, ii) identify samples from unknown classes. This is known as open-set recognition. Although random forest has been an extremely successful framework as a general-purpose classification (and regression) method, in practice, it usually operates under the closed-set assumption and is not able to identify samples from new classes when run out of the box. In this work, we propose a novel approach to enabling open-set recognition capability for random forest classifiers by incorporating distance metric learning and distance-based open-set recognition. The proposed method is validated on both synthetic and real-world datasets. The experimental results indicate that the proposed approach outperforms state-of-the-art distance-based open-set recognition methods.
△ Less
Submitted 1 August, 2024;
originally announced August 2024.
-
Quantile Regression using Random Forest Proximities
Authors:
Mingshu Li,
Bhaskarjit Sarmah,
Dhruv Desai,
Joshua Rosaler,
Snigdha Bhagat,
Philip Sommer,
Dhagash Mehta
Abstract:
Due to the dynamic nature of financial markets, maintaining models that produce precise predictions over time is difficult. Often the goal isn't just point prediction but determining uncertainty. Quantifying uncertainty, especially the aleatoric uncertainty due to the unpredictable nature of market drivers, helps investors understand varying risk levels. Recently, quantile regression forests (QRF)…
▽ More
Due to the dynamic nature of financial markets, maintaining models that produce precise predictions over time is difficult. Often the goal isn't just point prediction but determining uncertainty. Quantifying uncertainty, especially the aleatoric uncertainty due to the unpredictable nature of market drivers, helps investors understand varying risk levels. Recently, quantile regression forests (QRF) have emerged as a promising solution: Unlike most basic quantile regression methods that need separate models for each quantile, quantile regression forests estimate the entire conditional distribution of the target variable with a single model, while retaining all the salient features of a typical random forest. We introduce a novel approach to compute quantile regressions from random forests that leverages the proximity (i.e., distance metric) learned by the model and infers the conditional distribution of the target variable. We evaluate the proposed methodology using publicly available datasets and then apply it towards the problem of forecasting the average daily volume of corporate bonds. We show that using quantile regression using Random Forest proximities demonstrates superior performance in approximating conditional target distributions and prediction intervals to the original version of QRF. We also demonstrate that the proposed framework is significantly more computationally efficient than traditional approaches to quantile regressions.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Evaluating Deep Clustering Algorithms on Non-Categorical 3D CAD Models
Authors:
Siyuan Xiang,
Chin Tseng,
Congcong Wen,
Deshana Desai,
Yifeng Kou,
Binil Starly,
Daniele Panozzo,
Chen Feng
Abstract:
We introduce the first work on benchmarking and evaluating deep clustering algorithms on large-scale non-categorical 3D CAD models. We first propose a workflow to allow expert mechanical engineers to efficiently annotate 252,648 carefully sampled pairwise CAD model similarities, from a subset of the ABC dataset with 22,968 shapes. Using seven baseline deep clustering methods, we then investigate t…
▽ More
We introduce the first work on benchmarking and evaluating deep clustering algorithms on large-scale non-categorical 3D CAD models. We first propose a workflow to allow expert mechanical engineers to efficiently annotate 252,648 carefully sampled pairwise CAD model similarities, from a subset of the ABC dataset with 22,968 shapes. Using seven baseline deep clustering methods, we then investigate the fundamental challenges of evaluating clustering methods for non-categorical data. Based on these challenges, we propose a novel and viable ensemble-based clustering comparison approach. This work is the first to directly target the underexplored area of deep clustering algorithms for 3D shapes, and we believe it will be an important building block to analyze and utilize the massive 3D shape collections that are starting to appear in deep geometric computing.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Between Copyright and Computer Science: The Law and Ethics of Generative AI
Authors:
Deven R. Desai,
Mark Riedl
Abstract:
Copyright and computer science continue to intersect and clash, but they can coexist. The advent of new technologies such as digitization of visual and aural creations, sharing technologies, search engines, social media offerings, and more challenge copyright-based industries and reopen questions about the reach of copyright law. Breakthroughs in artificial intelligence research, especially Large…
▽ More
Copyright and computer science continue to intersect and clash, but they can coexist. The advent of new technologies such as digitization of visual and aural creations, sharing technologies, search engines, social media offerings, and more challenge copyright-based industries and reopen questions about the reach of copyright law. Breakthroughs in artificial intelligence research, especially Large Language Models that leverage copyrighted material as part of training models, are the latest examples of the ongoing tension between copyright and computer science. The exuberance, rush-to-market, and edge problem cases created by a few misguided companies now raises challenges to core legal doctrines and may shift Open Internet practices for the worse. That result does not have to be, and should not be, the outcome.
This Article shows that, contrary to some scholars' views, fair use law does not bless all ways that someone can gain access to copyrighted material even when the purpose is fair use. Nonetheless, the scientific need for more data to advance AI research means access to large book corpora and the Open Internet is vital for the future of that research. The copyright industry claims, however, that almost all uses of copyrighted material must be compensated, even for non-expressive uses. The Article's solution accepts that both sides need to change. It is one that forces the computer science world to discipline its behaviors and, in some cases, pay for copyrighted material. It also requires the copyright industry to abandon its belief that all uses must be compensated or restricted to uses sanctioned by the copyright industry. As part of this re-balancing, the Article addresses a problem that has grown out of this clash and under theorized.
△ Less
Submitted 5 September, 2024; v1 submitted 24 February, 2024;
originally announced March 2024.
-
Enhanced Local Explainability and Trust Scores with Random Forest Proximities
Authors:
Joshua Rosaler,
Dhruv Desai,
Bhaskarjit Sarmah,
Dimitrios Vamvourellis,
Deran Onay,
Dhagash Mehta,
Stefano Pasquali
Abstract:
We initiate a novel approach to explain the predictions and out of sample performance of random forest (RF) regression and classification models by exploiting the fact that any RF can be mathematically formulated as an adaptive weighted K nearest-neighbors model. Specifically, we employ a recent result that, for both regression and classification tasks, any RF prediction can be rewritten exactly a…
▽ More
We initiate a novel approach to explain the predictions and out of sample performance of random forest (RF) regression and classification models by exploiting the fact that any RF can be mathematically formulated as an adaptive weighted K nearest-neighbors model. Specifically, we employ a recent result that, for both regression and classification tasks, any RF prediction can be rewritten exactly as a weighted sum of the training targets, where the weights are RF proximities between the corresponding pairs of data points. We show that this linearity facilitates a local notion of explainability of RF predictions that generates attributions for any model prediction across observations in the training set, and thereby complements established feature-based methods like SHAP, which generate attributions for a model prediction across input features. We show how this proximity-based approach to explainability can be used in conjunction with SHAP to explain not just the model predictions, but also out-of-sample performance, in the sense that proximities furnish a novel means of assessing when a given model prediction is more or less likely to be correct. We demonstrate this approach in the modeling of US corporate bond prices and returns in both regression and classification cases.
△ Less
Submitted 5 August, 2024; v1 submitted 18 October, 2023;
originally announced October 2023.
-
Quantifying Outlierness of Funds from their Categories using Supervised Similarity
Authors:
Dhruv Desai,
Ashmita Dhiman,
Tushar Sharma,
Deepika Sharma,
Dhagash Mehta,
Stefano Pasquali
Abstract:
Mutual fund categorization has become a standard tool for the investment management industry and is extensively used by allocators for portfolio construction and manager selection, as well as by fund managers for peer analysis and competitive positioning. As a result, a (unintended) miscategorization or lack of precision can significantly impact allocation decisions and investment fund managers. H…
▽ More
Mutual fund categorization has become a standard tool for the investment management industry and is extensively used by allocators for portfolio construction and manager selection, as well as by fund managers for peer analysis and competitive positioning. As a result, a (unintended) miscategorization or lack of precision can significantly impact allocation decisions and investment fund managers. Here, we aim to quantify the effect of miscategorization of funds utilizing a machine learning based approach. We formulate the problem of miscategorization of funds as a distance-based outlier detection problem, where the outliers are the data-points that are far from the rest of the data-points in the given feature space. We implement and employ a Random Forest (RF) based method of distance metric learning, and compute the so-called class-wise outlier measures for each data-point to identify outliers in the data. We test our implementation on various publicly available data sets, and then apply it to mutual fund data. We show that there is a strong relationship between the outlier measures of the funds and their future returns and discuss the implications of our findings.
△ Less
Submitted 13 August, 2023;
originally announced August 2023.
-
Data-Limited Tissue Segmentation using Inpainting-Based Self-Supervised Learning
Authors:
Jeffrey Dominic,
Nandita Bhaskhar,
Arjun D. Desai,
Andrew Schmidt,
Elka Rubin,
Beliz Gunel,
Garry E. Gold,
Brian A. Hargreaves,
Leon Lenchik,
Robert Boutin,
Akshay S. Chaudhari
Abstract:
Although supervised learning has enabled high performance for image segmentation, it requires a large amount of labeled training data, which can be difficult to obtain in the medical imaging field. Self-supervised learning (SSL) methods involving pretext tasks have shown promise in overcoming this requirement by first pretraining models using unlabeled data. In this work, we evaluate the efficacy…
▽ More
Although supervised learning has enabled high performance for image segmentation, it requires a large amount of labeled training data, which can be difficult to obtain in the medical imaging field. Self-supervised learning (SSL) methods involving pretext tasks have shown promise in overcoming this requirement by first pretraining models using unlabeled data. In this work, we evaluate the efficacy of two SSL methods (inpainting-based pretext tasks of context prediction and context restoration) for CT and MRI image segmentation in label-limited scenarios, and investigate the effect of implementation design choices for SSL on downstream segmentation performance. We demonstrate that optimally trained and easy-to-implement inpainting-based SSL segmentation models can outperform classically supervised methods for MRI and CT tissue segmentation in label-limited scenarios, for both clinically-relevant metrics and the traditional Dice score.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
GLEAM: Greedy Learning for Large-Scale Accelerated MRI Reconstruction
Authors:
Batu Ozturkler,
Arda Sahiner,
Tolga Ergen,
Arjun D Desai,
Christopher M Sandino,
Shreyas Vasanawala,
John M Pauly,
Morteza Mardani,
Mert Pilanci
Abstract:
Unrolled neural networks have recently achieved state-of-the-art accelerated MRI reconstruction. These networks unroll iterative optimization algorithms by alternating between physics-based consistency and neural-network based regularization. However, they require several iterations of a large neural network to handle high-dimensional imaging tasks such as 3D MRI. This limits traditional training…
▽ More
Unrolled neural networks have recently achieved state-of-the-art accelerated MRI reconstruction. These networks unroll iterative optimization algorithms by alternating between physics-based consistency and neural-network based regularization. However, they require several iterations of a large neural network to handle high-dimensional imaging tasks such as 3D MRI. This limits traditional training algorithms based on backpropagation due to prohibitively large memory and compute requirements for calculating gradients and storing intermediate activations. To address this challenge, we propose Greedy LEarning for Accelerated MRI (GLEAM) reconstruction, an efficient training strategy for high-dimensional imaging settings. GLEAM splits the end-to-end network into decoupled network modules. Each module is optimized in a greedy manner with decoupled gradient updates, reducing the memory footprint during training. We show that the decoupled gradient updates can be performed in parallel on multiple graphical processing units (GPUs) to further reduce training time. We present experiments with 2D and 3D datasets including multi-coil knee, brain, and dynamic cardiac cine MRI. We observe that: i) GLEAM generalizes as well as state-of-the-art memory-efficient baselines such as gradient checkpointing and invertible networks with the same memory footprint, but with 1.3x faster training; ii) for the same memory footprint, GLEAM yields 1.1dB PSNR gain in 2D and 1.8 dB in 3D over end-to-end baselines.
△ Less
Submitted 18 July, 2022;
originally announced July 2022.
-
Scale-Equivariant Unrolled Neural Networks for Data-Efficient Accelerated MRI Reconstruction
Authors:
Beliz Gunel,
Arda Sahiner,
Arjun D. Desai,
Akshay S. Chaudhari,
Shreyas Vasanawala,
Mert Pilanci,
John Pauly
Abstract:
Unrolled neural networks have enabled state-of-the-art reconstruction performance and fast inference times for the accelerated magnetic resonance imaging (MRI) reconstruction task. However, these approaches depend on fully-sampled scans as ground truth data which is either costly or not possible to acquire in many clinical medical imaging applications; hence, reducing dependence on data is desirab…
▽ More
Unrolled neural networks have enabled state-of-the-art reconstruction performance and fast inference times for the accelerated magnetic resonance imaging (MRI) reconstruction task. However, these approaches depend on fully-sampled scans as ground truth data which is either costly or not possible to acquire in many clinical medical imaging applications; hence, reducing dependence on data is desirable. In this work, we propose modeling the proximal operators of unrolled neural networks with scale-equivariant convolutional neural networks in order to improve the data-efficiency and robustness to drifts in scale of the images that might stem from the variability of patient anatomies or change in field-of-view across different MRI scanners. Our approach demonstrates strong improvements over the state-of-the-art unrolled neural networks under the same memory constraints both with and without data augmentations on both in-distribution and out-of-distribution scaled images without significantly increasing the train or inference time.
△ Less
Submitted 21 April, 2022;
originally announced April 2022.
-
SKM-TEA: A Dataset for Accelerated MRI Reconstruction with Dense Image Labels for Quantitative Clinical Evaluation
Authors:
Arjun D Desai,
Andrew M Schmidt,
Elka B Rubin,
Christopher M Sandino,
Marianne S Black,
Valentina Mazzoli,
Kathryn J Stevens,
Robert Boutin,
Christopher Ré,
Garry E Gold,
Brian A Hargreaves,
Akshay S Chaudhari
Abstract:
Magnetic resonance imaging (MRI) is a cornerstone of modern medical imaging. However, long image acquisition times, the need for qualitative expert analysis, and the lack of (and difficulty extracting) quantitative indicators that are sensitive to tissue health have curtailed widespread clinical and research studies. While recent machine learning methods for MRI reconstruction and analysis have sh…
▽ More
Magnetic resonance imaging (MRI) is a cornerstone of modern medical imaging. However, long image acquisition times, the need for qualitative expert analysis, and the lack of (and difficulty extracting) quantitative indicators that are sensitive to tissue health have curtailed widespread clinical and research studies. While recent machine learning methods for MRI reconstruction and analysis have shown promise for reducing this burden, these techniques are primarily validated with imperfect image quality metrics, which are discordant with clinically-relevant measures that ultimately hamper clinical deployment and clinician trust. To mitigate this challenge, we present the Stanford Knee MRI with Multi-Task Evaluation (SKM-TEA) dataset, a collection of quantitative knee MRI (qMRI) scans that enables end-to-end, clinically-relevant evaluation of MRI reconstruction and analysis tools. This 1.6TB dataset consists of raw-data measurements of ~25,000 slices (155 patients) of anonymized patient MRI scans, the corresponding scanner-generated DICOM images, manual segmentations of four tissues, and bounding box annotations for sixteen clinically relevant pathologies. We provide a framework for using qMRI parameter maps, along with image reconstructions and dense image labels, for measuring the quality of qMRI biomarker estimates extracted from MRI reconstruction, segmentation, and detection techniques. Finally, we use this framework to benchmark state-of-the-art baselines on this dataset. We hope our SKM-TEA dataset and code can enable a broad spectrum of research for modular image reconstruction and image analysis in a clinically informed manner. Dataset access, code, and benchmarks are available at https://github.com/StanfordMIMI/skm-tea.
△ Less
Submitted 13 March, 2022;
originally announced March 2022.
-
Don't let Ricci v. DeStefano Hold You Back: A Bias-Aware Legal Solution to the Hiring Paradox
Authors:
Jad Salem,
Deven R. Desai,
Swati Gupta
Abstract:
Companies that try to address inequality in employment face a hiring paradox. Failing to address workforce imbalance can result in legal sanctions and scrutiny, but proactive measures to address these issues might result in the same legal conflict. Recent run-ins of Microsoft and Wells Fargo with the Labor Department's Office of Federal Contract Compliance Programs (OFCCP) are not isolated and are…
▽ More
Companies that try to address inequality in employment face a hiring paradox. Failing to address workforce imbalance can result in legal sanctions and scrutiny, but proactive measures to address these issues might result in the same legal conflict. Recent run-ins of Microsoft and Wells Fargo with the Labor Department's Office of Federal Contract Compliance Programs (OFCCP) are not isolated and are likely to persist. To add to the confusion, existing scholarship on Ricci v. DeStefano often deems solutions to this paradox impossible. Circumventive practices such as the 4/5ths rule further illustrate tensions between too little action and too much action.
In this work, we give a powerful way to solve this hiring paradox that tracks both legal and algorithmic challenges. We unpack the nuances of Ricci v. DeStefano and extend the legal literature arguing that certain algorithmic approaches to employment are allowed by introducing the legal practice of banding to evaluate candidates. We thus show that a bias-aware technique can be used to diagnose and mitigate "built-in" headwinds in the employment pipeline. We use the machinery of partially ordered sets to handle the presence of uncertainty in evaluations data. This approach allows us to move away from treating "people as numbers" to treating people as individuals -- a property that is sought after by Title VII in the context of employment.
△ Less
Submitted 31 January, 2022;
originally announced January 2022.
-
Noise2Recon: Enabling Joint MRI Reconstruction and Denoising with Semi-Supervised and Self-Supervised Learning
Authors:
Arjun D Desai,
Batu M Ozturkler,
Christopher M Sandino,
Robert Boutin,
Marc Willis,
Shreyas Vasanawala,
Brian A Hargreaves,
Christopher M Ré,
John M Pauly,
Akshay S Chaudhari
Abstract:
Deep learning (DL) has shown promise for faster, high quality accelerated MRI reconstruction. However, supervised DL methods depend on extensive amounts of fully-sampled (labeled) data and are sensitive to out-of-distribution (OOD) shifts, particularly low signal-to-noise ratio (SNR) acquisitions. To alleviate this challenge, we propose Noise2Recon, a model-agnostic, consistency training method fo…
▽ More
Deep learning (DL) has shown promise for faster, high quality accelerated MRI reconstruction. However, supervised DL methods depend on extensive amounts of fully-sampled (labeled) data and are sensitive to out-of-distribution (OOD) shifts, particularly low signal-to-noise ratio (SNR) acquisitions. To alleviate this challenge, we propose Noise2Recon, a model-agnostic, consistency training method for joint MRI reconstruction and denoising that can use both fully-sampled (labeled) and undersampled (unlabeled) scans in semi-supervised and self-supervised settings. With limited or no labeled training data, Noise2Recon outperforms compressed sensing and deep learning baselines, including supervised networks, augmentation-based training, fine-tuned denoisers, and self-supervised methods, and matches performance of supervised models, which were trained with 14x more fully-sampled scans. Noise2Recon also outperforms all baselines, including state-of-the-art fine-tuning and augmentation techniques, among low-SNR scans and when generalizing to other OOD factors, such as changes in acceleration factors and different datasets. Augmentation extent and loss weighting hyperparameters had negligible impact on Noise2Recon compared to supervised methods, which may indicate increased training stability. Our code is available at https://github.com/ad12/meddlr.
△ Less
Submitted 7 October, 2022; v1 submitted 30 September, 2021;
originally announced October 2021.
-
Fund2Vec: Mutual Funds Similarity using Graph Learning
Authors:
Vipul Satone,
Dhruv Desai,
Dhagash Mehta
Abstract:
Identifying similar mutual funds with respect to the underlying portfolios has found many applications in financial services ranging from fund recommender systems, competitors analysis, portfolio analytics, marketing and sales, etc. The traditional methods are either qualitative, and hence prone to biases and often not reproducible, or, are known not to capture all the nuances (non-linearities) am…
▽ More
Identifying similar mutual funds with respect to the underlying portfolios has found many applications in financial services ranging from fund recommender systems, competitors analysis, portfolio analytics, marketing and sales, etc. The traditional methods are either qualitative, and hence prone to biases and often not reproducible, or, are known not to capture all the nuances (non-linearities) among the portfolios from the raw data. We propose a radically new approach to identify similar funds based on the weighted bipartite network representation of funds and their underlying assets data using a sophisticated machine learning method called Node2Vec which learns an embedded low-dimensional representation of the network. We call the embedding \emph{Fund2Vec}. Ours is the first ever study of the weighted bipartite network representation of the funds-assets network in its original form that identifies structural similarity among portfolios as opposed to merely portfolio overlaps.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
Early Bird: Loop Closures from Opposing Viewpoints for Perceptually-Aliased Indoor Environments
Authors:
Satyajit Tourani,
Dhagash Desai,
Udit Singh Parihar,
Sourav Garg,
Ravi Kiran Sarvadevabhatla,
Michael Milford,
K. Madhava Krishna
Abstract:
Significant advances have been made recently in Visual Place Recognition (VPR), feature correspondence, and localization due to the proliferation of deep-learning-based methods. However, existing approaches tend to address, partially or fully, only one of two key challenges: viewpoint change and perceptual aliasing. In this paper, we present novel research that simultaneously addresses both challe…
▽ More
Significant advances have been made recently in Visual Place Recognition (VPR), feature correspondence, and localization due to the proliferation of deep-learning-based methods. However, existing approaches tend to address, partially or fully, only one of two key challenges: viewpoint change and perceptual aliasing. In this paper, we present novel research that simultaneously addresses both challenges by combining deep-learned features with geometric transformations based on reasonable domain assumptions about navigation on a ground-plane, whilst also removing the requirement for specialized hardware setup (e.g. lighting, downwards facing cameras). In particular, our integration of VPR with SLAM by leveraging the robustness of deep-learned features and our homography-based extreme viewpoint invariance significantly boosts the performance of VPR, feature correspondence, and pose graph submodules of the SLAM pipeline. For the first time, we demonstrate a localization system capable of state-of-the-art performance despite perceptual aliasing and extreme 180-degree-rotated viewpoint change in a range of real-world and simulated experiments. Our system is able to achieve early loop closures that prevent significant drifts in SLAM trajectories. We also compare extensively several deep architectures for VPR and descriptor matching. We also show that superior place recognition and descriptor matching across opposite views results in a similar performance gain in back-end pose graph optimization.
△ Less
Submitted 20 December, 2020; v1 submitted 3 October, 2020;
originally announced October 2020.
-
ACORNS: An Easy-To-Use Code Generator for Gradients and Hessians
Authors:
Deshana Desai,
Etai Shuchatowitz,
Zhongshi Jiang,
Teseo Schneider,
Daniele Panozzo
Abstract:
The computation of first and second-order derivatives is a staple in many computing applications, ranging from machine learning to scientific computing. We propose an algorithm to automatically differentiate algorithms written in a subset of C99 code and its efficient implementation as a Python script. We demonstrate that our algorithm enables automatic, reliable, and efficient differentiation of…
▽ More
The computation of first and second-order derivatives is a staple in many computing applications, ranging from machine learning to scientific computing. We propose an algorithm to automatically differentiate algorithms written in a subset of C99 code and its efficient implementation as a Python script. We demonstrate that our algorithm enables automatic, reliable, and efficient differentiation of common algorithms used in physical simulation and geometry processing.
△ Less
Submitted 9 July, 2020;
originally announced July 2020.
-
Machine Learning Fund Categorizations
Authors:
Dhagash Mehta,
Dhruv Desai,
Jithin Pradeep
Abstract:
Given the surge in popularity of mutual funds (including exchange-traded funds (ETFs)) as a diversified financial investment, a vast variety of mutual funds from various investment management firms and diversification strategies have become available in the market. Identifying similar mutual funds among such a wide landscape of mutual funds has become more important than ever because of many appli…
▽ More
Given the surge in popularity of mutual funds (including exchange-traded funds (ETFs)) as a diversified financial investment, a vast variety of mutual funds from various investment management firms and diversification strategies have become available in the market. Identifying similar mutual funds among such a wide landscape of mutual funds has become more important than ever because of many applications ranging from sales and marketing to portfolio replication, portfolio diversification and tax loss harvesting. The current best method is data-vendor provided categorization which usually relies on curation by human experts with the help of available data. In this work, we establish that an industry wide well-regarded categorization system is learnable using machine learning and largely reproducible, and in turn constructing a truly data-driven categorization. We discuss the intellectual challenges in learning this man-made system, our results and their implications.
△ Less
Submitted 29 May, 2020;
originally announced June 2020.
-
The International Workshop on Osteoarthritis Imaging Knee MRI Segmentation Challenge: A Multi-Institute Evaluation and Analysis Framework on a Standardized Dataset
Authors:
Arjun D. Desai,
Francesco Caliva,
Claudia Iriondo,
Naji Khosravan,
Aliasghar Mortazi,
Sachin Jambawalikar,
Drew Torigian,
Jutta Ellermann,
Mehmet Akcakaya,
Ulas Bagci,
Radhika Tibrewala,
Io Flament,
Matthew O`Brien,
Sharmila Majumdar,
Mathias Perslev,
Akshay Pai,
Christian Igel,
Erik B. Dam,
Sibaji Gaj,
Mingrui Yang,
Kunio Nakamura,
Xiaojuan Li,
Cem M. Deniz,
Vladimir Juras,
Ravinder Regatte
, et al. (4 additional authors not shown)
Abstract:
Purpose: To organize a knee MRI segmentation challenge for characterizing the semantic and clinical efficacy of automatic segmentation methods relevant for monitoring osteoarthritis progression.
Methods: A dataset partition consisting of 3D knee MRI from 88 subjects at two timepoints with ground-truth articular (femoral, tibial, patellar) cartilage and meniscus segmentations was standardized. Ch…
▽ More
Purpose: To organize a knee MRI segmentation challenge for characterizing the semantic and clinical efficacy of automatic segmentation methods relevant for monitoring osteoarthritis progression.
Methods: A dataset partition consisting of 3D knee MRI from 88 subjects at two timepoints with ground-truth articular (femoral, tibial, patellar) cartilage and meniscus segmentations was standardized. Challenge submissions and a majority-vote ensemble were evaluated using Dice score, average symmetric surface distance, volumetric overlap error, and coefficient of variation on a hold-out test set. Similarities in network segmentations were evaluated using pairwise Dice correlations. Articular cartilage thickness was computed per-scan and longitudinally. Correlation between thickness error and segmentation metrics was measured using Pearson's coefficient. Two empirical upper bounds for ensemble performance were computed using combinations of model outputs that consolidated true positives and true negatives.
Results: Six teams (T1-T6) submitted entries for the challenge. No significant differences were observed across all segmentation metrics for all tissues (p=1.0) among the four top-performing networks (T2, T3, T4, T6). Dice correlations between network pairs were high (>0.85). Per-scan thickness errors were negligible among T1-T4 (p=0.99) and longitudinal changes showed minimal bias (<0.03mm). Low correlations (<0.41) were observed between segmentation metrics and thickness error. The majority-vote ensemble was comparable to top performing networks (p=1.0). Empirical upper bound performances were similar for both combinations (p=1.0).
Conclusion: Diverse networks learned to segment the knee similarly where high segmentation accuracy did not correlate to cartilage thickness accuracy. Voting ensembles did not outperform individual networks but may help regularize individual models.
△ Less
Submitted 26 May, 2020; v1 submitted 29 April, 2020;
originally announced April 2020.
-
Technical Considerations for Semantic Segmentation in MRI using Convolutional Neural Networks
Authors:
Arjun D. Desai,
Garry E. Gold,
Brian A. Hargreaves,
Akshay S. Chaudhari
Abstract:
High-fidelity semantic segmentation of magnetic resonance volumes is critical for estimating tissue morphometry and relaxation parameters in both clinical and research applications. While manual segmentation is accepted as the gold-standard, recent advances in deep learning and convolutional neural networks (CNNs) have shown promise for efficient automatic segmentation of soft tissues. However, du…
▽ More
High-fidelity semantic segmentation of magnetic resonance volumes is critical for estimating tissue morphometry and relaxation parameters in both clinical and research applications. While manual segmentation is accepted as the gold-standard, recent advances in deep learning and convolutional neural networks (CNNs) have shown promise for efficient automatic segmentation of soft tissues. However, due to the stochastic nature of deep learning and the multitude of hyperparameters in training networks, predicting network behavior is challenging. In this paper, we quantify the impact of three factors associated with CNN segmentation performance: network architecture, training loss functions, and training data characteristics. We evaluate the impact of these variations on the segmentation of femoral cartilage and propose potential modifications to CNN architectures and training protocols to train these models with confidence.
△ Less
Submitted 5 February, 2019;
originally announced February 2019.
-
Role of Temporal Diversity in Inferring Social Ties Based on Spatio-Temporal Data
Authors:
Deshana Desai,
Harsh Nisar,
Rishab Bhardawaj
Abstract:
The last two decades have seen a tremendous surge in research on social networks and their implications. The studies includes inferring social relationships, which in turn have been used for target advertising, recommendations, search customization etc. However, the offline experiences of human, the conversations with people and face-to-face interactions that govern our lives interactions have rec…
▽ More
The last two decades have seen a tremendous surge in research on social networks and their implications. The studies includes inferring social relationships, which in turn have been used for target advertising, recommendations, search customization etc. However, the offline experiences of human, the conversations with people and face-to-face interactions that govern our lives interactions have received lesser attention. We introduce DAIICT Spatio-Temporal Network (DSSN), a spatiotemporal dataset of 0.7 million data points of continuous location data logged at an interval of every 2 minutes by mobile phones of 46 subjects. Our research is focused at inferring relationship strength between students based on the spatiotemporal data and comparing the results with the self-reported data. In that pursuit we introduce Temporal Diversity, which we show to be superior in its contribution to predicting relationship strength than its counterparts. We also explore the evolving nature of Temporal Diversity with time. Our rich dataset opens various other avenues of research that require fine-grained location data with bounded movement of participants within a limited geographical area. The advantage of having a bounded geographical area such as a university campus is that it provides us with a microcosm of the real world, where each such geographic zone has an internal context and function and a high percentage of mobility is governed by schedules and time-tables. The bounded geographical region in addition to the age homogeneous population gives us a minute look into the active internal socialization of students in a university.
△ Less
Submitted 10 November, 2016;
originally announced November 2016.
-
Optimal Hitting Sets for Combinatorial Shapes
Authors:
Aditya Bhaskara,
Devendra Desai,
Srikanth Srinivasan
Abstract:
We consider the problem of constructing explicit Hitting sets for Combinatorial Shapes, a class of statistical tests first studied by Gopalan, Meka, Reingold, and Zuckerman (STOC 2011). These generalize many well-studied classes of tests, including symmetric functions and combinatorial rectangles. Generalizing results of Linial, Luby, Saks, and Zuckerman (Combinatorica 1997) and Rabani and Shpilka…
▽ More
We consider the problem of constructing explicit Hitting sets for Combinatorial Shapes, a class of statistical tests first studied by Gopalan, Meka, Reingold, and Zuckerman (STOC 2011). These generalize many well-studied classes of tests, including symmetric functions and combinatorial rectangles. Generalizing results of Linial, Luby, Saks, and Zuckerman (Combinatorica 1997) and Rabani and Shpilka (SICOMP 2010), we construct hitting sets for Combinatorial Shapes of size polynomial in the alphabet, dimension, and the inverse of the error parameter. This is optimal up to polynomial factors. The best previous hitting sets came from the Pseudorandom Generator construction of Gopalan et al., and in particular had size that was quasipolynomial in the inverse of the error parameter.
Our construction builds on natural variants of the constructions of Linial et al. and Rabani and Shpilka. In the process, we construct fractional perfect hash families and hitting sets for combinatorial rectangles with stronger guarantees. These might be of independent interest.
△ Less
Submitted 14 November, 2012;
originally announced November 2012.
-
On a Connection Between Small Set Expansions and Modularity Clustering in Social Networks
Authors:
Bhaskar DasGupta,
Devendra Desai
Abstract:
In this paper we explore a connection between two seemingly different problems from two different domains: the small-set expansion problem studied in unique games conjecture, and a popular community finding approach for social networks known as the modularity clustering approach. We show that a sub-exponential time algorithm for the small-set expansion problem leads to a sub-exponential time const…
▽ More
In this paper we explore a connection between two seemingly different problems from two different domains: the small-set expansion problem studied in unique games conjecture, and a popular community finding approach for social networks known as the modularity clustering approach. We show that a sub-exponential time algorithm for the small-set expansion problem leads to a sub-exponential time constant factor approximation for some hard input instances of the modularity clustering problem.
△ Less
Submitted 11 February, 2014; v1 submitted 13 November, 2011;
originally announced November 2011.
-
On the Complexity of Newman's Community Finding Approach for Biological and Social Networks
Authors:
Bhaskar DasGupta,
Devendra Desai
Abstract:
Given a graph of interactions, a module (also called a community or cluster) is a subset of nodes whose fitness is a function of the statistical significance of the pairwise interactions of nodes in the module. The topic of this paper is a model-based community finding approach, commonly referred to as modularity clustering, that was originally proposed by Newman and has subsequently been extremel…
▽ More
Given a graph of interactions, a module (also called a community or cluster) is a subset of nodes whose fitness is a function of the statistical significance of the pairwise interactions of nodes in the module. The topic of this paper is a model-based community finding approach, commonly referred to as modularity clustering, that was originally proposed by Newman and has subsequently been extremely popular in practice. Various heuristic methods are currently employed for finding the optimal solution. However, the exact computational complexity of this approach is still largely unknown.
To this end, we initiate a systematic study of the computational complexity of modularity clustering. Due to the specific quadratic nature of the modularity function, it is necessary to study its value on sparse graphs and dense graphs separately. Our main results include a (1+\eps)-inapproximability for dense graphs and a logarithmic approximation for sparse graphs. We make use of several combinatorial properties of modularity to get these results. These are the first non-trivial approximability results beyond the previously known NP-hardness results.
△ Less
Submitted 10 April, 2012; v1 submitted 4 February, 2011;
originally announced February 2011.
-
Limits of Approximation Algorithms: PCPs and Unique Games (DIMACS Tutorial Lecture Notes)
Authors:
Prahladh Harsha,
Moses Charikar,
Matthew Andrews,
Sanjeev Arora,
Subhash Khot,
Dana Moshkovitz,
Lisa Zhang,
Ashkan Aazami,
Dev Desai,
Igor Gorodezky,
Geetha Jagannathan,
Alexander S. Kulikov,
Darakhshan J. Mir,
Alantha Newman,
Aleksandar Nikolov,
David Pritchard,
Gwen Spencer
Abstract:
These are the lecture notes for the DIMACS Tutorial "Limits of Approximation Algorithms: PCPs and Unique Games" held at the DIMACS Center, CoRE Building, Rutgers University on 20-21 July, 2009. This tutorial was jointly sponsored by the DIMACS Special Focus on Hardness of Approximation, the DIMACS Special Focus on Algorithmic Foundations of the Internet, and the Center for Computational Intracta…
▽ More
These are the lecture notes for the DIMACS Tutorial "Limits of Approximation Algorithms: PCPs and Unique Games" held at the DIMACS Center, CoRE Building, Rutgers University on 20-21 July, 2009. This tutorial was jointly sponsored by the DIMACS Special Focus on Hardness of Approximation, the DIMACS Special Focus on Algorithmic Foundations of the Internet, and the Center for Computational Intractability with support from the National Security Agency and the National Science Foundation.
The speakers at the tutorial were Matthew Andrews, Sanjeev Arora, Moses Charikar, Prahladh Harsha, Subhash Khot, Dana Moshkovitz and Lisa Zhang. The sribes were Ashkan Aazami, Dev Desai, Igor Gorodezky, Geetha Jagannathan, Alexander S. Kulikov, Darakhshan J. Mir, Alantha Newman, Aleksandar Nikolov, David Pritchard and Gwen Spencer.
△ Less
Submitted 20 February, 2010;
originally announced February 2010.