-
MOPI-HFRS: A Multi-objective Personalized Health-aware Food Recommendation System with LLM-enhanced Interpretation
Authors:
Zheyuan Zhang,
Zehong Wang,
Tianyi Ma,
Varun Sameer Taneja,
Sofia Nelson,
Nhi Ha Lan Le,
Keerthiram Murugesan,
Mingxuan Ju,
Nitesh V Chawla,
Chuxu Zhang,
Yanfang Ye
Abstract:
The prevalence of unhealthy eating habits has become an increasingly concerning issue in the United States. However, major food recommendation platforms (e.g., Yelp) continue to prioritize users' dietary preferences over the healthiness of their choices. Although efforts have been made to develop health-aware food recommendation systems, the personalization of such systems based on users' specific…
▽ More
The prevalence of unhealthy eating habits has become an increasingly concerning issue in the United States. However, major food recommendation platforms (e.g., Yelp) continue to prioritize users' dietary preferences over the healthiness of their choices. Although efforts have been made to develop health-aware food recommendation systems, the personalization of such systems based on users' specific health conditions remains under-explored. In addition, few research focus on the interpretability of these systems, which hinders users from assessing the reliability of recommendations and impedes the practical deployment of these systems. In response to this gap, we first establish two large-scale personalized health-aware food recommendation benchmarks at the first attempt. We then develop a novel framework, Multi-Objective Personalized Interpretable Health-aware Food Recommendation System (MOPI-HFRS), which provides food recommendations by jointly optimizing the three objectives: user preference, personalized healthiness and nutritional diversity, along with an large language model (LLM)-enhanced reasoning module to promote healthy dietary knowledge through the interpretation of recommended results. Specifically, this holistic graph learning framework first utilizes two structure learning and a structure pooling modules to leverage both descriptive features and health data. Then it employs Pareto optimization to achieve designed multi-facet objectives. Finally, to further promote the healthy dietary knowledge and awareness, we exploit an LLM by utilizing knowledge-infusion, prompting the LLMs with knowledge obtained from the recommendation model for interpretation.
△ Less
Submitted 11 December, 2024;
originally announced December 2024.
-
Learning to Compose SuperWeights for Neural Parameter Allocation Search
Authors:
Piotr Teterwak,
Soren Nelson,
Nikoli Dryden,
Dina Bashkirova,
Kate Saenko,
Bryan A. Plummer
Abstract:
Neural parameter allocation search (NPAS) automates parameter sharing by obtaining weights for a network given an arbitrary, fixed parameter budget. Prior work has two major drawbacks we aim to address. First, there is a disconnect in the sharing pattern between the search and training steps, where weights are warped for layers of different sizes during the search to measure similarity, but not du…
▽ More
Neural parameter allocation search (NPAS) automates parameter sharing by obtaining weights for a network given an arbitrary, fixed parameter budget. Prior work has two major drawbacks we aim to address. First, there is a disconnect in the sharing pattern between the search and training steps, where weights are warped for layers of different sizes during the search to measure similarity, but not during training, resulting in reduced performance. To address this, we generate layer weights by learning to compose sets of SuperWeights, which represent a group of trainable parameters. These SuperWeights are created to be large enough so they can be used to represent any layer in the network, but small enough that they are computationally efficient. The second drawback we address is the method of measuring similarity between shared parameters. Whereas prior work compared the weights themselves, we argue this does not take into account the amount of conflict between the shared weights. Instead, we use gradient information to identify layers with shared weights that wish to diverge from each other. We demonstrate that our SuperWeight Networks consistently boost performance over the state-of-the-art on the ImageNet and CIFAR datasets in the NPAS setting. We further show that our approach can generate parameters for many network architectures using the same set of weights. This enables us to support tasks like efficient ensembling and anytime prediction, outperforming fully-parameterized ensembles with 17% fewer parameters.
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
MixtureGrowth: Growing Neural Networks by Recombining Learned Parameters
Authors:
Chau Pham,
Piotr Teterwak,
Soren Nelson,
Bryan A. Plummer
Abstract:
Most deep neural networks are trained under fixed network architectures and require retraining when the architecture changes. If expanding the network's size is needed, it is necessary to retrain from scratch, which is expensive. To avoid this, one can grow from a small network by adding random weights over time to gradually achieve the target network size. However, this naive approach falls short…
▽ More
Most deep neural networks are trained under fixed network architectures and require retraining when the architecture changes. If expanding the network's size is needed, it is necessary to retrain from scratch, which is expensive. To avoid this, one can grow from a small network by adding random weights over time to gradually achieve the target network size. However, this naive approach falls short in practice as it brings too much noise to the growing process. Prior work tackled this issue by leveraging the already learned weights and training data for generating new weights through conducting a computationally expensive analysis step. In this paper, we introduce MixtureGrowth, a new approach to growing networks that circumvents the initialization overhead in prior work. Before growing, each layer in our model is generated with a linear combination of parameter templates. Newly grown layer weights are generated by using a new linear combination of existing templates for a layer. On one hand, these templates are already trained for the task, providing a strong initialization. On the other, the new coefficients provide flexibility for the added layer weights to learn something new. We show that our approach boosts top-1 accuracy over the state-of-the-art by 2-2.5% on CIFAR-100 and ImageNet datasets, while achieving comparable performance with fewer FLOPs to a larger network trained from scratch. Code is available at https://github.com/chaudatascience/mixturegrowth.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
RACR-MIL: Weakly Supervised Skin Cancer Grading using Rank-Aware Contextual Reasoning on Whole Slide Images
Authors:
Anirudh Choudhary,
Angelina Hwang,
Jacob Kechter,
Krishnakant Saboo,
Blake Bordeaux,
Puneet Bhullar,
Nneka Comfere,
David DiCaudo,
Steven Nelson,
Emma Johnson,
Leah Swanson,
Dennis Murphree,
Aaron Mangold,
Ravishankar K. Iyer
Abstract:
Cutaneous squamous cell cancer (cSCC) is the second most common skin cancer in the US. It is diagnosed by manual multi-class tumor grading using a tissue whole slide image (WSI), which is subjective and suffers from inter-pathologist variability. We propose an automated weakly-supervised grading approach for cSCC WSIs that is trained using WSI-level grade and does not require fine-grained tumor an…
▽ More
Cutaneous squamous cell cancer (cSCC) is the second most common skin cancer in the US. It is diagnosed by manual multi-class tumor grading using a tissue whole slide image (WSI), which is subjective and suffers from inter-pathologist variability. We propose an automated weakly-supervised grading approach for cSCC WSIs that is trained using WSI-level grade and does not require fine-grained tumor annotations. The proposed model, RACR-MIL, transforms each WSI into a bag of tiled patches and leverages attention-based multiple-instance learning to assign a WSI-level grade. We propose three key innovations to address general as well as cSCC-specific challenges in tumor grading. First, we leverage spatial and semantic proximity to define a WSI graph that encodes both local and non-local dependencies between tumor regions and leverage graph attention convolution to derive contextual patch features. Second, we introduce a novel ordinal ranking constraint on the patch attention network to ensure that higher-grade tumor regions are assigned higher attention. Third, we use tumor depth as an auxiliary task to improve grade classification in a multitask learning framework. RACR-MIL achieves 2-9% improvement in grade classification over existing weakly-supervised approaches on a dataset of 718 cSCC tissue images and localizes the tumor better. The model achieves 5-20% higher accuracy in difficult-to-classify high-risk grade classes and is robust to class imbalance.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
A Quasi-Conforming Embedded Reproducing Kernel Particle Method for Heterogeneous Materials
Authors:
Ryan T. Schlinkman,
Jonghyuk Baek,
Frank N. Beckwith,
Stacy M. Nelson,
J. S. Chen
Abstract:
We present a quasi-conforming embedded reproducing kernel particle method (QCE-RKPM) for modeling heterogeneous materials that makes use of techniques not available to mesh-based methods such as the finite element method (FEM) and avoids many of the drawbacks in current embedded and immersed formulations which are based on meshed methods. The different material domains are discretized independentl…
▽ More
We present a quasi-conforming embedded reproducing kernel particle method (QCE-RKPM) for modeling heterogeneous materials that makes use of techniques not available to mesh-based methods such as the finite element method (FEM) and avoids many of the drawbacks in current embedded and immersed formulations which are based on meshed methods. The different material domains are discretized independently thus avoiding time-consuming, conformal meshing. In this approach, the superposition of foreground (inclusion) and background (matrix) domain integration smoothing cells are corrected by a quasi-conforming quadtree subdivision on the background integration smoothing cells. Due to the non-conforming nature of the background integration smoothing cells near the material interfaces, a variationally consistent (VC) correction for domain integration is introduced to restore integration constraints and thus optimal convergence rates at a minor computational cost. Additional interface integration smoothing cells with area (volume) correction, while non-conforming, can be easily introduced to further enhance the accuracy and stability of the Galerkin solution using VC integration on non-conforming cells. To properly approximate the weak discontinuity across the material interface by a penalty-free Nitsche's method with enhanced coercivity, the interface nodes on the surface of the foreground discretization are also shared with the background discretization. As such, there are no tunable parameters, such as those involved in the penalty type method, to enforce interface compatibility in this approach. The advantage of this meshfree formulation is that it avoids many of the instabilities in mesh-based immersed and embedded methods. The effectiveness of QCE-RKPM is illustrated with several examples.
△ Less
Submitted 12 April, 2023;
originally announced April 2023.
-
Unpacking the Black Box: Regulating Algorithmic Decisions
Authors:
Laura Blattner,
Scott Nelson,
Jann Spiess
Abstract:
What should regulators of complex algorithms regulate? We propose a model of oversight over 'black-box' algorithms used in high-stakes applications such as lending, medical testing, or hiring. In our model, a regulator is limited in how much she can learn about a black-box model deployed by an agent with misaligned preferences. The regulator faces two choices: first, whether to allow for the use o…
▽ More
What should regulators of complex algorithms regulate? We propose a model of oversight over 'black-box' algorithms used in high-stakes applications such as lending, medical testing, or hiring. In our model, a regulator is limited in how much she can learn about a black-box model deployed by an agent with misaligned preferences. The regulator faces two choices: first, whether to allow for the use of complex algorithms; and second, which key properties of algorithms to regulate. We show that limiting agents to algorithms that are simple enough to be fully transparent is inefficient as long as the misalignment is limited and complex algorithms have sufficiently better performance than simple ones. Allowing for complex algorithms can improve welfare, but the gains depend on how the regulator regulates them. Regulation that focuses on the overall average behavior of algorithms, for example based on standard explainer tools, will generally be inefficient. Targeted regulation that focuses on the source of incentive misalignment, e.g., excess false positives or racial disparities, can provide second-best solutions. We provide empirical support for our theoretical findings using an application in consumer lending, where we document that complex models regulated based on context-specific explanation tools outperform simple, fully transparent models. This gain from complex models represents a Pareto improvement across our empirical applications that is preferred both by the lender and from the perspective of the financial regulator.
△ Less
Submitted 31 May, 2024; v1 submitted 5 October, 2021;
originally announced October 2021.
-
How Costly is Noise? Data and Disparities in Consumer Credit
Authors:
Laura Blattner,
Scott Nelson
Abstract:
We show that lenders face more uncertainty when assessing default risk of historically under-served groups in US credit markets and that this information disparity is a quantitatively important driver of inefficient and unequal credit market outcomes. We first document that widely used credit scores are statistically noisier indicators of default risk for historically under-served groups. This noi…
▽ More
We show that lenders face more uncertainty when assessing default risk of historically under-served groups in US credit markets and that this information disparity is a quantitatively important driver of inefficient and unequal credit market outcomes. We first document that widely used credit scores are statistically noisier indicators of default risk for historically under-served groups. This noise emerges primarily through the explanatory power of the underlying credit report data (e.g., thin credit files), not through issues with model fit (e.g., the inability to include protected class in the scoring model). Estimating a structural model of lending with heterogeneity in information, we quantify the gains from addressing these information disparities for the US mortgage market. We find that equalizing the precision of credit scores can reduce disparities in approval rates and in credit misallocation for disadvantaged groups by approximately half.
△ Less
Submitted 16 May, 2021;
originally announced May 2021.
-
Collaborative Experience between Scientific Software Projects using Agile Scrum Development
Authors:
A. L. Baxter,
S. Y. BenZvi,
W. Bonivento,
A. Brazier,
M. Clark,
A. Coleiro,
D. Collom,
M. Colomer-Molla,
B. Cousins,
A. Delgado Orellana,
D. Dornic,
V. Ekimtcov,
S. ElSayed,
A. Gallo Rosso,
P. Godwin,
S. Griswold,
A. Habig,
S. Horiuchi,
D. A. Howell,
M. W. G. Johnson,
M. Juric,
J. P. Kneller,
A. Kopec,
C. Kopper,
V. Kulikovskiy
, et al. (27 additional authors not shown)
Abstract:
Developing sustainable software for the scientific community requires expertise in software engineering and domain science. This can be challenging due to the unique needs of scientific software, the insufficient resources for software engineering practices in the scientific community, and the complexity of developing for evolving scientific contexts. While open-source software can partially addre…
▽ More
Developing sustainable software for the scientific community requires expertise in software engineering and domain science. This can be challenging due to the unique needs of scientific software, the insufficient resources for software engineering practices in the scientific community, and the complexity of developing for evolving scientific contexts. While open-source software can partially address these concerns, it can introduce complicating dependencies and delay development. These issues can be reduced if scientists and software developers collaborate. We present a case study wherein scientists from the SuperNova Early Warning System collaborated with software developers from the Scalable Cyberinfrastructure for Multi-Messenger Astrophysics project. The collaboration addressed the difficulties of open-source software development, but presented additional risks to each team. For the scientists, there was a concern of relying on external systems and lacking control in the development process. For the developers, there was a risk in supporting a user-group while maintaining core development. These issues were mitigated by creating a second Agile Scrum framework in parallel with the developers' ongoing Agile Scrum process. This Agile collaboration promoted communication, ensured that the scientists had an active role in development, and allowed the developers to evaluate and implement the scientists' software requirements. The collaboration provided benefits for each group: the scientists actuated their development by using an existing platform, and the developers utilized the scientists' use-case to improve their systems. This case study suggests that scientists and software developers can avoid scientific computing issues by collaborating and that Agile Scrum methods can address emergent concerns.
△ Less
Submitted 2 August, 2022; v1 submitted 19 January, 2021;
originally announced January 2021.
-
A needle-based deep-neural-network camera
Authors:
Ruipeng Guo,
Soren Nelson,
Rajesh Menon
Abstract:
We experimentally demonstrate a camera whose primary optic is a cannula (diameter=0.22mm and length=12.5mm) that acts a lightpipe transporting light intensity from an object plane (35cm away) to its opposite end. Deep neural networks (DNNs) are used to reconstruct color and grayscale images with field of view of 180 and angular resolution of ~0.40. When trained on images with depth information, th…
▽ More
We experimentally demonstrate a camera whose primary optic is a cannula (diameter=0.22mm and length=12.5mm) that acts a lightpipe transporting light intensity from an object plane (35cm away) to its opposite end. Deep neural networks (DNNs) are used to reconstruct color and grayscale images with field of view of 180 and angular resolution of ~0.40. When trained on images with depth information, the DNN can create depth maps. Finally, we show DNN-based classification of the EMNIST dataset without and with image reconstructions. The former could be useful for imaging with enhanced privacy.
△ Less
Submitted 13 November, 2020;
originally announced November 2020.
-
Classification of optics-free images with deep neural networks
Authors:
Soren Nelson,
Rajesh Menon
Abstract:
The thinnest possible camera is achieved by removing all optics, leaving only the image sensor. We train deep neural networks to perform multi-class detection and binary classification (with accuracy of 92%) on optics-free images without the need for anthropocentric image reconstructions. Inferencing from optics-free images has the potential for enhanced privacy and power efficiency.
The thinnest possible camera is achieved by removing all optics, leaving only the image sensor. We train deep neural networks to perform multi-class detection and binary classification (with accuracy of 92%) on optics-free images without the need for anthropocentric image reconstructions. Inferencing from optics-free images has the potential for enhanced privacy and power efficiency.
△ Less
Submitted 10 November, 2020;
originally announced November 2020.