-
The Transformation Risk-Benefit Model of Artificial Intelligence: Balancing Risks and Benefits Through Practical Solutions and Use Cases
Authors:
Richard Fulton,
Diane Fulton,
Nate Hayes,
Susan Kaplan
Abstract:
This paper summarizes the most cogent advantages and risks associated with Artificial Intelligence from an in-depth review of the literature. Then the authors synthesize the salient risk-related models currently being used in AI, technology and business-related scenarios. Next, in view of an updated context of AI along with theories and models reviewed and expanded constructs, the writers propose…
▽ More
This paper summarizes the most cogent advantages and risks associated with Artificial Intelligence from an in-depth review of the literature. Then the authors synthesize the salient risk-related models currently being used in AI, technology and business-related scenarios. Next, in view of an updated context of AI along with theories and models reviewed and expanded constructs, the writers propose a new framework called "The Transformation Risk-Benefit Model of Artificial Intelligence" to address the increasing fears and levels of AI risk. Using the model characteristics, the article emphasizes practical and innovative solutions where benefits outweigh risks and three use cases in healthcare, climate change/environment and cyber security to illustrate unique interplay of principles, dimensions and processes of this powerful AI transformational model.
△ Less
Submitted 11 April, 2024;
originally announced June 2024.
-
Graph-Based Bidirectional Transformer Decision Threshold Adjustment Algorithm for Class-Imbalanced Molecular Data
Authors:
Nicole Hayes,
Ekaterina Merkurjev,
Guo-Wei Wei
Abstract:
Data sets with imbalanced class sizes, where one class size is much smaller than that of others, occur exceedingly often in many applications, including those with biological foundations, such as disease diagnosis and drug discovery. Therefore, it is extremely important to be able to identify data elements of classes of various sizes, as a failure to do so can result in heavy costs. Nonetheless, m…
▽ More
Data sets with imbalanced class sizes, where one class size is much smaller than that of others, occur exceedingly often in many applications, including those with biological foundations, such as disease diagnosis and drug discovery. Therefore, it is extremely important to be able to identify data elements of classes of various sizes, as a failure to do so can result in heavy costs. Nonetheless, many data classification procedures do not perform well on imbalanced data sets as they often fail to detect elements belonging to underrepresented classes. In this work, we propose the BTDT-MBO algorithm, incorporating Merriman-Bence-Osher (MBO) approaches and a bidirectional transformer, as well as distance correlation and decision threshold adjustments, for data classification tasks on highly imbalanced molecular data sets, where the sizes of the classes vary greatly. The proposed technique not only integrates adjustments in the classification threshold for the MBO algorithm in order to help deal with the class imbalance, but also uses a bidirectional transformer procedure based on an attention mechanism for self-supervised learning. In addition, the model implements distance correlation as a weight function for the similarity graph-based framework on which the adjusted MBO algorithm operates. The proposed method is validated using six molecular data sets and compared to other related techniques. The computational experiments show that the proposed technique is superior to competing approaches even in the case of a high class imbalance ratio.
△ Less
Submitted 3 September, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Integrating Transformer and Autoencoder Techniques with Spectral Graph Algorithms for the Prediction of Scarcely Labeled Molecular Data
Authors:
Nicole Hayes,
Ekaterina Merkurjev,
Guo-Wei Wei
Abstract:
In molecular and biological sciences, experiments are expensive, time-consuming, and often subject to ethical constraints. Consequently, one often faces the challenging task of predicting desirable properties from small data sets or scarcely-labeled data sets. Although transfer learning can be advantageous, it requires the existence of a related large data set. This work introduces three graph-bas…
▽ More
In molecular and biological sciences, experiments are expensive, time-consuming, and often subject to ethical constraints. Consequently, one often faces the challenging task of predicting desirable properties from small data sets or scarcely-labeled data sets. Although transfer learning can be advantageous, it requires the existence of a related large data set. This work introduces three graph-based models incorporating Merriman-Bence-Osher (MBO) techniques to tackle this challenge. Specifically, graph-based modifications of the MBO scheme are integrated with state-of-the-art techniques, including a home-made transformer and an autoencoder, in order to deal with scarcely-labeled data sets. In addition, a consensus technique is detailed. The proposed models are validated using five benchmark data sets. We also provide a thorough comparison to other competing methods, such as support vector machines, random forests, and gradient boosting decision trees, which are known for their good performance on small data sets. The performances of various methods are analyzed using residue-similarity (R-S) scores and R-S indices. Extensive computational experiments and theoretical analysis show that the new models perform very well even when as little as 1% of the data set is used as labeled data.
△ Less
Submitted 5 January, 2023; v1 submitted 12 November, 2022;
originally announced November 2022.
-
Alternative Paradigms of Computation
Authors:
William Gasarch,
Nathan Hayes,
Emily Kaplitz,
William Regli
Abstract:
With Moore's law coming to a close it is useful to look at other forms of computer hardware. In this paper we survey what is known about several modes of computation: Neuromorphic, Custom Logic, Quantum, Optical, Spintronics, Reversible, Many-Valued Logic, Chemical, DNA, Neurological, Fluidic, Amorphous, Thermodynamic, Peptide, and Membrane. For each of these modes of computing we discuss pros, co…
▽ More
With Moore's law coming to a close it is useful to look at other forms of computer hardware. In this paper we survey what is known about several modes of computation: Neuromorphic, Custom Logic, Quantum, Optical, Spintronics, Reversible, Many-Valued Logic, Chemical, DNA, Neurological, Fluidic, Amorphous, Thermodynamic, Peptide, and Membrane. For each of these modes of computing we discuss pros, cons, current work, and metrics. After surveying these alternative modes of computation we discuss two aread where they may useful: data analytics and graph processing.
△ Less
Submitted 17 November, 2021;
originally announced November 2021.
-
Large-Margin Classification with Multiple Decision Rules
Authors:
Patrick K. Kimes,
D. Neil Hayes,
J. S. Marron,
Yufeng Liu
Abstract:
Binary classification is a common statistical learning problem in which a model is estimated on a set of covariates for some outcome indicating the membership of one of two classes. In the literature, there exists a distinction between hard and soft classification. In soft classification, the conditional class probability is modeled as a function of the covariates. In contrast, hard classification…
▽ More
Binary classification is a common statistical learning problem in which a model is estimated on a set of covariates for some outcome indicating the membership of one of two classes. In the literature, there exists a distinction between hard and soft classification. In soft classification, the conditional class probability is modeled as a function of the covariates. In contrast, hard classification methods only target the optimal prediction boundary. While hard and soft classification methods have been studied extensively, not much work has been done to compare the actual tasks of hard and soft classification. In this paper we propose a spectrum of statistical learning problems which span the hard and soft classification tasks based on fitting multiple decision rules to the data. By doing so, we reveal a novel collection of learning tasks of increasing complexity. We study the problems using the framework of large-margin classifiers and a class of piecewise linear convex surrogates, for which we derive statistical properties and a corresponding sub-gradient descent algorithm. We conclude by applying our approach to simulation settings and a magnetic resonance imaging (MRI) dataset from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study.
△ Less
Submitted 19 November, 2014;
originally announced November 2014.