-
Domain Specific Benchmarks for Evaluating Multimodal Large Language Models
Authors:
Khizar Anjum,
Muhammad Arbab Arshad,
Kadhim Hayawi,
Efstathios Polyzos,
Asadullah Tariq,
Mohamed Adel Serhani,
Laiba Batool,
Brady Lund,
Nishith Reddy Mannuru,
Ravi Varma Kumar Bevara,
Taslim Mahbub,
Muhammad Zeeshan Akram,
Sakib Shahriar
Abstract:
Large language models (LLMs) are increasingly being deployed across disciplines due to their advanced reasoning and problem solving capabilities. To measure their effectiveness, various benchmarks have been developed that measure aspects of LLM reasoning, comprehension, and problem-solving. While several surveys address LLM evaluation and benchmarks, a domain-specific analysis remains underexplore…
▽ More
Large language models (LLMs) are increasingly being deployed across disciplines due to their advanced reasoning and problem solving capabilities. To measure their effectiveness, various benchmarks have been developed that measure aspects of LLM reasoning, comprehension, and problem-solving. While several surveys address LLM evaluation and benchmarks, a domain-specific analysis remains underexplored in the literature. This paper introduces a taxonomy of seven key disciplines, encompassing various domains and application areas where LLMs are extensively utilized. Additionally, we provide a comprehensive review of LLM benchmarks and survey papers within each domain, highlighting the unique capabilities of LLMs and the challenges faced in their application. Finally, we compile and categorize these benchmarks by domain to create an accessible resource for researchers, aiming to pave the way for advancements toward artificial general intelligence (AGI)
△ Less
Submitted 20 June, 2025; v1 submitted 15 June, 2025;
originally announced June 2025.
-
What Does Information Science Offer for Data Science Research?: A Review of Data and Information Ethics Literature
Authors:
Brady D. Lund,
Ting Wang
Abstract:
This paper reviews literature pertaining to the development of data science as a discipline, current issues with data bias and ethics, and the role that the discipline of information science may play in addressing these concerns. Information science research and researchers have much to offer for data science, owing to their background as transdisciplinary scholars who apply human-centered and soc…
▽ More
This paper reviews literature pertaining to the development of data science as a discipline, current issues with data bias and ethics, and the role that the discipline of information science may play in addressing these concerns. Information science research and researchers have much to offer for data science, owing to their background as transdisciplinary scholars who apply human-centered and social-behavioral perspectives to issues within natural science disciplines. Information science researchers have already contributed to a humanistic approach to data ethics within the literature and an emphasis on data science within information schools all but ensures that this literature will continue to grow in coming decades. This review article serves as a reference for the history, current progress, and potential future directions of data ethics research within the corpus of information science literature.
△ Less
Submitted 26 May, 2025;
originally announced June 2025.
-
Zero Trust Cybersecurity: Procedures and Considerations in Context
Authors:
Brady D. Lund,
Tae Hee Lee,
Ziang Wang,
Ting Wang,
Nishith Reddy Mannuru
Abstract:
In response to the increasing complexity and sophistication of cyber threats, particularly those enhanced by advancements in artificial intelligence, traditional security methods are proving insufficient. This paper explores the Zero Trust cybersecurity framework, which operates on the principle of never trust, always verify to mitigate vulnerabilities within organizations. Specifically, it examin…
▽ More
In response to the increasing complexity and sophistication of cyber threats, particularly those enhanced by advancements in artificial intelligence, traditional security methods are proving insufficient. This paper explores the Zero Trust cybersecurity framework, which operates on the principle of never trust, always verify to mitigate vulnerabilities within organizations. Specifically, it examines the applicability of Zero Trust principles in environments where large volumes of information are exchanged, such as schools and libraries. The discussion highlights the importance of continuous authentication, least privilege access, and breach assumption. The findings underscore avenues for future research that may help preserve the security of these vulnerable organizations.
△ Less
Submitted 24 May, 2025;
originally announced May 2025.
-
Understanding the Relationship Between Personal Data Privacy Literacy and Data Privacy Information Sharing by University Students
Authors:
Brady D. Lund,
Bryan Anderson,
Ana Roeschley,
Gahangir Hossain
Abstract:
With constant threats to the safety of personal data in the United States, privacy literacy has become an increasingly important competency among university students, one that ties intimately to the information sharing behavior of these students. This survey based study examines how university students in the United States perceive personal data privacy and how their privacy literacy influences th…
▽ More
With constant threats to the safety of personal data in the United States, privacy literacy has become an increasingly important competency among university students, one that ties intimately to the information sharing behavior of these students. This survey based study examines how university students in the United States perceive personal data privacy and how their privacy literacy influences their understanding and behaviors. Students responses to a privacy literacy scale were categorized into high and low privacy literacy groups, revealing that high literacy individuals demonstrate a broader range of privacy practices, including multi factor authentication, VPN usage, and phishing awareness, whereas low literacy individuals rely on more basic security measures. Statistical analyses suggest that high literacy respondents display greater diversity in recommendations and engagement in privacy discussions. These findings suggest the need for enhanced educational initiatives to improve data privacy awareness at the university level to create a better cyber safe population.
△ Less
Submitted 24 May, 2025;
originally announced May 2025.
-
Putting GPT-4o to the Sword: A Comprehensive Evaluation of Language, Vision, Speech, and Multimodal Proficiency
Authors:
Sakib Shahriar,
Brady Lund,
Nishith Reddy Mannuru,
Muhammad Arbab Arshad,
Kadhim Hayawi,
Ravi Varma Kumar Bevara,
Aashrith Mannuru,
Laiba Batool
Abstract:
As large language models (LLMs) continue to advance, evaluating their comprehensive capabilities becomes significant for their application in various fields. This research study comprehensively evaluates the language, vision, speech, and multimodal capabilities of GPT-4o. The study employs standardized exam questions, reasoning tasks, and translation assessments to assess the model's language capa…
▽ More
As large language models (LLMs) continue to advance, evaluating their comprehensive capabilities becomes significant for their application in various fields. This research study comprehensively evaluates the language, vision, speech, and multimodal capabilities of GPT-4o. The study employs standardized exam questions, reasoning tasks, and translation assessments to assess the model's language capability. Additionally, GPT-4o's vision and speech capabilities are tested through image classification and object recognition tasks, as well as accent classification. The multimodal evaluation assesses the model's performance in integrating visual and linguistic data. Our findings reveal that GPT-4o demonstrates high accuracy and efficiency across multiple domains in language and reasoning capabilities, excelling in tasks that require few-shot learning. GPT-4o also provides notable improvements in multimodal tasks compared to its predecessors. However, the model shows variability and faces limitations in handling complex and ambiguous inputs, particularly in audio and vision capabilities. This paper highlights the need for more comprehensive benchmarks and robust evaluation frameworks, encompassing qualitative assessments involving human judgment as well as error analysis. Future work should focus on expanding datasets, investigating prompt-based assessment, and enhancing few-shot learning techniques to test the model's practical applicability and performance in real-world scenarios.
△ Less
Submitted 19 June, 2024;
originally announced July 2024.
-
The Impact of AI on Academic Research and Publishing
Authors:
Brady Lund,
Manika Lamba,
Sang Hoo Oh
Abstract:
Generative artificial intelligence (AI) technologies like ChatGPT, have significantly impacted academic writing and publishing through their ability to generate content at levels comparable to or surpassing human writers. Through a review of recent interdisciplinary literature, this paper examines ethical considerations surrounding the integration of AI into academia, focusing on the potential for…
▽ More
Generative artificial intelligence (AI) technologies like ChatGPT, have significantly impacted academic writing and publishing through their ability to generate content at levels comparable to or surpassing human writers. Through a review of recent interdisciplinary literature, this paper examines ethical considerations surrounding the integration of AI into academia, focusing on the potential for this technology to be used for scholarly misconduct and necessary oversight when using it for writing, editing, and reviewing of scholarly papers. The findings highlight the need for collaborative approaches to AI usage among publishers, editors, reviewers, and authors to ensure that this technology is used ethically and productively.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Perceptions of the Fourth Industrial Revolution and Artificial Intelligence Impact on Society
Authors:
Daniel Agbaji,
Brady Lund,
Nishith Reddy Mannuru
Abstract:
The Fourth Industrial Revolution, particularly Artificial Intelligence (AI), has had a profound impact on society, raising concerns about its implications and ethical considerations. The emergence of text generative AI tools like ChatGPT has further intensified concerns regarding ethics, security, privacy, and copyright. This study aims to examine the perceptions of individuals in different inform…
▽ More
The Fourth Industrial Revolution, particularly Artificial Intelligence (AI), has had a profound impact on society, raising concerns about its implications and ethical considerations. The emergence of text generative AI tools like ChatGPT has further intensified concerns regarding ethics, security, privacy, and copyright. This study aims to examine the perceptions of individuals in different information flow categorizations toward AI. The results reveal key themes in participant-supplied definitions of AI and the fourth industrial revolution, emphasizing the replication of human intelligence, machine learning, automation, and the integration of digital technologies. Participants expressed concerns about job replacement, privacy invasion, and inaccurate information provided by AI. However, they also recognized the benefits of AI, such as solving complex problems and increasing convenience. Views on government involvement in shaping the fourth industrial revolution varied, with some advocating for strict regulations and others favoring support and development. The anticipated changes brought by the fourth industrial revolution include automation, potential job impacts, increased social disconnect, and reliance on technology. Understanding these perceptions is crucial for effectively managing the challenges and opportunities associated with AI in the evolving digital landscape.
△ Less
Submitted 31 July, 2023;
originally announced August 2023.
-
Experimental Demonstration of Secure Frequency Hopping Communication Enabled by Quantum Key Distribution
Authors:
Bernardo A. Huberman,
Bob Lund,
Jing Wang,
Lin Cheng
Abstract:
We propose and experimentally demonstrate a method of frequency hopping spread spectrum communication using a quantum key distribution network to deliver the frequency hopping pattern for secure wireless communications. Results show low interception and jamming probabilities.
We propose and experimentally demonstrate a method of frequency hopping spread spectrum communication using a quantum key distribution network to deliver the frequency hopping pattern for secure wireless communications. Results show low interception and jamming probabilities.
△ Less
Submitted 20 June, 2023;
originally announced July 2023.
-
ChatGPT and a New Academic Reality: Artificial Intelligence-Written Research Papers and the Ethics of the Large Language Models in Scholarly Publishing
Authors:
Brady Lund,
Ting Wang,
Nishith Reddy Mannuru,
Bing Nie,
Somipam Shimray,
Ziang Wang
Abstract:
This paper discusses OpenAIs ChatGPT, a generative pre-trained transformer, which uses natural language processing to fulfill text-based user requests (i.e., a chatbot). The history and principles behind ChatGPT and similar models are discussed. This technology is then discussed in relation to its potential impact on academia and scholarly research and publishing. ChatGPT is seen as a potential mo…
▽ More
This paper discusses OpenAIs ChatGPT, a generative pre-trained transformer, which uses natural language processing to fulfill text-based user requests (i.e., a chatbot). The history and principles behind ChatGPT and similar models are discussed. This technology is then discussed in relation to its potential impact on academia and scholarly research and publishing. ChatGPT is seen as a potential model for the automated preparation of essays and other types of scholarly manuscripts. Potential ethical issues that could arise with the emergence of large language models like GPT-3, the underlying technology behind ChatGPT, and its usage by academics and researchers, are discussed and situated within the context of broader advancements in artificial intelligence, machine learning, and natural language processing for research and scholarly publishing.
△ Less
Submitted 31 March, 2023; v1 submitted 21 March, 2023;
originally announced March 2023.
-
Biomedical image analysis competitions: The state of current participation practice
Authors:
Matthias Eisenmann,
Annika Reinke,
Vivienn Weru,
Minu Dietlinde Tizabi,
Fabian Isensee,
Tim J. Adler,
Patrick Godau,
Veronika Cheplygina,
Michal Kozubek,
Sharib Ali,
Anubha Gupta,
Jan Kybic,
Alison Noble,
Carlos Ortiz de Solórzano,
Samiksha Pachade,
Caroline Petitjean,
Daniel Sage,
Donglai Wei,
Elizabeth Wilden,
Deepak Alapatt,
Vincent Andrearczyk,
Ujjwal Baid,
Spyridon Bakas,
Niranjan Balu,
Sophia Bano
, et al. (331 additional authors not shown)
Abstract:
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis,…
▽ More
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
△ Less
Submitted 12 September, 2023; v1 submitted 16 December, 2022;
originally announced December 2022.
-
Leveraging Clinical Characteristics for Improved Deep Learning-Based Kidney Tumor Segmentation on CT
Authors:
Christina B. Lund,
Bas H. M. van der Velden
Abstract:
This paper assesses whether using clinical characteristics in addition to imaging can improve automated segmentation of kidney cancer on contrast-enhanced computed tomography (CT). A total of 300 kidney cancer patients with contrast-enhanced CT scans and clinical characteristics were included. A baseline segmentation of the kidney cancer was performed using a 3D U-Net. Input to the U-Net were the…
▽ More
This paper assesses whether using clinical characteristics in addition to imaging can improve automated segmentation of kidney cancer on contrast-enhanced computed tomography (CT). A total of 300 kidney cancer patients with contrast-enhanced CT scans and clinical characteristics were included. A baseline segmentation of the kidney cancer was performed using a 3D U-Net. Input to the U-Net were the contrast-enhanced CT images, output were segmentations of kidney, kidney tumors, and kidney cysts. A cognizant sampling strategy was used to leverage clinical characteristics for improved segmentation. To this end, a Least Absolute Shrinkage and Selection Operator (LASSO) was used. Segmentations were evaluated using Dice and Surface Dice. Improvement in segmentation was assessed using Wilcoxon signed rank test. The baseline 3D U-Net showed a segmentation performance of 0.90 for kidney and kidney masses, i.e., kidney, tumor, and cyst, 0.29 for kidney masses, and 0.28 for kidney tumor, while the 3D U-Net trained with cognizant sampling enhanced the segmentation performance and reached Dice scores of 0.90, 0.39, and 0.38 respectively. To conclude, the cognizant sampling strategy leveraging the clinical characteristics significantly improved kidney cancer segmentation.
△ Less
Submitted 13 September, 2021;
originally announced September 2021.
-
Computational Complexity of Computing a Quasi-Proper Equilibrium
Authors:
Kristoffer Arnsfelt Hansen,
Troels Bjerre Lund
Abstract:
We study the computational complexity of computing or approximating a quasi-proper equilibrium for a given finite extensive form game of perfect recall. We show that the task of computing a symbolic quasi-proper equilibrium is $\mathrm{PPAD}$-complete for two-player games. For the case of zero-sum games we obtain a polynomial time algorithm based on Linear Programming. For general $n$-player games…
▽ More
We study the computational complexity of computing or approximating a quasi-proper equilibrium for a given finite extensive form game of perfect recall. We show that the task of computing a symbolic quasi-proper equilibrium is $\mathrm{PPAD}$-complete for two-player games. For the case of zero-sum games we obtain a polynomial time algorithm based on Linear Programming. For general $n$-player games we show that computing an approximation of a quasi-proper equilibrium is $\mathrm{FIXP}_a$-complete.
△ Less
Submitted 9 July, 2021;
originally announced July 2021.
-
Quantum Secured Internet Transport
Authors:
Bernardo Huberman,
Bob Lund,
Jing Wang
Abstract:
Quantum computing represents an emerging threat to the public key infrastructure underlying transport layer security (TLS) widely used in the Internet. This paper describes how QKD symmetric keys can be used with TLS to provide quantum computing resistant security for existing Internet applications. We also implement and test a general hybrid key delivery architecture with QKD over long distance f…
▽ More
Quantum computing represents an emerging threat to the public key infrastructure underlying transport layer security (TLS) widely used in the Internet. This paper describes how QKD symmetric keys can be used with TLS to provide quantum computing resistant security for existing Internet applications. We also implement and test a general hybrid key delivery architecture with QKD over long distance fibers between secure sites, and wireless key distribution over short distance within each site Finally we show how this same capability can be extended to a TLS cipher scheme with perfect security.
△ Less
Submitted 10 July, 2020;
originally announced July 2020.
-
On the list recoverability of randomly punctured codes
Authors:
Ben Lund,
Aditya Potukuchi
Abstract:
We show that a random puncturing of a code with good distance is list recoverable beyond the Johnson bound. In particular, this implies that there are Reed-Solomon codes that are list recoverable beyond the Johnson bound. It was previously known that there are Reed-Solomon codes that do not have this property. As an immediate corollary to our main theorem, we obtain better degree bounds on unbalan…
▽ More
We show that a random puncturing of a code with good distance is list recoverable beyond the Johnson bound. In particular, this implies that there are Reed-Solomon codes that are list recoverable beyond the Johnson bound. It was previously known that there are Reed-Solomon codes that do not have this property. As an immediate corollary to our main theorem, we obtain better degree bounds on unbalanced expanders that come from Reed-Solomon codes.
△ Less
Submitted 3 July, 2020; v1 submitted 4 May, 2020;
originally announced May 2020.
-
A Quantum Router for the Entangled Web
Authors:
Bernardo A. Huberman,
Bob Lund
Abstract:
Qubit transmission protocols are presently point-to-point, and thus restrictive in their functionality. A quantum router is necessary for the quantum Internet to become a reality. We present a quantum router design based on teleportation, as well as mechanisms for entangled pair management. The prototype was validated using a quantum simulator.
Qubit transmission protocols are presently point-to-point, and thus restrictive in their functionality. A quantum router is necessary for the quantum Internet to become a reality. We present a quantum router design based on teleportation, as well as mechanisms for entangled pair management. The prototype was validated using a quantum simulator.
△ Less
Submitted 11 March, 2019;
originally announced March 2019.
-
A Pseudoline Counterexample to the Strong Dirac Conjecture
Authors:
Ben D. Lund,
George B. Purdy,
Justin W. Smith
Abstract:
We demonstrate an infinite family of pseudoline arrangements, in which an arrangement of n pseudolines has no member incident to more than 4n/9 points of intersection. This shows the "Strong Dirac" conjecture to be false for pseudolines.
We also raise a number of open problems relating to possible differences between the structure of incidences between points and lines versus the structure of in…
▽ More
We demonstrate an infinite family of pseudoline arrangements, in which an arrangement of n pseudolines has no member incident to more than 4n/9 points of intersection. This shows the "Strong Dirac" conjecture to be false for pseudolines.
We also raise a number of open problems relating to possible differences between the structure of incidences between points and lines versus the structure of incidences between points and pseudolines.
△ Less
Submitted 11 January, 2014; v1 submitted 14 February, 2012;
originally announced February 2012.
-
Collinearities in Kinetic Point Sets
Authors:
Ben D. Lund,
George B. Purdy,
Justin W. Smith,
Csaba D. Tóth
Abstract:
Let $P$ be a set of $n$ points in the plane, each point moving along a given trajectory. A {\em $k$-collinearity} is a pair $(L,t)$ of a line $L$ and a time $t$ such that $L$ contains at least $k$ points at time $t$, the points along $L$ do not all coincide, and not all of them are collinear at all times. We show that, if the points move with constant velocity, then the number of 3-collinearities…
▽ More
Let $P$ be a set of $n$ points in the plane, each point moving along a given trajectory. A {\em $k$-collinearity} is a pair $(L,t)$ of a line $L$ and a time $t$ such that $L$ contains at least $k$ points at time $t$, the points along $L$ do not all coincide, and not all of them are collinear at all times. We show that, if the points move with constant velocity, then the number of 3-collinearities is at most $2\binom{n}{3}$, and this bound is tight. There are $n$ points having $Ω(n^3/k^4 + n^2/k^2)$ distinct $k$-collinearities. Thus, the number of $k$-collinearities among $n$ points, for constant $k$, is $O(n^3)$, and this bound is asymptotically tight. In addition, there are $n$ points, moving in pairwise distinct directions with different speeds, such that no three points are ever collinear.
△ Less
Submitted 16 May, 2011;
originally announced May 2011.
-
A Bichromatic Incidence Bound and an Application
Authors:
Ben D. Lund,
George B. Purdy,
Justin W. Smith
Abstract:
We prove a new, tight upper bound on the number of incidences between points and hyperplanes in Euclidean d-space. Given n points, of which k are colored red, there are O_d(m^{2/3}k^{2/3}n^{(d-2)/3} + kn^{d-2} + m) incidences between the k red points and m hyperplanes spanned by all n points provided that m = Ω(n^{d-2}). For the monochromatic case k = n, this was proved by Agarwal and Aronov.
We…
▽ More
We prove a new, tight upper bound on the number of incidences between points and hyperplanes in Euclidean d-space. Given n points, of which k are colored red, there are O_d(m^{2/3}k^{2/3}n^{(d-2)/3} + kn^{d-2} + m) incidences between the k red points and m hyperplanes spanned by all n points provided that m = Ω(n^{d-2}). For the monochromatic case k = n, this was proved by Agarwal and Aronov.
We use this incidence bound to prove that a set of n points, no more than n-k of which lie on any plane or two lines, spans Ω(nk^2) planes. We also provide an infinite family of counterexamples to a conjecture of Purdy's on the number of hyperplanes spanned by a set of points in dimensions higher than 3, and present new conjectures not subject to the counterexample.
△ Less
Submitted 26 April, 2011; v1 submitted 19 June, 2010;
originally announced June 2010.
-
A Multi-Stage CUDA Kernel for Floyd-Warshall
Authors:
Ben Lund,
Justin W Smith
Abstract:
We present a new implementation of the Floyd-Warshall All-Pairs Shortest Paths algorithm on CUDA. Our algorithm runs approximately 5 times faster than the previously best reported algorithm. In order to achieve this speedup, we applied a new technique to reduce usage of on-chip shared memory and allow the CUDA scheduler to more effectively hide instruction latency.
We present a new implementation of the Floyd-Warshall All-Pairs Shortest Paths algorithm on CUDA. Our algorithm runs approximately 5 times faster than the previously best reported algorithm. In order to achieve this speedup, we applied a new technique to reduce usage of on-chip shared memory and allow the CUDA scheduler to more effectively hide instruction latency.
△ Less
Submitted 24 February, 2010; v1 submitted 22 January, 2010;
originally announced January 2010.