Search | arXiv e-print repository

eXpLogic: Explaining Logic Types and Patterns in DiffLogic Networks

Authors: Stephen Wormald, David Koblah, Matheus Kunzler Maldaner, Domenic Forte, Damon L. Woodard

Abstract: Constraining deep neural networks (DNNs) to learn individual logic types per node, as performed using the DiffLogic network architecture, opens the door to model-specific explanation techniques that quell the complexity inherent to DNNs. Inspired by principles of circuit analysis from computer engineering, this work presents an algorithm (eXpLogic) for producing saliency maps which explain input p… ▽ More Constraining deep neural networks (DNNs) to learn individual logic types per node, as performed using the DiffLogic network architecture, opens the door to model-specific explanation techniques that quell the complexity inherent to DNNs. Inspired by principles of circuit analysis from computer engineering, this work presents an algorithm (eXpLogic) for producing saliency maps which explain input patterns that activate certain functions. The eXpLogic explanations: (1) show the exact set of inputs responsible for a decision, which helps interpret false negative and false positive predictions, (2) highlight common input patterns that activate certain outputs, and (3) help reduce the network size to improve class-specific inference. To evaluate the eXpLogic saliency map, we introduce a metric that quantifies how much an input changes before switching a model's class prediction (the SwitchDist) and use this metric to compare eXpLogic against the Vanilla Gradients (VG) and Integrated Gradient (IG) methods. Generally, we show that eXpLogic saliency maps are better at predicting which inputs will change the class score. These maps help reduce the network size and inference times by 87\% and 8\%, respectively, while having a limited impact (-3.8\%) on class-specific predictions. The broader value of this work to machine learning is in demonstrating how certain DNN architectures promote explainability, which is relevant to healthcare, defense, and law. △ Less

Submitted 12 March, 2025; originally announced March 2025.

Comments: Conference submission, 6 pages, 2 figures

arXiv:2409.04934 [pdf, other]

doi 10.1109/ACCESS.2024.3494737

Maximizing Relation Extraction Potential: A Data-Centric Study to Unveil Challenges and Opportunities

Authors: Anushka Swarup, Avanti Bhandarkar, Olivia P. Dizon-Paradis, Ronald Wilson, Damon L. Woodard

Abstract: Relation extraction is a Natural Language Processing task that aims to extract relationships from textual data. It is a critical step for information extraction. Due to its wide-scale applicability, research in relation extraction has rapidly scaled to using highly advanced neural networks. Despite their computational superiority, modern relation extractors fail to handle complicated extraction sc… ▽ More Relation extraction is a Natural Language Processing task that aims to extract relationships from textual data. It is a critical step for information extraction. Due to its wide-scale applicability, research in relation extraction has rapidly scaled to using highly advanced neural networks. Despite their computational superiority, modern relation extractors fail to handle complicated extraction scenarios. However, a comprehensive performance analysis of the state-of-the-art extractors that compile these challenges has been missing from the literature, and this paper aims to bridge this gap. The goal has been to investigate the possible data-centric characteristics that impede neural relation extraction. Based on extensive experiments conducted using 15 state-of-the-art relation extraction algorithms ranging from recurrent architectures to large language models and seven large-scale datasets, this research suggests that modern relation extractors are not robust to complex data and relation characteristics. It emphasizes pivotal issues, such as contextual ambiguity, correlating relations, long-tail data, and fine-grained relation distributions. In addition, it sets a marker for future directions to alleviate these issues, thereby proving to be a critical resource for novice and advanced researchers. Efficient handling of the challenges described can have significant implications for the field of information extraction, which is a critical part of popular systems such as search engines and chatbots. Data and relevant code can be found at \url{https://aaig.ece.ufl.edu/projects/relation-extraction}. △ Less

Submitted 25 November, 2024; v1 submitted 7 September, 2024; originally announced September 2024.

Comments: This work has been published to the IEEE Access (2024)

arXiv:2407.17870 [pdf, other]

Is the Digital Forensics and Incident Response Pipeline Ready for Text-Based Threats in LLM Era?

Authors: Avanti Bhandarkar, Ronald Wilson, Anushka Swarup, Mengdi Zhu, Damon Woodard

Abstract: In the era of generative AI, the widespread adoption of Neural Text Generators (NTGs) presents new cybersecurity challenges, particularly within the realms of Digital Forensics and Incident Response (DFIR). These challenges primarily involve the detection and attribution of sources behind advanced attacks like spearphishing and disinformation campaigns. As NTGs evolve, the task of distinguishing b… ▽ More In the era of generative AI, the widespread adoption of Neural Text Generators (NTGs) presents new cybersecurity challenges, particularly within the realms of Digital Forensics and Incident Response (DFIR). These challenges primarily involve the detection and attribution of sources behind advanced attacks like spearphishing and disinformation campaigns. As NTGs evolve, the task of distinguishing between human and NTG-authored texts becomes critically complex. This paper rigorously evaluates the DFIR pipeline tailored for text-based security systems, specifically focusing on the challenges of detecting and attributing authorship of NTG-authored texts. By introducing a novel human-NTG co-authorship text attack, termed CS-ACT, our study uncovers significant vulnerabilities in traditional DFIR methodologies, highlighting discrepancies between ideal scenarios and real-world conditions. Utilizing 14 diverse datasets and 43 unique NTGs, up to the latest GPT-4, our research identifies substantial vulnerabilities in the forensic profiling phase, particularly in attributing authorship to NTGs. Our comprehensive evaluation points to factors such as model sophistication and the lack of distinctive style within NTGs as significant contributors for these vulnerabilities. Our findings underscore the necessity for more sophisticated and adaptable strategies, such as incorporating adversarial learning, stylizing NTGs, and implementing hierarchical attribution through the mapping of NTG lineages to enhance source attribution. This sets the stage for future research and the development of more resilient text-based security systems. △ Less

Submitted 25 July, 2024; originally announced July 2024.

Comments: This work has been submitted to the IEEE for possible publication

arXiv:2402.13244 [pdf, other]

Are Fact-Checking Tools Helpful? An Exploration of the Usability of Google Fact Check

Authors: Qiangeng Yang, Tess Christensen, Shlok Gilda, Juliana Fernandes, Daniela Oliveira, Ronald Wilson, Damon Woodard

Abstract: Fact-checking-specific search tools such as Google Fact Check are a promising way to combat misinformation on social media, especially during events bringing significant social influence, such as the COVID-19 pandemic and the U.S. presidential elections. However, the usability of such an approach has not been thoroughly studied. We evaluated the performance of Google Fact Check by analyzing the re… ▽ More Fact-checking-specific search tools such as Google Fact Check are a promising way to combat misinformation on social media, especially during events bringing significant social influence, such as the COVID-19 pandemic and the U.S. presidential elections. However, the usability of such an approach has not been thoroughly studied. We evaluated the performance of Google Fact Check by analyzing the retrieved fact-checking results regarding 1,000 COVID-19-related false claims and found it able to retrieve the fact-checking results for 15.8% of the input claims, and the rendered results are relatively reliable. We also found that the false claims receiving different fact-checking verdicts (i.e., "False," "Partly False," "True," and "Unratable") tend to reflect diverse emotional tones, and fact-checking sources tend to check the claims in different lengths and using dictionary words to various extents. Claim variations addressing the same issue yet described differently are likely to retrieve distinct fact-checking results. We suggest that the quantities of the retrieved fact-checking results could be optimized and that slightly adjusting input wording may be the best practice for users to retrieve more useful information. This study aims to contribute to the understanding of state-of-the-art fact-checking tools and information integrity. △ Less

Submitted 24 May, 2025; v1 submitted 20 February, 2024; originally announced February 2024.

Comments: Accepted and presented at the 5th EAI International Conference on Data and Information in Online Environments (EAI DIONE 2024)

arXiv:2401.06293 [pdf, other]

MultiSlot ReRanker: A Generic Model-based Re-Ranking Framework in Recommendation Systems

Authors: Qiang Charles Xiao, Ajith Muralidharan, Birjodh Tiwana, Johnson Jia, Fedor Borisyuk, Aman Gupta, Dawn Woodard

Abstract: In this paper, we propose a generic model-based re-ranking framework, MultiSlot ReRanker, which simultaneously optimizes relevance, diversity, and freshness. Specifically, our Sequential Greedy Algorithm (SGA) is efficient enough (linear time complexity) for large-scale production recommendation engines. It achieved a lift of $+6\%$ to $ +10\%$ offline Area Under the receiver operating characteris… ▽ More In this paper, we propose a generic model-based re-ranking framework, MultiSlot ReRanker, which simultaneously optimizes relevance, diversity, and freshness. Specifically, our Sequential Greedy Algorithm (SGA) is efficient enough (linear time complexity) for large-scale production recommendation engines. It achieved a lift of $+6\%$ to $ +10\%$ offline Area Under the receiver operating characteristic Curve (AUC) which is mainly due to explicitly modeling mutual influences among items of a list, and leveraging the second pass ranking scores of multiple objectives. In addition, we have generalized the offline replay theory to multi-slot re-ranking scenarios, with trade-offs among multiple objectives. The offline replay results can be further improved by Pareto Optimality. Moreover, we've built a multi-slot re-ranking simulator based on OpenAI Gym integrated with the Ray framework. It can be easily configured for different assumptions to quickly benchmark both reinforcement learning and supervised learning algorithms. △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: 10 pages

arXiv:2305.03699 [pdf, ps, other]

Multimodal User Authentication in Smart Environments: Survey of User Attitudes

Authors: Aishat Aloba, Sarah Morrison-Smith, Aaliyah Richlen, Kimberly Suarez, Yu-Peng Chen, Shaghayegh Esmaeili, Damon L. Woodard, Jaime Ruiz, Lisa Anthony

Abstract: As users shift from interacting actively with devices with screens to interacting seamlessly with smart environments, novel models of user authentication will be needed to maintain the security and privacy of user data. To understand users' attitudes toward new models of authentication (e.g., voice recognition), we surveyed 117 Amazon Turk workers and 43 computer science students about their authe… ▽ More As users shift from interacting actively with devices with screens to interacting seamlessly with smart environments, novel models of user authentication will be needed to maintain the security and privacy of user data. To understand users' attitudes toward new models of authentication (e.g., voice recognition), we surveyed 117 Amazon Turk workers and 43 computer science students about their authentication preferences, in contexts when others are present and different usability metrics. Our users placed less trust in natural authentication modalities (e.g., body gestures) than traditional modalities (e.g., passwords) due to concerns about accuracy or security. Users were also not as willing to use natural authentication modalities except in the presence of people they trust due to risk of exposure and feelings of awkwardness. We discuss the implications for designing natural multimodal authentication and explore the design space around users' current mental models for the future of secure and usable smart technology. △ Less

Submitted 23 May, 2023; v1 submitted 5 May, 2023; originally announced May 2023.

Comments: 23 pages, 4 figures

arXiv:2206.10706

TraSE: Towards Tackling Authorial Style from a Cognitive Science Perspective

Authors: Ronald Wilson, Avanti Bhandarkar, Damon Woodard

Abstract: Stylistic analysis of text is a key task in research areas ranging from authorship attribution to forensic analysis and personality profiling. The existing approaches for stylistic analysis are plagued by issues like topic influence, lack of discriminability for large number of authors and the requirement for large amounts of diverse data. In this paper, the source of these issues are identified a… ▽ More Stylistic analysis of text is a key task in research areas ranging from authorship attribution to forensic analysis and personality profiling. The existing approaches for stylistic analysis are plagued by issues like topic influence, lack of discriminability for large number of authors and the requirement for large amounts of diverse data. In this paper, the source of these issues are identified along with the necessity for a cognitive perspective on authorial style in addressing them. A novel feature representation, called Trajectory-based Style Estimation (TraSE), is introduced to support this purpose. Authorship attribution experiments with over 27,000 authors and 1.4 million samples in a cross-domain scenario resulted in 90% attribution accuracy suggesting that the feature representation is immune to such negative influences and an excellent candidate for stylistic analysis. Finally, a qualitative analysis is performed on TraSE using physical human characteristics, like age, to validate its claim on capturing cognitive traits. △ Less

Submitted 5 December, 2023; v1 submitted 21 June, 2022; originally announced June 2022.

Comments: Experimental results in the paper are incorrectly reported due to an unforeseen glitch in the software prototype. The paper and its findings are withdrawn

arXiv:2204.09579 [pdf, other]

A Survey and Perspective on Artificial Intelligence for Security-Aware Electronic Design Automation

Authors: David Selasi Koblah, Rabin Yu Acharya, Daniel Capecci, Olivia P. Dizon-Paradis, Shahin Tajik, Fatemeh Ganji, Damon L. Woodard, Domenic Forte

Abstract: Artificial intelligence (AI) and machine learning (ML) techniques have been increasingly used in several fields to improve performance and the level of automation. In recent years, this use has exponentially increased due to the advancement of high-performance computing and the ever increasing size of data. One of such fields is that of hardware design; specifically the design of digital and analo… ▽ More Artificial intelligence (AI) and machine learning (ML) techniques have been increasingly used in several fields to improve performance and the level of automation. In recent years, this use has exponentially increased due to the advancement of high-performance computing and the ever increasing size of data. One of such fields is that of hardware design; specifically the design of digital and analog integrated circuits~(ICs), where AI/ ML techniques have been extensively used to address ever-increasing design complexity, aggressive time-to-market, and the growing number of ubiquitous interconnected devices (IoT). However, the security concerns and issues related to IC design have been highly overlooked. In this paper, we summarize the state-of-the-art in AL/ML for circuit design/optimization, security and engineering challenges, research in security-aware CAD/EDA, and future research directions and needs for using AI/ML for security-aware circuit design. △ Less

Submitted 20 April, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

arXiv:2202.08414 [pdf, other]

doi 10.1145/3588032

FPIC: A Novel Semantic Dataset for Optical PCB Assurance

Authors: Nathan Jessurun, Olivia P. Dizon-Paradis, Jacob Harrison, Shajib Ghosh, Mark M. Tehranipoor, Damon L. Woodard, Navid Asadizanjani

Abstract: Outsourced printed circuit board (PCB) fabrication necessitates increased hardware assurance capabilities. Several assurance techniques based on automated optical inspection (AOI) have been proposed that leverage PCB images acquired using digital cameras. We review state-of-the-art AOI techniques and observe a strong, rapid trend toward machine learning (ML) solutions. These require significant am… ▽ More Outsourced printed circuit board (PCB) fabrication necessitates increased hardware assurance capabilities. Several assurance techniques based on automated optical inspection (AOI) have been proposed that leverage PCB images acquired using digital cameras. We review state-of-the-art AOI techniques and observe a strong, rapid trend toward machine learning (ML) solutions. These require significant amounts of labeled ground truth data, which is lacking in the publicly available PCB data space. We contribute the FICS PCB Image Collection (FPIC) dataset to address this need. Additionally, we outline new hardware security methodologies enabled by our data set. △ Less

Submitted 14 March, 2023; v1 submitted 16 February, 2022; originally announced February 2022.

Comments: Dataset is available at https://www.trust-hub.org/#/data/pcb-images ; Submitted to ACM JETC in Feb 2022; Accepted February 2023

arXiv:2006.04029 [pdf]

Ethics, Data Science, and Health and Human Services: Embedded Bias in Policy Approaches to Teen Pregnancy Prevention

Authors: Davon Woodard, Huthaifa I. Ashqar, Taoran Ji

Abstract: Background: This study aims to evaluate the Chicago Teen Pregnancy Prevention Initiative delivery optimization outcomes given policy-neutral and policy-focused approaches to deliver this program to at-risk teens across the City of Chicago. Methods: We collect and compile several datasets from public sources including: Chicago Department of Public Health clinic locations, two public health statisti… ▽ More Background: This study aims to evaluate the Chicago Teen Pregnancy Prevention Initiative delivery optimization outcomes given policy-neutral and policy-focused approaches to deliver this program to at-risk teens across the City of Chicago. Methods: We collect and compile several datasets from public sources including: Chicago Department of Public Health clinic locations, two public health statistics datasets, census data of Chicago, list of Chicago public high schools, and their Locations. Our policy-neutral approach will consist of an equal distribution of funds and resources to schools and centers, regardless of past trends and outcomes. The policy-focused approaches will evaluate two models: first, a funding model based on prediction models from historical data; and second, a funding model based on economic and social outcomes for communities. Results: Results of this study confirms our initial hypothesis, that even though the models are optimized from a machine learning perspective, there is still possible that the models will produce wildly different results in the real-world application. Conclusions: When ethics and ethical considerations are extended beyond algorithmic optimization to encompass output and societal optimization, the foundation and philosophical grounding of the decision-making process become even more critical in the knowledge discovery process. △ Less

Submitted 6 June, 2020; originally announced June 2020.

Comments: Submitted to the Health Policy Open journal

arXiv:2004.13874 [pdf, other]

Histogram-based Auto Segmentation: A Novel Approach to Segmenting Integrated Circuit Structures from SEM Images

Authors: Ronald Wilson, Navid Asadizanjani, Domenic Forte, Damon L. Woodard

Abstract: In the Reverse Engineering and Hardware Assurance domain, a majority of the data acquisition is done through electron microscopy techniques such as Scanning Electron Microscopy (SEM). However, unlike its counterparts in optical imaging, only a limited number of techniques are available to enhance and extract information from the raw SEM images. In this paper, we introduce an algorithm to segment o… ▽ More In the Reverse Engineering and Hardware Assurance domain, a majority of the data acquisition is done through electron microscopy techniques such as Scanning Electron Microscopy (SEM). However, unlike its counterparts in optical imaging, only a limited number of techniques are available to enhance and extract information from the raw SEM images. In this paper, we introduce an algorithm to segment out Integrated Circuit (IC) structures from the SEM image. Unlike existing algorithms discussed in this paper, this algorithm is unsupervised, parameter-free and does not require prior information on the noise model or features in the target image making it effective in low quality image acquisition scenarios as well. Furthermore, the results from the application of the algorithm on various structures and layers in the IC are reported and discussed. △ Less

Submitted 28 April, 2020; originally announced April 2020.

arXiv:2002.04210 [pdf, other]

Hardware Trust and Assurance through Reverse Engineering: A Survey and Outlook from Image Analysis and Machine Learning Perspectives

Authors: Ulbert J. Botero, Ronald Wilson, Hangwei Lu, Mir Tanjidur Rahman, Mukhil A. Mallaiyan, Fatemeh Ganji, Navid Asadizanjani, Mark M. Tehranipoor, Damon L. Woodard, Domenic Forte

Abstract: In the context of hardware trust and assurance, reverse engineering has been often considered as an illegal action. Generally speaking, reverse engineering aims to retrieve information from a product, i.e., integrated circuits (ICs) and printed circuit boards (PCBs) in hardware security-related scenarios, in the hope of understanding the functionality of the device and determining its constituent… ▽ More In the context of hardware trust and assurance, reverse engineering has been often considered as an illegal action. Generally speaking, reverse engineering aims to retrieve information from a product, i.e., integrated circuits (ICs) and printed circuit boards (PCBs) in hardware security-related scenarios, in the hope of understanding the functionality of the device and determining its constituent components. Hence, it can raise serious issues concerning Intellectual Property (IP) infringement, the (in)effectiveness of security-related measures, and even new opportunities for injecting hardware Trojans. Ironically, reverse engineering can enable IP owners to verify and validate the design. Nevertheless, this cannot be achieved without overcoming numerous obstacles that limit successful outcomes of the reverse engineering process. This paper surveys these challenges from two complementary perspectives: image processing and machine learning. These two fields of study form a firm basis for the enhancement of efficiency and accuracy of reverse engineering processes for both PCBs and ICs. In summary, therefore, this paper presents a roadmap indicating clearly the actions to be taken to fulfill hardware trust and assurance objectives. △ Less

Submitted 7 April, 2021; v1 submitted 11 February, 2020; originally announced February 2020.

Comments: It is essential not to reduce the size of the figures as high quality ones are required to discuss the image processing algorithms and methods

arXiv:1804.07651 [pdf]

Approaches to Enhancing Cyber Resilience: Report of the North Atlantic Treaty Organization (NATO) Workshop IST-153

Authors: Alexander Kott, Benjamin Blakely, Diane Henshel, Gregory Wehner, James Rowell, Nathaniel Evans, Luis Muñoz-González, Nandi Leslie, Donald W French, Donald Woodard, Kerry Krutilla, Amanda Joyce, Igor Linkov, Carmen Mas-Machuca, Janos Sztipanovits, Hugh Harney, Dennis Kergl, Perri Nejib, Edward Yakabovicz, Steven Noel, Tim Dudman, Pierre Trepagnier, Sowdagar Badesha, Alfred Møller

Abstract: This report summarizes the discussions and findings of the 2017 North Atlantic Treaty Organization (NATO) Workshop, IST-153, on Cyber Resilience, held in Munich, Germany, on 23-25 October 2017, at the University of Bundeswehr. Despite continual progress in managing risks in the cyber domain, anticipation and prevention of all possible attacks and malfunctions are not feasible for the current or fu… ▽ More This report summarizes the discussions and findings of the 2017 North Atlantic Treaty Organization (NATO) Workshop, IST-153, on Cyber Resilience, held in Munich, Germany, on 23-25 October 2017, at the University of Bundeswehr. Despite continual progress in managing risks in the cyber domain, anticipation and prevention of all possible attacks and malfunctions are not feasible for the current or future systems comprising the cyber infrastructure. Therefore, interest in cyber resilience (as opposed to merely risk-based approaches) is increasing rapidly, in literature and in practice. Unlike concepts of risk or robustness - which are often and incorrectly conflated with resilience - resiliency refers to the system's ability to recover or regenerate its performance to a sufficient level after an unexpected impact produces a degradation of its performance. The exact relation among resilience, risk, and robustness has not been well articulated technically. The presentations and discussions at the workshop yielded this report. It focuses on the following topics that the participants of the workshop saw as particularly important: fundamental properties of cyber resilience; approaches to measuring and modeling cyber resilience; mission modeling for cyber resilience; systems engineering for cyber resilience, and dynamic defense as a path toward cyber resilience. △ Less

Submitted 20 April, 2018; originally announced April 2018.

Report number: ARL-SR-0396

arXiv:1803.09710 [pdf, other]

Secure and Reliable Biometric Access Control for Resource-Constrained Systems and IoT

Authors: Nima Karimian, Zimu Guo, Fatemeh Tehranipoor, Damon Woodard, Mark Tehranipoor, Domenic Forte

Abstract: With the emergence of the Internet-of-Things (IoT), there is a growing need for access control and data protection on low-power, pervasive devices. Biometric-based authentication is promising for IoT due to its convenient nature and lower susceptibility to attacks. However, the costs associated with biometric processing and template protection are nontrivial for smart cards, key fobs, and so forth… ▽ More With the emergence of the Internet-of-Things (IoT), there is a growing need for access control and data protection on low-power, pervasive devices. Biometric-based authentication is promising for IoT due to its convenient nature and lower susceptibility to attacks. However, the costs associated with biometric processing and template protection are nontrivial for smart cards, key fobs, and so forth. In this paper, we discuss the security, cost, and utility of biometric systems and develop two major frameworks for improving them. First, we introduce a new framework for implementing biometric systems based on physical unclonable functions (PUFs) and hardware obfuscation that, unlike traditional software approaches, does not require nonvolatile storage of a biometric template/key. Aside from reducing the risk of compromising the biometric, the nature of obfuscation also provides protection against access control circumvention via malware and fault injection. The PUF provides non-invertibility and non-linkability. Second, a major requirement of the proposed PUF/obfuscation approach is that a reliable (robust) key be generated from the users input biometric. We propose a noiseaware biometric quantization framework capable of generating unique, reliable keys with reduced enrollment time and denoising costs. Finally, we conduct several case studies. In the first, the proposed noise-aware approach is compared to our previous approach for multiple biometric modalities, including popular ones (fingerprint and iris) and emerging cardiovascular ones (ECG and PPG). The results show that ECG provides the best tradeoff between reliability, key length, entropy, and cost. In the second and third case studies, we demonstrate how reliability, denoising costs, and enrollment times can be simultaneously improved by modeling subject intra-variations for ECG. △ Less

Submitted 26 March, 2018; originally announced March 2018.

Comments: 11 pages, 9 figures

Showing 1–14 of 14 results for author: Woodard, D