Search | arXiv e-print repository

Enhancing tutoring systems by leveraging tailored promptings and domain knowledge with Large Language Models

Authors: Mohsen Balavar, Wenli Yang, David Herbert, Soonja Yeom

Abstract: Recent advancements in artificial intelligence (AI) and machine learning have reignited interest in their impact on Computer-based Learning (CBL). AI-driven tools like ChatGPT and Intelligent Tutoring Systems (ITS) have enhanced learning experiences through personalisation and flexibility. ITSs can adapt to individual learning needs and provide customised feedback based on a student's performance,… ▽ More Recent advancements in artificial intelligence (AI) and machine learning have reignited interest in their impact on Computer-based Learning (CBL). AI-driven tools like ChatGPT and Intelligent Tutoring Systems (ITS) have enhanced learning experiences through personalisation and flexibility. ITSs can adapt to individual learning needs and provide customised feedback based on a student's performance, cognitive state, and learning path. Despite these advances, challenges remain in accommodating diverse learning styles and delivering real-time, context-aware feedback. Our research aims to address these gaps by integrating skill-aligned feedback via Retrieval Augmented Generation (RAG) into prompt engineering for Large Language Models (LLMs) and developing an application to enhance learning through personalised tutoring in a computer science programming context. The pilot study evaluated a proposed system using three quantitative metrics: readability score, response time, and feedback depth, across three programming tasks of varying complexity. The system successfully sorted simulated students into three skill-level categories and provided context-aware feedback. This targeted approach demonstrated better effectiveness and adaptability compared to general methods. △ Less

Submitted 1 May, 2025; originally announced May 2025.

arXiv:2412.02344 [pdf, other]

UniForm: A Reuse Attention Mechanism Optimized for Efficient Vision Transformers on Edge Devices

Authors: Seul-Ki Yeom, Tae-Ho Kim

Abstract: Transformer-based architectures have demonstrated remarkable success across various domains, but their deployment on edge devices remains challenging due to high memory and computational demands. In this paper, we introduce a novel Reuse Attention mechanism, tailored for efficient memory access and computational optimization, enabling seamless operation on resource-constrained platforms without co… ▽ More Transformer-based architectures have demonstrated remarkable success across various domains, but their deployment on edge devices remains challenging due to high memory and computational demands. In this paper, we introduce a novel Reuse Attention mechanism, tailored for efficient memory access and computational optimization, enabling seamless operation on resource-constrained platforms without compromising performance. Unlike traditional multi-head attention (MHA), which redundantly computes separate attention matrices for each head, Reuse Attention consolidates these computations into a shared attention matrix, significantly reducing memory overhead and computational complexity. Comprehensive experiments on ImageNet-1K and downstream tasks show that the proposed UniForm models leveraging Reuse Attention achieve state-of-the-art imagenet classification accuracy while outperforming existing attention mechanisms, such as Linear Attention and Flash Attention, in inference speed and memory scalability. Notably, UniForm-l achieves a 76.7% Top-1 accuracy on ImageNet-1K with 21.8ms inference time on edge devices like the Jetson AGX Orin, representing up to a 5x speedup over competing benchmark methods. These results demonstrate the versatility of Reuse Attention across high-performance GPUs and edge platforms, paving the way for broader real-time applications △ Less

Submitted 3 December, 2024; originally announced December 2024.

Comments: 13 Pages, 8 Tables, 7 Figures

arXiv:2407.17609 [pdf]

Controlling structural phases of Sn through lattice engineering

Authors: Chandima Kasun Edirisinghe, Anjali Rathore, Taegeon Lee, Daekwon Lee, An-Hsi Chen, Garrett Baucom, Eitan Hershkovitz, Anuradha Wijesinghe, Pradip Adhikari, Sinchul Yeom, Hong Seok Lee, Hyung-Kook Choi, Hyunsoo Kim, Mina Yoon, Honggyu Kim, Matthew Brahlek, Heesuk Rho, Joon Sue Lee

Abstract: Topology and superconductivity, two distinct phenomena offer unique insight into quantum properties and their applications in quantum technologies, spintronics, and sustainable energy technologies if system can be found where they coexist. Tin (Sn) plays a pivotal role here as an element due to its two structural phases, $α$-Sn and $β$-Sn, exhibiting topological characteristics ($α$-Sn) and superc… ▽ More Topology and superconductivity, two distinct phenomena offer unique insight into quantum properties and their applications in quantum technologies, spintronics, and sustainable energy technologies if system can be found where they coexist. Tin (Sn) plays a pivotal role here as an element due to its two structural phases, $α$-Sn and $β$-Sn, exhibiting topological characteristics ($α$-Sn) and superconductivity ($β$-Sn). In this study we show how precise control of $α$ and $β$ phases of Sn thin films can be achieved by using molecular beam epitaxy grown buffer layers with systematic control over the lattice parameter. The resulting Sn films showed either $β$-Sn or $α$-Sn phases as the lattice constant of the buffer layer was varied from 6.10 A to 6.48 A, covering the range between GaSb (closely matched to InAs) and InSb. The crystal structures of the $α$- and $β$-Sn films were characterized by x-ray diffraction and confirmed by Raman spectroscopy and scanning transmission electron microscopy. The smooth and continuous surface morphology of the Sn films was validated using atomic force microscopy. The characteristics of $α$- and $β$-Sn phases were further verified using electrical transport measurements by observing resistance drop near 3.7 K for superconductivity of the $β$-Sn phase and Shubnikov-de Haas oscillations for the $α$-Sn phase. Density functional theory calculations showed that the stability of the Sn phases is highly dependent on lattice strain, with $α$-Sn being more stable under tensile strain and $β$-Sn becoming favorable under compressive strain, which is in good agreement with experimental observations. Hence, this study sheds light on controlling Sn phases through lattice engineering, enabling innovative applications in quantum technologies and beyond. △ Less

Submitted 21 September, 2024; v1 submitted 24 July, 2024; originally announced July 2024.

arXiv:2312.06272 [pdf, other]

U-MixFormer: UNet-like Transformer with Mix-Attention for Efficient Semantic Segmentation

Authors: Seul-Ki Yeom, Julian von Klitzing

Abstract: Semantic segmentation has witnessed remarkable advancements with the adaptation of the Transformer architecture. Parallel to the strides made by the Transformer, CNN-based U-Net has seen significant progress, especially in high-resolution medical imaging and remote sensing. This dual success inspired us to merge the strengths of both, leading to the inception of a U-Net-based vision transformer de… ▽ More Semantic segmentation has witnessed remarkable advancements with the adaptation of the Transformer architecture. Parallel to the strides made by the Transformer, CNN-based U-Net has seen significant progress, especially in high-resolution medical imaging and remote sensing. This dual success inspired us to merge the strengths of both, leading to the inception of a U-Net-based vision transformer decoder tailored for efficient contextual encoding. Here, we propose a novel transformer decoder, U-MixFormer, built upon the U-Net structure, designed for efficient semantic segmentation. Our approach distinguishes itself from the previous transformer methods by leveraging lateral connections between the encoder and decoder stages as feature queries for the attention modules, apart from the traditional reliance on skip connections. Moreover, we innovatively mix hierarchical feature maps from various encoder and decoder stages to form a unified representation for keys and values, giving rise to our unique mix-attention module. Our approach demonstrates state-of-the-art performance across various configurations. Extensive experiments show that U-MixFormer outperforms SegFormer, FeedFormer, and SegNeXt by a large margin. For example, U-MixFormer-B0 surpasses SegFormer-B0 and FeedFormer-B0 with 3.8% and 2.0% higher mIoU and 27.3% and 21.8% less computation and outperforms SegNext with 3.3% higher mIoU with MSCAN-T encoder on ADE20K. Code available at https://github.com/julian-klitzing/u-mixformer. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: 8 Pages, 6 Tables, 6 Figures

arXiv:2310.11424 [pdf]

Theoretical investigation of delafossite-Cu2ZnSnO4 as a promising photovoltaic absorber

Authors: Seoung-Hun Kang, Myeongjun Kang, Sang Woon Hwang, Sinchul Yeom, Mina Yoon, Jong Mok Ok, Sangmoon Yoon

Abstract: In the quest for efficient and cost-effective photovoltaic absorber materials beyond silicon, considerable attention has been directed toward exploring alternatives. One such material, zincblende-derived Cu2ZnSnS4 (CZTS), has shown promise due to its ideal band-gap size and high absorption coefficient. However, challenges such as structural defects and secondary phase formation have hindered its d… ▽ More In the quest for efficient and cost-effective photovoltaic absorber materials beyond silicon, considerable attention has been directed toward exploring alternatives. One such material, zincblende-derived Cu2ZnSnS4 (CZTS), has shown promise due to its ideal band-gap size and high absorption coefficient. However, challenges such as structural defects and secondary phase formation have hindered its development. In this study, we examine the potential of another compound Cu2ZnSnO4 (CZTO) with a similar composition to CZTS as a promising alternative. Employing ab initio density function theory (DFT) calculations in combination with an evolutionary structure prediction algorithm, we identify that the crystalline phase of the delafossite structure is the most stable among the 900 (meta)stable CZTO. Its thermodynamic stability at room temperature is also confirmed by the molecular dynamics study. Excitingly, this new phase of CZTO displays a direct band gap where the dipole-allowed transition occurs, making it a strong candidate for efficient light absorption. Furthermore, the estimation of spectroscopic limited maximum efficiency (SLME) directly demonstrates the high potential of delafossite-CZTO as a photovoltaic absorber. Our numerical results suggest that delafossite-CZTO holds another promise for future photovoltaic applications. △ Less

Submitted 17 October, 2023; originally announced October 2023.

arXiv:2307.05764 [pdf]

doi 10.1016/j.carbon.2022.09.006

The role of temperature on defect diffusion and nanoscale patterning in graphene

Authors: Ondrej Dyck, Sinchul Yeom, Sarah Dillender, Andrew R. Lupini, Mina Yoon, Stephen Jesse

Abstract: Graphene is of great scientific interest due to a variety of unique properties such as ballistic transport, spin selectivity, the quantum hall effect, and other quantum properties. Nanopatterning and atomic scale modifications of graphene are expected to enable further control over its intrinsic properties, providing ways to tune the electronic properties through geometric and strain effects, intr… ▽ More Graphene is of great scientific interest due to a variety of unique properties such as ballistic transport, spin selectivity, the quantum hall effect, and other quantum properties. Nanopatterning and atomic scale modifications of graphene are expected to enable further control over its intrinsic properties, providing ways to tune the electronic properties through geometric and strain effects, introduce edge states and other local or extended topological defects, and sculpt circuit paths. The focused beam of a scanning transmission electron microscope (STEM) can be used to remove atoms, enabling milling, doping, and deposition. Utilization of a STEM as an atomic scale fabrication platform is increasing; however, a detailed understanding of beam-induced processes and the subsequent cascade of aftereffects is lacking. Here, we examine the electron beam effects on atomically clean graphene at a variety of temperatures ranging from 400 to 1000 C. We find that temperature plays a significant role in the milling rate and moderates competing processes of carbon adatom coalescence, graphene healing, and the diffusion (and recombination) of defects. The results of this work can be applied to a wider range of 2D materials and introduce better understanding of defect evolution in graphite and other bulk layered materials. △ Less

Submitted 11 July, 2023; originally announced July 2023.

arXiv:2305.07025 [pdf, other]

doi 10.1063/5.0159670

Structural Anisotropy in Sb Thin Films

Authors: Pradip Adhikari, Anuradha Wijesinghe, Anjali Rathore, Timothy Jinsoo Yoo, Gyehyeon Kim, Hyoungtaek Lee, Sinchul Yeom, Alessandro R. Mazza, Changhee Sohn, Hyeong-Ryeol Park, Mina Yoon, Matthew Brahlek, Honggyu Kim, Joon Sue Lee

Abstract: Sb thin films have attracted wide interests due to their tunable band structure, topological phases, and remarkable electronic properties. We successfully grow epitaxial Sb thin films on a closely lattice-matched GaSb(001) surface by molecular beam epitaxy. We find a novel anisotropic directional dependence of their structural, morphological, and electronic properties. The origin of the anisotropi… ▽ More Sb thin films have attracted wide interests due to their tunable band structure, topological phases, and remarkable electronic properties. We successfully grow epitaxial Sb thin films on a closely lattice-matched GaSb(001) surface by molecular beam epitaxy. We find a novel anisotropic directional dependence of their structural, morphological, and electronic properties. The origin of the anisotropic features is elucidated using first-principles density functional theory (DFT) calculations. The growth regime of crystalline and amorphous Sb thin films was determined by mapping the surface reconstruction phase diagram of the GaSb(001) surface under Sb$_2$ flux, with confirmation of structural characterizations. Crystalline Sb thin films show a rhombohedral crystal structure along the rhombohedral (104) surface orientation parallel to the cubic (001) surface orientation of the GaSb substrate. At this coherent interface, Sb atoms are aligned with the GaSb lattice along the [1-10] crystallographic direction but are not aligned well along the [110] crystallographic direction, which results in anisotropic features in reflection high-energy electron diffraction patterns, surface morphology, and transport properties. Our DFT calculations show that the anisotropic features originate from the GaSb surface, where Sb atoms align with the Ga and Sb atoms on the reconstructed surface. The formation energy calculations confirm that the stability of the experimentally observed structures. Our results provide optimal film growth conditions for further studies of novel properties of Bi$_{1-x}$Sb$_x$ thin films with similar lattice parameters and an identical crystal structure as well as functional heterostructures of them with III-V semiconductor layers along the (001) surface orientation, supported by a theoretical understanding of the anisotropic film orientation. △ Less

Submitted 11 May, 2023; originally announced May 2023.

Journal ref: APL Mater. 12, 011116 (2024)

arXiv:2303.01627 [pdf, other]

A New Two-Dimensional Dirac Semimetal Based on the Alkaline Earth Metal, CaP$_3$

Authors: Seoung-Hun Kang, Wei Luo, Sinchul Yeom, Yaling Zheng, Mina Yoon

Abstract: Using an evolutionary algorithm in combination with first-principles density functional theory calculations, we identify two-dimensional (2D) CaP$_3$ monolayer as a new Dirac semimetal due to inversion and nonsymmorphic spatial symmetries of the structure. This new topological material, composed of light elements, exhibits high structural stability (higher than the phase known in the literature),… ▽ More Using an evolutionary algorithm in combination with first-principles density functional theory calculations, we identify two-dimensional (2D) CaP$_3$ monolayer as a new Dirac semimetal due to inversion and nonsymmorphic spatial symmetries of the structure. This new topological material, composed of light elements, exhibits high structural stability (higher than the phase known in the literature), which is confirmed by thermodynamic and kinetic stability analysis. Moreover, it satisfies the electron filling criteria, so that its Dirac state is located near the Fermi level. The existence of the Dirac state predicted by the theoretical symmetry analysis is also confirmed by first-principles electronic band structure calculations. We find that the energy position of the Dirac state can be tuned by strain, while the Dirac state is unstable against an external electric field since it breaks the spatial inversion symmetry. Our findings should be instrumental in the development of 2D Dirac fermions based on light elements for their application in nanoelectronic devices and topological electronics. △ Less

Submitted 2 March, 2023; originally announced March 2023.

arXiv:2301.01674 [pdf]

doi 10.1002/adma.202302906

Top-down fabrication of atomic patterns in twisted bilayer graphene

Authors: Ondrej Dyck, Sinchul Yeom, Andrew R. Lupini, Jacob L. Swett, Dale Hensley, Mina Yoon, Stephen Jesse

Abstract: Atomic-scale engineering typically involves bottom-up approaches, leveraging parameters such as temperature, partial pressures, and chemical affinity to promote spontaneous arrangement of atoms. These parameters are applied globally, resulting in atomic scale features scattered probabilistically throughout the material. In a top-down approach, different regions of the material are exposed to diffe… ▽ More Atomic-scale engineering typically involves bottom-up approaches, leveraging parameters such as temperature, partial pressures, and chemical affinity to promote spontaneous arrangement of atoms. These parameters are applied globally, resulting in atomic scale features scattered probabilistically throughout the material. In a top-down approach, different regions of the material are exposed to different parameters resulting in structural changes varying on the scale of the resolution. In this work, we combine the application of global and local parameters in an aberration corrected scanning transmission electron microscope (STEM) to demonstrate atomic scale precision patterning of atoms in twisted bilayer graphene. The focused electron beam is used to define attachment points for foreign atoms through the controlled ejection of carbon atoms from the graphene lattice. The sample environment is staged with nearby source materials, such that the sample temperature can induce migration of the source atoms across the sample surface. Under these conditions, the electron-beam (top-down) enables carbon atoms in the graphene to be replaced spontaneously by diffusing adatoms (bottom-up). Using image-based feedback-control, arbitrary patterns of atoms and atom clusters are attached to the twisted bilayer graphene with limited human interaction. The role of substrate temperature on adatom and vacancy diffusion is explored by first-principles simulations. △ Less

Submitted 4 January, 2023; originally announced January 2023.

arXiv:2209.03620 [pdf, other]

Black-Box Audits for Group Distribution Shifts

Authors: Marc Juarez, Samuel Yeom, Matt Fredrikson

Abstract: When a model informs decisions about people, distribution shifts can create undue disparities. However, it is hard for external entities to check for distribution shift, as the model and its training set are often proprietary. In this paper, we introduce and study a black-box auditing method to detect cases of distribution shift that lead to a performance disparity of the model across demographic… ▽ More When a model informs decisions about people, distribution shifts can create undue disparities. However, it is hard for external entities to check for distribution shift, as the model and its training set are often proprietary. In this paper, we introduce and study a black-box auditing method to detect cases of distribution shift that lead to a performance disparity of the model across demographic groups. By extending techniques used in membership and property inference attacks -- which are designed to expose private information from learned models -- we demonstrate that an external auditor can gain the information needed to identify these distribution shifts solely by querying the model. Our experimental results on real-world datasets show that this approach is effective, achieving 80--100% AUC-ROC in detecting shifts involving the underrepresentation of a demographic group in the training set. Researchers and investigative journalists can use our tools to perform non-collaborative audits of proprietary models and expose cases of underrepresentation in the training datasets. △ Less

Submitted 8 September, 2022; originally announced September 2022.

arXiv:2202.01891 [pdf, other]

Weighted Isolation and Random Cut Forest Algorithms for Anomaly Detection

Authors: Sijin Yeom, Jae-Hun Jung

Abstract: Random cut forest (RCF) algorithms have been developed for anomaly detection, particularly in time series data. The RCF algorithm is an improved version of the isolation forest (IF) algorithm. Unlike the IF algorithm, the RCF algorithm can determine whether real-time input contains an anomaly by inserting the input into the constructed tree network. Various RCF algorithms, including Robust RCF (RR… ▽ More Random cut forest (RCF) algorithms have been developed for anomaly detection, particularly in time series data. The RCF algorithm is an improved version of the isolation forest (IF) algorithm. Unlike the IF algorithm, the RCF algorithm can determine whether real-time input contains an anomaly by inserting the input into the constructed tree network. Various RCF algorithms, including Robust RCF (RRCF), have been developed, where the cutting procedure is adaptively chosen probabilistically. The RRCF algorithm demonstrates better performance than the IF algorithm, as dimension cuts are decided based on the geometric range of the data, whereas the IF algorithm randomly chooses dimension cuts. However, the overall data structure is not considered in both IF and RRCF, given that split values are chosen randomly. In this paper, we propose new IF and RCF algorithms, referred to as the weighted IF (WIF) and weighted RCF (WRCF) algorithms, respectively. Their split values are determined by considering the density of the given data. To introduce the WIF and WRCF, we first present a new geometric measure, a density measure, which is crucial for constructing the WIF and WRCF. We provide various mathematical properties of the density measure, accompanied by theorems that support and validate our claims through numerical examples. △ Less

Submitted 8 January, 2024; v1 submitted 1 February, 2022; originally announced February 2022.

Comments: 45 pages, 28 figures

arXiv:2111.09635 [pdf, other]

Automatic Neural Network Pruning that Efficiently Preserves the Model Accuracy

Authors: Thibault Castells, Seul-Ki Yeom

Abstract: Neural networks performance has been significantly improved in the last few years, at the cost of an increasing number of floating point operations per second (FLOPs). However, more FLOPs can be an issue when computational resources are limited. As an attempt to solve this problem, pruning filters is a common solution, but most existing pruning methods do not preserve the model accuracy efficientl… ▽ More Neural networks performance has been significantly improved in the last few years, at the cost of an increasing number of floating point operations per second (FLOPs). However, more FLOPs can be an issue when computational resources are limited. As an attempt to solve this problem, pruning filters is a common solution, but most existing pruning methods do not preserve the model accuracy efficiently and therefore require a large number of finetuning epochs. In this paper, we propose an automatic pruning method that learns which neurons to preserve in order to maintain the model accuracy while reducing the FLOPs to a predefined target. To accomplish this task, we introduce a trainable bottleneck that only requires one single epoch with 25.6% (CIFAR-10) or 7.49% (ILSVRC2012) of the dataset to learn which filters to prune. Experiments on various architectures and datasets show that the proposed method can not only preserve the accuracy after pruning but also outperform existing methods after finetuning. We achieve a 52.00% FLOPs reduction on ResNet-50, with a Top-1 accuracy of 47.51% after pruning and a state-of-the-art (SOTA) accuracy of 76.63% after finetuning on ILSVRC2012. Code available at https://github.com/nota-github/autobot_AAAI23. △ Less

Submitted 7 December, 2022; v1 submitted 18 November, 2021; originally announced November 2021.

Comments: 11 pages, 6 figures, 5 tables, accepted in AAAI2023 Workshop (Practical AI)

arXiv:2103.10858 [pdf, other]

Toward Compact Deep Neural Networks via Energy-Aware Pruning

Authors: Seul-Ki Yeom, Kyung-Hwan Shim, Jee-Hyun Hwang

Abstract: Despite the remarkable performance, modern deep neural networks are inevitably accompanied by a significant amount of computational cost for learning and deployment, which may be incompatible with their usage on edge devices. Recent efforts to reduce these overheads involve pruning and decomposing the parameters of various layers without performance deterioration. Inspired by several decomposition… ▽ More Despite the remarkable performance, modern deep neural networks are inevitably accompanied by a significant amount of computational cost for learning and deployment, which may be incompatible with their usage on edge devices. Recent efforts to reduce these overheads involve pruning and decomposing the parameters of various layers without performance deterioration. Inspired by several decomposition studies, in this paper, we propose a novel energy-aware pruning method that quantifies the importance of each filter in the network using nuclear-norm (NN). Proposed energy-aware pruning leads to state-of-the-art performance for Top-1 accuracy, FLOPs, and parameter reduction across a wide range of scenarios with multiple network architectures on CIFAR-10 and ImageNet after fine-grained classification tasks. On toy experiment, without fine-tuning, we can visually observe that NN has a minute change in decision boundaries across classes and outperforms the previous popular criteria. We achieve competitive results with 40.4/49.8% of FLOPs and 45.9/52.9% of parameter reduction with 94.13/94.61% in the Top-1 accuracy with ResNet-56/110 on CIFAR-10, respectively. In addition, our observations are consistent for a variety of different pruning setting in terms of data size as well as data quality which can be emphasized in the stability of the acceleration and compression with negligible accuracy loss. △ Less

Submitted 10 March, 2022; v1 submitted 19 March, 2021; originally announced March 2021.

Comments: 10 pages, 5 figures, 3 tables

arXiv:2102.06884 [pdf]

GPSPiChain-Blockchain based Self-Contained Family Security System in Smart Home

Authors: Ali Raza, Lachlan Hardy, Erin Roehrer, Soonja Yeom, Byeong ho Kang

Abstract: With advancements in technology, personal computing devices are better adapted for and further integrated into people's lives and homes. The integration of technology into society also results in an increasing desire to control who and what has access to sensitive information, especially for vulnerable people including children and the elderly. With blockchain coming in to the picture as a technol… ▽ More With advancements in technology, personal computing devices are better adapted for and further integrated into people's lives and homes. The integration of technology into society also results in an increasing desire to control who and what has access to sensitive information, especially for vulnerable people including children and the elderly. With blockchain coming in to the picture as a technology that can revolutionise the world, it is now possible to have an immutable audit trail of locational data over time. By controlling the process through inexpensive equipment in the home, it is possible to control whom has access to such personal data. This paper presents a blockchain based family security system for tracking the location of consenting family members' smart phones. The locations of the family members' smart phones are logged and stored in a private blockchain which can be accessed through a node installed in the family home on a computer. The data for the whereabouts of family members stays within the family unit and does not go to any third party. The system is implemented in a small scale (one miner and two other nodes) and the technical feasibility is discussed along with the limitations of the system. Further research will cover the integration of the system into a smart home environment, and ethical implementations of tracking, especially of vulnerable people, using the immutability of blockchain. △ Less

Submitted 13 February, 2021; originally announced February 2021.

Comments: 15 pages, 6 figures, accepted in The 4th International Workshop on Smart Simulation and Modelling for Complex Systems, IJCAI2019

Report number: SSMCS2019-13

arXiv:2002.07738 [pdf, other]

Individual Fairness Revisited: Transferring Techniques from Adversarial Robustness

Authors: Samuel Yeom, Matt Fredrikson

Abstract: We turn the definition of individual fairness on its head---rather than ascertaining the fairness of a model given a predetermined metric, we find a metric for a given model that satisfies individual fairness. This can facilitate the discussion on the fairness of a model, addressing the issue that it may be difficult to specify a priori a suitable metric. Our contributions are twofold: First, we i… ▽ More We turn the definition of individual fairness on its head---rather than ascertaining the fairness of a model given a predetermined metric, we find a metric for a given model that satisfies individual fairness. This can facilitate the discussion on the fairness of a model, addressing the issue that it may be difficult to specify a priori a suitable metric. Our contributions are twofold: First, we introduce the definition of a minimal metric and characterize the behavior of models in terms of minimal metrics. Second, for more complicated models, we apply the mechanism of randomized smoothing from adversarial robustness to make them individually fair under a given weighted $L^p$ metric. Our experiments show that adapting the minimal metrics of linear models to more complicated neural networks can lead to meaningful and interpretable fairness guarantees at little cost to utility. △ Less

Submitted 13 October, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

Comments: Published at IJCAI 2020 (at https://www.ijcai.org/Proceedings/2020/61 ); the conference version has a minor error in the proof of Theorem 3, which is fixed here

arXiv:2001.07546 [pdf]

Exploring an Application of Virtual Reality for Early Detection of Dementia

Authors: Yiming Zhong, Yuan Tian, Mira Park, Soonja Yeom

Abstract: Facing the severe global dementia problem, an exploration was conducted adopting the technology of virtual reality (VR). This report lays a technical foundation for further research project "Early Detection of Dementia Using Testing Tools in VR Environment", which illustrates the process of developing a VR application using Unity 3D software on Oculus Go. This preliminary exploration is composed o… ▽ More Facing the severe global dementia problem, an exploration was conducted adopting the technology of virtual reality (VR). This report lays a technical foundation for further research project "Early Detection of Dementia Using Testing Tools in VR Environment", which illustrates the process of developing a VR application using Unity 3D software on Oculus Go. This preliminary exploration is composed of three steps, including 3D virtual scene construction, VR interaction design and monitoring. The exploration was recorded to provide basic technical guidance and detailed method for subsequent research. △ Less

Submitted 14 January, 2020; originally announced January 2020.

Comments: 11 pages, 4 tables, 11 figures

arXiv:1912.08881 [pdf, other]

doi 10.1016/j.patcog.2021.107899

Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning

Authors: Seul-Ki Yeom, Philipp Seegerer, Sebastian Lapuschkin, Alexander Binder, Simon Wiedemann, Klaus-Robert Müller, Wojciech Samek

Abstract: The success of convolutional neural networks (CNNs) in various applications is accompanied by a significant increase in computation and parameter storage costs. Recent efforts to reduce these overheads involve pruning and compressing the weights of various layers while at the same time aiming to not sacrifice performance. In this paper, we propose a novel criterion for CNN pruning inspired by neur… ▽ More The success of convolutional neural networks (CNNs) in various applications is accompanied by a significant increase in computation and parameter storage costs. Recent efforts to reduce these overheads involve pruning and compressing the weights of various layers while at the same time aiming to not sacrifice performance. In this paper, we propose a novel criterion for CNN pruning inspired by neural network interpretability: The most relevant units, i.e. weights or filters, are automatically found using their relevance scores obtained from concepts of explainable AI (XAI). By exploring this idea, we connect the lines of interpretability and model compression research. We show that our proposed method can efficiently prune CNN models in transfer-learning setups in which networks pre-trained on large corpora are adapted to specialized tasks. The method is evaluated on a broad range of computer vision datasets. Notably, our novel criterion is not only competitive or better compared to state-of-the-art pruning criteria when successive retraining is performed, but clearly outperforms these previous criteria in the resource-constrained application scenario in which the data of the task to be transferred to is very scarce and one chooses to refrain from fine-tuning. Our method is able to compress the model iteratively while maintaining or even improving accuracy. At the same time, it has a computational cost in the order of gradient computation and is comparatively simple to apply without the need for tuning hyperparameters for pruning. △ Less

Submitted 12 March, 2021; v1 submitted 18 December, 2019; originally announced December 2019.

Comments: 25 pages + 5 supplementary pages, 13 figures, 6 tables

Journal ref: Pattern Recognition, Volume 115, pp.107899, 2021

arXiv:1906.11813 [pdf, ps, other]

Learning Fair Representations for Kernel Models

Authors: Zilong Tan, Samuel Yeom, Matt Fredrikson, Ameet Talwalkar

Abstract: Fair representations are a powerful tool for establishing criteria like statistical parity, proxy non-discrimination, and equality of opportunity in learned models. Existing techniques for learning these representations are typically model-agnostic, as they preprocess the original data such that the output satisfies some fairness criterion, and can be used with arbitrary learning methods. In contr… ▽ More Fair representations are a powerful tool for establishing criteria like statistical parity, proxy non-discrimination, and equality of opportunity in learned models. Existing techniques for learning these representations are typically model-agnostic, as they preprocess the original data such that the output satisfies some fairness criterion, and can be used with arbitrary learning methods. In contrast, we demonstrate the promise of learning a model-aware fair representation, focusing on kernel-based models. We leverage the classical Sufficient Dimension Reduction (SDR) framework to construct representations as subspaces of the reproducing kernel Hilbert space (RKHS), whose member functions are guaranteed to satisfy fairness. Our method supports several fairness criteria, continuous and discrete data, and multiple protected attributes. We further show how to calibrate the accuracy tradeoff by characterizing it in terms of the principal angles between subspaces of the RKHS. Finally, we apply our approach to obtain the first Fair Gaussian Process (FGP) prior for fair Bayesian learning, and show that it is competitive with, and in some cases outperforms, state-of-the-art methods on real data. △ Less

Submitted 20 January, 2020; v1 submitted 27 June, 2019; originally announced June 2019.

Comments: The 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020)

arXiv:1906.09218 [pdf, other]

doi 10.1145/3351095.3372845

FlipTest: Fairness Testing via Optimal Transport

Authors: Emily Black, Samuel Yeom, Matt Fredrikson

Abstract: We present FlipTest, a black-box technique for uncovering discrimination in classifiers. FlipTest is motivated by the intuitive question: had an individual been of a different protected status, would the model have treated them differently? Rather than relying on causal information to answer this question, FlipTest leverages optimal transport to match individuals in different protected groups, cre… ▽ More We present FlipTest, a black-box technique for uncovering discrimination in classifiers. FlipTest is motivated by the intuitive question: had an individual been of a different protected status, would the model have treated them differently? Rather than relying on causal information to answer this question, FlipTest leverages optimal transport to match individuals in different protected groups, creating similar pairs of in-distribution samples. We show how to use these instances to detect discrimination by constructing a "flipset": the set of individuals whose classifier output changes post-translation, which corresponds to the set of people who may be harmed because of their group membership. To shed light on why the model treats a given subgroup differently, FlipTest produces a "transparency report": a ranking of features that are most associated with the model's behavior on the flipset. Evaluating the approach on three case studies, we show that this provides a computationally inexpensive way to identify subgroups that may be harmed by model discrimination, including in cases where the model satisfies group fairness criteria. △ Less

Submitted 6 December, 2019; v1 submitted 21 June, 2019; originally announced June 2019.

Comments: Accepted to ACM FAT* 2020; The first two authors contributed equally

arXiv:1810.07155 [pdf, other]

Hunting for Discriminatory Proxies in Linear Regression Models

Authors: Samuel Yeom, Anupam Datta, Matt Fredrikson

Abstract: A machine learning model may exhibit discrimination when used to make decisions involving people. One potential cause for such outcomes is that the model uses a statistical proxy for a protected demographic attribute. In this paper we formulate a definition of proxy use for the setting of linear regression and present algorithms for detecting proxies. Our definition follows recent work on proxies… ▽ More A machine learning model may exhibit discrimination when used to make decisions involving people. One potential cause for such outcomes is that the model uses a statistical proxy for a protected demographic attribute. In this paper we formulate a definition of proxy use for the setting of linear regression and present algorithms for detecting proxies. Our definition follows recent work on proxies in classification models, and characterizes a model's constituent behavior that: 1) correlates closely with a protected random variable, and 2) is causally influential in the overall behavior of the model. We show that proxies in linear regression models can be efficiently identified by solving a second-order cone program, and further extend this result to account for situations where the use of a certain input variable is justified as a `business necessity'. Finally, we present empirical results on two law enforcement datasets that exhibit varying degrees of racial disparity in prediction outcomes, demonstrating that proxies shed useful light on the causes of discriminatory behavior in models. △ Less

Submitted 27 November, 2018; v1 submitted 16 October, 2018; originally announced October 2018.

arXiv:1808.08619 [pdf, other]

doi 10.1145/3442188.3445892

Avoiding Disparity Amplification under Different Worldviews

Authors: Samuel Yeom, Michael Carl Tschantz

Abstract: We mathematically compare four competing definitions of group-level nondiscrimination: demographic parity, equalized odds, predictive parity, and calibration. Using the theoretical framework of Friedler et al., we study the properties of each definition under various worldviews, which are assumptions about how, if at all, the observed data is biased. We argue that different worldviews call for dif… ▽ More We mathematically compare four competing definitions of group-level nondiscrimination: demographic parity, equalized odds, predictive parity, and calibration. Using the theoretical framework of Friedler et al., we study the properties of each definition under various worldviews, which are assumptions about how, if at all, the observed data is biased. We argue that different worldviews call for different definitions of fairness, and we specify the worldviews that, when combined with the desire to avoid a criterion for discrimination that we call disparity amplification, motivate demographic parity and equalized odds. We also argue that predictive parity and calibration are insufficient for avoiding disparity amplification because predictive parity allows an arbitrarily large inter-group disparity and calibration is not robust to post-processing. Finally, we define a worldview that is more realistic than the previously considered ones, and we introduce a new notion of fairness that corresponds to this worldview. △ Less

Submitted 9 March, 2021; v1 submitted 26 August, 2018; originally announced August 2018.

Comments: This is a draft version. For the published version, please go to https://dl.acm.org/doi/10.1145/3442188.3445892

arXiv:1709.01604 [pdf, other]

Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting

Authors: Samuel Yeom, Irene Giacomelli, Matt Fredrikson, Somesh Jha

Abstract: Machine learning algorithms, when applied to sensitive data, pose a distinct threat to privacy. A growing body of prior work demonstrates that models produced by these algorithms may leak specific private information in the training data to an attacker, either through the models' structure or their observable behavior. However, the underlying cause of this privacy risk is not well understood beyon… ▽ More Machine learning algorithms, when applied to sensitive data, pose a distinct threat to privacy. A growing body of prior work demonstrates that models produced by these algorithms may leak specific private information in the training data to an attacker, either through the models' structure or their observable behavior. However, the underlying cause of this privacy risk is not well understood beyond a handful of anecdotal accounts that suggest overfitting and influence might play a role. This paper examines the effect that overfitting and influence have on the ability of an attacker to learn information about the training data from machine learning models, either through training set membership inference or attribute inference attacks. Using both formal and empirical analyses, we illustrate a clear relationship between these factors and the privacy risk that arises in several popular machine learning algorithms. We find that overfitting is sufficient to allow an attacker to perform membership inference and, when the target attribute meets certain conditions about its influence, attribute inference attacks. Interestingly, our formal analysis also shows that overfitting is not necessary for these attacks and begins to shed light on what other factors may be in play. Finally, we explore the connection between membership inference and attribute inference, showing that there are deep connections between the two that lead to effective new attacks. △ Less

Submitted 4 May, 2018; v1 submitted 5 September, 2017; originally announced September 2017.

arXiv:1702.07626 [pdf, ps, other]

Restricted averaging operators to cones over finite fields

Authors: Doowon Koh, Chun-Yen Shen, Seongjun Yeom

Abstract: We investigate the sharp L^p\to L^r estimates for the restricted averaging operator A_C over the cone C of the d-dimensional vector space F_q^d over the finite field F_q with q elements. The restricted averaging operator A_C for the cone C is defined by the relation that A_Cf=f\ast σ|_C, where σdenotes the normalized surface measure on the cone C, and f is a complex valued function on the space F_… ▽ More We investigate the sharp L^p\to L^r estimates for the restricted averaging operator A_C over the cone C of the d-dimensional vector space F_q^d over the finite field F_q with q elements. The restricted averaging operator A_C for the cone C is defined by the relation that A_Cf=f\ast σ|_C, where σdenotes the normalized surface measure on the cone C, and f is a complex valued function on the space F_q^d with the normalized counting measure dx. In the previous work, the sharp boundedness of A_C was obtained in odd dimensions d\ge 3 but partial results were only given in even dimensions d\ge 4. In this paper we prove the optimal estimates in even dimensions d\ge 6 in the case when the cone C\subset F_q^d contains a d/2 dimensional subspace. △ Less

Submitted 24 February, 2017; originally announced February 2017.

Comments: 20 pages, No figures

MSC Class: 42B05

arXiv:1702.02724 [pdf, other]

doi 10.1038/s41598-017-17160-0.

Magnetic Frustration Driven by Itinerancy in Spinel CoV2O4

Authors: J. H. Lee, J. Ma, S. E. Hahn, H. B. Cao, Tao Hong, M. S. Yeom, S. Okamoto, H. D. Zhou, M. Matsuda, R. S. Fishman

Abstract: Localized spins and itinerant electrons rarely coexist in geometrically-frustrated spinel lattices. We show that the spinel CoV2O4 stands at the crossover from insulating to itinerant behavior and exhibits a complex interplay between localized spins and itinerant electrons. In contrast to the expected paramagnetism, localized spins supported by enhanced exchange couplings are frustrated by the eff… ▽ More Localized spins and itinerant electrons rarely coexist in geometrically-frustrated spinel lattices. We show that the spinel CoV2O4 stands at the crossover from insulating to itinerant behavior and exhibits a complex interplay between localized spins and itinerant electrons. In contrast to the expected paramagnetism, localized spins supported by enhanced exchange couplings are frustrated by the effects of delocalized electrons. This frustration produces a non-collinear spin state and may be responsible for macroscopic spin-glass behavior. Competing phases can be uncovered by external perturbations such as pressure or magnetic field, which enhance the frustration. △ Less

Submitted 9 February, 2017; originally announced February 2017.

Comments: 15 pages, 5 figures

Journal ref: Sci Rep. 7(1), 17129 (2017)

arXiv:1601.07677 [pdf, other]

Restriction of averaging operators to algebraic varieties over finite fields

Authors: Doowon Koh, Seongjun Yeom

Abstract: We study $L^p\to L^r$ estimates for restricted averaging operators related to algebraic varieties $V$ of $d$-dimensional vector spaces over finite fields $\mathbb F_q$ with $q$ elements. We observe properties of both the Fourier restriction operator and the averaging operator over $V\subset \mathbb F_q^d.$ As a consequence, we obtain optimal results on the restricted averaging problems for spheres… ▽ More We study $L^p\to L^r$ estimates for restricted averaging operators related to algebraic varieties $V$ of $d$-dimensional vector spaces over finite fields $\mathbb F_q$ with $q$ elements. We observe properties of both the Fourier restriction operator and the averaging operator over $V\subset \mathbb F_q^d.$ As a consequence, we obtain optimal results on the restricted averaging problems for spheres and paraboloids in dimensions $d\ge2,$ and cones in odd dimensions $d\ge 3.$ In addition, when the variety $V$ is a cone lying in an even dimensional vector space over $\mathbb F_q$ and $-1$ is a square number in $\mathbb F_q$, we also obtain sharp estimates except for two endpoints. △ Less

Submitted 1 September, 2016; v1 submitted 28 January, 2016; originally announced January 2016.

Comments: 15 pages, 1 figure, introduction and abstract were changed

MSC Class: 42B05 (Primary); 11T23 (Secondary)

arXiv:1009.0081 [pdf, ps, other]

doi 10.1016/j.susc.2011.03.025

Strain-induced pseudo-magnetic fields and charging effects on CVD-grown graphene

Authors: N. -C. Yeh, M. L. Teague, S. Yeom, B. L. Standley, D. A. Boyd, M. W. Bockrath

Abstract: Atomically resolved imaging and spectroscopic characteristics of graphene grown by chemical vapor deposition (CVD) on copper are investigated by means of scanning tunneling microscopy and spectroscopy (STM/STS). For CVD-grown graphene remaining on the copper substrate, the monolayer carbon structures exhibit ripples and appear strongly strained, with different regions exhibiting different lattice… ▽ More Atomically resolved imaging and spectroscopic characteristics of graphene grown by chemical vapor deposition (CVD) on copper are investigated by means of scanning tunneling microscopy and spectroscopy (STM/STS). For CVD-grown graphene remaining on the copper substrate, the monolayer carbon structures exhibit ripples and appear strongly strained, with different regions exhibiting different lattice structures and electronic density of states (DOS). In particular, ridges appear along the boundaries of different lattice structures, which exhibit excess charging effects. Additionally, the large and non-uniform strain induces pseudo-magnetic field up to ~ 50 Tesla, as manifested by the integer and fractional pseudo-magnetic field quantum Hall effect (IQHE and FQHE) in the DOS of graphene. In contrast, for graphene transferred from copper to SiO2 substrates after the CVD growth, the average strain on the whole is reduced, so are the corresponding charging effects and pseudo-magnetic fields except for sample areas near topographical ridges. These findings suggest feasible "strain engineering" of the electronic states of graphene by proper design of the substrates and growth conditions. △ Less

Submitted 26 March, 2011; v1 submitted 31 August, 2010; originally announced September 2010.

Comments: 10 pages, 9 figures. Accepted for publication in a special issue of Surface Science: "Graphene Surfaces and Interfaces". Contac author: Nai-Chang Yeh ([email protected])

Journal ref: Surface Science 605, 1649-1656 (2011)

arXiv:cond-mat/0211494 [pdf]

doi 10.1140/epje/e2004-00028-1

Structure and thermodynamics of associating rods solutions

Authors: Min Sun Yeom, Alexander V. Ermoshkin, Monica Olvera de la Cruz

Abstract: Thermoreversible sol-gel transitions in solutions of rod-like associating polymers are analyzed by computer simulations and by mean field models. The sol-gel transition is determined by the divergence of the cluster weight average. The analytically determined sol-gel transition is in good agreement with the simulation results. At low temperatures we observe a peak in the heat capacity, which max… ▽ More Thermoreversible sol-gel transitions in solutions of rod-like associating polymers are analyzed by computer simulations and by mean field models. The sol-gel transition is determined by the divergence of the cluster weight average. The analytically determined sol-gel transition is in good agreement with the simulation results. At low temperatures we observe a peak in the heat capacity, which maximum is associated with the precipitation transition. The gelation transition is sensitive to the number of associating groups per rod but nearly insensitive to the spatial distribution of associating groups around the rod. The precipitation is strongly dependent on both the number and distribution of associating groups per rod. We find negligible nematic orientational order at the gelation and precipitation transitions. △ Less

Submitted 21 November, 2002; originally announced November 2002.

Showing 1–27 of 27 results for author: Yeom, S