-
Locating Risk: Task Designers and the Challenge of Risk Disclosure in RAI Content Work
Authors:
Alice Qian Zhang,
Ryland Shaw,
Laura Dabbish,
Jina Suh,
Hong Shen
Abstract:
As AI systems are increasingly tested and deployed in open-ended and high-stakes domains, crowd workers are often tasked with responsible AI (RAI) content work. These tasks include labeling violent content, moderating disturbing text, or simulating harmful behavior for red teaming exercises to shape AI system behaviors. While prior efforts have highlighted the risks to worker well-being associated…
▽ More
As AI systems are increasingly tested and deployed in open-ended and high-stakes domains, crowd workers are often tasked with responsible AI (RAI) content work. These tasks include labeling violent content, moderating disturbing text, or simulating harmful behavior for red teaming exercises to shape AI system behaviors. While prior efforts have highlighted the risks to worker well-being associated with RAI content work, far less attention has been paid to how these risks are communicated to workers. Existing transparency frameworks and guidelines such as model cards, datasheets, and crowdworksheets focus on documenting model information and dataset collection processes, but they overlook an important aspect of disclosing well-being risks to workers. In the absence of standard workflows or clear guidance, the consistent application of content warnings, consent flows, or other forms of well-being risk disclosure remain unclear. This study investigates how task designers approach risk disclosure in crowdsourced RAI tasks. Drawing on interviews with 23 task designers across academic and industry sectors, we examine how well-being risk is recognized, interpreted, and communicated in practice. Our findings surface a need to support task designers in identifying and communicating well-being risk not only to support crowdworker well-being but also to strengthen the ethical integrity and technical efficacy of AI development pipelines.
△ Less
Submitted 30 May, 2025;
originally announced May 2025.
-
An Adaptive and Parameter-Free Nesterov's Accelerated Gradient Method for Convex Optimization
Authors:
Jaewook J. Suh,
Shiqian Ma
Abstract:
We propose AdaNAG, an adaptive accelerated gradient method based on Nesterov's accelerated gradient method. AdaNAG is line-search-free, parameter-free, and achieves the accelerated convergence rates $f(x_k) - f_\star = \mathcal{O}\left(1/k^2\right)$ and $\min_{i\in\left\{1,\dots, k\right\}} \|\nabla f(x_i)\|^2 = \mathcal{O}\left(1/k^3\right)$ for $L$-smooth convex function $f$. We provide a Lyapun…
▽ More
We propose AdaNAG, an adaptive accelerated gradient method based on Nesterov's accelerated gradient method. AdaNAG is line-search-free, parameter-free, and achieves the accelerated convergence rates $f(x_k) - f_\star = \mathcal{O}\left(1/k^2\right)$ and $\min_{i\in\left\{1,\dots, k\right\}} \|\nabla f(x_i)\|^2 = \mathcal{O}\left(1/k^3\right)$ for $L$-smooth convex function $f$. We provide a Lyapunov analysis for the convergence proof of AdaNAG, which additionally enables us to propose a novel adaptive gradient descent (GD) method, AdaGD. AdaGD achieves the non-ergodic convergence rate $f(x_k) - f_\star = \mathcal{O}\left(1/k\right)$, like the original GD. The analysis of AdaGD also motivated us to propose a generalized AdaNAG that includes practically useful variants of AdaNAG. Numerical results demonstrate that our methods outperform some other recent adaptive methods for representative applications.
△ Less
Submitted 16 May, 2025;
originally announced May 2025.
-
Dexterous Contact-Rich Manipulation via the Contact Trust Region
Authors:
H. J. Terry Suh,
Tao Pang,
Tong Zhao,
Russ Tedrake
Abstract:
What is a good local description of contact dynamics for contact-rich manipulation, and where can we trust this local description? While many approaches often rely on the Taylor approximation of dynamics with an ellipsoidal trust region, we argue that such approaches are fundamentally inconsistent with the unilateral nature of contact. As a remedy, we present the Contact Trust Region (CTR), which…
▽ More
What is a good local description of contact dynamics for contact-rich manipulation, and where can we trust this local description? While many approaches often rely on the Taylor approximation of dynamics with an ellipsoidal trust region, we argue that such approaches are fundamentally inconsistent with the unilateral nature of contact. As a remedy, we present the Contact Trust Region (CTR), which captures the unilateral nature of contact while remaining efficient for computation. With CTR, we first develop a Model-Predictive Control (MPC) algorithm capable of synthesizing local contact-rich plans. Then, we extend this capability to plan globally by stitching together local MPC plans, enabling efficient and dexterous contact-rich manipulation. To verify the performance of our method, we perform comprehensive evaluations, both in high-fidelity simulation and on hardware, on two contact-rich systems: a planar IiwaBimanual system and a 3D AllegroHand system. On both systems, our method offers a significantly lower-compute alternative to existing RL-based approaches to contact-rich manipulation. In particular, our Allegro in-hand manipulation policy, in the form of a roadmap, takes fewer than 10 minutes to build offline on a standard laptop using just its CPU, with online inference taking just a few seconds. Experiment data, video and code are available at ctr.theaiinstitute.com.
△ Less
Submitted 14 May, 2025; v1 submitted 4 May, 2025;
originally announced May 2025.
-
When Testing AI Tests Us: Safeguarding Mental Health on the Digital Frontlines
Authors:
Sachin R. Pendse,
Darren Gergle,
Rachel Kornfield,
Jonah Meyerhoff,
David Mohr,
Jina Suh,
Annie Wescott,
Casey Williams,
Jessica Schleider
Abstract:
Red-teaming is a core part of the infrastructure that ensures that AI models do not produce harmful content. Unlike past technologies, the black box nature of generative AI systems necessitates a uniquely interactional mode of testing, one in which individuals on red teams actively interact with the system, leveraging natural language to simulate malicious actors and solicit harmful outputs. This…
▽ More
Red-teaming is a core part of the infrastructure that ensures that AI models do not produce harmful content. Unlike past technologies, the black box nature of generative AI systems necessitates a uniquely interactional mode of testing, one in which individuals on red teams actively interact with the system, leveraging natural language to simulate malicious actors and solicit harmful outputs. This interactional labor done by red teams can result in mental health harms that are uniquely tied to the adversarial engagement strategies necessary to effectively red team. The importance of ensuring that generative AI models do not propagate societal or individual harm is widely recognized -- one less visible foundation of end-to-end AI safety is also the protection of the mental health and wellbeing of those who work to keep model outputs safe. In this paper, we argue that the unmet mental health needs of AI red-teamers is a critical workplace safety concern. Through analyzing the unique mental health impacts associated with the labor done by red teams, we propose potential individual and organizational strategies that could be used to meet these needs, and safeguard the mental health of red-teamers. We develop our proposed strategies through drawing parallels between common red-teaming practices and interactional labor common to other professions (including actors, mental health professionals, conflict photographers, and content moderators), describing how individuals and organizations within these professional spaces safeguard their mental health given similar psychological demands. Drawing on these protective practices, we describe how safeguards could be adapted for the distinct mental health challenges experienced by red teaming organizations as they mitigate emerging technological risks on the new digital frontlines.
△ Less
Submitted 29 April, 2025;
originally announced April 2025.
-
Longitudinal Study on Social and Emotional Use of AI Conversational Agent
Authors:
Mohit Chandra,
Javier Hernandez,
Gonzalo Ramos,
Mahsa Ershadi,
Ananya Bhattacharjee,
Judith Amores,
Ebele Okoli,
Ann Paradiso,
Shahed Warreth,
Jina Suh
Abstract:
Development in digital technologies has continuously reshaped how individuals seek and receive social and emotional support. While online platforms and communities have long served this need, the increased integration of general-purpose conversational AI into daily lives has introduced new dynamics in how support is provided and experienced. Existing research has highlighted both benefits (e.g., w…
▽ More
Development in digital technologies has continuously reshaped how individuals seek and receive social and emotional support. While online platforms and communities have long served this need, the increased integration of general-purpose conversational AI into daily lives has introduced new dynamics in how support is provided and experienced. Existing research has highlighted both benefits (e.g., wider access to well-being resources) and potential risks (e.g., over-reliance) of using AI for support seeking. In this five-week, exploratory study, we recruited 149 participants divided into two usage groups: a baseline usage group (BU, n=60) that used the internet and AI as usual, and an active usage group (AU, n=89) encouraged to use one of four commercially available AI tools (Microsoft Copilot, Google Gemini, PI AI, ChatGPT) for social and emotional interactions. Our analysis revealed significant increases in perceived attachment towards AI (32.99 percentage points), perceived AI empathy (25.8 p.p.), and motivation to use AI for entertainment (22.90 p.p.) among the AU group. We also observed that individual differences (e.g., gender identity, prior AI usage) influenced perceptions of AI empathy and attachment. Lastly, the AU group expressed higher comfort in seeking personal help, managing stress, obtaining social support, and talking about health with AI, indicating potential for broader emotional support while highlighting the need for safeguards against problematic usage. Overall, our exploratory findings underscore the importance of developing consumer-facing AI tools that support emotional well-being responsibly, while empowering users to understand the limitations of these tools.
△ Less
Submitted 18 April, 2025;
originally announced April 2025.
-
Higher-Order Binding of Language Model Virtual Personas: a Study on Approximating Political Partisan Misperceptions
Authors:
Minwoo Kang,
Suhong Moon,
Seung Hyeong Lee,
Ayush Raj,
Joseph Suh,
David M. Chan
Abstract:
Large language models (LLMs) are increasingly capable of simulating human behavior, offering cost-effective ways to estimate user responses during the early phases of survey design. While previous studies have examined whether models can reflect individual opinions or attitudes, we argue that a \emph{higher-order} binding of virtual personas requires successfully approximating not only the opinion…
▽ More
Large language models (LLMs) are increasingly capable of simulating human behavior, offering cost-effective ways to estimate user responses during the early phases of survey design. While previous studies have examined whether models can reflect individual opinions or attitudes, we argue that a \emph{higher-order} binding of virtual personas requires successfully approximating not only the opinions of a user as an identified member of a group, but also the nuanced ways in which that user perceives and evaluates those outside the group. In particular, faithfully simulating how humans perceive different social groups is critical for applying LLMs to various political science studies, including timely topics on polarization dynamics, inter-group conflict, and democratic backsliding. To this end, we propose a novel methodology for constructing virtual personas with synthetic user ``backstories" generated as extended, multi-turn interview transcripts. Our generated backstories are longer, rich in detail, and consistent in authentically describing a singular individual, compared to previous methods. We show that virtual personas conditioned on our backstories closely replicate human response distributions (up to an 87\% improvement as measured by Wasserstein Distance) and produce effect sizes that closely match those observed in the original studies. Altogether, our work extends the applicability of LLMs beyond estimating individual self-opinions, enabling their use in a broader range of human studies.
△ Less
Submitted 15 April, 2025;
originally announced April 2025.
-
Efficient Implementation of Reinforcement Learning over Homomorphic Encryption
Authors:
Jihoon Suh,
Takashi Tanaka
Abstract:
We investigate encrypted control policy synthesis over the cloud. While encrypted control implementations have been studied previously, we focus on the less explored paradigm of privacy-preserving control synthesis, which can involve heavier computations ideal for cloud outsourcing. We classify control policy synthesis into model-based, simulator-driven, and data-driven approaches and examine thei…
▽ More
We investigate encrypted control policy synthesis over the cloud. While encrypted control implementations have been studied previously, we focus on the less explored paradigm of privacy-preserving control synthesis, which can involve heavier computations ideal for cloud outsourcing. We classify control policy synthesis into model-based, simulator-driven, and data-driven approaches and examine their implementation over fully homomorphic encryption (FHE) for privacy enhancements. A key challenge arises from comparison operations (min or max) in standard reinforcement learning algorithms, which are difficult to execute over encrypted data. This observation motivates our focus on Relative-Entropy-regularized reinforcement learning (RL) problems, which simplifies encrypted evaluation of synthesis algorithms due to their comparison-free structures. We demonstrate how linearly solvable value iteration, path integral control, and Z-learning can be readily implemented over FHE. We conduct a case study of our approach through numerical simulations of encrypted Z-learning in a grid world environment using the CKKS encryption scheme, showing convergence with acceptable approximation error. Our work suggests the potential for secure and efficient cloud-based reinforcement learning.
△ Less
Submitted 12 April, 2025;
originally announced April 2025.
-
Effective Automation to Support the Human Infrastructure in AI Red Teaming
Authors:
Alice Qian Zhang,
Jina Suh,
Mary L. Gray,
Hong Shen
Abstract:
As artificial intelligence (AI) systems become increasingly embedded in critical societal functions, the need for robust red teaming methodologies continues to grow. In this forum piece, we examine emerging approaches to automating AI red teaming, with a particular focus on how the application of automated methods affects human-driven efforts. We discuss the role of labor in automated red teaming…
▽ More
As artificial intelligence (AI) systems become increasingly embedded in critical societal functions, the need for robust red teaming methodologies continues to grow. In this forum piece, we examine emerging approaches to automating AI red teaming, with a particular focus on how the application of automated methods affects human-driven efforts. We discuss the role of labor in automated red teaming processes, the benefits and limitations of automation, and its broader implications for AI safety and labor practices. Drawing on existing frameworks and case studies, we argue for a balanced approach that combines human expertise with automated tools to strengthen AI risk assessment. Finally, we highlight key challenges in scaling automated red teaming, including considerations around worker proficiency, agency, and context-awareness.
△ Less
Submitted 27 March, 2025;
originally announced March 2025.
-
Uncovering inequalities in new knowledge learning by large language models across different languages
Authors:
Chenglong Wang,
Haoyu Tang,
Xiyuan Yang,
Yueqi Xie,
Jina Suh,
Sunayana Sitaram,
Junming Huang,
Yu Xie,
Zhaoya Gong,
Xing Xie,
Fangzhao Wu
Abstract:
As large language models (LLMs) gradually become integral tools for problem solving in daily life worldwide, understanding linguistic inequality is becoming increasingly important. Existing research has primarily focused on static analyses that assess the disparities in the existing knowledge and capabilities of LLMs across languages. However, LLMs are continuously evolving, acquiring new knowledg…
▽ More
As large language models (LLMs) gradually become integral tools for problem solving in daily life worldwide, understanding linguistic inequality is becoming increasingly important. Existing research has primarily focused on static analyses that assess the disparities in the existing knowledge and capabilities of LLMs across languages. However, LLMs are continuously evolving, acquiring new knowledge to generate up-to-date, domain-specific responses. Investigating linguistic inequalities within this dynamic process is, therefore, also essential. In this paper, we explore inequalities in new knowledge learning by LLMs across different languages and four key dimensions: effectiveness, transferability, prioritization, and robustness. Through extensive experiments under two settings (in-context learning and fine-tuning) using both proprietary and open-source models, we demonstrate that low-resource languages consistently face disadvantages across all four dimensions. By shedding light on these disparities, we aim to raise awareness of linguistic inequalities in LLMs' new knowledge learning, fostering the development of more inclusive and equitable future LLMs.
△ Less
Submitted 5 March, 2025;
originally announced March 2025.
-
Taxation Perspectives from Large Language Models: A Case Study on Additional Tax Penalties
Authors:
Eunkyung Choi,
Young Jin Suh,
Hun Park,
Wonseok Hwang
Abstract:
How capable are large language models (LLMs) in the domain of taxation? Although numerous studies have explored the legal domain in general, research dedicated to taxation remain scarce. Moreover, the datasets used in these studies are either simplified, failing to reflect the real-world complexities, or unavailable as open source. To address this gap, we introduce PLAT, a new benchmark designed t…
▽ More
How capable are large language models (LLMs) in the domain of taxation? Although numerous studies have explored the legal domain in general, research dedicated to taxation remain scarce. Moreover, the datasets used in these studies are either simplified, failing to reflect the real-world complexities, or unavailable as open source. To address this gap, we introduce PLAT, a new benchmark designed to assess the ability of LLMs to predict the legitimacy of additional tax penalties. PLAT is constructed to evaluate LLMs' understanding of tax law, particularly in cases where resolving the issue requires more than just applying related statutes. Our experiments with six LLMs reveal that their baseline capabilities are limited, especially when dealing with conflicting issues that demand a comprehensive understanding. However, we found that enabling retrieval, self-reasoning, and discussion among multiple agents with specific role assignments, this limitation can be mitigated.
△ Less
Submitted 5 March, 2025;
originally announced March 2025.
-
Physics-Driven Data Generation for Contact-Rich Manipulation via Trajectory Optimization
Authors:
Lujie Yang,
H. J. Terry Suh,
Tong Zhao,
Bernhard Paus Graesdal,
Tarik Kelestemur,
Jiuguang Wang,
Tao Pang,
Russ Tedrake
Abstract:
We present a low-cost data generation pipeline that integrates physics-based simulation, human demonstrations, and model-based planning to efficiently generate large-scale, high-quality datasets for contact-rich robotic manipulation tasks. Starting with a small number of embodiment-flexible human demonstrations collected in a virtual reality simulation environment, the pipeline refines these demon…
▽ More
We present a low-cost data generation pipeline that integrates physics-based simulation, human demonstrations, and model-based planning to efficiently generate large-scale, high-quality datasets for contact-rich robotic manipulation tasks. Starting with a small number of embodiment-flexible human demonstrations collected in a virtual reality simulation environment, the pipeline refines these demonstrations using optimization-based kinematic retargeting and trajectory optimization to adapt them across various robot embodiments and physical parameters. This process yields a diverse, physically consistent dataset that enables cross-embodiment data transfer, and offers the potential to reuse legacy datasets collected under different hardware configurations or physical parameters. We validate the pipeline's effectiveness by training diffusion policies from the generated datasets for challenging contact-rich manipulation tasks across multiple robot embodiments, including a floating Allegro hand and bimanual robot arms. The trained policies are deployed zero-shot on hardware for bimanual iiwa arms, achieving high success rates with minimal human input. Project website: https://lujieyang.github.io/physicsgen/.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
Language Model Fine-Tuning on Scaled Survey Data for Predicting Distributions of Public Opinions
Authors:
Joseph Suh,
Erfan Jahanparast,
Suhong Moon,
Minwoo Kang,
Serina Chang
Abstract:
Large language models (LLMs) present novel opportunities in public opinion research by predicting survey responses in advance during the early stages of survey design. Prior methods steer LLMs via descriptions of subpopulations as LLMs' input prompt, yet such prompt engineering approaches have struggled to faithfully predict the distribution of survey responses from human subjects. In this work, w…
▽ More
Large language models (LLMs) present novel opportunities in public opinion research by predicting survey responses in advance during the early stages of survey design. Prior methods steer LLMs via descriptions of subpopulations as LLMs' input prompt, yet such prompt engineering approaches have struggled to faithfully predict the distribution of survey responses from human subjects. In this work, we propose directly fine-tuning LLMs to predict response distributions by leveraging unique structural characteristics of survey data. To enable fine-tuning, we curate SubPOP, a significantly scaled dataset of 3,362 questions and 70K subpopulation-response pairs from well-established public opinion surveys. We show that fine-tuning on SubPOP greatly improves the match between LLM predictions and human responses across various subpopulations, reducing the LLM-human gap by up to 46% compared to baselines, and achieves strong generalization to unseen surveys and subpopulations. Our findings highlight the potential of survey-based fine-tuning to improve opinion prediction for diverse, real-world subpopulations and therefore enable more efficient survey designs. Our code is available at https://github.com/JosephJeesungSuh/subpop.
△ Less
Submitted 23 February, 2025;
originally announced February 2025.
-
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU
Authors:
Heejun Lee,
Geon Park,
Jaduk Suh,
Sung Ju Hwang
Abstract:
In modern large language models (LLMs), handling very long context lengths presents significant challenges as it causes slower inference speeds and increased memory costs. Additionally, most existing pre-trained LLMs fail to generalize beyond their original training sequence lengths. To enable efficient and practical long-context utilization, we introduce InfiniteHiP, a novel, and practical LLM in…
▽ More
In modern large language models (LLMs), handling very long context lengths presents significant challenges as it causes slower inference speeds and increased memory costs. Additionally, most existing pre-trained LLMs fail to generalize beyond their original training sequence lengths. To enable efficient and practical long-context utilization, we introduce InfiniteHiP, a novel, and practical LLM inference framework that accelerates processing by dynamically eliminating irrelevant context tokens through a modular hierarchical token pruning algorithm. Our method also allows generalization to longer sequences by selectively applying various RoPE adjustment methods according to the internal attention patterns within LLMs. Furthermore, we offload the key-value cache to host memory during inference, significantly reducing GPU memory pressure. As a result, InfiniteHiP enables the processing of up to 3 million tokens on a single L40s 48GB GPU -- 3x larger -- without any permanent loss of context information. Our framework achieves an 18.95x speedup in attention decoding for a 1 million token context without requiring additional training. We implement our method in the SGLang framework and demonstrate its effectiveness and practicality through extensive evaluations.
△ Less
Submitted 12 February, 2025;
originally announced February 2025.
-
Encrypted Computation of Collision Probability for Secure Satellite Conjunction Analysis
Authors:
Jihoon Suh,
Michael Hibbard,
Kaoru Teranishi,
Takashi Tanaka,
Moriba Jah,
Maruthi Akella
Abstract:
The computation of collision probability ($\mathcal{P}_c$) is crucial for space environmentalism and sustainability by providing decision-making knowledge that can prevent collisions between anthropogenic space objects. However, the accuracy and precision of $\mathcal{P}_c$ computations is often compromised by limitations in computational resources and data availability. While significant improvem…
▽ More
The computation of collision probability ($\mathcal{P}_c$) is crucial for space environmentalism and sustainability by providing decision-making knowledge that can prevent collisions between anthropogenic space objects. However, the accuracy and precision of $\mathcal{P}_c$ computations is often compromised by limitations in computational resources and data availability. While significant improvements have been made in the computational aspects, the rising concerns regarding the privacy of collaborative data sharing can be a major limiting factor in the future conjunction analysis and risk assessment, especially as the space environment grows increasingly privatized, competitive, and fraught with conflicting strategic interests. This paper argues that the importance of privacy measures in space situational awareness (SSA) is underappreciated, and regulatory and compliance measures currently in place are not sufficient by themselves, presenting a significant gap.
To address this gap, we introduce a novel encrypted architecture that leverages advanced cryptographic techniques, including homomorphic encryption (HE) and multi-party computation (MPC), to safeguard the privacy of entities computing space sustainability metrics, inter alia, $\mathcal{P}_c$. Our proposed protocol, Encrypted $\mathcal{P}_c$, integrates the Monte Carlo estimation algorithm with cryptographic solutions, enabling secure collision probability computation without exposing sensitive or proprietary information. This research advances secure conjunction analysis by developing a secure MPC protocol for $\mathcal{P}_c$ computation and highlights the need for innovative protocols to ensure a more secure and cooperative SSA landscape.
△ Less
Submitted 13 January, 2025;
originally announced January 2025.
-
AI red-teaming is a sociotechnical challenge: on values, labor, and harms
Authors:
Tarleton Gillespie,
Ryland Shaw,
Mary L. Gray,
Jina Suh
Abstract:
As generative AI technologies find more and more real-world applications, the importance of testing their performance and safety seems paramount. "Red-teaming" has quickly become the primary approach to test AI models--prioritized by AI companies, and enshrined in AI policy and regulation. Members of red teams act as adversaries, probing AI systems to test their safety mechanisms and uncover vulne…
▽ More
As generative AI technologies find more and more real-world applications, the importance of testing their performance and safety seems paramount. "Red-teaming" has quickly become the primary approach to test AI models--prioritized by AI companies, and enshrined in AI policy and regulation. Members of red teams act as adversaries, probing AI systems to test their safety mechanisms and uncover vulnerabilities. Yet we know far too little about this work or its implications. This essay calls for collaboration between computer scientists and social scientists to study the sociotechnical systems surrounding AI technologies, including the work of red-teaming, to avoid repeating the mistakes of the recent past. We highlight the importance of understanding the values and assumptions behind red-teaming, the labor arrangements involved, and the psychological impacts on red-teamers, drawing insights from the lessons learned around the work of content moderation.
△ Less
Submitted 3 April, 2025; v1 submitted 12 December, 2024;
originally announced December 2024.
-
Numerical Analysis of HiPPO-LegS ODE for Deep State Space Models
Authors:
Jaesung R. Park,
Jaewook J. Suh,
Youngjoon Hong,
Ernest K. Ryu
Abstract:
In deep learning, the recently introduced state space models utilize HiPPO (High-order Polynomial Projection Operators) memory units to approximate continuous-time trajectories of input functions using ordinary differential equations (ODEs), and these techniques have shown empirical success in capturing long-range dependencies in long input sequences. However, the mathematical foundations of these…
▽ More
In deep learning, the recently introduced state space models utilize HiPPO (High-order Polynomial Projection Operators) memory units to approximate continuous-time trajectories of input functions using ordinary differential equations (ODEs), and these techniques have shown empirical success in capturing long-range dependencies in long input sequences. However, the mathematical foundations of these ODEs, particularly the singular HiPPO-LegS (Legendre Scaled) ODE, and their corresponding numerical discretizations remain unsettled. In this work, we fill this gap by establishing that HiPPO-LegS ODE is well-posed despite its singularity, albeit without the freedom of arbitrary initial conditions. Further, we establish convergence of the associated numerical discretization schemes for Riemann integrable input functions.
△ Less
Submitted 8 June, 2025; v1 submitted 11 December, 2024;
originally announced December 2024.
-
From Lived Experience to Insight: Unpacking the Psychological Risks of Using AI Conversational Agents
Authors:
Mohit Chandra,
Suchismita Naik,
Denae Ford,
Ebele Okoli,
Munmun De Choudhury,
Mahsa Ershadi,
Gonzalo Ramos,
Javier Hernandez,
Ananya Bhattacharjee,
Shahed Warreth,
Jina Suh
Abstract:
Recent gains in popularity of AI conversational agents have led to their increased use for improving productivity and supporting well-being. While previous research has aimed to understand the risks associated with interactions with AI conversational agents, these studies often fall short in capturing the lived experiences of individuals. Additionally, psychological risks have often been presented…
▽ More
Recent gains in popularity of AI conversational agents have led to their increased use for improving productivity and supporting well-being. While previous research has aimed to understand the risks associated with interactions with AI conversational agents, these studies often fall short in capturing the lived experiences of individuals. Additionally, psychological risks have often been presented as a sub-category within broader AI-related risks in past taxonomy works, leading to under-representation of the impact of psychological risks of AI use. To address these challenges, our work presents a novel risk taxonomy focusing on psychological risks of using AI gathered through the lived experiences of individuals. We employed a mixed-method approach, involving a comprehensive survey with 283 people with lived mental health experience and workshops involving experts with lived experience to develop a psychological risk taxonomy. Our taxonomy features 19 AI behaviors, 21 negative psychological impacts, and 15 contexts related to individuals. Additionally, we propose a novel multi-path vignette-based framework for understanding the complex interplay between AI behaviors, psychological impacts, and individual user contexts. Finally, based on the feedback obtained from the workshop sessions, we present design recommendations for developing safer and more robust AI agents. Our work offers an in-depth understanding of the psychological risks associated with AI conversational agents and provides actionable recommendations for policymakers, researchers, and developers.
△ Less
Submitted 29 May, 2025; v1 submitted 10 December, 2024;
originally announced December 2024.
-
Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generative Models
Authors:
Jungwon Park,
Jungmin Ko,
Dongnam Byun,
Jangwon Suh,
Wonjong Rhee
Abstract:
Recent text-to-image diffusion models leverage cross-attention layers, which have been effectively utilized to enhance a range of visual generative tasks. However, our understanding of cross-attention layers remains somewhat limited. In this study, we introduce a mechanistic interpretability approach for diffusion models by constructing Head Relevance Vectors (HRVs) that align with human-specified…
▽ More
Recent text-to-image diffusion models leverage cross-attention layers, which have been effectively utilized to enhance a range of visual generative tasks. However, our understanding of cross-attention layers remains somewhat limited. In this study, we introduce a mechanistic interpretability approach for diffusion models by constructing Head Relevance Vectors (HRVs) that align with human-specified visual concepts. An HRV for a given visual concept has a length equal to the total number of cross-attention heads, with each element representing the importance of the corresponding head for the given visual concept. To validate HRVs as interpretable features, we develop an ordered weakening analysis that demonstrates their effectiveness. Furthermore, we propose concept strengthening and concept adjusting methods and apply them to enhance three visual generative tasks. Our results show that HRVs can reduce misinterpretations of polysemous words in image generation, successfully modify five challenging attributes in image editing, and mitigate catastrophic neglect in multi-concept generation. Overall, our work provides an advancement in understanding cross-attention layers and introduces new approaches for fine-controlling these layers at the head level.
△ Less
Submitted 24 February, 2025; v1 submitted 3 December, 2024;
originally announced December 2024.
-
Is Linear Feedback on Smoothed Dynamics Sufficient for Stabilizing Contact-Rich Plans?
Authors:
Yuki Shirai,
Tong Zhao,
H. J. Terry Suh,
Huaijiang Zhu,
Xinpei Ni,
Jiuguang Wang,
Max Simchowitz,
Tao Pang
Abstract:
Designing planners and controllers for contact-rich manipulation is extremely challenging as contact violates the smoothness conditions that many gradient-based controller synthesis tools assume. Contact smoothing approximates a non-smooth system with a smooth one, allowing one to use these synthesis tools more effectively. However, applying classical control synthesis methods to smoothed contact…
▽ More
Designing planners and controllers for contact-rich manipulation is extremely challenging as contact violates the smoothness conditions that many gradient-based controller synthesis tools assume. Contact smoothing approximates a non-smooth system with a smooth one, allowing one to use these synthesis tools more effectively. However, applying classical control synthesis methods to smoothed contact dynamics remains relatively under-explored. This paper analyzes the efficacy of linear controller synthesis using differential simulators based on contact smoothing. We introduce natural baselines for leveraging contact smoothing to compute (a) open-loop plans robust to uncertain conditions and/or dynamics, and (b) feedback gains to stabilize around open-loop plans. Using robotic bimanual whole-body manipulation as a testbed, we perform extensive empirical experiments on over 300 trajectories and analyze why LQR seems insufficient for stabilizing contact-rich plans. The video summarizing this paper and hardware experiments is found here: https://youtu.be/HLaKi6qbwQg?si=_zCAmBBD6rGSitm9.
△ Less
Submitted 14 May, 2025; v1 submitted 10 November, 2024;
originally announced November 2024.
-
Optimization Algorithm Design via Electric Circuits
Authors:
Stephen P. Boyd,
Tetiana Parshakova,
Ernest K. Ryu,
Jaewook J. Suh
Abstract:
We present a novel methodology for convex optimization algorithm design using ideas from electric RLC circuits. Given an optimization problem, the first stage of the methodology is to design an appropriate electric circuit whose continuous-time dynamics converge to the solution of the optimization problem at hand. Then, the second stage is an automated, computer-assisted discretization of the cont…
▽ More
We present a novel methodology for convex optimization algorithm design using ideas from electric RLC circuits. Given an optimization problem, the first stage of the methodology is to design an appropriate electric circuit whose continuous-time dynamics converge to the solution of the optimization problem at hand. Then, the second stage is an automated, computer-assisted discretization of the continuous-time dynamics, yielding a provably convergent discrete-time algorithm. Our methodology recovers many classical (distributed) optimization algorithms and enables users to quickly design and explore a wide range of new algorithms with convergence guarantees.
△ Less
Submitted 20 January, 2025; v1 submitted 4 November, 2024;
originally announced November 2024.
-
AI on My Shoulder: Supporting Emotional Labor in Front-Office Roles with an LLM-based Empathetic Coworker
Authors:
Vedant Das Swain,
Qiuyue "Joy" Zhong,
Jash Rajesh Parekh,
Yechan Jeon,
Roy Zimmermann,
Mary Czerwinski,
Jina Suh,
Varun Mishra,
Koustuv Saha,
Javier Hernandez
Abstract:
Client-Service Representatives (CSRs) are vital to organizations. Frequent interactions with disgruntled clients, however, disrupt their mental well-being. To help CSRs regulate their emotions while interacting with uncivil clients, we designed Care-Pilot, an LLM-powered assistant, and evaluated its efficacy, perception, and use. Our comparative analyses between 665 human and Care-Pilot-generated…
▽ More
Client-Service Representatives (CSRs) are vital to organizations. Frequent interactions with disgruntled clients, however, disrupt their mental well-being. To help CSRs regulate their emotions while interacting with uncivil clients, we designed Care-Pilot, an LLM-powered assistant, and evaluated its efficacy, perception, and use. Our comparative analyses between 665 human and Care-Pilot-generated support messages highlight Care-Pilot's ability to adapt to and demonstrate empathy in various incivility incidents. Additionally, 143 CSRs assessed Care-Pilot's empathy as more sincere and actionable than human messages. Finally, we interviewed 20 CSRs who interacted with Care-Pilot in a simulation exercise. They reported that Care-Pilot helped them avoid negative thinking, recenter thoughts, and humanize clients; showing potential for bridging gaps in coworker support. Yet, they also noted deployment challenges and emphasized the indispensability of shared experiences. We discuss future designs and societal implications of AI-mediated emotional labor, underscoring empathy as a critical function for AI assistants for worker mental health.
△ Less
Submitted 27 February, 2025; v1 submitted 18 October, 2024;
originally announced November 2024.
-
AURA: Amplifying Understanding, Resilience, and Awareness for Responsible AI Content Work
Authors:
Alice Qian Zhang,
Judith Amores,
Mary L. Gray,
Mary Czerwinski,
Jina Suh
Abstract:
Behind the scenes of maintaining the safety of technology products from harmful and illegal digital content lies unrecognized human labor. The recent rise in the use of generative AI technologies and the accelerating demands to meet responsible AI (RAI) aims necessitates an increased focus on the labor behind such efforts in the age of AI. This study investigates the nature and challenges of conte…
▽ More
Behind the scenes of maintaining the safety of technology products from harmful and illegal digital content lies unrecognized human labor. The recent rise in the use of generative AI technologies and the accelerating demands to meet responsible AI (RAI) aims necessitates an increased focus on the labor behind such efforts in the age of AI. This study investigates the nature and challenges of content work that supports RAI efforts, or "RAI content work," that span content moderation, data labeling, and red teaming -- through the lived experiences of content workers. We conduct a formative survey and semi-structured interview studies to develop a conceptualization of RAI content work and a subsequent framework of recommendations for providing holistic support for content workers. We validate our recommendations through a series of workshops with content workers and derive considerations for and examples of implementing such recommendations. We discuss how our framework may guide future innovation to support the well-being and professional development of the RAI content workforce.
△ Less
Submitted 2 November, 2024;
originally announced November 2024.
-
Understanding Communication Preferences of Information Workers in Engagement with Text-Based Conversational Agents
Authors:
Ananya Bhattacharjee,
Jina Suh,
Mahsa Ershadi,
Shamsi T. Iqbal,
Andrew D. Wilson,
Javier Hernandez
Abstract:
Communication traits in text-based human-AI conversations play pivotal roles in shaping user experiences and perceptions of systems. With the advancement of large language models (LLMs), it is now feasible to analyze these traits at a more granular level. In this study, we explore the preferences of information workers regarding chatbot communication traits across seven applications. Participants…
▽ More
Communication traits in text-based human-AI conversations play pivotal roles in shaping user experiences and perceptions of systems. With the advancement of large language models (LLMs), it is now feasible to analyze these traits at a more granular level. In this study, we explore the preferences of information workers regarding chatbot communication traits across seven applications. Participants were invited to participate in an interactive survey, which featured adjustable sliders, allowing them to adjust and express their preferences for five key communication traits: formality, personification, empathy, sociability, and humor. Our findings reveal distinct communication preferences across different applications; for instance, there was a preference for relatively high empathy in wellbeing contexts and relatively low personification in coding. Similarities in preferences were also noted between applications such as chatbots for customer service and scheduling. These insights offer crucial design guidelines for future chatbots, emphasizing the need for nuanced trait adjustments for each application.
△ Less
Submitted 1 November, 2024; v1 submitted 27 October, 2024;
originally announced October 2024.
-
Dear Diary: A randomized controlled trial of Generative AI coding tools in the workplace
Authors:
Jenna Butler,
Jina Suh,
Sankeerti Haniyur,
Constance Hadley
Abstract:
Generative AI coding tools are relatively new, and their impact on developers extends beyond traditional coding metrics, influencing beliefs about work and developers' roles in the workplace. This study aims to illuminate developers' preexisting beliefs about generative AI tools, their self perceptions, and how regular use of these tools may alter these beliefs. Using a mixed methods approach, inc…
▽ More
Generative AI coding tools are relatively new, and their impact on developers extends beyond traditional coding metrics, influencing beliefs about work and developers' roles in the workplace. This study aims to illuminate developers' preexisting beliefs about generative AI tools, their self perceptions, and how regular use of these tools may alter these beliefs. Using a mixed methods approach, including surveys, a randomized controlled trial, and a three week diary study, we explored the real world application of generative AI tools within a large multinational software company. Our findings reveal that the introduction and sustained use of generative AI coding tools significantly increases developers' perceptions of these tools as both useful and enjoyable. However, developers' views on the trustworthiness of AI generated code remained unchanged. We also discovered unexpected uses of these tools, such as replacing web searches and fostering creative ideation. Additionally, 84 percent of participants reported positive changes in their daily work practices, and 66 percent noted shifts in their feelings about their work, ranging from increased enthusiasm to heightened awareness of the need to stay current with technological advances. This research provides both qualitative and quantitative insights into the evolving role of generative AI in software development and offers practical recommendations for maximizing the benefits of this emerging technology, particularly in balancing the productivity gains from AI-generated code with the need for increased scrutiny and critical evaluation of its outputs.
△ Less
Submitted 23 October, 2024;
originally announced October 2024.
-
Symmetry and parity in Frobenius action on cohomology
Authors:
Junecue Suh
Abstract:
We prove that the Newton polygons of Frobenius on the crystalline cohomology of proper smooth varieties satisfy a symmetry that results, in the case of projective smooth varieties, from Poincaré duality and the hard Lefschetz theorem. As a corollary, we deduce that the Betti numbers in odd degrees of any proper smooth variety over a field are even (a consequence of Hodge symmetry in characteristic…
▽ More
We prove that the Newton polygons of Frobenius on the crystalline cohomology of proper smooth varieties satisfy a symmetry that results, in the case of projective smooth varieties, from Poincaré duality and the hard Lefschetz theorem. As a corollary, we deduce that the Betti numbers in odd degrees of any proper smooth variety over a field are even (a consequence of Hodge symmetry in characteristic zero), answering an old question of Serre. Then we give a generalization and a refinement for arbitrary varieties over finite fields, in response to later questions of Serre and of Katz.
△ Less
Submitted 1 October, 2024;
originally announced October 2024.
-
Ordinary primes in Hilbert modular varieties
Authors:
Junecue Suh
Abstract:
A well-known conjecture, often attributed to Serre, asserts that any motive over any number field has infinitely many ordinary reductions (in the sense that the Newton polygon coincides with the Hodge polygon). In the case of Hilbert modular cuspforms $f$ of parallel weight $(2, \cdots , 2)$, we show how to produce more ordinary primes by using the Sato-Tate equidistribution and combining it with…
▽ More
A well-known conjecture, often attributed to Serre, asserts that any motive over any number field has infinitely many ordinary reductions (in the sense that the Newton polygon coincides with the Hodge polygon). In the case of Hilbert modular cuspforms $f$ of parallel weight $(2, \cdots , 2)$, we show how to produce more ordinary primes by using the Sato-Tate equidistribution and combining it with the Galois theory of the Hecke field. Under the assumption of stronger forms of Sato-Tate equidistribution, we get stronger (but conditional) results. In the case of higher weights, we formulate the ordinariness conjecture for submotives of the intersection cohomology of proper algebraic varieties with motivic coefficients, and verify it for the motives whose $l$-adic Galois realisations are abelian on a finite index subgroup. We get some results for Hilbert cuspforms of weight $(3, \cdots , 3)$, weaker than those for $(2, \cdots , 2)$.
△ Less
Submitted 9 October, 2024; v1 submitted 1 October, 2024;
originally announced October 2024.
-
Electric Control of Polarity in Spin-Orbit Josephson Diode
Authors:
Junghyun Shin,
Jae-Ho Han,
Anjali Rathore,
Joon Sue Lee,
Seung-Bo Shim,
Jinwoong Cha,
Sunghun Park,
Junho Suh
Abstract:
The effect of spin-orbit coupling in a Josephson diode has not been elucidated due to its interplay with the complexity of Josephson devices. Here, we systematically control local electric fields in epitaxial Al-InAs Josephson junctions under in-plane magnetic fields and observe a polarity reversal of the Josephson diode. We interpret this polarity reversal as an effect of field-tunable spin-orbit…
▽ More
The effect of spin-orbit coupling in a Josephson diode has not been elucidated due to its interplay with the complexity of Josephson devices. Here, we systematically control local electric fields in epitaxial Al-InAs Josephson junctions under in-plane magnetic fields and observe a polarity reversal of the Josephson diode. We interpret this polarity reversal as an effect of field-tunable spin-orbit coupling on nonreciprocal Josephson currents. A theoretical model, accounting for Rashba and Dresselhaus spin-orbit couplings in a planar Josephson junction containing many transverse subbands, aligns with the observed polarity reversal and its dependence on magnetic field. Our finding addresses spin-orbit control in a Josephson diode, enabling manipulation of Josephson harmonics.
△ Less
Submitted 3 May, 2025; v1 submitted 26 September, 2024;
originally announced September 2024.
-
Rediscovering the Latent Dimensions of Personality with Large Language Models as Trait Descriptors
Authors:
Joseph Suh,
Suhong Moon,
Minwoo Kang,
David M. Chan
Abstract:
Assessing personality traits using large language models (LLMs) has emerged as an interesting and challenging area of research. While previous methods employ explicit questionnaires, often derived from the Big Five model of personality, we hypothesize that LLMs implicitly encode notions of personality when modeling next-token responses. To demonstrate this, we introduce a novel approach that uncov…
▽ More
Assessing personality traits using large language models (LLMs) has emerged as an interesting and challenging area of research. While previous methods employ explicit questionnaires, often derived from the Big Five model of personality, we hypothesize that LLMs implicitly encode notions of personality when modeling next-token responses. To demonstrate this, we introduce a novel approach that uncovers latent personality dimensions in LLMs by applying singular value de-composition (SVD) to the log-probabilities of trait-descriptive adjectives. Our experiments show that LLMs "rediscover" core personality traits such as extraversion, agreeableness, conscientiousness, neuroticism, and openness without relying on direct questionnaire inputs, with the top-5 factors corresponding to Big Five traits explaining 74.3% of the variance in the latent space. Moreover, we can use the derived principal components to assess personality along the Big Five dimensions, and achieve improvements in average personality prediction accuracy of up to 5% over fine-tuned models, and up to 21% over direct LLM-based scoring techniques.
△ Less
Submitted 15 September, 2024;
originally announced September 2024.
-
On Pseudo $B$-symmetric spacetimes and $f(\mathcal{R})$ gravity
Authors:
Young Jin Suh,
Krishnendu De,
Uday Chand De
Abstract:
This article delivers the characterization of a pseudo $B$ symmetric spacetimes and we illustrate that a pseudo $B$ symmetric spacetime admitting Codazzi type of $B$-tensor represents a perfect fluid spacetime and if this spacetime admits the time-like convergence criterion, then the pseudo $B$ symmetric spacetime fulfills cosmic strong energy criterion and contains pure matter. Besides, we find i…
▽ More
This article delivers the characterization of a pseudo $B$ symmetric spacetimes and we illustrate that a pseudo $B$ symmetric spacetime admitting Codazzi type of $B$-tensor represents a perfect fluid spacetime and if this spacetime admits the time-like convergence criterion, then the pseudo $B$ symmetric spacetime fulfills cosmic strong energy criterion and contains pure matter. Besides, we find in a pseudo $B$ symmetric spacetime with Codazzi type of $B$-tensor the electric part of the Weyl tensor vanishes and has Riemann and Weyl compatible vector fields. Furthermore, it is established that the chosen spacetime with Codazzi type of $B$-tensor is conformally flat and represents a Robertson-Walker spacetime. Also, we calculate the scale factor $\varPsi (t)$ for these spacetimes in a spatially flat Robertson-Walker spacetime. Finally, we study the impact of this spacetime under $f(R)$ gravity scenario and deduce several energy conditions by considering a new model $f\left(\mathcal{R}\right)= e^{(α\mathcal{R})}-ln(β\mathcal{R})$ in which $α$ and $β$ are positive constants.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
TWLV-I: Analysis and Insights from Holistic Evaluation on Video Foundation Models
Authors:
Hyeongmin Lee,
Jin-Young Kim,
Kyungjune Baek,
Jihwan Kim,
Hyojun Go,
Seongsu Ha,
Seokjin Han,
Jiho Jang,
Raehyuk Jung,
Daewoo Kim,
GeunOh Kim,
JongMok Kim,
Jongseok Kim,
Junwan Kim,
Soonwoo Kwon,
Jangwon Lee,
Seungjoon Park,
Minjoon Seo,
Jay Suh,
Jaehyuk Yi,
Aiden Lee
Abstract:
In this work, we discuss evaluating video foundation models in a fair and robust manner. Unlike language or image foundation models, many video foundation models are evaluated with differing parameters (such as sampling rate, number of frames, pretraining steps, etc.), making fair and robust comparisons challenging. Therefore, we present a carefully designed evaluation framework for measuring two…
▽ More
In this work, we discuss evaluating video foundation models in a fair and robust manner. Unlike language or image foundation models, many video foundation models are evaluated with differing parameters (such as sampling rate, number of frames, pretraining steps, etc.), making fair and robust comparisons challenging. Therefore, we present a carefully designed evaluation framework for measuring two core capabilities of video comprehension: appearance and motion understanding. Our findings reveal that existing video foundation models, whether text-supervised like UMT or InternVideo2, or self-supervised like V-JEPA, exhibit limitations in at least one of these capabilities. As an alternative, we introduce TWLV-I, a new video foundation model that constructs robust visual representations for both motion- and appearance-based videos. Based on the average top-1 accuracy of linear probing on five action recognition benchmarks, pretrained only on publicly accessible datasets, our model shows a 4.6%p improvement compared to V-JEPA (ViT-L) and a 7.7%p improvement compared to UMT (ViT-L). Even when compared to much larger models, our model demonstrates a 7.2%p improvement compared to DFN (ViT-H), a 2.7%p improvement compared to V-JEPA (ViT-H) and a 2.8%p improvement compared to InternVideo2 (ViT-g). We provide embedding vectors obtained by TWLV-I from videos of several commonly used video benchmarks, along with evaluation source code that can directly utilize these embeddings. The code is available at https://github.com/twelvelabs-io/video-embeddings-evaluation-framework.
△ Less
Submitted 22 August, 2024; v1 submitted 20 August, 2024;
originally announced August 2024.
-
Improving Domain-Specific ASR with LLM-Generated Contextual Descriptions
Authors:
Jiwon Suh,
Injae Na,
Woohwan Jung
Abstract:
End-to-end automatic speech recognition (E2E ASR) systems have significantly improved speech recognition through training on extensive datasets. Despite these advancements, they still struggle to accurately recognize domain specific words, such as proper nouns and technical terminologies. To address this problem, we propose a method to utilize the state-of-the-art Whisper without modifying its arc…
▽ More
End-to-end automatic speech recognition (E2E ASR) systems have significantly improved speech recognition through training on extensive datasets. Despite these advancements, they still struggle to accurately recognize domain specific words, such as proper nouns and technical terminologies. To address this problem, we propose a method to utilize the state-of-the-art Whisper without modifying its architecture, preserving its generalization performance while enabling it to leverage descriptions effectively. Moreover, we propose two additional training techniques to improve the domain specific ASR: decoder fine-tuning, and context perturbation. We also propose a method to use a Large Language Model (LLM) to generate descriptions with simple metadata, when descriptions are unavailable. Our experiments demonstrate that proposed methods notably enhance domain-specific ASR accuracy on real-life datasets, with LLM-generated descriptions outperforming human-crafted ones in effectiveness.
△ Less
Submitted 25 July, 2024;
originally announced July 2024.
-
The Human Factor in AI Red Teaming: Perspectives from Social and Collaborative Computing
Authors:
Alice Qian Zhang,
Ryland Shaw,
Jacy Reese Anthis,
Ashlee Milton,
Emily Tseng,
Jina Suh,
Lama Ahmad,
Ram Shankar Siva Kumar,
Julian Posada,
Benjamin Shestakofsky,
Sarah T. Roberts,
Mary L. Gray
Abstract:
Rapid progress in general-purpose AI has sparked significant interest in "red teaming," a practice of adversarial testing originating in military and cybersecurity applications. AI red teaming raises many questions about the human factor, such as how red teamers are selected, biases and blindspots in how tests are conducted, and harmful content's psychological effects on red teamers. A growing bod…
▽ More
Rapid progress in general-purpose AI has sparked significant interest in "red teaming," a practice of adversarial testing originating in military and cybersecurity applications. AI red teaming raises many questions about the human factor, such as how red teamers are selected, biases and blindspots in how tests are conducted, and harmful content's psychological effects on red teamers. A growing body of HCI and CSCW literature examines related practices-including data labeling, content moderation, and algorithmic auditing. However, few, if any have investigated red teaming itself. Future studies may explore topics ranging from fairness to mental health and other areas of potential harm. We aim to facilitate a community of researchers and practitioners who can begin to meet these challenges with creativity, innovation, and thoughtful reflection.
△ Less
Submitted 11 September, 2024; v1 submitted 10 July, 2024;
originally announced July 2024.
-
Virtual Personas for Language Models via an Anthology of Backstories
Authors:
Suhong Moon,
Marwa Abdulhai,
Minwoo Kang,
Joseph Suh,
Widyadewi Soedarmadji,
Eran Kohen Behar,
David M. Chan
Abstract:
Large language models (LLMs) are trained from vast repositories of text authored by millions of distinct authors, reflecting an enormous diversity of human traits. While these models bear the potential to be used as approximations of human subjects in behavioral studies, prior efforts have been limited in steering model responses to match individual human users. In this work, we introduce "Antholo…
▽ More
Large language models (LLMs) are trained from vast repositories of text authored by millions of distinct authors, reflecting an enormous diversity of human traits. While these models bear the potential to be used as approximations of human subjects in behavioral studies, prior efforts have been limited in steering model responses to match individual human users. In this work, we introduce "Anthology", a method for conditioning LLMs to particular virtual personas by harnessing open-ended life narratives, which we refer to as "backstories." We show that our methodology enhances the consistency and reliability of experimental outcomes while ensuring better representation of diverse sub-populations. Across three nationally representative human surveys conducted as part of Pew Research Center's American Trends Panel (ATP), we demonstrate that Anthology achieves up to 18% improvement in matching the response distributions of human respondents and 27% improvement in consistency metrics. Our code and generated backstories are available at https://github.com/CannyLab/anthology.
△ Less
Submitted 1 November, 2024; v1 submitted 9 July, 2024;
originally announced July 2024.
-
A Training-free Sub-quadratic Cost Transformer Model Serving Framework With Hierarchically Pruned Attention
Authors:
Heejun Lee,
Geon Park,
Youngwan Lee,
Jaduk Suh,
Jina Kim,
Wonyoung Jeong,
Bumsik Kim,
Hyemin Lee,
Myeongjae Jeon,
Sung Ju Hwang
Abstract:
In modern large language models (LLMs), increasing the context length is crucial for improving comprehension and coherence in long-context, multi-modal, and retrieval-augmented language generation. While many recent transformer models attempt to extend their context length over a million tokens, they remain impractical due to the quadratic time and space complexities. Although recent works on line…
▽ More
In modern large language models (LLMs), increasing the context length is crucial for improving comprehension and coherence in long-context, multi-modal, and retrieval-augmented language generation. While many recent transformer models attempt to extend their context length over a million tokens, they remain impractical due to the quadratic time and space complexities. Although recent works on linear and sparse attention mechanisms can achieve this goal, their real-world applicability is often limited by the need to re-train from scratch and significantly worse performance. In response, we propose a novel approach, Hierarchically Pruned Attention (HiP), which reduces the time complexity of the attention mechanism to $O(T \log T)$ and the space complexity to $O(T)$, where $T$ is the sequence length. We notice a pattern in the attention scores of pretrained LLMs where tokens close together tend to have similar scores, which we call ``attention locality''. Based on this observation, we utilize a novel tree-search-like algorithm that estimates the top-$k$ key tokens for a given query on the fly, which is mathematically guaranteed to have better performance than random attention pruning. In addition to improving the time complexity of the attention mechanism, we further optimize GPU memory usage by implementing KV cache offloading, which stores only $O(\log T)$ tokens on the GPU while maintaining similar decoding throughput. Experiments on benchmarks show that HiP, with its training-free nature, significantly reduces both prefill and decoding latencies, as well as memory usage, while maintaining high-quality generation with minimal degradation. HiP enables pretrained LLMs to scale up to millions of tokens on commodity GPUs, potentially unlocking long-context LLM applications previously deemed infeasible.
△ Less
Submitted 23 January, 2025; v1 submitted 14 June, 2024;
originally announced June 2024.
-
Coherent control of a triangular exchange-only spin qubit
Authors:
Edwin Acuna,
Joseph D. Broz,
Kaushal Shyamsundar,
Antonio B. Mei,
Colin P. Feeney,
Valerie Smetanka,
Tiffany Davis,
Kangmu Lee,
Maxwell D. Choi,
Brydon Boyd,
June Suh,
Wonill D. Ha,
Cameron Jennings,
Andrew S. Pan,
Daniel S. Sanchez,
Matthew D. Reed,
Jason R. Petta
Abstract:
We demonstrate coherent control of a three-electron exchange-only spin qubit with the quantum dots arranged in a close-packed triangular geometry. The device is tuned to confine one electron in each quantum dot, as evidenced by pairwise charge stability diagrams. Time-domain control of the exchange coupling is demonstrated and qubit performance is characterized using blind randomized benchmarking,…
▽ More
We demonstrate coherent control of a three-electron exchange-only spin qubit with the quantum dots arranged in a close-packed triangular geometry. The device is tuned to confine one electron in each quantum dot, as evidenced by pairwise charge stability diagrams. Time-domain control of the exchange coupling is demonstrated and qubit performance is characterized using blind randomized benchmarking, with an average single-qubit gate fidelity F = 99.84%. The compact triangular device geometry can be readily scaled to larger two-dimensional quantum dot arrays with high connectivity.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Diamond molecular balance: Revolutionizing high-resolution mass spectrometry from MDa to TDa at room temperature
Authors:
Donggeun Lee,
Seung-Woo Jeon,
Chang-Hwan Yi,
Yang-Hee Kim,
Yeeun Choi,
Sang-Hun Lee,
Jinwoong Cha,
Seung-Bo Shim,
Junho Suh,
Il-Young Kim,
Dongyeon Daniel Kang,
Hojoong Jung,
Cherlhyun Jeong,
Jae-pyoung Ahn,
Hee Chul Park,
Sang-Wook Han,
Chulki Kim
Abstract:
The significance of mass spectrometry lies in its unparalleled ability to accurately identify and quantify molecules in complex samples, providing invaluable insights into molecular structures and interactions. Here, we leverage diamond nanostructures as highly sensitive mass sensors by utilizing a self-excitation mechanism under an electron beam in a conventional scanning electron microscope (SEM…
▽ More
The significance of mass spectrometry lies in its unparalleled ability to accurately identify and quantify molecules in complex samples, providing invaluable insights into molecular structures and interactions. Here, we leverage diamond nanostructures as highly sensitive mass sensors by utilizing a self-excitation mechanism under an electron beam in a conventional scanning electron microscope (SEM). The diamond molecular balance (DMB) exhibits an exceptional mass resolution of 0.36 MDa, based on its outstanding mechanical quality factor and frequency stability, along with an extensive dynamic range from MDa to TDa. This positions the DMB at the forefront of molecular balances operating at room temperature. Notably, the DMB demonstrates its ability to measure the mass of a single bacteriophage T4 by precisely locating the analyte on the device. These findings highlight the groundbreaking potential of the DMB as a revolutionary tool for mass spectrometry at room temperature.
△ Less
Submitted 25 July, 2024; v1 submitted 4 June, 2024;
originally announced June 2024.
-
Pegasus-v1 Technical Report
Authors:
Raehyuk Jung,
Hyojun Go,
Jaehyuk Yi,
Jiho Jang,
Daniel Kim,
Jay Suh,
Aiden Lee,
Cooper Han,
Jae Lee,
Jeff Kim,
Jin-Young Kim,
Junwan Kim,
Kyle Park,
Lucas Lee,
Mars Ha,
Minjoon Seo,
Abraham Jo,
Ed Park,
Hassan Kianinejad,
SJ Kim,
Tony Moon,
Wade Jeong,
Andrei Popescu,
Esther Kim,
EK Yoon
, et al. (19 additional authors not shown)
Abstract:
This technical report introduces Pegasus-1, a multimodal language model specialized in video content understanding and interaction through natural language. Pegasus-1 is designed to address the unique challenges posed by video data, such as interpreting spatiotemporal information, to offer nuanced video content comprehension across various lengths. This technical report overviews Pegasus-1's archi…
▽ More
This technical report introduces Pegasus-1, a multimodal language model specialized in video content understanding and interaction through natural language. Pegasus-1 is designed to address the unique challenges posed by video data, such as interpreting spatiotemporal information, to offer nuanced video content comprehension across various lengths. This technical report overviews Pegasus-1's architecture, training strategies, and its performance in benchmarks on video conversation, zero-shot video question answering, and video summarization. We also explore qualitative characteristics of Pegasus-1 , demonstrating its capabilities as well as its limitations, in order to provide readers a balanced view of its current state and its future direction.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Optimal Acceleration for Minimax and Fixed-Point Problems is Not Unique
Authors:
TaeHo Yoon,
Jaeyeon Kim,
Jaewook J. Suh,
Ernest K. Ryu
Abstract:
Recently, accelerated algorithms using the anchoring mechanism for minimax optimization and fixed-point problems have been proposed, and matching complexity lower bounds establish their optimality. In this work, we present the surprising observation that the optimal acceleration mechanism in minimax optimization and fixed-point problems is not unique. Our new algorithms achieve exactly the same wo…
▽ More
Recently, accelerated algorithms using the anchoring mechanism for minimax optimization and fixed-point problems have been proposed, and matching complexity lower bounds establish their optimality. In this work, we present the surprising observation that the optimal acceleration mechanism in minimax optimization and fixed-point problems is not unique. Our new algorithms achieve exactly the same worst-case convergence rates as existing anchor-based methods while using materially different acceleration mechanisms. Specifically, these new algorithms are dual to the prior anchor-based accelerated methods in the sense of H-duality. This finding opens a new avenue of research on accelerated algorithms since we now have a family of methods that empirically exhibit varied characteristics while having the same optimal worst-case guarantee.
△ Less
Submitted 23 April, 2024; v1 submitted 19 April, 2024;
originally announced April 2024.
-
MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education
Authors:
Murong Yue,
Wenhan Lyu,
Wijdane Mifdal,
Jennifer Suh,
Yixuan Zhang,
Ziyu Yao
Abstract:
Mathematical modeling (MM) is considered a fundamental skill for students in STEM disciplines. Practicing the MM skill is often the most effective when students can engage in group discussion and collaborative problem-solving. However, due to unevenly distributed teachers and educational resources needed to monitor such group activities, students do not always receive equal opportunities for this…
▽ More
Mathematical modeling (MM) is considered a fundamental skill for students in STEM disciplines. Practicing the MM skill is often the most effective when students can engage in group discussion and collaborative problem-solving. However, due to unevenly distributed teachers and educational resources needed to monitor such group activities, students do not always receive equal opportunities for this practice. Excitingly, large language models (LLMs) have recently demonstrated strong capability in both modeling mathematical problems and simulating characters with different traits and properties. Drawing inspiration from the advancement of LLMs, in this work, we present MATHVC, the very first LLM-powered virtual classroom containing multiple LLM-simulated student characters, with whom a human student can practice their MM skill. To encourage each LLM character's behaviors to be aligned with their specified math-relevant properties (termed "characteristics alignment") and the overall conversational procedure to be close to an authentic student MM discussion (termed "conversational procedural alignment"), we proposed three innovations: integrating MM domain knowledge into the simulation, defining a symbolic schema as the ground for character simulation, and designing a meta planner at the platform level to drive the conversational procedure. Through experiments and ablation studies, we confirmed the effectiveness of our simulation approach and showed the promise for MATHVC to benefit real-life students in the future.
△ Less
Submitted 29 January, 2025; v1 submitted 9 April, 2024;
originally announced April 2024.
-
Large Language Models are Capable of Offering Cognitive Reappraisal, if Guided
Authors:
Hongli Zhan,
Allen Zheng,
Yoon Kyung Lee,
Jina Suh,
Junyi Jessy Li,
Desmond C. Ong
Abstract:
Large language models (LLMs) have offered new opportunities for emotional support, and recent work has shown that they can produce empathic responses to people in distress. However, long-term mental well-being requires emotional self-regulation, where a one-time empathic response falls short. This work takes a first step by engaging with cognitive reappraisals, a strategy from psychology practitio…
▽ More
Large language models (LLMs) have offered new opportunities for emotional support, and recent work has shown that they can produce empathic responses to people in distress. However, long-term mental well-being requires emotional self-regulation, where a one-time empathic response falls short. This work takes a first step by engaging with cognitive reappraisals, a strategy from psychology practitioners that uses language to targetedly change negative appraisals that an individual makes of the situation; such appraisals is known to sit at the root of human emotional experience. We hypothesize that psychologically grounded principles could enable such advanced psychology capabilities in LLMs, and design RESORT which consists of a series of reappraisal constitutions across multiple dimensions that can be used as LLM instructions. We conduct a first-of-its-kind expert evaluation (by clinical psychologists with M.S. or Ph.D. degrees) of an LLM's zero-shot ability to generate cognitive reappraisal responses to medium-length social media messages asking for support. This fine-grained evaluation showed that even LLMs at the 7B scale guided by RESORT are capable of generating empathic responses that can help users reappraise their situations.
△ Less
Submitted 8 August, 2024; v1 submitted 1 April, 2024;
originally announced April 2024.
-
Large Language Models Produce Responses Perceived to be Empathic
Authors:
Yoon Kyung Lee,
Jina Suh,
Hongli Zhan,
Junyi Jessy Li,
Desmond C. Ong
Abstract:
Large Language Models (LLMs) have demonstrated surprising performance on many tasks, including writing supportive messages that display empathy. Here, we had these models generate empathic messages in response to posts describing common life experiences, such as workplace situations, parenting, relationships, and other anxiety- and anger-eliciting situations. Across two studies (N=192, 202), we sh…
▽ More
Large Language Models (LLMs) have demonstrated surprising performance on many tasks, including writing supportive messages that display empathy. Here, we had these models generate empathic messages in response to posts describing common life experiences, such as workplace situations, parenting, relationships, and other anxiety- and anger-eliciting situations. Across two studies (N=192, 202), we showed human raters a variety of responses written by several models (GPT4 Turbo, Llama2, and Mistral), and had people rate these responses on how empathic they seemed to be. We found that LLM-generated responses were consistently rated as more empathic than human-written responses. Linguistic analyses also show that these models write in distinct, predictable ``styles", in terms of their use of punctuation, emojis, and certain words. These results highlight the potential of using LLMs to enhance human peer support in contexts where empathy is important.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
DISCERN: Designing Decision Support Interfaces to Investigate the Complexities of Workplace Social Decision-Making With Line Managers
Authors:
Pranav Khadpe,
Lindy Le,
Kate Nowak,
Shamsi T. Iqbal,
Jina Suh
Abstract:
Line managers form the first level of management in organizations, and must make complex decisions, while maintaining relationships with those impacted by their decisions. Amidst growing interest in technology-supported decision-making at work, their needs remain understudied. Further, most existing design knowledge for supporting social decision-making comes from domains where decision-makers are…
▽ More
Line managers form the first level of management in organizations, and must make complex decisions, while maintaining relationships with those impacted by their decisions. Amidst growing interest in technology-supported decision-making at work, their needs remain understudied. Further, most existing design knowledge for supporting social decision-making comes from domains where decision-makers are more socially detached from those they decide for. We conducted iterative design research with line managers within a technology organization, investigating decision-making practices, and opportunities for technological support. Through formative research, development of a decision-representation tool -- DISCERN -- and user enactments, we identify their communication and analysis needs that lack adequate support. We found they preferred tools for externalizing reasoning rather than tools that replace interpersonal interactions, and they wanted tools to support a range of intuitive and calculative decision-making. We discuss how design of social decision-making supports, especially in the workplace, can more explicitly support highly interactional social decision-making.
△ Less
Submitted 29 February, 2024;
originally announced February 2024.
-
Impact of projective curvature tensor in $f\left(R,G\right)$, $f\left(R,T\right)$ and $f\left(R,L_{m}\right)$-gravity
Authors:
Young Jin Suh,
Krishnendu De,
Uday Chand De
Abstract:
This article concerns with the characterization of a spacetime and modified gravity, such as $f\left(R,G\right)$, $f\left(R,T\right)$ and $f\left(R,L_{m}\right)$-gravity equipped with the projective curvature tensor. We establish that a projectively flat perfect fluid spacetime represents dark energy era. Also, we prove that a projectively flat perfect fluid spacetime is either locally isometric t…
▽ More
This article concerns with the characterization of a spacetime and modified gravity, such as $f\left(R,G\right)$, $f\left(R,T\right)$ and $f\left(R,L_{m}\right)$-gravity equipped with the projective curvature tensor. We establish that a projectively flat perfect fluid spacetime represents dark energy era. Also, we prove that a projectively flat perfect fluid spacetime is either locally isometric to Minkowski spacetime or a de-Sitter spacetime. Furthermore, it is shown that a perfect fluid spacetime permitting harmonic projective curvature tensor becomes a generalized Robertson-Walker spacetime and is of Petrov type $I$, $D$ or $O$. Lastly, we investigate the effect of projectively flat perfect fluid spacetime solutions in $f\left(R,G\right)$, $f\left(R,T\right)$ and $f\left(R,L_{m}\right)$-gravity, respectively. We also investigate the spacetime as a $f\left(R,G\right)$-gravity solution of and use the flat Friedmann-Robertson-Walker metric to establish a relation among jerk, snap, and deceleration parameters. Numerous energy conditions are studied in terms of Ricci scalar with the model $f\left(R,G\right)=\exp(R)+α\left(6G\right)^β$. For this model, the strong energy condition is violated but the weak, dominant and null energy conditions are fulfilled, which is in excellent accordance with current observational investigations that show the universe is now accelerating.
△ Less
Submitted 24 February, 2024;
originally announced February 2024.
-
IMBUE: Improving Interpersonal Effectiveness through Simulation and Just-in-time Feedback with Human-Language Model Interaction
Authors:
Inna Wanyin Lin,
Ashish Sharma,
Christopher Michael Rytting,
Adam S. Miner,
Jina Suh,
Tim Althoff
Abstract:
Navigating certain communication situations can be challenging due to individuals' lack of skills and the interference of strong emotions. However, effective learning opportunities are rarely accessible. In this work, we conduct a human-centered study that uses language models to simulate bespoke communication training and provide just-in-time feedback to support the practice and learning of inter…
▽ More
Navigating certain communication situations can be challenging due to individuals' lack of skills and the interference of strong emotions. However, effective learning opportunities are rarely accessible. In this work, we conduct a human-centered study that uses language models to simulate bespoke communication training and provide just-in-time feedback to support the practice and learning of interpersonal effectiveness skills. We apply the interpersonal effectiveness framework from Dialectical Behavioral Therapy (DBT), DEAR MAN, which focuses on both conversational and emotional skills. We present IMBUE, an interactive training system that provides feedback 25% more similar to experts' feedback, compared to that generated by GPT-4. IMBUE is the first to focus on communication skills and emotion management simultaneously, incorporate experts' domain knowledge in providing feedback, and be grounded in psychology theory. Through a randomized trial of 86 participants, we find that IMBUE's simulation-only variant significantly improves participants' self-efficacy (up to 17%) and reduces negative emotions (up to 25%). With IMBUE's additional just-in-time feedback, participants demonstrate 17% improvement in skill mastery, along with greater enhancements in self-efficacy (27% more) and reduction of negative emotions (16% more) compared to simulation-only. The improvement in skill mastery is the only measure that is transferred to new and more difficult situations; situation specific training is necessary for improving self-efficacy and emotion reduction.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Making a prototype of Seoul historical sites chatbot using Langchain
Authors:
Jae Young Suh,
Minsoo Kwak,
Soo Yong Kim,
Hyoungseo Cho
Abstract:
In this paper, we are going to share a draft of the development of a conversational agent created to disseminate information about historical sites located in the Seoul. The primary objective of the agent is to increase awareness among visitors who are not familiar with Seoul, about the presence and precise locations of valuable cultural heritage sites. It aims to promote a basic understanding of…
▽ More
In this paper, we are going to share a draft of the development of a conversational agent created to disseminate information about historical sites located in the Seoul. The primary objective of the agent is to increase awareness among visitors who are not familiar with Seoul, about the presence and precise locations of valuable cultural heritage sites. It aims to promote a basic understanding of Korea's rich and diverse cultural history. The agent is thoughtfully designed for accessibility in English and utilizes data generously provided by the Seoul Metropolitan Government. Despite the limited data volume, it consistently delivers reliable and accurate responses, seamlessly aligning with the available information. We have meticulously detailed the methodologies employed in creating this agent and provided a comprehensive overview of its underlying structure within the paper. Additionally, we delve into potential improvements to enhance this initial version of the system, with a primary emphasis on expanding the available data through our prompting. In conclusion, we provide an in-depth discussion of our expectations regarding the future impact of this agent in promoting and facilitating the sharing of historical sites.
△ Less
Submitted 10 February, 2024;
originally announced February 2024.
-
Effect of trivial bands on chiral anomaly-induced longitudinal magnetoconductivity in Weyl semimetals
Authors:
Jeonghyeon Suh,
Hongki Min
Abstract:
Including the effect of the trivial band near Weyl nodes, we evaluate the longitudinal magnetoconductivity (LMC) of Weyl semimetals along the magnetic field direction using the Boltzmann magnetotransport theory, and study its dependence on the magnetic field, Fermi energy, and temperature. We find that for weak internode and node-trivial band scatterings, the LMC is quadratic in the magnetic field…
▽ More
Including the effect of the trivial band near Weyl nodes, we evaluate the longitudinal magnetoconductivity (LMC) of Weyl semimetals along the magnetic field direction using the Boltzmann magnetotransport theory, and study its dependence on the magnetic field, Fermi energy, and temperature. We find that for weak internode and node-trivial band scatterings, the LMC is quadratic in the magnetic field and is inversely proportional to the fourth power of the Fermi energy at high densities due to internode scatterings, and to the square of the Fermi energy at low densities due to scatterings between a Weyl node and the trivial band. In the case of strong internode and nodetrivial band scatterings, the magnetic field-driven anisotropy induced by the phase-space volume element and the orbital magnetic moment cannot be neglected. As a result, the LMC exhibits a significantly different trend compared to that in the weak internode and node-trivial band scattering limit. Finally, we calculate the temperature dependence of the LMC in the strong inelastic scattering limit and obtain its asymptotic behaviors at low and high temperatures, respectively, demonstrating that the temperature dependence is strongly affected by the existence of the trivial band.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
From User Surveys to Telemetry-Driven AI Agents: Exploring the Potential of Personalized Productivity Solutions
Authors:
Subigya Nepal,
Javier Hernandez,
Talie Massachi,
Kael Rowan,
Judith Amores,
Jina Suh,
Gonzalo Ramos,
Brian Houck,
Shamsi T. Iqbal,
Mary Czerwinski
Abstract:
Information workers increasingly struggle with productivity challenges in modern workplaces, facing difficulties in managing time and effectively utilizing workplace analytics data for behavioral improvement. Despite the availability of productivity metrics through enterprise tools, workers often fail to translate this data into actionable insights. We present a comprehensive, user-centric approac…
▽ More
Information workers increasingly struggle with productivity challenges in modern workplaces, facing difficulties in managing time and effectively utilizing workplace analytics data for behavioral improvement. Despite the availability of productivity metrics through enterprise tools, workers often fail to translate this data into actionable insights. We present a comprehensive, user-centric approach to address these challenges through AI-based productivity agents tailored to users' needs. Utilizing a two-phase method, we first conducted a survey with 363 participants, exploring various aspects of productivity, communication style, agent approach, personality traits, personalization, and privacy. Drawing on the survey insights, we developed a GPT-4 powered personalized productivity agent that utilizes telemetry data gathered via Viva Insights from information workers to provide tailored assistance. We compared its performance with alternative productivity-assistive tools, such as dashboard and narrative, in a study involving 40 participants. Our findings highlight the importance of user-centric design, adaptability, and the balance between personalization and privacy in AI-assisted productivity tools. By building on these insights, our work provides important guidance for developing more effective productivity solutions, ultimately leading to optimized efficiency and user experiences for information workers.
△ Less
Submitted 5 June, 2025; v1 submitted 16 January, 2024;
originally announced January 2024.
-
Youth WellTech: A Global Remote Co-Design Sprint for Youth Mental Health Technology
Authors:
Kenji Phang,
Siddharth Saarathi Pradhan,
Chino Ikwuegbu,
Gonzalo Ramos,
Denae Ford,
Ebele Okoli,
Salman Muin Kayser Chishti,
Jina Suh
Abstract:
Mental health is a pressing concern in today's digital age, particularly among youth who are deeply intertwined with technology. Despite the influx of technology solutions addressing mental health issues, youth often remain sidelined during the design process. While co-design methods have been employed to improve participation by youth, many such initiatives are limited to design activities and la…
▽ More
Mental health is a pressing concern in today's digital age, particularly among youth who are deeply intertwined with technology. Despite the influx of technology solutions addressing mental health issues, youth often remain sidelined during the design process. While co-design methods have been employed to improve participation by youth, many such initiatives are limited to design activities and lack training for youth to research and develop solutions for themselves. In this case study, we detail our 8-week remote, collaborative research initiative called Youth WellTech, designed to facilitate remote co-design sprints aimed at equipping youth with the tools and knowledge to envision and design tech futures for their own communities. We pilot this initiative with 12 student technology evangelists across 8 countries globally to foster the sharing of mental health challenges and diverse perspectives. We highlight insights from our experiences running this global program remotely, its structure, and recommendations for co-research.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Enhancing Contrastive Learning with Efficient Combinatorial Positive Pairing
Authors:
Jaeill Kim,
Duhun Hwang,
Eunjung Lee,
Jangwon Suh,
Jimyeong Kim,
Wonjong Rhee
Abstract:
In the past few years, contrastive learning has played a central role for the success of visual unsupervised representation learning. Around the same time, high-performance non-contrastive learning methods have been developed as well. While most of the works utilize only two views, we carefully review the existing multi-view methods and propose a general multi-view strategy that can improve learni…
▽ More
In the past few years, contrastive learning has played a central role for the success of visual unsupervised representation learning. Around the same time, high-performance non-contrastive learning methods have been developed as well. While most of the works utilize only two views, we carefully review the existing multi-view methods and propose a general multi-view strategy that can improve learning speed and performance of any contrastive or non-contrastive method. We first analyze CMC's full-graph paradigm and empirically show that the learning speed of $K$-views can be increased by $_{K}\mathrm{C}_{2}$ times for small learning rate and early training. Then, we upgrade CMC's full-graph by mixing views created by a crop-only augmentation, adopting small-size views as in SwAV multi-crop, and modifying the negative sampling. The resulting multi-view strategy is called ECPP (Efficient Combinatorial Positive Pairing). We investigate the effectiveness of ECPP by applying it to SimCLR and assessing the linear evaluation performance for CIFAR-10 and ImageNet-100. For each benchmark, we achieve a state-of-the-art performance. In case of ImageNet-100, ECPP boosted SimCLR outperforms supervised learning.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
"I Want It That Way": Enabling Interactive Decision Support Using Large Language Models and Constraint Programming
Authors:
Connor Lawless,
Jakob Schoeffer,
Lindy Le,
Kael Rowan,
Shilad Sen,
Cristina St. Hill,
Jina Suh,
Bahareh Sarrafzadeh
Abstract:
A critical factor in the success of decision support systems is the accurate modeling of user preferences. Psychology research has demonstrated that users often develop their preferences during the elicitation process, highlighting the pivotal role of system-user interaction in developing personalized systems. This paper introduces a novel approach, combining Large Language Models (LLMs) with Cons…
▽ More
A critical factor in the success of decision support systems is the accurate modeling of user preferences. Psychology research has demonstrated that users often develop their preferences during the elicitation process, highlighting the pivotal role of system-user interaction in developing personalized systems. This paper introduces a novel approach, combining Large Language Models (LLMs) with Constraint Programming to facilitate interactive decision support. We study this hybrid framework through the lens of meeting scheduling, a time-consuming daily activity faced by a multitude of information workers. We conduct three studies to evaluate the novel framework, including a diary study (n=64) to characterize contextual scheduling preferences, a quantitative evaluation of the system's performance, and a user study (n=10) with a prototype system. Our work highlights the potential for a hybrid LLM and optimization approach for iterative preference elicitation and design considerations for building systems that support human-system collaborative decision-making processes.
△ Less
Submitted 1 October, 2024; v1 submitted 11 December, 2023;
originally announced December 2023.