-
Feedstack: Layering Structured Representations over Unstructured Feedback to Scaffold Human AI Conversation
Authors:
Hannah Vy Nguyen,
Yu-Chun Grace Yen,
Omar Shakir,
Hang Huynh,
Sebastian Gutierrez,
June A. Smith,
Sheila Jimenez,
Salma Abdelgelil,
Stephen MacNeil
Abstract:
Many conversational user interfaces facilitate linear conversations with turn-based dialogue, similar to face-to-face conversations between people. However, digital conversations can afford more than simple back-and-forth; they can be layered with interaction techniques and structured representations that scaffold exploration, reflection, and shared understanding between users and AI systems. We i…
▽ More
Many conversational user interfaces facilitate linear conversations with turn-based dialogue, similar to face-to-face conversations between people. However, digital conversations can afford more than simple back-and-forth; they can be layered with interaction techniques and structured representations that scaffold exploration, reflection, and shared understanding between users and AI systems. We introduce Feedstack, a speculative interface that augments feedback conversations with layered affordances for organizing, navigating, and externalizing feedback. These layered structures serve as a shared representation of the conversation that can surface user intent and reveal underlying design principles. This work represents an early exploration of this vision using a research-through-design approach. We describe system features and design rationale, and present insights from two formative (n=8, n=8) studies to examine how novice designers engage with these layered supports. Rather than presenting a conclusive evaluation, we reflect on Feedstack as a design probe that opens up new directions for conversational feedback systems.
△ Less
Submitted 3 June, 2025;
originally announced June 2025.
-
Predictive Anchoring: A Novel Interaction to Support Contextualized Suggestions for Grid Displays
Authors:
Cynthia Zastudil,
Christine Holyfield,
June A. Smith,
Hannah Vy Nguyen,
Stephen MacNeil
Abstract:
Grid displays are the most common form of augmentative and alternative communication device recommended by speech-language pathologists for children. Grid displays present a large variety of vocabulary which can be beneficial for a users' language development. However, the extensive navigation and cognitive overhead required of users of grid displays can negatively impact users' ability to activel…
▽ More
Grid displays are the most common form of augmentative and alternative communication device recommended by speech-language pathologists for children. Grid displays present a large variety of vocabulary which can be beneficial for a users' language development. However, the extensive navigation and cognitive overhead required of users of grid displays can negatively impact users' ability to actively participate in social interactions, which is an important factor of their language development. We present a novel interaction technique for grid displays, Predictive Anchoring, based on user interaction theory and language development theory. Our design is informed by existing literature in AAC research, presented in the form of a set of design goals and a preliminary design sketch. Future work in user studies and interaction design are also discussed.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
AI-guided inverse design and discovery of recyclable vitrimeric polymers
Authors:
Yiwen Zheng,
Prakash Thakolkaran,
Agni K. Biswal,
Jake A. Smith,
Ziheng Lu,
Shuxin Zheng,
Bichlien H. Nguyen,
Siddhant Kumar,
Aniruddh Vashisth
Abstract:
Vitrimer is a new, exciting class of sustainable polymers with the ability to heal due to their dynamic covalent adaptive network that can go through associative rearrangement reactions. However, a limited choice of constituent molecules restricts their property space, prohibiting full realization of their potential applications. To overcome this challenge, we couple molecular dynamics (MD) simula…
▽ More
Vitrimer is a new, exciting class of sustainable polymers with the ability to heal due to their dynamic covalent adaptive network that can go through associative rearrangement reactions. However, a limited choice of constituent molecules restricts their property space, prohibiting full realization of their potential applications. To overcome this challenge, we couple molecular dynamics (MD) simulations and a novel graph variational autoencoder (VAE) machine learning model for inverse design of vitrimer chemistries with desired glass transition temperature (Tg) and synthesize a novel vitrimer polymer. We build the first vitrimer dataset of one million chemistries and calculate Tg on 8,424 of them by high-throughput MD simulations calibrated by a Gaussian process model. The proposed novel VAE employs dual graph encoders and a latent dimension overlapping scheme which allows for individual representation of multi-component vitrimers. By constructing a continuous latent space containing necessary information of vitrimers, we demonstrate high accuracy and efficiency of our framework in discovering novel vitrimers with desirable Tg beyond the training regime. To validate the effectiveness of our framework in experiments, we generate novel vitrimer chemistries with a target Tg = 323 K. By incorporating chemical intuition, we synthesize a vitrimer with Tg of 311-317 K, and experimentally demonstrate healability and flowability. The proposed framework offers an exciting tool for polymer chemists to design and synthesize novel, sustainable vitrimer polymers for a facet of applications.
△ Less
Submitted 6 September, 2024; v1 submitted 6 December, 2023;
originally announced December 2023.
-
Why Target Networks Stabilise Temporal Difference Methods
Authors:
Mattie Fellows,
Matthew J. A. Smith,
Shimon Whiteson
Abstract:
Integral to recent successes in deep reinforcement learning has been a class of temporal difference methods that use infrequently updated target values for policy evaluation in a Markov Decision Process. Yet a complete theoretical explanation for the effectiveness of target networks remains elusive. In this work, we provide an analysis of this popular class of algorithms, to finally answer the que…
▽ More
Integral to recent successes in deep reinforcement learning has been a class of temporal difference methods that use infrequently updated target values for policy evaluation in a Markov Decision Process. Yet a complete theoretical explanation for the effectiveness of target networks remains elusive. In this work, we provide an analysis of this popular class of algorithms, to finally answer the question: `why do target networks stabilise TD learning'? To do so, we formalise the notion of a partially fitted policy evaluation method, which describes the use of target networks and bridges the gap between fitted methods and semigradient temporal difference algorithms. Using this framework we are able to uniquely characterise the so-called deadly triad - the use of TD updates with (nonlinear) function approximation and off-policy data - which often leads to nonconvergent algorithms. This insight leads us to conclude that the use of target networks can mitigate the effects of poor conditioning in the Jacobian of the TD update. Instead, we show that under mild regularity conditions and a well tuned target network update frequency, convergence can be guaranteed even in the extremely challenging off-policy sampling and nonlinear function approximation setting.
△ Less
Submitted 11 August, 2023; v1 submitted 24 February, 2023;
originally announced February 2023.
-
Learning to correct spectral methods for simulating turbulent flows
Authors:
Gideon Dresdner,
Dmitrii Kochkov,
Peter Norgaard,
Leonardo Zepeda-Núñez,
Jamie A. Smith,
Michael P. Brenner,
Stephan Hoyer
Abstract:
Despite their ubiquity throughout science and engineering, only a handful of partial differential equations (PDEs) have analytical, or closed-form solutions. This motivates a vast amount of classical work on numerical simulation of PDEs and more recently, a whirlwind of research into data-driven techniques leveraging machine learning (ML). A recent line of work indicates that a hybrid of classical…
▽ More
Despite their ubiquity throughout science and engineering, only a handful of partial differential equations (PDEs) have analytical, or closed-form solutions. This motivates a vast amount of classical work on numerical simulation of PDEs and more recently, a whirlwind of research into data-driven techniques leveraging machine learning (ML). A recent line of work indicates that a hybrid of classical numerical techniques and machine learning can offer significant improvements over either approach alone. In this work, we show that the choice of the numerical scheme is crucial when incorporating physics-based priors. We build upon Fourier-based spectral methods, which are known to be more efficient than other numerical schemes for simulating PDEs with smooth and periodic solutions. Specifically, we develop ML-augmented spectral solvers for three common PDEs of fluid dynamics. Our models are more accurate (2-4x) than standard spectral solvers at the same resolution but have longer overall runtimes (~2x), due to the additional runtime cost of the neural network component. We also demonstrate a handful of key design principles for combining machine learning and numerical methods for solving PDEs.
△ Less
Submitted 25 June, 2023; v1 submitted 1 July, 2022;
originally announced July 2022.
-
Synchronous Unsupervised STDP Learning with Stochastic STT-MRAM Switching
Authors:
Peng Zhou,
Julie A. Smith,
Laura Deremo,
Stephen K. Heinrich-Barna,
Joseph S. Friedman
Abstract:
The use of analog resistance states for storing weights in neuromorphic systems is impeded by fabrication imprecision and device stochasticity that limit the precision of synapse weights. This challenge can be resolved by emulating analog behavior with the stochastic switching of the binary states of spin-transfer torque magnetoresistive random-access memory (STT-MRAM). However, previous approache…
▽ More
The use of analog resistance states for storing weights in neuromorphic systems is impeded by fabrication imprecision and device stochasticity that limit the precision of synapse weights. This challenge can be resolved by emulating analog behavior with the stochastic switching of the binary states of spin-transfer torque magnetoresistive random-access memory (STT-MRAM). However, previous approaches based on STT-MRAM operate in an asynchronous manner that is difficult to implement experimentally. This paper proposes a synchronous spiking neural network system with clocked circuits that perform unsupervised learning leveraging the stochastic switching of STT-MRAM. The proposed system enables a single-layer network to achieve 90% inference accuracy on the MNIST dataset.
△ Less
Submitted 10 December, 2021;
originally announced December 2021.
-
Variational Data Assimilation with a Learned Inverse Observation Operator
Authors:
Thomas Frerix,
Dmitrii Kochkov,
Jamie A. Smith,
Daniel Cremers,
Michael P. Brenner,
Stephan Hoyer
Abstract:
Variational data assimilation optimizes for an initial state of a dynamical system such that its evolution fits observational data. The physical model can subsequently be evolved into the future to make predictions. This principle is a cornerstone of large scale forecasting applications such as numerical weather prediction. As such, it is implemented in current operational systems of weather forec…
▽ More
Variational data assimilation optimizes for an initial state of a dynamical system such that its evolution fits observational data. The physical model can subsequently be evolved into the future to make predictions. This principle is a cornerstone of large scale forecasting applications such as numerical weather prediction. As such, it is implemented in current operational systems of weather forecasting agencies across the globe. However, finding a good initial state poses a difficult optimization problem in part due to the non-invertible relationship between physical states and their corresponding observations. We learn a mapping from observational data to physical states and show how it can be used to improve optimizability. We employ this mapping in two ways: to better initialize the non-convex optimization problem, and to reformulate the objective function in better behaved physics space instead of observation space. Our experimental results for the Lorenz96 model and a two-dimensional turbulent fluid flow demonstrate that this procedure significantly improves forecast quality for chaotic systems.
△ Less
Submitted 20 May, 2021; v1 submitted 22 February, 2021;
originally announced February 2021.
-
Machine learning accelerated computational fluid dynamics
Authors:
Dmitrii Kochkov,
Jamie A. Smith,
Ayya Alieva,
Qing Wang,
Michael P. Brenner,
Stephan Hoyer
Abstract:
Numerical simulation of fluids plays an essential role in modeling many physical phenomena, such as weather, climate, aerodynamics and plasma physics. Fluids are well described by the Navier-Stokes equations, but solving these equations at scale remains daunting, limited by the computational cost of resolving the smallest spatiotemporal features. This leads to unfavorable trade-offs between accura…
▽ More
Numerical simulation of fluids plays an essential role in modeling many physical phenomena, such as weather, climate, aerodynamics and plasma physics. Fluids are well described by the Navier-Stokes equations, but solving these equations at scale remains daunting, limited by the computational cost of resolving the smallest spatiotemporal features. This leads to unfavorable trade-offs between accuracy and tractability. Here we use end-to-end deep learning to improve approximations inside computational fluid dynamics for modeling two-dimensional turbulent flows. For both direct numerical simulation of turbulence and large eddy simulation, our results are as accurate as baseline solvers with 8-10x finer resolution in each spatial dimension, resulting in 40-80x fold computational speedups. Our method remains stable during long simulations, and generalizes to forcing functions and Reynolds numbers outside of the flows where it is trained, in contrast to black box machine learning approaches. Our approach exemplifies how scientific computing can leverage machine learning and hardware accelerators to improve simulations without sacrificing accuracy or generalization.
△ Less
Submitted 28 January, 2021;
originally announced February 2021.
-
Learning Memory Access Patterns
Authors:
Milad Hashemi,
Kevin Swersky,
Jamie A. Smith,
Grant Ayers,
Heiner Litz,
Jichuan Chang,
Christos Kozyrakis,
Parthasarathy Ranganathan
Abstract:
The explosion in workload complexity and the recent slow-down in Moore's law scaling call for new approaches towards efficient computing. Researchers are now beginning to use recent advances in machine learning in software optimizations, augmenting or replacing traditional heuristics and data structures. However, the space of machine learning for computer hardware architecture is only lightly expl…
▽ More
The explosion in workload complexity and the recent slow-down in Moore's law scaling call for new approaches towards efficient computing. Researchers are now beginning to use recent advances in machine learning in software optimizations, augmenting or replacing traditional heuristics and data structures. However, the space of machine learning for computer hardware architecture is only lightly explored. In this paper, we demonstrate the potential of deep learning to address the von Neumann bottleneck of memory performance. We focus on the critical problem of learning memory access patterns, with the goal of constructing accurate and efficient memory prefetchers. We relate contemporary prefetching strategies to n-gram models in natural language processing, and show how recurrent neural networks can serve as a drop-in replacement. On a suite of challenging benchmark datasets, we find that neural networks consistently demonstrate superior performance in terms of precision and recall. This work represents the first step towards practical neural-network based prefetching, and opens a wide range of exciting directions for machine learning in computer architecture research.
△ Less
Submitted 6 March, 2018;
originally announced March 2018.
-
Approaching near-perfect state discrimination of photonic Bell states through the use of unentangled ancilla photons
Authors:
Jake A. Smith,
Lev Kaplan
Abstract:
Despite well-established no-go theorems on a perfect linear optical Bell state analyzer, we find a numerical trend that appears to approach a near-perfect measurement if we incorporate eight or more un-entangled ancilla photons into our device. Following this trend, we begin a promising inductive approach to building an ideal optical Bell measurement device. In the process, we determine that any B…
▽ More
Despite well-established no-go theorems on a perfect linear optical Bell state analyzer, we find a numerical trend that appears to approach a near-perfect measurement if we incorporate eight or more un-entangled ancilla photons into our device. Following this trend, we begin a promising inductive approach to building an ideal optical Bell measurement device. In the process, we determine that any Bell state analyzer that (even occasionally) bunches all photons into only two of the output modes cannot perform an ideal measurement and we find a set of conditions on our linear optical circuit that prevent this outcome.
△ Less
Submitted 28 February, 2018;
originally announced February 2018.
-
Optimal Encoding Capacity of a Linear Optical Quantum Channel
Authors:
Jake A. Smith,
Dmitry B. Uskov,
Lev Kaplan
Abstract:
Here, we study the capacity of a quantum channel, assuming linear optical encoding, as a function of available photons and optical modes. First, we observe that substantial improvement is made possible by not restricting ourselves to a rail-encoded qubit basis. Then, we derive an analytic formula for general channel capacity and show that this capacity is achieved without requiring the use of enta…
▽ More
Here, we study the capacity of a quantum channel, assuming linear optical encoding, as a function of available photons and optical modes. First, we observe that substantial improvement is made possible by not restricting ourselves to a rail-encoded qubit basis. Then, we derive an analytic formula for general channel capacity and show that this capacity is achieved without requiring the use of entangling operations typically required for scalable universal quantum computation, e.g. KLM measurement-assisted transformations. As an example, we provide an explicit encoding scheme using the resources required of standard dense coding using two dual-rail qubits (2 photons in 4 modes). In this case, our protocol encodes one additional bit of information. Greater gains are expected for larger systems.
△ Less
Submitted 10 August, 2015; v1 submitted 23 June, 2015;
originally announced June 2015.
-
Roadmap Document on Stochastic Analysis
Authors:
Bo Friis Nielsen,
Flemming Nielson,
Henrik Pilegaard,
Michael James Andrew Smith,
Ender Yüksel,
Kebin Zeng,
Lijun Zhang
Abstract:
This document was prepared as part of the MT-LAB research centre. The research centre studies the Modelling of Information Technology and is a VKR Centre of Excellence funded for five years by the VILLUM Foundation. You can read more about MT-LAB at its webpage www.MT-LAB.dk.
The goal of the document is to serve as an introduction to new PhD students addressing the research goals of MT-LAB. As s…
▽ More
This document was prepared as part of the MT-LAB research centre. The research centre studies the Modelling of Information Technology and is a VKR Centre of Excellence funded for five years by the VILLUM Foundation. You can read more about MT-LAB at its webpage www.MT-LAB.dk.
The goal of the document is to serve as an introduction to new PhD students addressing the research goals of MT-LAB. As such it aims to provide an overview of a number of selected approaches to the modelling of stochastic systems. It should be readable not only by computers scientists with a background in formal methods but also by PhD students in stochastics that are interested in understanding the computer science approach to stochastic model checking.
We have no intention of being encyclopedic in our treatment of the approaches or the literature. Rather we have made the selection of material based on the competences of the groups involved in or closely affiliated to MT-LAB, so as to ease the task of the PhD students in navigating an otherwise vast amount of literature.
We have decided to publish the document in case other young researchers may find it helpful. The list of authors reflect those that have at times played a significant role in the production of the document.
△ Less
Submitted 27 September, 2012;
originally announced September 2012.
-
Repository Replication Using NNTP and SMTP
Authors:
Joan A. Smith,
Martin Klein,
Michael L. Nelson
Abstract:
We present the results of a feasibility study using shared, existing, network-accessible infrastructure for repository replication. We investigate how dissemination of repository contents can be ``piggybacked'' on top of existing email and Usenet traffic. Long-term persistence of the replicated repository may be achieved thanks to current policies and procedures which ensure that mail messages a…
▽ More
We present the results of a feasibility study using shared, existing, network-accessible infrastructure for repository replication. We investigate how dissemination of repository contents can be ``piggybacked'' on top of existing email and Usenet traffic. Long-term persistence of the replicated repository may be achieved thanks to current policies and procedures which ensure that mail messages and news posts are retrievable for evidentiary and other legal purposes for many years after the creation date. While the preservation issues of migration and emulation are not addressed with this approach, it does provide a simple method of refreshing content with unknown partners.
△ Less
Submitted 2 November, 2006; v1 submitted 1 June, 2006;
originally announced June 2006.
-
Reconstructing Websites for the Lazy Webmaster
Authors:
Frank McCown,
Joan A. Smith,
Michael L. Nelson,
Johan Bollen
Abstract:
Backup or preservation of websites is often not considered until after a catastrophic event has occurred. In the face of complete website loss, "lazy" webmasters or concerned third parties may be able to recover some of their website from the Internet Archive. Other pages may also be salvaged from commercial search engine caches. We introduce the concept of "lazy preservation"- digital preservat…
▽ More
Backup or preservation of websites is often not considered until after a catastrophic event has occurred. In the face of complete website loss, "lazy" webmasters or concerned third parties may be able to recover some of their website from the Internet Archive. Other pages may also be salvaged from commercial search engine caches. We introduce the concept of "lazy preservation"- digital preservation performed as a result of the normal operations of the Web infrastructure (search engines and caches). We present Warrick, a tool to automate the process of website reconstruction from the Internet Archive, Google, MSN and Yahoo. Using Warrick, we have reconstructed 24 websites of varying sizes and composition to demonstrate the feasibility and limitations of website reconstruction from the public Web infrastructure. To measure Warrick's window of opportunity, we have profiled the time required for new Web resources to enter and leave search engine caches.
△ Less
Submitted 16 December, 2005;
originally announced December 2005.