-
Branching algebras for the general linear Lie superalgebra
Authors:
Soo Teck Lee,
Ruibin Zhang
Abstract:
We develop an algebraic approach to the branching of representations of the general linear Lie superalgebra $\mathfrak{gl}_{p|q}({\mathbb C})$, by constructing certain super commutative algebras whose structure encodes the branching rules. Using this approach, we derive the branching rules for restricting any irreducible polynomial representation $V$ of $\mathfrak{gl}_{p|q}({\mathbb C})$ to a regu…
▽ More
We develop an algebraic approach to the branching of representations of the general linear Lie superalgebra $\mathfrak{gl}_{p|q}({\mathbb C})$, by constructing certain super commutative algebras whose structure encodes the branching rules. Using this approach, we derive the branching rules for restricting any irreducible polynomial representation $V$ of $\mathfrak{gl}_{p|q}({\mathbb C})$ to a regular subalgebra isomorphic to $\mathfrak{gl}_{r|s}({\mathbb C})\oplus \mathfrak{gl}_{r'|s'}({\mathbb C})$, $\mathfrak{gl}_{r|s}({\mathbb C})\oplus\mathfrak{gl}_1({\mathbb C})^{r'+s'}$ or $\mathfrak{gl}_{r|s}({\mathbb C})$, with $r+r'=p$ and $s+s'=q$. In the case of $\mathfrak{gl}_{r|s}({\mathbb C})\oplus\mathfrak{gl}_1({\mathbb C})^{r'+s'}$ with $s=0$ or $s=1$ but general $r$, we also construct a basis for the space of $\mathfrak{gl}_{r|s}({\mathbb C})$ highest weight vectors in $V$; when $r=s=0$, the branching rule leads to explicit expressions for the weight multiplicities of $V$ in terms of Kostka numbers.
△ Less
Submitted 17 March, 2024;
originally announced March 2024.
-
Towards Embedding Dynamic Personas in Interactive Robots: Masquerading Animated Social Kinematics (MASK)
Authors:
Jeongeun Park,
Taemoon Jeong,
Hyeonseong Kim,
Taehyun Byun,
Seungyoon Shin,
Keunjun Choi,
Jaewoon Kwon,
Taeyoon Lee,
Matthew Pan,
Sungjoon Choi
Abstract:
This paper presents the design and development of an innovative interactive robotic system to enhance audience engagement using character-like personas. Built upon the foundations of persona-driven dialog agents, this work extends the agent's application to the physical realm, employing robots to provide a more captivating and interactive experience. The proposed system, named the Masquerading Ani…
▽ More
This paper presents the design and development of an innovative interactive robotic system to enhance audience engagement using character-like personas. Built upon the foundations of persona-driven dialog agents, this work extends the agent's application to the physical realm, employing robots to provide a more captivating and interactive experience. The proposed system, named the Masquerading Animated Social Kinematic (MASK), leverages an anthropomorphic robot which interacts with guests using non-verbal interactions, including facial expressions and gestures. A behavior generation system based upon a finite-state machine structure effectively conditions robotic behavior to convey distinct personas. The MASK framework integrates a perception engine, a behavior selection engine, and a comprehensive action library to enable real-time, dynamic interactions with minimal human intervention in behavior design. Throughout the user subject studies, we examined whether the users could recognize the intended character in both personality- and film-character-based persona conditions. We conclude by discussing the role of personas in interactive agents and the factors to consider for creating an engaging user experience.
△ Less
Submitted 7 October, 2024; v1 submitted 15 March, 2024;
originally announced March 2024.
-
RECIPE4U: Student-ChatGPT Interaction Dataset in EFL Writing Education
Authors:
Jieun Han,
Haneul Yoo,
Junho Myung,
Minsun Kim,
Tak Yeon Lee,
So-Yeon Ahn,
Alice Oh
Abstract:
The integration of generative AI in education is expanding, yet empirical analyses of large-scale and real-world interactions between students and AI systems still remain limited. Addressing this gap, we present RECIPE4U (RECIPE for University), a dataset sourced from a semester-long experiment with 212 college students in English as Foreign Language (EFL) writing courses. During the study, studen…
▽ More
The integration of generative AI in education is expanding, yet empirical analyses of large-scale and real-world interactions between students and AI systems still remain limited. Addressing this gap, we present RECIPE4U (RECIPE for University), a dataset sourced from a semester-long experiment with 212 college students in English as Foreign Language (EFL) writing courses. During the study, students engaged in dialogues with ChatGPT to revise their essays. RECIPE4U includes comprehensive records of these interactions, including conversation logs, students' intent, students' self-rated satisfaction, and students' essay edit histories. In particular, we annotate the students' utterances in RECIPE4U with 13 intention labels based on our coding schemes. We establish baseline results for two subtasks in task-oriented dialogue systems within educational contexts: intent detection and satisfaction estimation. As a foundational step, we explore student-ChatGPT interaction patterns through RECIPE4U and analyze them by focusing on students' dialogue, essay data statistics, and students' essay edits. We further illustrate potential applications of RECIPE4U dataset for enhancing the incorporation of LLMs in educational frameworks. RECIPE4U is publicly available at https://zeunie.github.io/RECIPE4U/.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Physics-Inspired Deep Learning Anti-Aliasing Framework in Efficient Channel State Feedback
Authors:
Yu-Chien Lin,
Yan Xin,
Ta-Sung Lee,
Charlie,
Zhang,
Zhi Ding
Abstract:
Acquiring downlink channel state information (CSI) at the base station is vital for optimizing performance in massive Multiple input multiple output (MIMO) Frequency-Division Duplexing (FDD) systems. While deep learning architectures have been successful in facilitating UE-side CSI feedback and gNB-side recovery, the undersampling issue prior to CSI feedback is often overlooked. This issue, which…
▽ More
Acquiring downlink channel state information (CSI) at the base station is vital for optimizing performance in massive Multiple input multiple output (MIMO) Frequency-Division Duplexing (FDD) systems. While deep learning architectures have been successful in facilitating UE-side CSI feedback and gNB-side recovery, the undersampling issue prior to CSI feedback is often overlooked. This issue, which arises from low density pilot placement in current standards, results in significant aliasing effects in outdoor channels and consequently limits CSI recovery performance. To this end, this work introduces a new CSI upsampling framework at the gNB as a post-processing solution to address the gaps caused by undersampling. Leveraging the physical principles of discrete Fourier transform shifting theorem and multipath reciprocity, our framework effectively uses uplink CSI to mitigate aliasing effects. We further develop a learning-based method that integrates the proposed algorithm with the Iterative Shrinkage-Thresholding Algorithm Net (ISTA-Net) architecture, enhancing our approach for non-uniform sampling recovery. Our numerical results show that both our rule-based and deep learning methods significantly outperform traditional interpolation techniques and current state-of-the-art approaches in terms of performance.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
First Constraints on the Epoch of Reionization Using the non-Gaussianity of the Kinematic Sunyaev-Zel{'}dovich Effect from the South Pole Telescope and {\it Herschel}-SPIRE Observations
Authors:
S. Raghunathan,
P. A. R. Ade,
A. J. Anderson,
B. Ansarinejad,
M. Archipley,
J. E. Austermann,
L. Balkenhol,
J. A. Beall,
K. Benabed,
A. N. Bender,
B. A. Benson,
F. Bianchini,
L. E. Bleem,
J. Bock,
F. R. Bouchet,
L. Bryant,
E. Camphuis,
J. E. Carlstrom,
T. W. Cecil,
C. L. Chang,
P. Chaubal,
H. C. Chiang,
P. M. Chichura,
T. -L. Chou,
R. Citron
, et al. (99 additional authors not shown)
Abstract:
We report results from an analysis aimed at detecting the trispectrum of the kinematic Sunyaev-Zel{'}dovich (kSZ) effect by combining data from the South Pole Telescope (SPT) and {\it Herschel}-SPIRE experiments over a 100 ${\rm deg}^{2}$ field. The SPT observations combine data from the previous and current surveys, namely SPTpol and SPT-3G, to achieve depths of 4.5, 3, and 16 $μ{\rm K-arcmin}$ i…
▽ More
We report results from an analysis aimed at detecting the trispectrum of the kinematic Sunyaev-Zel{'}dovich (kSZ) effect by combining data from the South Pole Telescope (SPT) and {\it Herschel}-SPIRE experiments over a 100 ${\rm deg}^{2}$ field. The SPT observations combine data from the previous and current surveys, namely SPTpol and SPT-3G, to achieve depths of 4.5, 3, and 16 $μ{\rm K-arcmin}$ in bands centered at 95, 150, and 220 GHz. For SPIRE, we include data from the 600 and 857 GHz bands. We reconstruct the velocity-induced large-scale correlation of the small-scale kSZ signal with a quadratic estimator that uses two cosmic microwave background (CMB) temperature maps, constructed by optimally combining data from all the frequency bands. We reject the null hypothesis of a zero trispectrum at $10.3σ$ level. However, the measured trispectrum contains contributions from both the kSZ and other undesired components, such as CMB lensing and astrophysical foregrounds, with kSZ being sub-dominant. We use the \textsc{Agora} simulations to estimate the expected signal from CMB lensing and astrophysical foregrounds. After accounting for the contributions from CMB lensing and foreground signals, we do not detect an excess kSZ-only trispectrum and use this non-detection to set constraints on reionization. By applying a prior based on observations of the Gunn-Peterson trough, we obtain an upper limit on the duration of reionization of $Δz_{\rm re, 50} < 4.5$ (95\% C.L). We find these constraints are fairly robust to foregrounds assumptions. This trispectrum measurement is independent of, but consistent with, {\it Planck}'s optical depth measurement. This result is the first constraint on the epoch of reionization using the non-Gaussian nature of the kSZ signal.
△ Less
Submitted 15 August, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
Exploration of the polarization angle variability of the Crab Nebula with POLARBEAR and its application to the search for axion-like particles
Authors:
Shunsuke Adachi,
Tylor Adkins,
Carlo Baccigalupi,
Yuji Chinone,
Kevin T. Crowley,
Josquin Errard,
Giulio Fabbian,
Chang Feng,
Takuro Fujino,
Masaya Hasegawa,
Masashi Hazumi,
Oliver Jeong,
Daisuke Kaneko,
Brian Keating,
Akito Kusaka,
Adrian T. Lee,
Anto I. Lonappan,
Yuto Minami,
Masaaki Murata,
Lucio Piccirillo,
Christian L. Reichardt,
Praween Siritanasak,
Jacob Spisak,
Satoru Takakura,
Grant P. Teply
, et al. (1 additional authors not shown)
Abstract:
The Crab Nebula, also known as Tau A, is a polarized astronomical source at millimeter wavelengths. It has been used as a stable light source for polarization angle calibration in millimeter-wave astronomy. However, it is known that its intensity and polarization vary as a function of time at a variety of wavelengths. Thus, it is of interest to verify the stability of the millimeter-wave polarizat…
▽ More
The Crab Nebula, also known as Tau A, is a polarized astronomical source at millimeter wavelengths. It has been used as a stable light source for polarization angle calibration in millimeter-wave astronomy. However, it is known that its intensity and polarization vary as a function of time at a variety of wavelengths. Thus, it is of interest to verify the stability of the millimeter-wave polarization. If detected, polarization variability may be used to better understand the dynamics of Tau~A, and for understanding the validity of Tau~A as a calibrator. One intriguing application of such observation is to use it for the search of axion-like particles (ALPs). Ultralight ALPs couple to photons through a Chern-Simons term, and induce a temporal oscillation in the polarization angle of linearly polarized sources. After assessing a number of systematic errors and testing for internal consistency, we evaluate the variability of the polarization angle of the Crab Nebula using 2015 and 2016 observations with the 150 GHz POLARBEAR instrument. We place a median 95% upper bound of polarization oscillation amplitude $A < 0.065^\circ$ over the oscillation frequencies from $0.75~\mathrm{year}^{-1}$ to $0.66~\mathrm{hour}^{-1}$. Assuming that no sources other than ALP are causing Tau A's polarization angle variation, that the ALP constitutes all the dark matter, and that the ALP field is a stochastic Gaussian field, this bound translates into a median 95% upper bound of ALP-photon coupling $g_{aγγ} < 2.16\times10^{-12}\,\mathrm{GeV}^{-1}\times(m_a/10^{-21} \mathrm{eV})$ in the mass range from $9.9\times10^{-23} \mathrm{eV}$ to $7.7\times10^{-19} \mathrm{eV}$. This demonstrates that this type of analysis using bright polarized sources is as competitive as those using the polarization of cosmic microwave background in constraining ALPs.
△ Less
Submitted 19 September, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
Dynamical Model of $J/Ψ$ photo-production on the nucleon
Authors:
S. Sakinah,
T. -S. H. Lee,
Ho-Meoyng Choi
Abstract:
A dynamical model based on a phenomenological charm quark-nucleon($c$-N) potential $v_{cN}$ and the Pomeron-exchange mechanism is constructed to investigate the $J/Ψ$ photo-production on the nucleon from threshold to invariant mass $W=300$ GeV. The $J/Ψ$-N potential,$V_{J/ΨN}(r)$,is constructed by folding $v_{cN}$ into the wavefunction $Φ_{J/Ψ}(c\bar{c})$ of $J/Ψ$ within a Constituent Quark Model(…
▽ More
A dynamical model based on a phenomenological charm quark-nucleon($c$-N) potential $v_{cN}$ and the Pomeron-exchange mechanism is constructed to investigate the $J/Ψ$ photo-production on the nucleon from threshold to invariant mass $W=300$ GeV. The $J/Ψ$-N potential,$V_{J/ΨN}(r)$,is constructed by folding $v_{cN}$ into the wavefunction $Φ_{J/Ψ}(c\bar{c})$ of $J/Ψ$ within a Constituent Quark Model(CQM) of Ref.[43]. A photo-production amplitude is also generated by $v_{cN}$ by a $c\bar{c}$-loop integration over the $γ\rightarrow c\bar{c}$ vertex function and $Φ_{J/Ψ}(c\bar{c})$. No commonly used Vector Meson Dominance assumption is used to define this photo-production amplitude which is needed to describe the data near the threshold. The potential $v_{cN}(r)$ is parameterized in a form such that the predicted $V_{J/ΨN}(r)$ at large distances has the same Yukawa potential form extracted from a Lattice QCD(LQCD) calculation of Ref.[18]. The parameters of $v_{cN}$ are determined by fitting the total cross section data of JLab by performing calculations that include $J/Ψ$-N final state interactions(FSI). The resulting differential cross sections are found in good agreements with the data. It is shown that the FSI effects dominate the cross section in the very near threshold region, allowing for sensitive testing of the predicted $J/Ψ$-N scattering amplitudes. By imposing the constraints of $J/Ψ$-N potential extracted from the LQCD calculation, we have obtained three $J/Ψ$-N potentials which fit the JLab data equally well. The resulting $J/Ψ$-N scattering lengths are in the range of $a=(-0.05$ fm $\sim$ $-0.25$ fm). With the determined $v_{cN}(r)$ and the wavefunctions generated from the same CQM, the constructed model is used to predict the cross sections of photo-production of $η_c(1S)$ and $Ψ(2S)$ mesons for future experimental tests.
△ Less
Submitted 10 April, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
Differentially Private Synthetic Data via Foundation Model APIs 2: Text
Authors:
Chulin Xie,
Zinan Lin,
Arturs Backurs,
Sivakanth Gopi,
Da Yu,
Huseyin A Inan,
Harsha Nori,
Haotian Jiang,
Huishuai Zhang,
Yin Tat Lee,
Bo Li,
Sergey Yekhanin
Abstract:
Text data has become extremely valuable due to the emergence of machine learning algorithms that learn from it. A lot of high-quality text data generated in the real world is private and therefore cannot be shared or used freely due to privacy concerns. Generating synthetic replicas of private text data with a formal privacy guarantee, i.e., differential privacy (DP), offers a promising and scalab…
▽ More
Text data has become extremely valuable due to the emergence of machine learning algorithms that learn from it. A lot of high-quality text data generated in the real world is private and therefore cannot be shared or used freely due to privacy concerns. Generating synthetic replicas of private text data with a formal privacy guarantee, i.e., differential privacy (DP), offers a promising and scalable solution. However, existing methods necessitate DP finetuning of large language models (LLMs) on private data to generate DP synthetic data. This approach is not viable for proprietary LLMs (e.g., GPT-3.5) and also demands considerable computational resources for open-source LLMs. Lin et al. (2024) recently introduced the Private Evolution (PE) algorithm to generate DP synthetic images with only API access to diffusion models. In this work, we propose an augmented PE algorithm, named Aug-PE, that applies to the complex setting of text. We use API access to an LLM and generate DP synthetic text without any model training. We conduct comprehensive experiments on three benchmark datasets. Our results demonstrate that Aug-PE produces DP synthetic text that yields competitive utility with the SOTA DP finetuning baselines. This underscores the feasibility of relying solely on API access of LLMs to produce high-quality DP synthetic texts, thereby facilitating more accessible routes to privacy-preserving LLM applications. Our code and data are available at https://github.com/AI-secure/aug-pe.
△ Less
Submitted 23 July, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
A far-ultraviolet-driven photoevaporation flow observed in a protoplanetary disk
Authors:
Olivier Berné,
Emilie Habart,
Els Peeters,
Ilane Schroetter,
Amélie Canin,
Ameek Sidhu,
Ryan Chown,
Emeric Bron,
Thomas J. Haworth,
Pamela Klaassen,
Boris Trahin,
Dries Van De Putte,
Felipe Alarcón,
Marion Zannese,
Alain Abergel,
Edwin A. Bergin,
Jeronimo Bernard-Salas,
Christiaan Boersma,
Jan Cami,
Sara Cuadrado,
Emmanuel Dartois,
Daniel Dicken,
Meriem Elyajouri,
Asunción Fuente,
Javier R. Goicoechea
, et al. (121 additional authors not shown)
Abstract:
Most low-mass stars form in stellar clusters that also contain massive stars, which are sources of far-ultraviolet (FUV) radiation. Theoretical models predict that this FUV radiation produces photo-dissociation regions (PDRs) on the surfaces of protoplanetary disks around low-mass stars, impacting planet formation within the disks. We report JWST and Atacama Large Millimetere Array observations of…
▽ More
Most low-mass stars form in stellar clusters that also contain massive stars, which are sources of far-ultraviolet (FUV) radiation. Theoretical models predict that this FUV radiation produces photo-dissociation regions (PDRs) on the surfaces of protoplanetary disks around low-mass stars, impacting planet formation within the disks. We report JWST and Atacama Large Millimetere Array observations of a FUV-irradiated protoplanetary disk in the Orion Nebula. Emission lines are detected from the PDR; modelling their kinematics and excitation allows us to constrain the physical conditions within the gas. We quantify the mass-loss rate induced by the FUV irradiation, finding it is sufficient to remove gas from the disk in less than a million years. This is rapid enough to affect giant planet formation in the disk.
△ Less
Submitted 29 February, 2024;
originally announced March 2024.
-
The $X$-semiprimeness of Rings
Authors:
Grigore Călugăreanu,
Tsiu-Kwen Lee,
Jerzy Matczuk
Abstract:
For a nonempty subset $X$ of a ring $R$, the ring $R$ is called $X$-semiprime if, given $a\in R$, $aXa=0$ implies $a=0$. This provides a proper class of semiprime rings. First, we clarify the relationship between idempotent semiprime and unit-semiprime rings. Secondly, given a Lie ideal $L$ of a ring $R$, we offer a criterion for $R$ to be $L$-semiprime. For a prime ring $R$, we characterizes Lie…
▽ More
For a nonempty subset $X$ of a ring $R$, the ring $R$ is called $X$-semiprime if, given $a\in R$, $aXa=0$ implies $a=0$. This provides a proper class of semiprime rings. First, we clarify the relationship between idempotent semiprime and unit-semiprime rings. Secondly, given a Lie ideal $L$ of a ring $R$, we offer a criterion for $R$ to be $L$-semiprime. For a prime ring $R$, we characterizes Lie ideals $L$ of $R$ such that $R$ is $L$-semiprime. Moreover, $X$-semiprimeness of matrix rings, prime rings (with a nontrivial idempotent), semiprime rings, regular rings, and subdirect products are studied.
△ Less
Submitted 9 April, 2024; v1 submitted 29 February, 2024;
originally announced February 2024.
-
EmoBench: Evaluating the Emotional Intelligence of Large Language Models
Authors:
Sahand Sabour,
Siyang Liu,
Zheyuan Zhang,
June M. Liu,
Jinfeng Zhou,
Alvionna S. Sunaryo,
Juanzi Li,
Tatia M. C. Lee,
Rada Mihalcea,
Minlie Huang
Abstract:
Recent advances in Large Language Models (LLMs) have highlighted the need for robust, comprehensive, and challenging benchmarks. Yet, research on evaluating their Emotional Intelligence (EI) is considerably limited. Existing benchmarks have two major shortcomings: first, they mainly focus on emotion recognition, neglecting essential EI capabilities such as emotion regulation and thought facilitati…
▽ More
Recent advances in Large Language Models (LLMs) have highlighted the need for robust, comprehensive, and challenging benchmarks. Yet, research on evaluating their Emotional Intelligence (EI) is considerably limited. Existing benchmarks have two major shortcomings: first, they mainly focus on emotion recognition, neglecting essential EI capabilities such as emotion regulation and thought facilitation through emotion understanding; second, they are primarily constructed from existing datasets, which include frequent patterns, explicit information, and annotation errors, leading to unreliable evaluation. We propose EmoBench, a benchmark that draws upon established psychological theories and proposes a comprehensive definition for machine EI, including Emotional Understanding and Emotional Application. EmoBench includes a set of 400 hand-crafted questions in English and Chinese, which are meticulously designed to require thorough reasoning and understanding. Our findings reveal a considerable gap between the EI of existing LLMs and the average human, highlighting a promising direction for future research. Our code and data are publicly available at https://github.com/Sahandfer/EmoBench.
△ Less
Submitted 17 July, 2024; v1 submitted 19 February, 2024;
originally announced February 2024.
-
Learning to Learn Faster from Human Feedback with Language Model Predictive Control
Authors:
Jacky Liang,
Fei Xia,
Wenhao Yu,
Andy Zeng,
Montserrat Gonzalez Arenas,
Maria Attarian,
Maria Bauza,
Matthew Bennice,
Alex Bewley,
Adil Dostmohamed,
Chuyuan Kelly Fu,
Nimrod Gileadi,
Marissa Giustina,
Keerthana Gopalakrishnan,
Leonard Hasenclever,
Jan Humplik,
Jasmine Hsu,
Nikhil Joshi,
Ben Jyenis,
Chase Kew,
Sean Kirmani,
Tsang-Wei Edward Lee,
Kuang-Huei Lee,
Assaf Hurwitz Michaely,
Joss Moore
, et al. (25 additional authors not shown)
Abstract:
Large language models (LLMs) have been shown to exhibit a wide range of capabilities, such as writing robot code from language commands -- enabling non-experts to direct robot behaviors, modify them based on feedback, or compose them to perform new tasks. However, these capabilities (driven by in-context learning) are limited to short-term interactions, where users' feedback remains relevant for o…
▽ More
Large language models (LLMs) have been shown to exhibit a wide range of capabilities, such as writing robot code from language commands -- enabling non-experts to direct robot behaviors, modify them based on feedback, or compose them to perform new tasks. However, these capabilities (driven by in-context learning) are limited to short-term interactions, where users' feedback remains relevant for only as long as it fits within the context size of the LLM, and can be forgotten over longer interactions. In this work, we investigate fine-tuning the robot code-writing LLMs, to remember their in-context interactions and improve their teachability i.e., how efficiently they adapt to human inputs (measured by average number of corrections before the user considers the task successful). Our key observation is that when human-robot interactions are viewed as a partially observable Markov decision process (in which human language inputs are observations, and robot code outputs are actions), then training an LLM to complete previous interactions is training a transition dynamics model -- that can be combined with classic robotics techniques such as model predictive control (MPC) to discover shorter paths to success. This gives rise to Language Model Predictive Control (LMPC), a framework that fine-tunes PaLM 2 to improve its teachability on 78 tasks across 5 robot embodiments -- improving non-expert teaching success rates of unseen tasks by 26.9% while reducing the average number of human corrections from 2.4 to 1.9. Experiments show that LMPC also produces strong meta-learners, improving the success rate of in-context learning new tasks on unseen robot embodiments and APIs by 31.5%. See videos, code, and demos at: https://robot-teaching.github.io/.
△ Less
Submitted 31 May, 2024; v1 submitted 17 February, 2024;
originally announced February 2024.
-
GTC Spectroscopic Surveys of Planetary Nebulae in the Milky Way and M31
Authors:
Xuan Fang,
Haomiao Huang,
Martin A. Guerrero,
Letizia Stanghellini,
Ruben Garcia-Benito,
Ting-Hui Lee,
Yong Zhang
Abstract:
We report spectroscopic surveys of planetary nebulae (PNe) in the Milky Way and Andromeda (M31), using the 10.4-m Gran Telescopio Canarias (GTC). The spectra are of high quality and cover the whole optical range, mostly from 3650 Å to beyond 1 micron, enabling detection of nebular emission lines critical for spectral analysis as well as photoionization modeling. We obtained GTC spectra of 24 compa…
▽ More
We report spectroscopic surveys of planetary nebulae (PNe) in the Milky Way and Andromeda (M31), using the 10.4-m Gran Telescopio Canarias (GTC). The spectra are of high quality and cover the whole optical range, mostly from 3650 Å to beyond 1 micron, enabling detection of nebular emission lines critical for spectral analysis as well as photoionization modeling. We obtained GTC spectra of 24 compact (angular diameter <5 arcsec) PNe located in the Galactic disk, ~3-20 kpc from the Galactic centre, and can be used to constrain stellar evolution models and derive radial abundance gradients of the Milky Way. We have observed 30 PNe in the outer halo of M31 using the GTC. These halo PNe are uniformly metal-rich and probably all evolved from low-mass stars, consistent with the conjecture that they all formed from the metal-rich gas in M31 disk but displaced to their present locations due to galaxy interactions.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Authors:
Soroush Nasiriany,
Fei Xia,
Wenhao Yu,
Ted Xiao,
Jacky Liang,
Ishita Dasgupta,
Annie Xie,
Danny Driess,
Ayzaan Wahid,
Zhuo Xu,
Quan Vuong,
Tingnan Zhang,
Tsang-Wei Edward Lee,
Kuang-Huei Lee,
Peng Xu,
Sean Kirmani,
Yuke Zhu,
Andy Zeng,
Karol Hausman,
Nicolas Heess,
Chelsea Finn,
Sergey Levine,
Brian Ichter
Abstract:
Vision language models (VLMs) have shown impressive capabilities across a variety of tasks, from logical reasoning to visual understanding. This opens the door to richer interaction with the world, for example robotic control. However, VLMs produce only textual outputs, while robotic control and other spatial tasks require outputting continuous coordinates, actions, or trajectories. How can we ena…
▽ More
Vision language models (VLMs) have shown impressive capabilities across a variety of tasks, from logical reasoning to visual understanding. This opens the door to richer interaction with the world, for example robotic control. However, VLMs produce only textual outputs, while robotic control and other spatial tasks require outputting continuous coordinates, actions, or trajectories. How can we enable VLMs to handle such settings without fine-tuning on task-specific data?
In this paper, we propose a novel visual prompting approach for VLMs that we call Prompting with Iterative Visual Optimization (PIVOT), which casts tasks as iterative visual question answering. In each iteration, the image is annotated with a visual representation of proposals that the VLM can refer to (e.g., candidate robot actions, localizations, or trajectories). The VLM then selects the best ones for the task. These proposals are iteratively refined, allowing the VLM to eventually zero in on the best available answer. We investigate PIVOT on real-world robotic navigation, real-world manipulation from images, instruction following in simulation, and additional spatial inference tasks such as localization. We find, perhaps surprisingly, that our approach enables zero-shot control of robotic systems without any robot training data, navigation in a variety of environments, and other capabilities. Although current performance is far from perfect, our work highlights potentials and limitations of this new regime and shows a promising approach for Internet-Scale VLMs in robotic and spatial reasoning domains. Website: pivot-prompt.github.io and HuggingFace: https://huggingface.co/spaces/pivot-prompt/pivot-prompt-demo.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Eigenmode Decomposition Method for Full-Wave Modeling of Microring Resonators
Authors:
Yuriy Akimov,
Aswin Alexander Eapen,
Shiyang Zhu,
Doris K. T. Ng,
Nanxi Li,
Woon Leng Loh,
Lennon Y. T. Lee,
Alagappan Gandhi,
Aravind P. Anthur
Abstract:
We develop a theoretical predictive model for an all-pass ring resonator that enables the most complete description of linear coupling regimes. The model is based on eigenmode decomposition of Maxwell's equations with full account of the confined and leaky modes, as opposed to the existing phenomenological methods restricted to the confined modes only. This model enables quantitative description o…
▽ More
We develop a theoretical predictive model for an all-pass ring resonator that enables the most complete description of linear coupling regimes. The model is based on eigenmode decomposition of Maxwell's equations with full account of the confined and leaky modes, as opposed to the existing phenomenological methods restricted to the confined modes only. This model enables quantitative description of all-pass ring resonators and provides insights into the physics underlying microring-waveguide coupling. We experimentally validate the model using transmission measurements in the linear regime of aluminium nitride resonators. The developed model is then used to explore the field enhancement in microrings crucial for nonlinear photonic applications.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Dance-to-Music Generation with Encoder-based Textual Inversion
Authors:
Sifei Li,
Weiming Dong,
Yuxin Zhang,
Fan Tang,
Chongyang Ma,
Oliver Deussen,
Tong-Yee Lee,
Changsheng Xu
Abstract:
The seamless integration of music with dance movements is essential for communicating the artistic intent of a dance piece. This alignment also significantly improves the immersive quality of gaming experiences and animation productions. Although there has been remarkable advancement in creating high-fidelity music from textual descriptions, current methodologies mainly focus on modulating overall…
▽ More
The seamless integration of music with dance movements is essential for communicating the artistic intent of a dance piece. This alignment also significantly improves the immersive quality of gaming experiences and animation productions. Although there has been remarkable advancement in creating high-fidelity music from textual descriptions, current methodologies mainly focus on modulating overall characteristics such as genre and emotional tone. They often overlook the nuanced management of temporal rhythm, which is indispensable in crafting music for dance, since it intricately aligns the musical beats with the dancers' movements. Recognizing this gap, we propose an encoder-based textual inversion technique to augment text-to-music models with visual control, facilitating personalized music generation. Specifically, we develop dual-path rhythm-genre inversion to effectively integrate the rhythm and genre of a dance motion sequence into the textual space of a text-to-music model. Contrary to traditional textual inversion methods, which directly update text embeddings to reconstruct a single target object, our approach utilizes separate rhythm and genre encoders to obtain text embeddings for two pseudo-words, adapting to the varying rhythms and genres. We collect a new dataset called In-the-wild Dance Videos (InDV) and demonstrate that our approach outperforms state-of-the-art methods across multiple evaluation metrics. Furthermore, our method is able to adapt to changes in tempo and effectively integrates with the inherent text-guided generation capability of the pre-trained model. Our source code and demo videos are available at \url{https://github.com/lsfhuihuiff/Dance-to-music_Siggraph_Asia_2024}
△ Less
Submitted 12 September, 2024; v1 submitted 31 January, 2024;
originally announced January 2024.
-
Spreading and engulfment of a viscoelastic film onto a Newtonian droplet
Authors:
Chunheng Zhao,
Taehun Lee,
Andreas Carlson
Abstract:
We use the conservative phase-field lattice Boltzmann method to investigate the dynamics when a Newtonian droplet comes in contact with an immiscible viscoelastic liquid film. The dynamics of the three liquid phases are explored through numerical simulations, with a focus on illustrating the contact line dynamics and the viscoelastic effects described by the Oldroyd-B model. The droplet dynamics a…
▽ More
We use the conservative phase-field lattice Boltzmann method to investigate the dynamics when a Newtonian droplet comes in contact with an immiscible viscoelastic liquid film. The dynamics of the three liquid phases are explored through numerical simulations, with a focus on illustrating the contact line dynamics and the viscoelastic effects described by the Oldroyd-B model. The droplet dynamics are contrasted with the case of a Newtonian fluid film. The simulations demonstrate that when the film is viscoelastic, the droplet dynamics become insensitive to the film thickness when the polymer viscosity and relaxation time are large. A viscoelastic ridge forms at the moving contact line, which evolves with a power-law dependence on time. By rescaling the interface profile of the ridge using its height and width, it appears to collapse onto a similar shape. Our findings reveal a strong correlation between the viscoelastic stress and the interface shape near the contact line.
△ Less
Submitted 31 January, 2024;
originally announced January 2024.
-
Towards Generating Informative Textual Description for Neurons in Language Models
Authors:
Shrayani Mondal,
Rishabh Garodia,
Arbaaz Qureshi,
Taesung Lee,
Youngja Park
Abstract:
Recent developments in transformer-based language models have allowed them to capture a wide variety of world knowledge that can be adapted to downstream tasks with limited resources. However, what pieces of information are understood in these models is unclear, and neuron-level contributions in identifying them are largely unknown. Conventional approaches in neuron explainability either depend on…
▽ More
Recent developments in transformer-based language models have allowed them to capture a wide variety of world knowledge that can be adapted to downstream tasks with limited resources. However, what pieces of information are understood in these models is unclear, and neuron-level contributions in identifying them are largely unknown. Conventional approaches in neuron explainability either depend on a finite set of pre-defined descriptors or require manual annotations for training a secondary model that can then explain the neurons of the primary model. In this paper, we take BERT as an example and we try to remove these constraints and propose a novel and scalable framework that ties textual descriptions to neurons. We leverage the potential of generative language models to discover human-interpretable descriptors present in a dataset and use an unsupervised approach to explain neurons with these descriptors. Through various qualitative and quantitative analyses, we demonstrate the effectiveness of this framework in generating useful data-specific descriptors with little human involvement in identifying the neurons that encode these descriptors. In particular, our experiment shows that the proposed approach achieves 75% precision@2, and 50% recall@2
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Hyperphosphorylation-Induced Phase Transition in Vesicle Delivery Dynamics of Motor Proteins in Neuronal Cells
Authors:
Eunsang Lee,
Donghee Kim,
Yo Han Song,
Kyujin Shin,
Sanggeun Song,
Minho Lee,
Yeongchang Goh,
Mi Hee Lim,
Ji-Hyun Kim,
Jaeyoung Sung,
Kang Taek Lee
Abstract:
Synaptic vesicle transport by motor proteins along microtubules is a crucial active process underlying neuronal communication. It is known that microtubules are destabilized by tau-hyperphosphorylation, which causes tau proteins to detach from microtubules and form neurofibril tangles. However, how tau-phosphorylation affects transport dynamics of motor proteins on the microtubule remains unknown.…
▽ More
Synaptic vesicle transport by motor proteins along microtubules is a crucial active process underlying neuronal communication. It is known that microtubules are destabilized by tau-hyperphosphorylation, which causes tau proteins to detach from microtubules and form neurofibril tangles. However, how tau-phosphorylation affects transport dynamics of motor proteins on the microtubule remains unknown. Here, we discover that long-distance unidirectional motion of vesicle-motor protein multiplexes (VMPMs) in living cells is suppressed under tau-hyperphosphorylation, with the consequent loss of fast vesicle-transport along the microtubule. The VMPMs in hyperphosphorylated cells exhibit seemingly bidirectional random motion, with dynamic properties far different from VMPM motion in normal cells. We establish a parsimonious physicochemical model of VMPM's active motion that provides a unified, quantitative explanation and predictions for our experimental results. Our analysis reveals that, under hyperphosphorylation conditions, motor-protein-multiplexes have both static and dynamic motility fluctuations. The loss of the fast vesicle-transport along the microtubule can be a mechanism of neurodegenerative disorders associated with tau-hyperphosphorylation.
△ Less
Submitted 23 April, 2024; v1 submitted 27 January, 2024;
originally announced January 2024.
-
Speeding up Fermionic Lattice Calculations with Photonic Accelerated Inverters
Authors:
Felipe Attanasio,
Marc Bauer,
Jelle Dijkstra,
Timoteo Lee,
Jan M. Pawlowski,
Wolfram Pernice
Abstract:
Lattice field theory (LFT) is the standard non-perturbative method to perform numerical calculations of quantum field theory. However, the typical bottleneck of fermionic lattice calculations is the inversion of the Dirac matrix. This inversion is solved by iterative methods, like the conjugate gradient algorithm, where matrix-vector multiplications (MVMs) are the main operation. Photonic integrat…
▽ More
Lattice field theory (LFT) is the standard non-perturbative method to perform numerical calculations of quantum field theory. However, the typical bottleneck of fermionic lattice calculations is the inversion of the Dirac matrix. This inversion is solved by iterative methods, like the conjugate gradient algorithm, where matrix-vector multiplications (MVMs) are the main operation. Photonic integrated circuits excel in performing quick and energy-efficient MVMs, but at the same time, they are known to have low accuracy. This can be overcome by using mixed precision methods. In this paper, we explore the idea of using photonic technology to fulfil the demand for computational power of fermionic lattice calculations. These methods have the potential to reduce computation costs by one order of magnitude. Because of the hybrid nature of these methods, we call these 'photonic accelerated inverters (PAIs)'.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
CreativeSynth: Cross-Art-Attention for Artistic Image Synthesis with Multimodal Diffusion
Authors:
Nisha Huang,
Weiming Dong,
Yuxin Zhang,
Fan Tang,
Ronghui Li,
Chongyang Ma,
Xiu Li,
Tong-Yee Lee,
Changsheng Xu
Abstract:
Although remarkable progress has been made in image style transfer, style is just one of the components of artistic paintings. Directly transferring extracted style features to natural images often results in outputs with obvious synthetic traces. This is because key painting attributes including layout, perspective, shape, and semantics often cannot be conveyed and expressed through style transfe…
▽ More
Although remarkable progress has been made in image style transfer, style is just one of the components of artistic paintings. Directly transferring extracted style features to natural images often results in outputs with obvious synthetic traces. This is because key painting attributes including layout, perspective, shape, and semantics often cannot be conveyed and expressed through style transfer. Large-scale pretrained text-to-image generation models have demonstrated their capability to synthesize a vast amount of high-quality images. However, even with extensive textual descriptions, it is challenging to fully express the unique visual properties and details of paintings. Moreover, generic models often disrupt the overall artistic effect when modifying specific areas, making it more complicated to achieve a unified aesthetic in artworks. Our main novel idea is to integrate multimodal semantic information as a synthesis guide into artworks, rather than transferring style to the real world. We also aim to reduce the disruption to the harmony of artworks while simplifying the guidance conditions. Specifically, we propose an innovative multi-task unified framework called CreativeSynth, based on the diffusion model with the ability to coordinate multimodal inputs. CreativeSynth combines multimodal features with customized attention mechanisms to seamlessly integrate real-world semantic content into the art domain through Cross-Art-Attention for aesthetic maintenance and semantic fusion. We demonstrate the results of our method across a wide range of different art categories, proving that CreativeSynth bridges the gap between generative models and artistic expression. Code and results are available at https://github.com/haha-lisa/CreativeSynth.
△ Less
Submitted 15 May, 2025; v1 submitted 25 January, 2024;
originally announced January 2024.
-
Flaring Stars in a Non-targeted mm-wave Survey with SPT-3G
Authors:
C. Tandoi,
S. Guns,
A. Foster,
P. A. R. Ade,
A. J. Anderson,
B. Ansarinejad,
M. Archipley,
L. Balkenhol,
K. Benabed,
A. N. Bender,
B. A. Benson,
F. Bianchini,
L. E. Bleem,
F. R. Bouchet,
L. Bryant,
E. Camphuis,
J. E. Carlstrom,
T. W. Cecil,
C. L. Chang,
P. Chaubal,
P. M. Chichura,
T. -L. Chou,
A. Coerver,
T. M. Crawford,
A. Cukierman
, et al. (74 additional authors not shown)
Abstract:
We present a flare star catalog from four years of non-targeted millimeter-wave survey data from the South Pole Telescope (SPT). The data were taken with the SPT-3G camera and cover a 1500-square-degree region of the sky from $20^{h}40^{m}0^{s}$ to $3^{h}20^{m}0^{s}$ in right ascension and $-42^{\circ}$ to $-70^{\circ}$ in declination. This region was observed on a nearly daily cadence from 2019-2…
▽ More
We present a flare star catalog from four years of non-targeted millimeter-wave survey data from the South Pole Telescope (SPT). The data were taken with the SPT-3G camera and cover a 1500-square-degree region of the sky from $20^{h}40^{m}0^{s}$ to $3^{h}20^{m}0^{s}$ in right ascension and $-42^{\circ}$ to $-70^{\circ}$ in declination. This region was observed on a nearly daily cadence from 2019-2022 and chosen to avoid the plane of the galaxy. A short-duration transient search of this survey yields 111 flaring events from 66 stars, increasing the number of both flaring events and detected flare stars by an order of magnitude from the previous SPT-3G data release. We provide cross-matching to Gaia DR3, as well as matches to X-ray point sources found in the second ROSAT all-sky survey. We have detected flaring stars across the main sequence, from early-type A stars to M dwarfs, as well as a large population of evolved stars. These stars are mostly nearby, spanning 10 to 1000 parsecs in distance. Most of the flare spectral indices are constant or gently rising as a function of frequency at 95/150/220 GHz. The timescale of these events can range from minutes to hours, and the peak $νL_ν$ luminosities range from $10^{27}$ to $10^{31}$ erg s$^{-1}$ in the SPT-3G frequency bands.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
DOO-RE: A dataset of ambient sensors in a meeting room for activity recognition
Authors:
Hyunju Kim,
Geon Kim,
Taehoon Lee,
Kisoo Kim,
Dongman Lee
Abstract:
With the advancement of IoT technology, recognizing user activities with machine learning methods is a promising way to provide various smart services to users. High-quality data with privacy protection is essential for deploying such services in the real world. Data streams from surrounding ambient sensors are well suited to the requirement. Existing ambient sensor datasets only support constrain…
▽ More
With the advancement of IoT technology, recognizing user activities with machine learning methods is a promising way to provide various smart services to users. High-quality data with privacy protection is essential for deploying such services in the real world. Data streams from surrounding ambient sensors are well suited to the requirement. Existing ambient sensor datasets only support constrained private spaces and those for public spaces have yet to be explored despite growing interest in research on them. To meet this need, we build a dataset collected from a meeting room equipped with ambient sensors. The dataset, DOO-RE, includes data streams from various ambient sensor types such as Sound and Projector. Each sensor data stream is segmented into activity units and multiple annotators provide activity labels through a cross-validation annotation process to improve annotation quality. We finally obtain 9 types of activities. To our best knowledge, DOO-RE is the first dataset to support the recognition of both single and group activities in a real meeting room with reliable annotations.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
Encoding position by spins: Objectivity in the boson-spin model
Authors:
Tae-Hun Lee,
Jarosław K. Korbicz
Abstract:
We investigate quantum objectivity in the boson-spin model, where a central harmonic oscillator interacts with a thermal bath of spin-1/2 systems. We analyze how information about a continuous position variable can be encoded into discrete, finite-dimensional environments. More precisely, we study conditions under which the so-called Spectrum Broadcast Structures (SBS) can be formed in the model.…
▽ More
We investigate quantum objectivity in the boson-spin model, where a central harmonic oscillator interacts with a thermal bath of spin-1/2 systems. We analyze how information about a continuous position variable can be encoded into discrete, finite-dimensional environments. More precisely, we study conditions under which the so-called Spectrum Broadcast Structures (SBS) can be formed in the model. These are multipartite quantum state structures, representing a mode-refined form of decoherence. Working in the recoil-less limit, we use the Floquet theory to show that despite its apparent simplicity, the model has a rich structure with different regimes, depending on the motion of the central system. In one of them, the faithful encoding of the position and hence objectivity are impossible irrespectively of the resources used. In other, large enough collections of spins will faithfully encode the position information. We derive the characteristic length scales, corresponding to decoherence and precision of the encoding.
△ Less
Submitted 3 May, 2024; v1 submitted 15 January, 2024;
originally announced January 2024.
-
Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems
Authors:
Yue Kang,
Cho-Jui Hsieh,
Thomas C. M. Lee
Abstract:
In the stochastic contextual low-rank matrix bandit problem, the expected reward of an action is given by the inner product between the action's feature matrix and some fixed, but initially unknown $d_1$ by $d_2$ matrix $Θ^*$ with rank $r \ll \{d_1, d_2\}$, and an agent sequentially takes actions based on past experience to maximize the cumulative reward. In this paper, we study the generalized lo…
▽ More
In the stochastic contextual low-rank matrix bandit problem, the expected reward of an action is given by the inner product between the action's feature matrix and some fixed, but initially unknown $d_1$ by $d_2$ matrix $Θ^*$ with rank $r \ll \{d_1, d_2\}$, and an agent sequentially takes actions based on past experience to maximize the cumulative reward. In this paper, we study the generalized low-rank matrix bandit problem, which has been recently proposed in \cite{lu2021low} under the Generalized Linear Model (GLM) framework. To overcome the computational infeasibility and theoretical restrain of existing algorithms on this problem, we first propose the G-ESTT framework that modifies the idea from \cite{jun2019bilinear} by using Stein's method on the subspace estimation and then leverage the estimated subspaces via a regularization idea. Furthermore, we remarkably improve the efficiency of G-ESTT by using a novel exclusion idea on the estimated subspace instead, and propose the G-ESTS framework. We also show that G-ESTT can achieve the $\tilde{O}(\sqrt{(d_1+d_2)MrT})$ bound of regret while G-ESTS can achineve the $\tilde{O}(\sqrt{(d_1+d_2)^{3/2}Mr^{3/2}T})$ bound of regret under mild assumption up to logarithm terms, where $M$ is some problem dependent value. Under a reasonable assumption that $M = O((d_1+d_2)^2)$ in our problem setting, the regret of G-ESTT is consistent with the current best regret of $\tilde{O}((d_1+d_2)^{3/2} \sqrt{rT}/D_{rr})$~\citep{lu2021low} ($D_{rr}$ will be defined later). For completeness, we conduct experiments to illustrate that our proposed algorithms, especially G-ESTS, are also computationally tractable and consistently outperform other state-of-the-art (generalized) linear matrix bandit methods based on a suite of simulations.
△ Less
Submitted 14 January, 2024;
originally announced January 2024.
-
Creating Personalized Synthetic Voices from Articulation Impaired Speech Using Augmented Reconstruction Loss
Authors:
Yusheng Tian,
Jingyu Li,
Tan Lee
Abstract:
This research is about the creation of personalized synthetic voices for head and neck cancer survivors. It is focused particularly on tongue cancer patients whose speech might exhibit severe articulation impairment. Our goal is to restore normal articulation in the synthesized speech, while maximally preserving the target speaker's individuality in terms of both the voice timbre and speaking styl…
▽ More
This research is about the creation of personalized synthetic voices for head and neck cancer survivors. It is focused particularly on tongue cancer patients whose speech might exhibit severe articulation impairment. Our goal is to restore normal articulation in the synthesized speech, while maximally preserving the target speaker's individuality in terms of both the voice timbre and speaking style. This is formulated as a task of learning from noisy labels. We propose to augment the commonly used speech reconstruction loss with two additional terms. The first term constitutes a regularization loss that mitigates the impact of distorted articulation in the training speech. The second term is a consistency loss that encourages correct articulation in the generated speech. These additional loss terms are obtained from frame-level articulation scores of original and generated speech, which are derived using a separately trained phone classifier. Experimental results on a real case of tongue cancer patient confirm that the synthetic voice achieves comparable articulation quality to unimpaired natural speech, while effectively maintaining the target speaker's individuality. Audio samples are available at https://myspeechproject.github.io/ArticulationRepair/.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
LUPET: Incorporating Hierarchical Information Path into Multilingual ASR
Authors:
Wei Liu,
Jingyong Hou,
Dong Yang,
Muyong Cao,
Tan Lee
Abstract:
Toward high-performance multilingual automatic speech recognition (ASR), various types of linguistic information and model design have demonstrated their effectiveness independently. They include language identity (LID), phoneme information, language-specific processing modules, and cross-lingual self-supervised speech representation. It is expected that leveraging their benefits synergistically i…
▽ More
Toward high-performance multilingual automatic speech recognition (ASR), various types of linguistic information and model design have demonstrated their effectiveness independently. They include language identity (LID), phoneme information, language-specific processing modules, and cross-lingual self-supervised speech representation. It is expected that leveraging their benefits synergistically in a unified solution would further improve the overall system performance. This paper presents a novel design of a hierarchical information path, named LUPET, which sequentially encodes, from the shallow layers to deep layers, multiple aspects of linguistic and acoustic information at diverse granularity scales. The path starts from LID prediction, followed by acoustic unit discovery, phoneme sharing, and finally token recognition routed by a mixture-of-expert. ASR experiments are carried out on 10 languages in the Common Voice corpus. The results demonstrate the superior performance of LUPET as compared to the baseline systems. Most importantly, LUPET effectively mitigates the issue of performance compromise of high-resource languages with low-resource ones in the multilingual setting.
△ Less
Submitted 8 January, 2025; v1 submitted 8 January, 2024;
originally announced January 2024.
-
Certain functional identities on division rings
Authors:
Tsiu-Kwen Lee,
Jheng-Huei Lin
Abstract:
We study the functional identity $G(x)f(x)=H(x)$ on a division ring $D$, where $f \colon D\to D$ is an additive map and $G(X)\ne 0, H(X)$ are generalized polynomials in the variable $X$ with coefficients in $D$. Precisely, it is proved that either $D$ is finite-dimensional over its center or $f$ is an elementary operator. Applying the result and its consequences, we prove that if $D$ is a noncommu…
▽ More
We study the functional identity $G(x)f(x)=H(x)$ on a division ring $D$, where $f \colon D\to D$ is an additive map and $G(X)\ne 0, H(X)$ are generalized polynomials in the variable $X$ with coefficients in $D$. Precisely, it is proved that either $D$ is finite-dimensional over its center or $f$ is an elementary operator. Applying the result and its consequences, we prove that if $D$ is a noncommutative division ring of characteristic not $2$, then the only solution of additive maps $f, g$ on $D$ satisfying the identity $f(x) = x^n g(x^{-1})$ with $n\ne 2$ a positive integer is the trivial case, that is, $f=0$ and $g=0$. This extends Catalano and Merchán's result in 2023 to get a complete solution.
△ Less
Submitted 5 January, 2024;
originally announced January 2024.
-
Image Collage on Arbitrary Shape via Shape-Aware Slicing and Optimization
Authors:
Dong-Yi Wu,
Thi-Ngoc-Hanh Le,
Sheng-Yi Yao,
Yun-Chen Lin,
Tong-Yee Lee
Abstract:
Image collage is a very useful tool for visualizing an image collection. Most of the existing methods and commercial applications for generating image collages are designed on simple shapes, such as rectangular and circular layouts. This greatly limits the use of image collages in some artistic and creative settings. Although there are some methods that can generate irregularly-shaped image collag…
▽ More
Image collage is a very useful tool for visualizing an image collection. Most of the existing methods and commercial applications for generating image collages are designed on simple shapes, such as rectangular and circular layouts. This greatly limits the use of image collages in some artistic and creative settings. Although there are some methods that can generate irregularly-shaped image collages, they often suffer from severe image overlapping and excessive blank space. This prevents such methods from being effective information communication tools. In this paper, we present a shape slicing algorithm and an optimization scheme that can create image collages of arbitrary shapes in an informative and visually pleasing manner given an input shape and an image collection. To overcome the challenge of irregular shapes, we propose a novel algorithm, called Shape-Aware Slicing, which partitions the input shape into cells based on medial axis and binary slicing tree. Shape-Aware Slicing, which is designed specifically for irregular shapes, takes human perception and shape structure into account to generate visually pleasing partitions. Then, the layout is optimized by analyzing input images with the goal of maximizing the total salient regions of the images. To evaluate our method, we conduct extensive experiments and compare our results against previous work. The evaluations show that our proposed algorithm can efficiently arrange image collections on irregular shapes and create visually superior results than prior work and existing commercial tools.
△ Less
Submitted 17 November, 2023;
originally announced January 2024.
-
SPT Clusters with DES and HST Weak Lensing. II. Cosmological Constraints from the Abundance of Massive Halos
Authors:
S. Bocquet,
S. Grandis,
L. E. Bleem,
M. Klein,
J. J. Mohr,
T. Schrabback,
T. M. C. Abbott,
P. A. R. Ade,
M. Aguena,
A. Alarcon,
S. Allam,
S. W. Allen,
O. Alves,
A. Amon,
A. J. Anderson,
J. Annis,
B. Ansarinejad,
J. E. Austermann,
S. Avila,
D. Bacon,
M. Bayliss,
J. A. Beall,
K. Bechtol,
M. R. Becker,
A. N. Bender
, et al. (171 additional authors not shown)
Abstract:
We present cosmological constraints from the abundance of galaxy clusters selected via the thermal Sunyaev-Zel'dovich (SZ) effect in South Pole Telescope (SPT) data with a simultaneous mass calibration using weak gravitational lensing data from the Dark Energy Survey (DES) and the Hubble Space Telescope (HST). The cluster sample is constructed from the combined SPT-SZ, SPTpol ECS, and SPTpol 500d…
▽ More
We present cosmological constraints from the abundance of galaxy clusters selected via the thermal Sunyaev-Zel'dovich (SZ) effect in South Pole Telescope (SPT) data with a simultaneous mass calibration using weak gravitational lensing data from the Dark Energy Survey (DES) and the Hubble Space Telescope (HST). The cluster sample is constructed from the combined SPT-SZ, SPTpol ECS, and SPTpol 500d surveys, and comprises 1,005 confirmed clusters in the redshift range $0.25-1.78$ over a total sky area of 5,200 deg$^2$. We use DES Year 3 weak-lensing data for 688 clusters with redshifts $z<0.95$ and HST weak-lensing data for 39 clusters with $0.6<z<1.7$. The weak-lensing measurements enable robust mass measurements of sample clusters and allow us to empirically constrain the SZ observable--mass relation. For a flat $Λ$CDM cosmology, and marginalizing over the sum of massive neutrinos, we measure $Ω_\mathrm{m}=0.286\pm0.032$, $σ_8=0.817\pm0.026$, and the parameter combination $σ_8\,(Ω_\mathrm{m}/0.3)^{0.25}=0.805\pm0.016$. Our measurement of $S_8\equivσ_8\,\sqrt{Ω_\mathrm{m}/0.3}=0.795\pm0.029$ and the constraint from Planck CMB anisotropies (2018 TT,TE,EE+lowE) differ by $1.1σ$. In combination with that Planck dataset, we place a 95% upper limit on the sum of neutrino masses $\sum m_ν<0.18$ eV. When additionally allowing the dark energy equation of state parameter $w$ to vary, we obtain $w=-1.45\pm0.31$ from our cluster-based analysis. In combination with Planck data, we measure $w=-1.34^{+0.22}_{-0.15}$, or a $2.2σ$ difference with a cosmological constant. We use the cluster abundance to measure $σ_8$ in five redshift bins between 0.25 and 1.8, and we find the results to be consistent with structure growth as predicted by the $Λ$CDM model fit to Planck primary CMB data.
△ Less
Submitted 21 June, 2024; v1 submitted 4 January, 2024;
originally announced January 2024.
-
The expected value of sample information calculations for external validation of risk prediction models
Authors:
Mohsen Sadatsafavi,
Andrew J Vickers,
Tae Yoon Lee,
Paul Gustafson,
Laure Wynants
Abstract:
In designing external validation studies of clinical prediction models, contemporary sample size calculation methods are based on the frequentist inferential paradigm. One of the widely reported metrics of model performance is net benefit (NB), and the relevance of conventional inference around NB as a measure of clinical utility is doubtful. Value of Information methodology quantifies the consequ…
▽ More
In designing external validation studies of clinical prediction models, contemporary sample size calculation methods are based on the frequentist inferential paradigm. One of the widely reported metrics of model performance is net benefit (NB), and the relevance of conventional inference around NB as a measure of clinical utility is doubtful. Value of Information methodology quantifies the consequences of uncertainty in terms of its impact on clinical utility of decisions. We introduce the expected value of sample information (EVSI) for validation as the expected gain in NB from conducting an external validation study of a given size. We propose algorithms for EVSI computation, and in a case study demonstrate how EVSI changes as a function of the amount of current information and future study's sample size. Value of Information methodology provides a decision-theoretic lens to the process of planning a validation study of a risk prediction model and can complement conventional methods when designing such studies.
△ Less
Submitted 5 December, 2024; v1 submitted 3 January, 2024;
originally announced January 2024.
-
Designing Electricity Distribution Networks: The Impact of Demand Coincidence
Authors:
Gunther Gust,
Alexander Schlüter,
Stefan Feuerriegel,
Ignacio Úbeda,
Jonathan T Lee,
Dirk Neumann
Abstract:
With the global effort to reduce carbon emissions, clean technologies such as electric vehicles and heat pumps are increasingly introduced into electricity distribution networks. These technologies considerably increase electricity flows and can lead to more coincident electricity demand. In this paper, we analyze how such increases in demand coincidence impact future distribution network investme…
▽ More
With the global effort to reduce carbon emissions, clean technologies such as electric vehicles and heat pumps are increasingly introduced into electricity distribution networks. These technologies considerably increase electricity flows and can lead to more coincident electricity demand. In this paper, we analyze how such increases in demand coincidence impact future distribution network investments. For this purpose, we develop a novel model for designing electricity distribution networks, called the distribution network reconfiguration problem with line-specific demand coincidence (DNRP-LSDC). Our analysis is two-fold: (1) We apply our model to a large sample of real-world networks from a Swiss distribution network operator. We find that a high demand coincidence due to, for example, a large-scale uptake of electric vehicles, requires a substantial amount of new network line construction and increases average network cost by 84 % in comparison to the status quo. (2) We use a set of synthetic networks to isolate the effect of specific network characteristics. Here, we show that high coincidence has a more detrimental effect on large networks and on networks with low geographic consumer densities, as present in, e. g., rural areas. We also show that expansion measures are robust to variations in the cost parameters. Our results demonstrate the necessity of designing policies and operational protocols that reduce demand coincidence. Moreover, our findings show that operators of distribution networks must consider the demand coincidence of new electricity uses and adapt investment budgets accordingly. Here, our solution algorithms for the DNRP-LSDC problem can support operators of distribution networks in strategic and operational network design tasks.
△ Less
Submitted 20 January, 2024; v1 submitted 20 December, 2023;
originally announced December 2023.
-
Anti-reflection coating with mullite and Duroid for large-diameter cryogenic sapphire and alumina optics
Authors:
Kana Sakaguri,
Masaya Hasegawa,
Yuki Sakurai,
Junna Sugiyama,
Nicole Farias,
Charles Hill,
Bradley R. Johnson,
Kuniaki Konishi,
Akito Kusaka,
Adrian T. Lee,
Tomotake Matsumura,
Edward J. Wollack,
Junji Yumoto
Abstract:
We developed a broadband two-layer anti-reflection (AR) coating for use on a sapphire half-wave plate (HWP) and an alumina infrared (IR) filter for the cosmic microwave background (CMB) polarimetry. Measuring the faint CMB B-mode signals requires maximizing the number of photons reaching the detectors and minimizing spurious polarization due to reflection with an off-axis incident angle. Sapphire…
▽ More
We developed a broadband two-layer anti-reflection (AR) coating for use on a sapphire half-wave plate (HWP) and an alumina infrared (IR) filter for the cosmic microwave background (CMB) polarimetry. Measuring the faint CMB B-mode signals requires maximizing the number of photons reaching the detectors and minimizing spurious polarization due to reflection with an off-axis incident angle. Sapphire and alumina have high refractive indices of 3.1 and are highly reflective without an AR coating. This paper presents the design, fabrication, quality control, and measured performance of an AR coating using thermally-sprayed mullite and Duroid 5880LZ. This technology enables large optical elements with diameters of 600 mm. We also present a newly developed thermography-based nondestructive quality control technique, which is key to assuring good adhesion and preventing delamination when thermal cycling. We demonstrate the average reflectance of about 2.6% (0.9%) for two observing bands centered at 90/150 (220/280) GHz. At room temperature, the average transmittance of a 105 mm square test sample at 220/280 GHz is 83%, and it will increase to 90% at 100 K, attributed to reduced absorption losses. Therefore, our developed layering technique has proved effective for 220/280 GHz applications, particularly in addressing dielectric loss concerns. This AR coating technology has been deployed in the cryogenic HWP and IR filters of the Simons Array and the Simons observatory experiments and applies to future experiments such as CMB-S4.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Electrically reconfigurable phase-change transmissive metasurface
Authors:
Cosmin Constantin Popescu,
Kiumars Aryana,
Parth Garud,
Khoi Phuong Dao,
Steven Vitale,
Vladimir Liberman,
Hyung-Bin Bae,
Tae-Woo Lee,
Myungkoo Kang,
Kathleen A. Richardson,
Carlos A. Rios Ocampo,
Yifei Zhang,
Tian Gu,
Juejun Hu,
Hyun Jung Kim
Abstract:
Programmable and reconfigurable optics hold significant potential for transforming a broad spectrum of applications, spanning space explorations to biomedical imaging, gas sensing, and optical cloaking. The ability to adjust the optical properties of components like filters, lenses, and beam steering devices could result in dramatic reductions in size, weight, and power consumption in future optoe…
▽ More
Programmable and reconfigurable optics hold significant potential for transforming a broad spectrum of applications, spanning space explorations to biomedical imaging, gas sensing, and optical cloaking. The ability to adjust the optical properties of components like filters, lenses, and beam steering devices could result in dramatic reductions in size, weight, and power consumption in future optoelectronic devices. Among the potential candidates for reconfigurable optics, chalcogenide-based phase change materials (PCMs) offer great promise due to their non-volatile and analogue switching characteristics. Although PCM have found widespread use in electronic data storage, these memory devices are deeply sub-micron-sized. To incorporate phase change materials into free-space optical components, it is essential to scale them up to beyond several hundreds of microns while maintaining reliable switching characteristics. This study demonstrated a non-mechanical, non-volatile transmissive filter based on low-loss PCMs with a 200 $μ$m$ \times $200 $μ$m switching area. The device/metafilter can be consistently switched between low- and high-transmission states using electrical pulses with a switching contrast ratio of 5.5 dB. The device was reversibly switched for 1250 cycles before accelerated degradation took place. The work represents an important step toward realizing free-space reconfigurable optics based on PCMs.
△ Less
Submitted 16 December, 2023;
originally announced December 2023.
-
Impact of beam far side-lobe knowledge in the presence of foregrounds for LiteBIRD
Authors:
C. Leloup,
G. Patanchon,
J. Errard,
C. Franceschet,
J. E. Gudmundsson,
S. Henrot-Versillé,
H. Imada,
H. Ishino,
T. Matsumura,
G. Puglisi,
W. Wang,
A. Adler,
J. Aumont,
R. Aurlien,
C. Baccigalupi,
M. Ballardini,
A. J. Banday,
R. B. Barreiro,
N. Bartolo,
A. Basyrov,
M. Bersanelli,
D. Blinov,
M. Bortolami,
T. Brinckmann,
P. Campeti
, et al. (86 additional authors not shown)
Abstract:
We present a study of the impact of an uncertainty in the beam far side-lobe knowledge on the measurement of the Cosmic Microwave Background $B$-mode signal at large scale. It is expected to be one of the main source of systematic effects in future CMB observations. Because it is crucial for all-sky survey missions to take into account the interplays between beam systematic effects and all the dat…
▽ More
We present a study of the impact of an uncertainty in the beam far side-lobe knowledge on the measurement of the Cosmic Microwave Background $B$-mode signal at large scale. It is expected to be one of the main source of systematic effects in future CMB observations. Because it is crucial for all-sky survey missions to take into account the interplays between beam systematic effects and all the data analysis steps, the primary goal of this paper is to provide the methodology to carry out the end-to-end study of their effect for a space-borne CMB polarization experiment, up to the cosmological results in the form of a bias $δr$ on the tensor-to-scalar ratio $r$. LiteBIRD is dedicated to target the measurement of CMB primordial $B$ modes by reaching a sensitivity of $σ\left( r \right) \leq 10^{-3}$ assuming $r=0$. As a demonstration of our framework, we derive the relationship between the knowledge of the beam far side-lobes and the tentatively allocated error budget under given assumptions on design, simulation and component separation method. We assume no mitigation of the far side-lobes effect at any stage of the analysis pipeline. We show that $δr$ is mostly due to the integrated fractional power difference between the estimated beams and the true beams in the far side-lobes region, with little dependence on the actual shape of the beams, for low enough $δr$. Under our set of assumptions, in particular considering the specific foreground cleaning method we used, we find that the integrated fractional power in the far side-lobes should be known at a level as tight as $\sim 10^{-4}$, to achieve the required limit on the bias $δr < 1.9 \times 10^{-5}$. The framework and tools developed for this study can be easily adapted to provide requirements under different design, data analysis frameworks and for other future space-borne experiments beyond LiteBIRD.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Active Learning approach to simulations of Strongly Correlated Matter with the Ghost Gutzwiller Approximation
Authors:
Marius S. Frank,
Denis G. Artiukhin,
Tsung-Han Lee,
Yongxin Yao,
Kipton Barros,
Ove Christiansen,
Nicola Lanatà
Abstract:
Quantum embedding (QE) methods such as the Ghost Gutzwiller Approximation (gGA) offer a powerful approach to simulating strongly-correlated systems, but come with the computational bottleneck of computing the ground state of an auxiliary embedding Hamiltonian (EH) iteratively. In this work, we introduce an active learning (AL) framework integrated within the gGA to address this challenge. The meth…
▽ More
Quantum embedding (QE) methods such as the Ghost Gutzwiller Approximation (gGA) offer a powerful approach to simulating strongly-correlated systems, but come with the computational bottleneck of computing the ground state of an auxiliary embedding Hamiltonian (EH) iteratively. In this work, we introduce an active learning (AL) framework integrated within the gGA to address this challenge. The methodology is applied to the single-band Hubbard model and results in a significant reduction in the number of instances where the EH must be solved. Through a principal component analysis (PCA), we find that the EH parameters form a low-dimensional structure that is largely independent of the geometric specifics of the systems, especially in the strongly-correlated regime. Our AL strategy enables us to discover this low-dimensionality structure on the fly, while leveraging it for reducing the computational cost of gGA, laying the groundwork for more efficient simulations of complex strongly-correlated materials.
△ Less
Submitted 12 December, 2023; v1 submitted 8 December, 2023;
originally announced December 2023.
-
The Hoyle and associated excited states from the viewpoint of pocket resonances in alpha + 8Be reactions
Authors:
Teck-Ghee Lee,
Orhan Bayrak,
Ian J. Thompson,
Cheuk-Yin Wong
Abstract:
We examine the production of the Hoyle and associated excited states from the viewpoint of pocket resonances in the reaction of an $α$-particle on a ground state prolate $^8$Be nucleus within the optical model coupled-channel framework. The predicted reaction cross sections, as a function of the center-of-mass energy $E_{\rm cm}$, show prominent resonances, including the Hoyle resonance. The posit…
▽ More
We examine the production of the Hoyle and associated excited states from the viewpoint of pocket resonances in the reaction of an $α$-particle on a ground state prolate $^8$Be nucleus within the optical model coupled-channel framework. The predicted reaction cross sections, as a function of the center-of-mass energy $E_{\rm cm}$, show prominent resonances, including the Hoyle resonance. The positions and widths of these resonances are sensitive to the target deformation ($β_2$ parameter) and the parity of the nuclear surface potential $-$ deeper for the even-parity $L$ partial waves relative to those for the odd-parity $L$ partial waves at the surface region because of the Bose-Einstein exchange of the $α$-bosons. Decomposing the reaction cross sections to different partial waves, we find that the resonance energies and widths reasonably agree with the available experimental data and previous hyperspherical calculations for the $0_2^+$ (Hoyle state), $0_3^+$, $1_1^-$ and $3_1^-$ states of $^{12}$C, except for the narrow theoretical width of the $2_2^+$ state. Analyzing the wavefunctions and the resonance widths, we identify the narrow and sharp $0_2^+$, $3_1^-$ and $2_2^+$ resonances as pocket resonances -- resonances which occur below the potential barrier, while the broad $0_3^+$ and $1_1^-$ resonances as above-the-barrier resonances. For astrophysical applications, we also evaluate the astrophysical $S(E_{\rm cm})$-factor for $E_{\rm cm}$ $<$ 1.0 MeV, for the fusion of $α$+$^8$Be into the $^{12}$C$(2^+)$ state based on our estimated $s$-wave $α$+$^8$Be reaction cross section and the associated $γ$- and $α$-decay widths for the decay of $^{12}$C excited states in the potential pocket.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine
Authors:
Harsha Nori,
Yin Tat Lee,
Sheng Zhang,
Dean Carignan,
Richard Edgar,
Nicolo Fusi,
Nicholas King,
Jonathan Larson,
Yuanzhi Li,
Weishung Liu,
Renqian Luo,
Scott Mayer McKinney,
Robert Osazuwa Ness,
Hoifung Poon,
Tao Qin,
Naoto Usuyama,
Chris White,
Eric Horvitz
Abstract:
Generalist foundation models such as GPT-4 have displayed surprising capabilities in a wide variety of domains and tasks. Yet, there is a prevalent assumption that they cannot match specialist capabilities of fine-tuned models. For example, most explorations to date on medical competency benchmarks have leveraged domain-specific training, as exemplified by efforts on BioGPT and Med-PaLM. We build…
▽ More
Generalist foundation models such as GPT-4 have displayed surprising capabilities in a wide variety of domains and tasks. Yet, there is a prevalent assumption that they cannot match specialist capabilities of fine-tuned models. For example, most explorations to date on medical competency benchmarks have leveraged domain-specific training, as exemplified by efforts on BioGPT and Med-PaLM. We build on a prior study of GPT-4's capabilities on medical challenge benchmarks in the absence of special training. Rather than using simple prompting to highlight the model's out-of-the-box capabilities, we perform a systematic exploration of prompt engineering. We find that prompting innovation can unlock deeper specialist capabilities and show that GPT-4 easily tops prior leading results for medical benchmarks. The prompting methods we explore are general purpose, and make no specific use of domain expertise, removing the need for expert-curated content. Our experimental design carefully controls for overfitting during the prompt engineering process. We introduce Medprompt, based on a composition of several prompting strategies. With Medprompt, GPT-4 achieves state-of-the-art results on all nine of the benchmark datasets in the MultiMedQA suite. The method outperforms leading specialist models such as Med-PaLM 2 by a significant margin with an order of magnitude fewer calls to the model. Steering GPT-4 with Medprompt achieves a 27% reduction in error rate on the MedQA dataset over the best methods to date achieved with specialist models and surpasses a score of 90% for the first time. Beyond medical problems, we show the power of Medprompt to generalize to other domains and provide evidence for the broad applicability of the approach via studies of the strategy on exams in electrical engineering, machine learning, philosophy, accounting, law, nursing, and clinical psychology.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
On the quantum time complexity of divide and conquer
Authors:
Jonathan Allcock,
Jinge Bao,
Aleksandrs Belovs,
Troy Lee,
Miklos Santha
Abstract:
We initiate a systematic study of the time complexity of quantum divide and conquer algorithms for classical problems. We establish generic conditions under which search and minimization problems with classical divide and conquer algorithms are amenable to quantum speedup and apply these theorems to an array of problems involving strings, integers, and geometric objects. They include LONGEST DISTI…
▽ More
We initiate a systematic study of the time complexity of quantum divide and conquer algorithms for classical problems. We establish generic conditions under which search and minimization problems with classical divide and conquer algorithms are amenable to quantum speedup and apply these theorems to an array of problems involving strings, integers, and geometric objects. They include LONGEST DISTINCT SUBSTRING, KLEE'S COVERAGE, several optimization problems on stock transactions, and k-INCREASING SUBSEQUENCE. For most of these results, our quantum time upper bound matches the quantum query lower bound for the problem, up to polylogarithmic factors.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Noise Analysis for Performance Evaluation of Biopotential Recording Front-Ends
Authors:
Taeju Lee
Abstract:
Noise efficiency factor (NEF) and power efficiency factor (PEF) are widely used as the figure of merit to quantify the performance of biopotential recording front-ends. NEF and PEF are discussed from the noise analysis to the trend survey. To provide a comprehensive performance comparison of the front-ends, the performance mapping is developed using the design parameters of the technology node, NE…
▽ More
Noise efficiency factor (NEF) and power efficiency factor (PEF) are widely used as the figure of merit to quantify the performance of biopotential recording front-ends. NEF and PEF are discussed from the noise analysis to the trend survey. To provide a comprehensive performance comparison of the front-ends, the performance mapping is developed using the design parameters of the technology node, NEF, PEF, |PEF - NEF|, and supply voltage. Using |PEF - NEF| provides how well a front-end balances between current-noise efficiency and power-noise efficiency, in other words, how biased a front-end is between current- and power-noise efficiencies. Also, the performance mappings of different front-end architectures are presented.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Positional Description Matters for Transformers Arithmetic
Authors:
Ruoqi Shen,
Sébastien Bubeck,
Ronen Eldan,
Yin Tat Lee,
Yuanzhi Li,
Yi Zhang
Abstract:
Transformers, central to the successes in modern Natural Language Processing, often falter on arithmetic tasks despite their vast capabilities --which paradoxically include remarkable coding abilities. We observe that a crucial challenge is their naive reliance on positional information to solve arithmetic problems with a small number of digits, leading to poor performance on larger numbers. Herei…
▽ More
Transformers, central to the successes in modern Natural Language Processing, often falter on arithmetic tasks despite their vast capabilities --which paradoxically include remarkable coding abilities. We observe that a crucial challenge is their naive reliance on positional information to solve arithmetic problems with a small number of digits, leading to poor performance on larger numbers. Herein, we delve deeper into the role of positional encoding, and propose several ways to fix the issue, either by modifying the positional encoding directly, or by modifying the representation of the arithmetic task to leverage standard positional encoding differently. We investigate the value of these modifications for three tasks: (i) classical multiplication, (ii) length extrapolation in addition, and (iii) addition in natural language context. For (i) we train a small model on a small dataset (100M parameters and 300k samples) with remarkable aptitude in (direct, no scratchpad) 15 digits multiplication and essentially perfect up to 12 digits, while usual training in this context would give a model failing at 4 digits multiplication. In the experiments on addition, we use a mere 120k samples to demonstrate: for (ii) extrapolation from 10 digits to testing on 12 digits numbers while usual training would have no extrapolation, and for (iii) almost perfect accuracy up to 5 digits while usual training would be correct only up to 3 digits (which is essentially memorization with a training set of 120k samples).
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
RAISE -- Radiology AI Safety, an End-to-end lifecycle approach
Authors:
M. Jorge Cardoso,
Julia Moosbauer,
Tessa S. Cook,
B. Selnur Erdal,
Brad Genereaux,
Vikash Gupta,
Bennett A. Landman,
Tiarna Lee,
Parashkev Nachev,
Elanchezhian Somasundaram,
Ronald M. Summers,
Khaled Younis,
Sebastien Ourselin,
Franz MJ Pfister
Abstract:
The integration of AI into radiology introduces opportunities for improved clinical care provision and efficiency but it demands a meticulous approach to mitigate potential risks as with any other new technology. Beginning with rigorous pre-deployment evaluation and validation, the focus should be on ensuring models meet the highest standards of safety, effectiveness and efficacy for their intende…
▽ More
The integration of AI into radiology introduces opportunities for improved clinical care provision and efficiency but it demands a meticulous approach to mitigate potential risks as with any other new technology. Beginning with rigorous pre-deployment evaluation and validation, the focus should be on ensuring models meet the highest standards of safety, effectiveness and efficacy for their intended applications. Input and output guardrails implemented during production usage act as an additional layer of protection, identifying and addressing individual failures as they occur. Continuous post-deployment monitoring allows for tracking population-level performance (data drift), fairness, and value delivery over time. Scheduling reviews of post-deployment model performance and educating radiologists about new algorithmic-driven findings is critical for AI to be effective in clinical practice. Recognizing that no single AI solution can provide absolute assurance even when limited to its intended use, the synergistic application of quality assurance at multiple levels - regulatory, clinical, technical, and ethical - is emphasized. Collaborative efforts between stakeholders spanning healthcare systems, industry, academia, and government are imperative to address the multifaceted challenges involved. Trust in AI is an earned privilege, contingent on a broad set of goals, among them transparently demonstrating that the AI adheres to the same rigorous safety, effectiveness and efficacy standards as other established medical technologies. By doing so, developers can instil confidence among providers and patients alike, enabling the responsible scaling of AI and the realization of its potential benefits. The roadmap presented herein aims to expedite the achievement of deployable, reliable, and safe AI in radiology.
△ Less
Submitted 24 November, 2023;
originally announced November 2023.
-
High-Quality Face Caricature via Style Translation
Authors:
Lamyanba Laishram,
Muhammad Shaheryar,
Jong Taek Lee,
Soon Ki Jung
Abstract:
Caricature is an exaggerated form of artistic portraiture that accentuates unique yet subtle characteristics of human faces. Recently, advancements in deep end-to-end techniques have yielded encouraging outcomes in capturing both style and elevated exaggerations in creating face caricatures. Most of these approaches tend to produce cartoon-like results that could be more practical for real-world a…
▽ More
Caricature is an exaggerated form of artistic portraiture that accentuates unique yet subtle characteristics of human faces. Recently, advancements in deep end-to-end techniques have yielded encouraging outcomes in capturing both style and elevated exaggerations in creating face caricatures. Most of these approaches tend to produce cartoon-like results that could be more practical for real-world applications. In this study, we proposed a high-quality, unpaired face caricature method that is appropriate for use in the real world and uses computer vision techniques and GAN models. We attain the exaggeration of facial features and the stylization of appearance through a two-step process: Face caricature generation and face caricature projection. The face caricature generation step creates new caricature face datasets from real images and trains a generative model using the real and newly created caricature datasets. The Face caricature projection employs an encoder trained with real and caricature faces with the pretrained generator to project real and caricature faces. We perform an incremental facial exaggeration from the real image to the caricature faces using the encoder and generator's latent space. Our projection preserves the facial identity, attributes, and expressions from the input image. Also, it accounts for facial occlusions, such as reading glasses or sunglasses, to enhance the robustness of our model. Furthermore, we conducted a comprehensive comparison of our approach with various state-of-the-art face caricature methods, highlighting our process's distinctiveness and exceptional realism.
△ Less
Submitted 22 November, 2023;
originally announced November 2023.
-
Rotational spectroscopic characterisation of the [D2,C,S] system: an update from the laboratory and theory
Authors:
Natalia Inostroza-Pino,
Valerio Lattanzi,
C. Zachary Palmer,
Ryan C. Fortenberry,
Diego Mardones,
Paola Caselli,
Oko E. Godwin,
Timothy J. Lee
Abstract:
The synergy between high-resolution rotational spectroscopy and quantum-chemical calculations is essential for exploring future detection of molecules, especially when spectroscopy parameters are not available yet. By using highly correlated ab initio quartic force fields (QFFs) from explicitly correlated coupled cluster theory, a complete set of rotational constants and centrifugal distortion con…
▽ More
The synergy between high-resolution rotational spectroscopy and quantum-chemical calculations is essential for exploring future detection of molecules, especially when spectroscopy parameters are not available yet. By using highly correlated ab initio quartic force fields (QFFs) from explicitly correlated coupled cluster theory, a complete set of rotational constants and centrifugal distortion constants for D$_2$CS and cis/trans-DCSD isomers have been produced. Comparing our new ab initio results for D$_2$CS with new rotational spectroscopy laboratory data for the same species, the accuracy of the computed B and C rotational constants is within 0.1% while the A constant is only slightly higher. Additionally, quantum chemical vibrational frequencies are also provided, and these spectral reference data and new experimental rotational lines will provide additional references for potential observation of these deuterated sulfur species with either ground-based radio telescopes or space-based infrared observatories.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
The HIBEAM Instrument at the European Spallation Source
Authors:
V. Santoro,
D. Milstead,
P. Fierlinger,
W. M. Snow,
J. Amaral,
J. Barrow,
M. Bartis,
P. Bentley,
L. Björk,
G. Brooijmans,
L. Broussard,
A. Burgman,
G. Croci,
N. de la Cour,
D. D. Di Julio,
K. Dunne,
L. Eklund,
H. Eriksson,
M. J. Ferreira,
U. Friman-Gayer,
P. Golubev,
G. Gorini,
G. P. Guedes,
V. Hehl,
A. Heinz
, et al. (39 additional authors not shown)
Abstract:
The European Spallation Source (ESS) will be the world's brightest neutron source and will open a new intensity frontier in particle physics. The HIBEAM collaboration aims to exploit the unique potential of the ESS with a dedicated ESS instrument for particle physics which offers world-leading capability in a number of areas. The HIBEAM program includes the first search in thirty years for free ne…
▽ More
The European Spallation Source (ESS) will be the world's brightest neutron source and will open a new intensity frontier in particle physics. The HIBEAM collaboration aims to exploit the unique potential of the ESS with a dedicated ESS instrument for particle physics which offers world-leading capability in a number of areas. The HIBEAM program includes the first search in thirty years for free neutrons converting to antineutrons and searches for sterile neutrons, ultralight axion dark matter and nonzero neutron electric charge. This paper outlines the capabilities, design, infrastructure, and scientific potential of the HIBEAM program, including its dedicated beamline, neutron optical system, magnetic shielding and control, and detectors for neutrons and antineutrons. Additionally, we discuss the long-term scientific exploitation of HIBEAM, which may include measurements of the neutron electric dipole moment and precision studies of neutron decays.
△ Less
Submitted 7 April, 2025; v1 submitted 14 November, 2023;
originally announced November 2023.
-
Galaxy Clusters Discovered via the Thermal Sunyaev-Zel'dovich Effect in the 500-square-degree SPTpol Survey
Authors:
L. E. Bleem,
M. Klein,
T. M. C. Abbott,
P. A. R. Ade,
M. Aguena,
O. Alves,
A. J. Anderson,
F. Andrade-Oliveira,
B. Ansarinejad,
M. Archipley,
M. L. N. Ashby,
J. E. Austermann,
D. Bacon,
J. A. Beall,
A. N. Bender,
B. A. Benson,
F. Bianchini,
S. Bocquet,
D. Brooks,
D. L. Burke,
M. Calzadilla,
J. E. Carlstrom,
A. Carnero Rosell,
J. Carretero,
C. L. Chang
, et al. (103 additional authors not shown)
Abstract:
We present a catalog of 689 galaxy cluster candidates detected at significance $ξ>4$ via their thermal Sunyaev-Zel'dovich (SZ) effect signature in 95 and 150 GHz data from the 500-square-degree SPTpol survey. We use optical and infrared data from the Dark Energy Camera and the Wide-field Infrared Survey Explorer (WISE) and \spitzer \ satellites, to confirm 544 of these candidates as clusters with…
▽ More
We present a catalog of 689 galaxy cluster candidates detected at significance $ξ>4$ via their thermal Sunyaev-Zel'dovich (SZ) effect signature in 95 and 150 GHz data from the 500-square-degree SPTpol survey. We use optical and infrared data from the Dark Energy Camera and the Wide-field Infrared Survey Explorer (WISE) and \spitzer \ satellites, to confirm 544 of these candidates as clusters with $\sim94\%$ purity. The sample has an approximately redshift-independent mass threshold at redshift $z>0.25$ and spans $1.5 \times 10^{14} < M_{500c} < 9.1 \times 10^{14}$ $M_\odot/h_{70}$ \ and $0.03<z\lesssim1.6$ in mass and redshift, respectively; 21\% of the confirmed clusters are at $z>1$. We use external radio data from the Sydney University Molonglo Sky Survey (SUMSS) to estimate contamination to the SZ signal from synchrotron sources. The contamination reduces the recovered $ξ$ by a median value of 0.032, or $\sim0.8\%$ of the $ξ=4$ threshold value, and $\sim7\%$ of candidates have a predicted contamination greater than $Δξ= 1$. With the exception of a small number of systems $(<1\%)$, an analysis of clusters detected in single-frequency 95 and 150 GHz data shows no significant contamination of the SZ signal by emission from dusty or synchrotron sources. This cluster sample will be a key component in upcoming astrophysical and cosmological analyses of clusters. The SPTpol millimeter-wave maps and associated data products used to produce this sample are available at https://pole.uchicago.edu/public/data/sptpol_500d_clusters/index.html, and the NASA LAMBDA website. An interactive sky server with the SPTpol maps and Dark Energy Survey data release 2 images is also available at NCSA https://skyviewer.ncsa.illinois.edu.
△ Less
Submitted 8 February, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Regenerating Arbitrary Video Sequences with Distillation Path-Finding
Authors:
Thi-Ngoc-Hanh Le,
Sheng-Yi Yao,
Chun-Te Wu,
Tong-Yee Lee
Abstract:
If the video has long been mentioned as a widespread visualization form, the animation sequence in the video is mentioned as storytelling for people. Producing an animation requires intensive human labor from skilled professional artists to obtain plausible animation in both content and motion direction, incredibly for animations with complex content, multiple moving objects, and dense movement. T…
▽ More
If the video has long been mentioned as a widespread visualization form, the animation sequence in the video is mentioned as storytelling for people. Producing an animation requires intensive human labor from skilled professional artists to obtain plausible animation in both content and motion direction, incredibly for animations with complex content, multiple moving objects, and dense movement. This paper presents an interactive framework to generate new sequences according to the users' preference on the starting frame. The critical contrast of our approach versus prior work and existing commercial applications is that novel sequences with arbitrary starting frame are produced by our system with a consistent degree in both content and motion direction. To achieve this effectively, we first learn the feature correlation on the frameset of the given video through a proposed network called RSFNet. Then, we develop a novel path-finding algorithm, SDPF, which formulates the knowledge of motion directions of the source video to estimate the smooth and plausible sequences. The extensive experiments show that our framework can produce new animations on the cartoon and natural scenes and advance prior works and commercial applications to enable users to obtain more predictable results.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Multi-Agent Reinforcement Learning for the Low-Level Control of a Quadrotor UAV
Authors:
Beomyeol Yu,
Taeyoung Lee
Abstract:
By leveraging the underlying structures of the quadrotor dynamics, we propose multi-agent reinforcement learning frameworks to innovate the low-level control of a quadrotor, where independent agents operate cooperatively to achieve a common goal. While single-agent reinforcement learning has been successfully applied in quadrotor controls, training a large monolithic network is often data-intensiv…
▽ More
By leveraging the underlying structures of the quadrotor dynamics, we propose multi-agent reinforcement learning frameworks to innovate the low-level control of a quadrotor, where independent agents operate cooperatively to achieve a common goal. While single-agent reinforcement learning has been successfully applied in quadrotor controls, training a large monolithic network is often data-intensive and time-consuming. Moreover, achieving agile yawing control remains a significant challenge due to the strongly coupled nature of the quadrotor dynamics. To address this, we decompose the quadrotor dynamics into translational and yawing components and assign collaborative reinforcement learning agents to each part to facilitate more efficient training. Additionally, we introduce regularization terms to mitigate steady-state errors and prevent excessive maneuvers. Benchmark studies, including sim-to-sim transfer verification, demonstrate that our proposed training schemes substantially improve the convergence rate of training, while enhancing flight control performance and stability compared to traditional single-agent approaches.
△ Less
Submitted 26 February, 2024; v1 submitted 10 November, 2023;
originally announced November 2023.
-
Multi Higgs Boson Signals of a Modified Muon Yukawa Coupling at a Muon Collider
Authors:
Radovan Dermisek,
Keith Hermanek,
Taegyu Lee,
Navin McGinnis,
Sangsik Yoon
Abstract:
We study di-Higgs and tri-Higgs boson productions at a muon collider as functions of the modification of the muon Yukawa coupling resulting from new physics parameterized by the dimension 6 mass operator. We show that the di-Higgs signal can be used to observe a deviation in the muon Yukawa coupling at the 10 % level for $\sqrt{s} = 10$ TeV and at the 3.5 % level for $\sqrt{s} = 30$ TeV. The tri-H…
▽ More
We study di-Higgs and tri-Higgs boson productions at a muon collider as functions of the modification of the muon Yukawa coupling resulting from new physics parameterized by the dimension 6 mass operator. We show that the di-Higgs signal can be used to observe a deviation in the muon Yukawa coupling at the 10 % level for $\sqrt{s} = 10$ TeV and at the 3.5 % level for $\sqrt{s} = 30$ TeV. The tri-Higgs signal improves the sensitivity dramatically with increasing $\sqrt{s}$, reaching 0.8 % at $\sqrt{s} = 30$ TeV. We also study all processes involving Goldstone bosons originating from the same operator, discuss possible model dependence resulting from other operators of dimension 6 and higher, and identify multi-Higgs productions and one additional process as golden channels. We further extend the study to the two Higgs doublet model type-II and show that di-Higgs and tri-Higgs signals involving heavy Higgs bosons can be enhanced by a factor of $(\tan β)^6$, which results in the potential sensitivity to a modified muon Yukawa coupling at the $10^{-6}$ level already at a $\sqrt{s} = 10 $ TeV muon collider. The results can be easily customized for other extensions of the Higgs sector.
△ Less
Submitted 24 May, 2024; v1 submitted 8 November, 2023;
originally announced November 2023.
-
Retargeting video with an end-to-end framework
Authors:
Thi-Ngoc-Hanh Le,
HuiGuang Huang,
Yi-Ru Chen,
Tong-Yee Lee
Abstract:
Video holds significance in computer graphics applications. Because of the heterogeneous of digital devices, retargeting videos becomes an essential function to enhance user viewing experience in such applications. In the research of video retargeting, preserving the relevant visual content in videos, avoiding flicking, and processing time are the vital challenges. Extending image retargeting tech…
▽ More
Video holds significance in computer graphics applications. Because of the heterogeneous of digital devices, retargeting videos becomes an essential function to enhance user viewing experience in such applications. In the research of video retargeting, preserving the relevant visual content in videos, avoiding flicking, and processing time are the vital challenges. Extending image retargeting techniques to the video domain is challenging due to the high running time. Prior work of video retargeting mainly utilizes time-consuming preprocessing to analyze frames. Plus, being tolerant of different video content, avoiding important objects from shrinking, and the ability to play with arbitrary ratios are the limitations that need to be resolved in these systems requiring investigation. In this paper, we present an end-to-end RETVI method to retarget videos to arbitrary aspect ratios. We eliminate the computational bottleneck in the conventional approaches by designing RETVI with two modules, content feature analyzer (CFA) and adaptive deforming estimator (ADE). The extensive experiments and evaluations show that our system outperforms previous work in quality and running time. Visit our project website for more results at http://graphics.csie.ncku.edu.tw/RETVI.
△ Less
Submitted 8 November, 2023; v1 submitted 7 November, 2023;
originally announced November 2023.