-
Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation
Authors:
JoonHo Lee,
Jae Oh Woo,
Juree Seok,
Parisa Hassanzadeh,
Wooseok Jang,
JuYoun Son,
Sima Didari,
Baruch Gutow,
Heng Hao,
Hankyu Moon,
Wenjun Hu,
Yeong-Dae Kwon,
Taehee Lee,
Seungjai Min
Abstract:
Assessing response quality to instructions in language models is vital but challenging due to the complexity of human language across different contexts. This complexity often results in ambiguous or inconsistent interpretations, making accurate assessment difficult. To address this issue, we propose a novel Uncertainty-aware Reward Model (URM) that introduces a robust uncertainty estimation for t…
▽ More
Assessing response quality to instructions in language models is vital but challenging due to the complexity of human language across different contexts. This complexity often results in ambiguous or inconsistent interpretations, making accurate assessment difficult. To address this issue, we propose a novel Uncertainty-aware Reward Model (URM) that introduces a robust uncertainty estimation for the quality of paired responses based on Bayesian approximation. Trained with preference datasets, our uncertainty-enabled proxy not only scores rewards for responses but also evaluates their inherent uncertainty. Empirical results demonstrate significant benefits of incorporating the proposed proxy into language model training. Our method boosts the instruction following capability of language models by refining data curation for training and improving policy optimization objectives, thereby surpassing existing methods by a large margin on benchmarks such as Vicuna and MT-bench. These findings highlight that our proposed approach substantially advances language model training and paves a new way of harnessing uncertainty within language models.
△ Less
Submitted 31 January, 2025; v1 submitted 10 May, 2024;
originally announced May 2024.
-
Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey
Authors:
Marcos V. Conde,
Zhijun Lei,
Wen Li,
Cosmin Stejerean,
Ioannis Katsavounidis,
Radu Timofte,
Kihwan Yoon,
Ganzorig Gankhuyag,
Jiangtao Lv,
Long Sun,
Jinshan Pan,
Jiangxin Dong,
Jinhui Tang,
Zhiyuan Li,
Hao Wei,
Chenyang Ge,
Dongyang Zhang,
Tianle Liu,
Huaian Chen,
Yi Jin,
Menghan Zhou,
Yiqiang Yan,
Si Gao,
Biao Wu,
Shaoli Liu
, et al. (50 additional authors not shown)
Abstract:
This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod…
▽ More
This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF codec, instead of JPEG. All the proposed methods improve PSNR fidelity over Lanczos interpolation, and process images under 10ms. Out of the 160 participants, 25 teams submitted their code and models. The solutions present novel designs tailored for memory-efficiency and runtime on edge devices. This survey describes the best solutions for real-time SR of compressed high-resolution images.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Translation of Multifaceted Data without Re-Training of Machine Translation Systems
Authors:
Hyeonseok Moon,
Seungyoon Lee,
Seongtae Hong,
Seungjun Lee,
Chanjun Park,
Heuiseok Lim
Abstract:
Translating major language resources to build minor language resources becomes a widely-used approach. Particularly in translating complex data points composed of multiple components, it is common to translate each component separately. However, we argue that this practice often overlooks the interrelation between components within the same data point. To address this limitation, we propose a nove…
▽ More
Translating major language resources to build minor language resources becomes a widely-used approach. Particularly in translating complex data points composed of multiple components, it is common to translate each component separately. However, we argue that this practice often overlooks the interrelation between components within the same data point. To address this limitation, we propose a novel MT pipeline that considers the intra-data relation in implementing MT for training data. In our MT pipeline, all the components in a data point are concatenated to form a single translation sequence and subsequently reconstructed to the data components after translation. We introduce a Catalyst Statement (CS) to enhance the intra-data relation, and Indicator Token (IT) to assist the decomposition of a translated sequence into its respective data components. Through our approach, we have achieved a considerable improvement in translation quality itself, along with its effectiveness as training data. Compared with the conventional approach that translates each data component separately, our method yields better training data that enhances the performance of the trained model by 2.690 points for the web page ranking (WPR) task, and 0.845 for the question generation (QG) task in the XGLUE benchmark.
△ Less
Submitted 24 September, 2024; v1 submitted 24 April, 2024;
originally announced April 2024.
-
Development of a data overflow protection system for Super-Kamiokande to maximize data from nearby supernovae
Authors:
M. Mori,
K. Abe,
Y. Hayato,
K. Hiraide,
K. Hosokawa,
K. Ieki,
M. Ikeda,
J. Kameda,
Y. Kanemura,
R. Kaneshima,
Y. Kashiwagi,
Y. Kataoka,
S. Miki,
S. Mine,
M. Miura,
S. Moriyama,
Y. Nakano,
M. Nakahata,
S. Nakayama,
Y. Noguchi,
K. Okamoto,
K. Sato,
H. Sekiya,
H. Shiba,
K. Shimizu
, et al. (230 additional authors not shown)
Abstract:
Neutrinos from very nearby supernovae, such as Betelgeuse, are expected to generate more than ten million events over 10\,s in Super-Kamokande (SK). At such large event rates, the buffers of the SK analog-to-digital conversion board (QBEE) will overflow, causing random loss of data that is critical for understanding the dynamics of the supernova explosion mechanism. In order to solve this problem,…
▽ More
Neutrinos from very nearby supernovae, such as Betelgeuse, are expected to generate more than ten million events over 10\,s in Super-Kamokande (SK). At such large event rates, the buffers of the SK analog-to-digital conversion board (QBEE) will overflow, causing random loss of data that is critical for understanding the dynamics of the supernova explosion mechanism. In order to solve this problem, two new DAQ modules were developed to aid in the observation of very nearby supernovae. The first of these, the SN module, is designed to save only the number of hit PMTs during a supernova burst and the second, the Veto module, prescales the high rate neutrino events to prevent the QBEE from overflowing based on information from the SN module. In the event of a very nearby supernova, these modules allow SK to reconstruct the time evolution of the neutrino event rate from beginning to end using both QBEE and SN module data. This paper presents the development and testing of these modules together with an analysis of supernova-like data generated with a flashing laser diode. We demonstrate that the Veto module successfully prevents DAQ overflows for Betelgeuse-like supernovae as well as the long-term stability of the new modules. During normal running the Veto module is found to issue DAQ vetos a few times per month resulting in a total dead time less than 1\,ms, and does not influence ordinary operations. Additionally, using simulation data we find that supernovae closer than 800~pc will trigger Veto module resulting in a prescaling of the observed neutrino data.
△ Less
Submitted 13 August, 2024; v1 submitted 12 April, 2024;
originally announced April 2024.
-
Evaluation of the performance of the event reconstruction algorithms in the JSNS$^2$ experiment using a $^{252}$Cf calibration source
Authors:
D. H. Lee,
M. K. Cheoun,
J. H. Choi,
J. Y. Choi,
T. Dodo,
J. Goh,
K. Haga,
M. Harada,
S. Hasegawa,
W. Hwang,
T. Iida,
H. I. Jang,
J. S. Jang,
K. K. Joo,
D. E. Jung,
S. K. Kang,
Y. Kasugai,
T. Kawasaki,
E. J. Kim,
J. Y. Kim,
S. B Kim,
W. Kim,
H. Kinoshita,
T. Konno,
I. T. Lim
, et al. (28 additional authors not shown)
Abstract:
JSNS$^2$ searches for short baseline neutrino oscillations with a baseline of 24~meters and a target of 17~tonnes of the Gd-loaded liquid scintillator. The correct algorithm on the event reconstruction of events, which determines the position and energy of neutrino interactions in the detector, are essential for the physics analysis of the data from the experiment. Therefore, the performance of th…
▽ More
JSNS$^2$ searches for short baseline neutrino oscillations with a baseline of 24~meters and a target of 17~tonnes of the Gd-loaded liquid scintillator. The correct algorithm on the event reconstruction of events, which determines the position and energy of neutrino interactions in the detector, are essential for the physics analysis of the data from the experiment. Therefore, the performance of the event reconstruction is carefully checked with calibrations using $^{252}$Cf source. This manuscript describes the methodology and the performance of the event reconstruction.
△ Less
Submitted 19 January, 2025; v1 submitted 5 April, 2024;
originally announced April 2024.
-
Pulse Shape Discrimination in JSNS$^2$
Authors:
T. Dodo,
M. K. Cheoun,
J. H. Choi,
J. Y. Choi,
J. Goh,
K. Haga,
M. Harada,
S. Hasegawa,
W. Hwang,
T. Iida,
H. I. Jang,
J. S. Jang,
K. K. Joo,
D. E. Jung,
S. K. Kang,
Y. Kasugai,
T. Kawasaki,
E. J. Kim,
J. Y. Kim,
S. B. Kim,
W. Kim,
H. Kinoshita,
T. Konno,
D. H. Lee,
I. T. Lim
, et al. (29 additional authors not shown)
Abstract:
JSNS$^2$ (J-PARC Sterile Neutrino Search at J-PARC Spallation Neutron Source) is an experiment that is searching for sterile neutrinos via the observation of $\barν_μ \rightarrow \barν_e$ appearance oscillations using neutrinos with muon decay-at-rest. For this search, rejecting cosmic-ray-induced neutron events by Pulse Shape Discrimination (PSD) is essential because the JSNS$^2$ detector is loca…
▽ More
JSNS$^2$ (J-PARC Sterile Neutrino Search at J-PARC Spallation Neutron Source) is an experiment that is searching for sterile neutrinos via the observation of $\barν_μ \rightarrow \barν_e$ appearance oscillations using neutrinos with muon decay-at-rest. For this search, rejecting cosmic-ray-induced neutron events by Pulse Shape Discrimination (PSD) is essential because the JSNS$^2$ detector is located above ground, on the third floor of the building. We have achieved 95$\%$ rejection of neutron events while keeping 90$\%$ of signal, electron-like events using a data driven likelihood method.
△ Less
Submitted 22 February, 2025; v1 submitted 28 March, 2024;
originally announced April 2024.
-
Some remarks on the $\mathcal{K}_{p,1}$ Theorem
Authors:
Yeongrak Kim,
Hyunsuk Moon,
Euisung Park
Abstract:
Let $X$ be a non-degenerate projective irreducible variety of dimension $n \ge 1$, degree $d$, and codimension $e \ge 2$ over an algebraically closed field $\mathbb{K}$ of characteristic $0$. Let $β_{p,q} (X)$ be the $(p,q)$-th graded Betti number of $X$. M. Green proved the celebrating $\mathcal K_{p,1}$-theorem about the vanishing of $β_{p,1} (X)$ for high values for $p$ and potential examples o…
▽ More
Let $X$ be a non-degenerate projective irreducible variety of dimension $n \ge 1$, degree $d$, and codimension $e \ge 2$ over an algebraically closed field $\mathbb{K}$ of characteristic $0$. Let $β_{p,q} (X)$ be the $(p,q)$-th graded Betti number of $X$. M. Green proved the celebrating $\mathcal K_{p,1}$-theorem about the vanishing of $β_{p,1} (X)$ for high values for $p$ and potential examples of nonvanishing graded Betti numbers. Later, Nagel-Pitteloud and Brodmann-Schenzel classified varieties with nonvanishing $β_{e-1,1}(X)$. It is clear that $β_{e-1,1}(X) \neq 0$ when there is an $(n+1)$-dimensional variety of minimal degree containing $X$, however, this is not always the case as seen in the example of the triple Veronese surface in $\mathbb{P}^9$. In this paper, we completely classify varieties $X$ with nonvanishing $β_{e-1,1}(X) \neq 0$ such that $X$ does not lie on an $(n+1)$-dimensional variety of minimal degree. They are exactly cones over smooth del Pezzo varieties whose Picard number is $\le n-1$.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Least Squares Inference for Data with Network Dependency
Authors:
Jing Lei,
Kehui Chen,
Haeun Moon
Abstract:
We address the inference problem concerning regression coefficients in a classical linear regression model using least squares estimates. The analysis is conducted under circumstances where network dependency exists across units in the sample. Neglecting the dependency among observations may lead to biased estimation of the asymptotic variance and often inflates the Type I error in coefficient inf…
▽ More
We address the inference problem concerning regression coefficients in a classical linear regression model using least squares estimates. The analysis is conducted under circumstances where network dependency exists across units in the sample. Neglecting the dependency among observations may lead to biased estimation of the asymptotic variance and often inflates the Type I error in coefficient inference. In this paper, we first establish a central limit theorem for the ordinary least squares estimate, with a verifiable dependence condition alongside corresponding neighborhood growth conditions. Subsequently, we propose a consistent estimator for the asymptotic variance of the estimated coefficients, which employs a data-driven method to balance the bias-variance trade-off. We find that the optimal tuning depends on the linear hypothesis under consideration and must be chosen adaptively. The presented theory and methods are illustrated and supported by numerical experiments and a data example.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Mining Sequential Patterns in Uncertain Databases Using Hierarchical Index Structure
Authors:
Kashob Kumar Roy,
Md Hasibul Haque Moon,
Md Mahmudur Rahman,
Chowdhury Farhan Ahmed,
Carson K. Leung
Abstract:
In this uncertain world, data uncertainty is inherent in many applications and its importance is growing drastically due to the rapid development of modern technologies. Nowadays, researchers have paid more attention to mine patterns in uncertain databases. A few recent works attempt to mine frequent uncertain sequential patterns. Despite their success, they are incompetent to reduce the number of…
▽ More
In this uncertain world, data uncertainty is inherent in many applications and its importance is growing drastically due to the rapid development of modern technologies. Nowadays, researchers have paid more attention to mine patterns in uncertain databases. A few recent works attempt to mine frequent uncertain sequential patterns. Despite their success, they are incompetent to reduce the number of false-positive pattern generation in their mining process and maintain the patterns efficiently. In this paper, we propose multiple theoretically tightened pruning upper bounds that remarkably reduce the mining space. A novel hierarchical structure is introduced to maintain the patterns in a space-efficient way. Afterward, we develop a versatile framework for mining uncertain sequential patterns that can effectively handle weight constraints as well. Besides, with the advent of incremental uncertain databases, existing works are not scalable. There exist several incremental sequential pattern mining algorithms, but they are limited to mine in precise databases. Therefore, we propose a new technique to adapt our framework to mine patterns when the database is incremental. Finally, we conduct extensive experiments on several real-life datasets and show the efficacy of our framework in different applications.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
Mining Weighted Sequential Patterns in Incremental Uncertain Databases
Authors:
Kashob Kumar Roy,
Md Hasibul Haque Moon,
Md Mahmudur Rahman,
Chowdhury Farhan Ahmed,
Carson Kai-Sang Leung
Abstract:
Due to the rapid development of science and technology, the importance of imprecise, noisy, and uncertain data is increasing at an exponential rate. Thus, mining patterns in uncertain databases have drawn the attention of researchers. Moreover, frequent sequences of items from these databases need to be discovered for meaningful knowledge with great impact. In many real cases, weights of items and…
▽ More
Due to the rapid development of science and technology, the importance of imprecise, noisy, and uncertain data is increasing at an exponential rate. Thus, mining patterns in uncertain databases have drawn the attention of researchers. Moreover, frequent sequences of items from these databases need to be discovered for meaningful knowledge with great impact. In many real cases, weights of items and patterns are introduced to find interesting sequences as a measure of importance. Hence, a constraint of weight needs to be handled while mining sequential patterns. Besides, due to the dynamic nature of databases, mining important information has become more challenging. Instead of mining patterns from scratch after each increment, incremental mining algorithms utilize previously mined information to update the result immediately. Several algorithms exist to mine frequent patterns and weighted sequences from incremental databases. However, these algorithms are confined to mine the precise ones. Therefore, we have developed an algorithm to mine frequent sequences in an uncertain database in this work. Furthermore, we have proposed two new techniques for mining when the database is incremental. Extensive experiments have been conducted for performance evaluation. The analysis showed the efficiency of our proposed framework.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
Augmented Doubly Robust Post-Imputation Inference for Proteomic Data
Authors:
Haeun Moon,
Jin-Hong Du,
Jing Lei,
Kathryn Roeder
Abstract:
Quantitative measurements produced by mass spectrometry proteomics experiments offer a direct way to explore the role of proteins in molecular mechanisms. However, analysis of such data is challenging due to the large proportion of missing values. A common strategy to address this issue is to utilize an imputed dataset, which often introduces systematic bias into downstream analyses if the imputat…
▽ More
Quantitative measurements produced by mass spectrometry proteomics experiments offer a direct way to explore the role of proteins in molecular mechanisms. However, analysis of such data is challenging due to the large proportion of missing values. A common strategy to address this issue is to utilize an imputed dataset, which often introduces systematic bias into downstream analyses if the imputation errors are ignored. In this paper, we propose a statistical framework inspired by doubly robust estimators that offers valid and efficient inference for proteomic data. Our framework combines powerful machine learning tools, such as variational autoencoders, to augment the imputation quality with high-dimensional peptide data, and a parametric model to estimate the propensity score for debiasing imputed outcomes. Our estimator is compatible with the double machine learning framework and has provable properties. In application to both single-cell and bulk-cell proteomic data our method utilizes the imputed data to gain additional, meaningful discoveries and yet maintains good control of false positives.
△ Less
Submitted 20 January, 2025; v1 submitted 23 March, 2024;
originally announced March 2024.
-
Measurements of the charge ratio and polarization of cosmic-ray muons with the Super-Kamiokande detector
Authors:
H. Kitagawa,
T. Tada,
K. Abe,
C. Bronner,
Y. Hayato,
K. Hiraide,
K. Hosokawa,
K. Ieki,
M. Ikeda,
J. Kameda,
Y. Kanemura,
R. Kaneshima,
Y. Kashiwagi,
Y. Kataoka,
S. Miki,
S. Mine,
M. Miura,
S. Moriyama,
Y. Nakano,
M. Nakahata,
S. Nakayama,
Y. Noguchi,
K. Okamoto,
K. Sato,
H. Sekiya
, et al. (231 additional authors not shown)
Abstract:
We present the results of the charge ratio ($R$) and polarization ($P^μ_{0}$) measurements using the decay electron events collected from 2008 September to 2022 June by the Super-Kamiokande detector. Because of its underground location and long operation, we performed high precision measurements by accumulating cosmic-ray muons. We measured the muon charge ratio to be $R=1.32 \pm 0.02$…
▽ More
We present the results of the charge ratio ($R$) and polarization ($P^μ_{0}$) measurements using the decay electron events collected from 2008 September to 2022 June by the Super-Kamiokande detector. Because of its underground location and long operation, we performed high precision measurements by accumulating cosmic-ray muons. We measured the muon charge ratio to be $R=1.32 \pm 0.02$ $(\mathrm{stat.}{+}\mathrm{syst.})$ at $E_μ\cos θ_{\mathrm{Zenith}}=0.7^{+0.3}_{-0.2}$ $\mathrm{TeV}$, where $E_μ$ is the muon energy and $θ_{\mathrm{Zenith}}$ is the zenith angle of incoming cosmic-ray muons. This result is consistent with the Honda flux model while this suggests a tension with the $πK$ model of $1.9σ$. We also measured the muon polarization at the production location to be $P^μ_{0}=0.52 \pm 0.02$ $(\mathrm{stat.}{+}\mathrm{syst.})$ at the muon momentum of $0.9^{+0.6}_{-0.1}$ $\mathrm{TeV}/c$ at the surface of the mountain; this also suggests a tension with the Honda flux model of $1.5σ$. This is the most precise measurement ever to experimentally determine the cosmic-ray muon polarization near $1~\mathrm{TeV}/c$. These measurement results are useful to improve the atmospheric neutrino simulations.
△ Less
Submitted 4 November, 2024; v1 submitted 13 March, 2024;
originally announced March 2024.
-
Complementarity of which-path information in induced and stimulated coherences via four-wave mixing process from warm Rb atomic ensemble
Authors:
Danbi Kim,
Jiho Park,
Changhoon Baek,
Sun Kyung Lee and,
Han Seb Moon
Abstract:
We report a systematic approach for establishing a complementary relationship between the interference visibility, concurrence, and predictability in the crossing of induced and stimulated coherences of two-mode squeezed coherent states. This is achieved using a double-path interferometer involving two independent four-wave mixing (FWM) atomic samples generated via spontaneous and stimulated FWM p…
▽ More
We report a systematic approach for establishing a complementary relationship between the interference visibility, concurrence, and predictability in the crossing of induced and stimulated coherences of two-mode squeezed coherent states. This is achieved using a double-path interferometer involving two independent four-wave mixing (FWM) atomic samples generated via spontaneous and stimulated FWM processes from a warm atomic ensemble of 87Rb. We demonstrate that the transition from quantum to classical behavior can be characterized by the induced coherence effect, distinguishing between the two-mode squeezed vacuum and coherent states. Moreover, our experimental scheme, employing two FWM atomic ensembles with long-coherent photons, provides valuable insights into the complementarity of which-path information in induced and stimulated coherences.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Second gadolinium loading to Super-Kamiokande
Authors:
K. Abe,
C. Bronner,
Y. Hayato,
K. Hiraide,
K. Hosokawa,
K. Ieki,
M. Ikeda,
J. Kameda,
Y. Kanemura,
R. Kaneshima,
Y. Kashiwagi,
Y. Kataoka,
S. Miki,
S. Mine,
M. Miura,
S. Moriyama,
Y. Nakano,
M. Nakahata,
S. Nakayama,
Y. Noguchi,
K. Sato,
H. Sekiya,
H. Shiba,
K. Shimizu,
M. Shiozawa
, et al. (225 additional authors not shown)
Abstract:
The first loading of gadolinium (Gd) into Super-Kamiokande in 2020 was successful, and the neutron capture efficiency on Gd reached 50\%. To further increase the Gd neutron capture efficiency to 75\%, 26.1 tons of $\rm Gd_2(\rm SO_4)_3\cdot \rm 8H_2O$ was additionally loaded into Super-Kamiokande (SK) from May 31 to July 4, 2022. As the amount of loaded $\rm Gd_2(\rm SO_4)_3\cdot \rm 8H_2O$ was do…
▽ More
The first loading of gadolinium (Gd) into Super-Kamiokande in 2020 was successful, and the neutron capture efficiency on Gd reached 50\%. To further increase the Gd neutron capture efficiency to 75\%, 26.1 tons of $\rm Gd_2(\rm SO_4)_3\cdot \rm 8H_2O$ was additionally loaded into Super-Kamiokande (SK) from May 31 to July 4, 2022. As the amount of loaded $\rm Gd_2(\rm SO_4)_3\cdot \rm 8H_2O$ was doubled compared to the first loading, the capacity of the powder dissolving system was doubled. We also developed new batches of gadolinium sulfate with even further reduced radioactive impurities. In addition, a more efficient screening method was devised and implemented to evaluate these new batches of $\rm Gd_2(\rm SO_4)_3\cdot \rm 8H_2O$. Following the second loading, the Gd concentration in SK was measured to be $333.5\pm2.5$ ppm via an Atomic Absorption Spectrometer (AAS). From the mean neutron capture time constant of neutrons from an Am/Be calibration source, the Gd concentration was independently measured to be 332.7 $\pm$ 6.8(sys.) $\pm$ 1.1(stat.) ppm, consistent with the AAS result. Furthermore, during the loading the Gd concentration was monitored continually using the capture time constant of each spallation neutron produced by cosmic-ray muons,and the final neutron capture efficiency was shown to become 1.5 times higher than that of the first loaded phase, as expected.
△ Less
Submitted 18 June, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Performance of SK-Gd's Upgraded Real-time Supernova Monitoring System
Authors:
Y. Kashiwagi,
K. Abe,
C. Bronner,
Y. Hayato,
K. Hiraide,
K. Hosokawa,
K. Ieki,
M. Ikeda,
J. Kameda,
Y. Kanemura,
R. Kaneshima,
Y. Kataoka,
S. Miki,
S. Mine,
M. Miura,
S. Moriyama,
Y. Nakano,
M. Nakahata,
S. Nakayama,
Y. Noguchi,
K. Sato,
H. Sekiya,
H. Shiba,
K. Shimizu,
M. Shiozawa
, et al. (214 additional authors not shown)
Abstract:
Among multi-messenger observations of the next galactic core-collapse supernova, Super-Kamiokande (SK) plays a critical role in detecting the emitted supernova neutrinos, determining the direction to the supernova (SN), and notifying the astronomical community of these observations in advance of the optical signal. On 2022, SK has increased the gadolinium dissolved in its water target (SK-Gd) and…
▽ More
Among multi-messenger observations of the next galactic core-collapse supernova, Super-Kamiokande (SK) plays a critical role in detecting the emitted supernova neutrinos, determining the direction to the supernova (SN), and notifying the astronomical community of these observations in advance of the optical signal. On 2022, SK has increased the gadolinium dissolved in its water target (SK-Gd) and has achieved a Gd concentration of 0.033%, resulting in enhanced neutron detection capability, which in turn enables more accurate determination of the supernova direction. Accordingly, SK-Gd's real-time supernova monitoring system (Abe te al. 2016b) has been upgraded. SK_SN Notice, a warning system that works together with this monitoring system, was released on December 13, 2021, and is available through GCN Notices (Barthelmy et al. 2000). When the monitoring system detects an SN-like burst of events, SK_SN Notice will automatically distribute an alarm with the reconstructed direction to the supernova candidate within a few minutes. In this paper, we present a systematic study of SK-Gd's response to a simulated galactic SN. Assuming a supernova situated at 10 kpc, neutrino fluxes from six supernova models are used to characterize SK-Gd's pointing accuracy using the same tools as the online monitoring system. The pointing accuracy is found to vary from 3-7$^\circ$ depending on the models. However, if the supernova is closer than 10 kpc, SK_SN Notice can issue an alarm with three-degree accuracy, which will benefit follow-up observations by optical telescopes with large fields of view.
△ Less
Submitted 13 March, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
Precise Extraction of Deep Learning Models via Side-Channel Attacks on Edge/Endpoint Devices
Authors:
Younghan Lee,
Sohee Jun,
Yungi Cho,
Woorim Han,
Hyungon Moon,
Yunheung Paek
Abstract:
With growing popularity, deep learning (DL) models are becoming larger-scale, and only the companies with vast training datasets and immense computing power can manage their business serving such large models. Most of those DL models are proprietary to the companies who thus strive to keep their private models safe from the model extraction attack (MEA), whose aim is to steal the model by training…
▽ More
With growing popularity, deep learning (DL) models are becoming larger-scale, and only the companies with vast training datasets and immense computing power can manage their business serving such large models. Most of those DL models are proprietary to the companies who thus strive to keep their private models safe from the model extraction attack (MEA), whose aim is to steal the model by training surrogate models. Nowadays, companies are inclined to offload the models from central servers to edge/endpoint devices. As revealed in the latest studies, adversaries exploit this opportunity as new attack vectors to launch side-channel attack (SCA) on the device running victim model and obtain various pieces of the model information, such as the model architecture (MA) and image dimension (ID). Our work provides a comprehensive understanding of such a relationship for the first time and would benefit future MEA studies in both offensive and defensive sides in that they may learn which pieces of information exposed by SCA are more important than the others. Our analysis additionally reveals that by grasping the victim model information from SCA, MEA can get highly effective and successful even without any prior knowledge of the model. Finally, to evince the practicality of our analysis results, we empirically apply SCA, and subsequently, carry out MEA under realistic threat assumptions. The results show up to 5.8 times better performance than when the adversary has no model information about the victim model.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Collective biphoton temporal waveform of photon-pair generated from Doppler-broadened atomic ensemble
Authors:
Heewoo Kim,
Hansol Jeong,
Han Seb Moon
Abstract:
Photonic quantum states generated from atomic ensembles will play important roles in future quantum networks and long-distance quantum communication because their advantages, such as universal identity and narrow spectral bandwidth, are essential for quantum nodes and quantum repeaters based on atomic ensembles. In this study of the biphoton temporal waveform (BTW) of the photon pairs generated fr…
▽ More
Photonic quantum states generated from atomic ensembles will play important roles in future quantum networks and long-distance quantum communication because their advantages, such as universal identity and narrow spectral bandwidth, are essential for quantum nodes and quantum repeaters based on atomic ensembles. In this study of the biphoton temporal waveform (BTW) of the photon pairs generated from a cascade-type two-photon-transition, we report the collectively coherent superposition of biphoton wavefunction emitted from different velocity classes in a Doppler-broadened cascade-type atomic ensemble. We experimentally demonstrate that the three times difference of temporal width of both BTWs varies dependent on the wavelengths of the signal and idler photons from both 6S_{1/2}-6P_{3/2}-6D_{5/2} and -8S_{1/2} transitions of Cs, corresponding to the idler and signal wavelengths of 852 nm-917 nm and 852 nm-795 nm, respectively. Our results help understand the characteristics of biphoton sources from a warm atomic ensemble and can be applied to long-distance quantum networks and practical quantum repeaters based on atom-photon interactions.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
A State-of-the-art Survey on Full-duplex Network Design
Authors:
Yonghwi Kim,
Hyung-Joo Moon,
Hanju Yoo,
Byoungnam,
Kim,
Kai-Kit Wong,
Chan-Byoung Chae
Abstract:
Full-duplex (FD) technology is gaining popularity for integration into a wide range of wireless networks due to its demonstrated potential in recent studies. In contrast to half-duplex (HD) technology, the implementation of FD in networks necessitates considering inter-node interference (INI) from various network perspectives. When deploying FD technology in networks, several critical factors must…
▽ More
Full-duplex (FD) technology is gaining popularity for integration into a wide range of wireless networks due to its demonstrated potential in recent studies. In contrast to half-duplex (HD) technology, the implementation of FD in networks necessitates considering inter-node interference (INI) from various network perspectives. When deploying FD technology in networks, several critical factors must be taken into account. These include self-interference (SI) and the requisite SI cancellation (SIC) processes, as well as the selection of multiple user equipment (UE) per time slot. Additionally, inter-node interference (INI), including cross-link interference (CLI) and inter-cell interference (ICI), become crucial issues during concurrent uplink (UL) and downlink (DL) transmission and reception, similar to SI. Since most INI is challenging to eliminate, a comprehensive investigation that covers radio resource control (RRC), medium access control (MAC), and the physical layer (PHY) is essential in the context of FD network design, rather than focusing on individual network layers and types. This paper covers state-of-the-art studies, including protocols and documents from 3GPP for FD, MAC protocol, user scheduling, and CLI handling. The methods are also compared through a network-level system simulation based on 3D ray-tracing.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Toward Practical Automatic Speech Recognition and Post-Processing: a Call for Explainable Error Benchmark Guideline
Authors:
Seonmin Koo,
Chanjun Park,
Jinsung Kim,
Jaehyung Seo,
Sugyeong Eo,
Hyeonseok Moon,
Heuiseok Lim
Abstract:
Automatic speech recognition (ASR) outcomes serve as input for downstream tasks, substantially impacting the satisfaction level of end-users. Hence, the diagnosis and enhancement of the vulnerabilities present in the ASR model bear significant importance. However, traditional evaluation methodologies of ASR systems generate a singular, composite quantitative metric, which fails to provide comprehe…
▽ More
Automatic speech recognition (ASR) outcomes serve as input for downstream tasks, substantially impacting the satisfaction level of end-users. Hence, the diagnosis and enhancement of the vulnerabilities present in the ASR model bear significant importance. However, traditional evaluation methodologies of ASR systems generate a singular, composite quantitative metric, which fails to provide comprehensive insight into specific vulnerabilities. This lack of detail extends to the post-processing stage, resulting in further obfuscation of potential weaknesses. Despite an ASR model's ability to recognize utterances accurately, subpar readability can negatively affect user satisfaction, giving rise to a trade-off between recognition accuracy and user-friendliness. To effectively address this, it is imperative to consider both the speech-level, crucial for recognition accuracy, and the text-level, critical for user-friendliness. Consequently, we propose the development of an Error Explainable Benchmark (EEB) dataset. This dataset, while considering both speech- and text-level, enables a granular understanding of the model's shortcomings. Our proposition provides a structured pathway for a more `real-world-centric' evaluation, a marked shift away from abstracted, traditional methods, allowing for the detection and rectification of nuanced system weaknesses, ultimately aiming for an improved user experience.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Solar neutrino measurements using the full data period of Super-Kamiokande-IV
Authors:
Super-Kamiokande Collaboration,
:,
K. Abe,
C. Bronner,
Y. Hayato,
K. Hiraide,
K. Hosokawa,
K. Ieki,
M. Ikeda,
S. Imaizumi,
K. Iyogi,
J. Kameda,
Y. Kanemura,
R. Kaneshima,
Y. Kashiwagi,
Y. Kataoka,
Y. Kato,
Y. Kishimoto,
S. Miki,
S. Mine,
M. Miura,
T. Mochizuki,
S. Moriyama,
Y. Nagao,
M. Nakahata
, et al. (305 additional authors not shown)
Abstract:
An analysis of solar neutrino data from the fourth phase of Super-Kamiokande~(SK-IV) from October 2008 to May 2018 is performed and the results are presented. The observation time of the data set of SK-IV corresponds to $2970$~days and the total live time for all four phases is $5805$~days. For more precise solar neutrino measurements, several improvements are applied in this analysis: lowering th…
▽ More
An analysis of solar neutrino data from the fourth phase of Super-Kamiokande~(SK-IV) from October 2008 to May 2018 is performed and the results are presented. The observation time of the data set of SK-IV corresponds to $2970$~days and the total live time for all four phases is $5805$~days. For more precise solar neutrino measurements, several improvements are applied in this analysis: lowering the data acquisition threshold in May 2015, further reduction of the spallation background using neutron clustering events, precise energy reconstruction considering the time variation of the PMT gain. The observed number of solar neutrino events in $3.49$--$19.49$ MeV electron kinetic energy region during SK-IV is $65,443^{+390}_{-388}\,(\mathrm{stat.})\pm 925\,(\mathrm{syst.})$ events. Corresponding $\mathrm{^{8}B}$ solar neutrino flux is $(2.314 \pm 0.014\, \rm{(stat.)} \pm 0.040 \, \rm{(syst.)}) \times 10^{6}~\mathrm{cm^{-2}\,s^{-1}}$, assuming a pure electron-neutrino flavor component without neutrino oscillations. The flux combined with all SK phases up to SK-IV is $(2.336 \pm 0.011\, \rm{(stat.)} \pm 0.043 \, \rm{(syst.)}) \times 10^{6}~\mathrm{cm^{-2}\,s^{-1}}$. Based on the neutrino oscillation analysis from all solar experiments, including the SK $5805$~days data set, the best-fit neutrino oscillation parameters are $\rm{sin^{2} θ_{12,\,solar}} = 0.306 \pm 0.013 $ and $Δm^{2}_{21,\,\mathrm{solar}} = (6.10^{+ 0.95}_{-0.81}) \times 10^{-5}~\rm{eV}^{2}$, with a deviation of about 1.5$σ$ from the $Δm^{2}_{21}$ parameter obtained by KamLAND. The best-fit neutrino oscillation parameters obtained from all solar experiments and KamLAND are $\sin^{2} θ_{12,\,\mathrm{global}} = 0.307 \pm 0.012 $ and $Δm^{2}_{21,\,\mathrm{global}} = (7.50^{+ 0.19}_{-0.18}) \times 10^{-5}~\rm{eV}^{2}$.
△ Less
Submitted 20 February, 2024; v1 submitted 20 December, 2023;
originally announced December 2023.
-
Nanoscale confinement and control of excitonic complexes in a monolayer WSe2
Authors:
Hyowon Moon,
Lukas Mennel,
Chitraleema Chakraborty,
Cheng Peng,
Jawaher Almutlaq,
Takashi Taniguchi,
Kenji Watanabe,
Dirk Englund
Abstract:
Nanoscale control and observation of photophysical processes in semiconductors is critical for basic understanding and applications from optoelectronics to quantum information processing. In particular, there are open questions and opportunities in controlling excitonic complexes in two-dimensional materials such as excitons, trions or biexcitons. However, neither conventional diffraction-limited…
▽ More
Nanoscale control and observation of photophysical processes in semiconductors is critical for basic understanding and applications from optoelectronics to quantum information processing. In particular, there are open questions and opportunities in controlling excitonic complexes in two-dimensional materials such as excitons, trions or biexcitons. However, neither conventional diffraction-limited optical spectroscopy nor lithography-limited electric control provides a proper tool to investigate these quasiparticles at the nanometer-scale at cryogenic temperature. Here, we introduce a cryogenic capacitive confocal optical microscope (C3OM) as a tool to study quasiparticle dynamics at the nanometer scale. Using a conductive atomic force microscope (AFM) tip as a gate electrode, we can modulate the electronic doping at the nanometer scale in WSe2 at 4K. This tool allows us to modulate with nanometer-scale confinement the exciton and trion peaks, as well a distinct photoluminescence line associated with a larger excitonic complex that exhibits distinctive nonlinear optical response. Our results demonstrate nanoscale confinement and spectroscopy of exciton complexes at arbitrary positions, which should prove an important tool for quantitative understanding of complex optoelectronic properties in semiconductors as well as for applications ranging from quantum spin liquids to superresolution measurements to control of quantum emitters.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
A Computing-in-Memory-based One-Class Hyperdimensional Computing Model for Outlier Detection
Authors:
Ruixuan Wang,
Sabrina Hassan Moon,
Xiaobo Sharon Hu,
Xun Jiao,
Dayane Reis
Abstract:
In this work, we present ODHD, an algorithm for outlier detection based on hyperdimensional computing (HDC), a non-classical learning paradigm. Along with the HDC-based algorithm, we propose IM-ODHD, a computing-in-memory (CiM) implementation based on hardware/software (HW/SW) codesign for improved latency and energy efficiency. The training and testing phases of ODHD may be performed with convent…
▽ More
In this work, we present ODHD, an algorithm for outlier detection based on hyperdimensional computing (HDC), a non-classical learning paradigm. Along with the HDC-based algorithm, we propose IM-ODHD, a computing-in-memory (CiM) implementation based on hardware/software (HW/SW) codesign for improved latency and energy efficiency. The training and testing phases of ODHD may be performed with conventional CPU/GPU hardware or our IM-ODHD, SRAM-based CiM architecture using the proposed HW/SW codesign techniques. We evaluate the performance of ODHD on six datasets from different application domains using three metrics, namely accuracy, F1 score, and ROC-AUC, and compare it with multiple baseline methods such as OCSVM, isolation forest, and autoencoder. The experimental results indicate that ODHD outperforms all the baseline methods in terms of these three metrics on every dataset for both CPU/GPU and CiM implementations. Furthermore, we perform an extensive design space exploration to demonstrate the tradeoff between delay, energy efficiency, and performance of ODHD. We demonstrate that the HW/SW codesign implementation of the outlier detection on IM-ODHD is able to outperform the GPU-based implementation of ODHD by at least 331.5x/889x in terms of training/testing latency (and on average 14.0x/36.9x in terms of training/testing energy consumption.
△ Less
Submitted 22 February, 2024; v1 submitted 29 November, 2023;
originally announced November 2023.
-
Four-set Hypergraphlets for Characterization of Directed Hypergraphs
Authors:
Heechan Moon,
Hyunju Kim,
Sunwoo Kim,
Kijung Shin
Abstract:
A directed hypergraph, which consists of nodes and hyperarcs, is a higher-order data structure that naturally models directional group interactions (e.g., chemical reactions of molecules). Although there have been extensive studies on local structures of (directed) graphs in the real world, those of directed hypergraphs remain unexplored. In this work, we focus on measurements, findings, and appli…
▽ More
A directed hypergraph, which consists of nodes and hyperarcs, is a higher-order data structure that naturally models directional group interactions (e.g., chemical reactions of molecules). Although there have been extensive studies on local structures of (directed) graphs in the real world, those of directed hypergraphs remain unexplored. In this work, we focus on measurements, findings, and applications related to local structures of directed hypergraphs, and they together contribute to a systematic understanding of various real-world systems interconnected by directed group interactions. Our first contribution is to define 91 directed hypergraphlets (DHGs), which disjointly categorize directed connections and overlaps among four node sets that compose two incident hyperarcs. Our second contribution is to develop exact and approximate algorithms for counting the occurrences of each DHG. Our last contribution is to characterize 11 real-world directed hypergraphs and individual hyperarcs in them using the occurrences of DHGs, which reveals clear domain-based local structural patterns. Our experiments demonstrate that our DHG-based characterization gives up to 12% and 33% better performances on hypergraph clustering and hyperarc prediction, respectively, than baseline characterization methods. Moreover, we show that CODA-A, which is our proposed approximate algorithm, is up to 32X faster than its competitors with similar characterization quality.
△ Less
Submitted 24 November, 2023;
originally announced November 2023.
-
Atmospheric neutrino oscillation analysis with neutron tagging and an expanded fiducial volume in Super-Kamiokande I-V
Authors:
Super-Kamiokande Collaboration,
:,
T. Wester,
K. Abe,
C. Bronner,
Y. Hayato,
K. Hiraide,
K. Hosokawa,
K. Ieki,
M. Ikeda,
J. Kameda,
Y. Kanemura,
R. Kaneshima,
Y. Kashiwagi,
Y. Kataoka,
S. Miki,
S. Mine,
M. Miura,
S. Moriyama,
Y. Nakano,
M. Nakahata,
S. Nakayama,
Y. Noguchi,
K. Sato,
H. Sekiya
, et al. (212 additional authors not shown)
Abstract:
We present a measurement of neutrino oscillation parameters with the Super-Kamiokande detector using atmospheric neutrinos from the complete pure-water SK I-V (April 1996-July 2020) data set, including events from an expanded fiducial volume. The data set corresponds to 6511.3 live days and an exposure of 484.2 kiloton-years. Measurements of the neutrino oscillation parameters $Δm^2_{32}$,…
▽ More
We present a measurement of neutrino oscillation parameters with the Super-Kamiokande detector using atmospheric neutrinos from the complete pure-water SK I-V (April 1996-July 2020) data set, including events from an expanded fiducial volume. The data set corresponds to 6511.3 live days and an exposure of 484.2 kiloton-years. Measurements of the neutrino oscillation parameters $Δm^2_{32}$, $\sin^2θ_{23}$, $\sin^2 θ_{13}$, $δ_{CP}$, and the preference for the neutrino mass ordering are presented with atmospheric neutrino data alone, and with constraints on $\sin^2 θ_{13}$ from reactor neutrino experiments. Our analysis including constraints on $\sin^2 θ_{13}$ favors the normal mass ordering at the 92.3% level.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
The High Energy X-ray Probe: Resolved X-ray Populations in Extragalactic Environments
Authors:
Bret D. Lehmer,
Kristen Garofali,
Breanna A. Binder,
Francesca Fornasini,
Neven Vulic,
Andreas Zezas,
Ann Hornschemeier,
Margaret Lazzarini,
Hannah Moon,
Toni Venters,
Daniel Wik,
Mihoko Yukita,
Matteo Bachetti,
Javier A. García,
Brian Grefenstette,
Kristin Madsen,
Kaya Mori,
Daniel Stern
Abstract:
We construct simulated galaxy data sets based on the High Energy X-ray Probe (HEX-P) mission concept to demonstrate the significant advances in galaxy science that will be yielded by the HEX-P observatory. The combination of high spatial resolution imaging ($<$20 arcsec FWHM), broad spectral coverage (0.2-80 keV), and sensitivity superior to current facilities (e.g., XMM-Newton and NuSTAR) will en…
▽ More
We construct simulated galaxy data sets based on the High Energy X-ray Probe (HEX-P) mission concept to demonstrate the significant advances in galaxy science that will be yielded by the HEX-P observatory. The combination of high spatial resolution imaging ($<$20 arcsec FWHM), broad spectral coverage (0.2-80 keV), and sensitivity superior to current facilities (e.g., XMM-Newton and NuSTAR) will enable HEX-P to detect hard (4-25 keV) X-ray emission from resolved point-source populations within $\sim$800 galaxies and integrated emission from $\sim$6000 galaxies out to 100 Mpc. These galaxies cover wide ranges of galaxy types (e.g., normal, starburst, and passive galaxies) and properties (e.g., metallicities and star-formation histories). In such galaxies, HEX-P will: (1) provide unique information about X-ray binary populations, including accretor demographics (black hole and neutron stars), distributions of accretion states and state transition cadences; (2) place order-of-magnitude more stringent constraints on inverse Compton emission associated with particle acceleration in starburst environments; and (3) put into clear context the contributions from X-ray emitting populations to both ionizing the surrounding interstellar medium in low-metallicity galaxies and heating the intergalactic medium in the $z > 8$ Universe.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
Measurement of the neutrino-oxygen neutral-current quasielastic cross section using atmospheric neutrinos in the SK-Gd experiment
Authors:
S. Sakai,
K. Abe,
C. Bronner,
Y. Hayato,
K. Hiraide,
K. Hosokawa,
K. Ieki,
M. Ikeda,
J. Kameda,
Y. Kanemura,
R. Kaneshima,
Y. Kashiwagi,
Y. Kataoka,
S. Miki,
S. Mine,
M. Miura,
S. Moriyama,
Y. Nakano,
M. Nakahata,
S. Nakayama,
Y. Noguchi,
K. Sato,
H. Sekiya,
H. Shiba,
K. Shimizu
, et al. (211 additional authors not shown)
Abstract:
We report the first measurement of the atmospheric neutrino-oxygen neutral-current quasielastic (NCQE) cross section in the gadolinium-loaded Super-Kamiokande (SK) water Cherenkov detector. In June 2020, SK began a new experimental phase, named SK-Gd, by loading 0.011% by mass of gadolinium into the ultrapure water of the SK detector. The introduction of gadolinium to ultrapure water has the effec…
▽ More
We report the first measurement of the atmospheric neutrino-oxygen neutral-current quasielastic (NCQE) cross section in the gadolinium-loaded Super-Kamiokande (SK) water Cherenkov detector. In June 2020, SK began a new experimental phase, named SK-Gd, by loading 0.011% by mass of gadolinium into the ultrapure water of the SK detector. The introduction of gadolinium to ultrapure water has the effect of improving the neutron-tagging efficiency. Using a 552.2 day data set from August 2020 to June 2022, we measure the NCQE cross section to be 0.74 $\pm$ 0.22(stat.) $^{+0.85}_{-0.15}$ (syst.) $\times$ 10$^{-38}$ cm$^{2}$/oxygen in the energy range from 160 MeV to 10 GeV, which is consistent with the atmospheric neutrino-flux-averaged theoretical NCQE cross section and the measurement in the SK pure-water phase within the uncertainties. Furthermore, we compare the models of the nucleon-nucleus interactions in water and find that the Binary Cascade model and the Liege Intranuclear Cascade model provide a somewhat better fit to the observed data than the Bertini Cascade model. Since the atmospheric neutrino-oxygen NCQE reactions are one of the main backgrounds in the search for diffuse supernova neutrino background (DSNB), these new results will contribute to future studies - and the potential discovery - of the DSNB in SK.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
Photometry of the Didymos system across the DART impact apparition
Authors:
Nicholas Moskovitz,
Cristina Thomas,
Petr Pravec,
Tim Lister,
Tom Polakis,
David Osip,
Theodore Kareta,
Agata Rożek,
Steven R. Chesley,
Shantanu P. Naidu,
Peter Scheirich,
William Ryan,
Eileen Ryan,
Brian Skiff,
Colin Snodgrass,
Matthew M. Knight,
Andrew S. Rivkin,
Nancy L. Chabot,
Vova Ayvazian,
Irina Belskaya,
Zouhair Benkhaldoun,
Daniel N. Berteşteanu,
Mariangela Bonavita,
Terrence H. Bressi,
Melissa J. Brucker
, et al. (56 additional authors not shown)
Abstract:
On 26 September 2022, the Double Asteroid Redirection Test (DART) spacecraft impacted Dimorphos, the satellite of binary near-Earth asteroid (65803) Didymos. This demonstrated the efficacy of a kinetic impactor for planetary defense by changing the orbital period of Dimorphos by 33 minutes (Thomas et al. 2023). Measuring the period change relied heavily on a coordinated campaign of lightcurve phot…
▽ More
On 26 September 2022, the Double Asteroid Redirection Test (DART) spacecraft impacted Dimorphos, the satellite of binary near-Earth asteroid (65803) Didymos. This demonstrated the efficacy of a kinetic impactor for planetary defense by changing the orbital period of Dimorphos by 33 minutes (Thomas et al. 2023). Measuring the period change relied heavily on a coordinated campaign of lightcurve photometry designed to detect mutual events (occultations and eclipses) as a direct probe of the satellite's orbital period. A total of 28 telescopes contributed 224 individual lightcurves during the impact apparition from July 2022 to February 2023. We focus here on decomposable lightcurves, i.e. those from which mutual events could be extracted. We describe our process of lightcurve decomposition and use that to release the full data set for future analysis. We leverage these data to place constraints on the post-impact evolution of ejecta. The measured depths of mutual events relative to models showed that the ejecta became optically thin within the first ~1 day after impact, and then faded with a decay time of about 25 days. The bulk magnitude of the system showed that ejecta no longer contributed measurable brightness enhancement after about 20 days post-impact. This bulk photometric behavior was not well represented by an HG photometric model. An HG1G2 model did fit the data well across a wide range of phase angles. Lastly, we note the presence of an ejecta tail through at least March 2023. Its persistence implied ongoing escape of ejecta from the system many months after DART impact.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Search for Periodic Time Variations of the Solar $^8$B Neutrino Flux between 1996 and 2018 in Super-Kamiokande
Authors:
K. Abe,
C. Bronner,
Y. Hayato,
K. Hiraide,
K. Hosokawa,
K. Ieki,
M. Ikeda,
J. Kameda,
Y. Kanemura,
R. Kaneshima,
Y. Kashiwagi,
Y. Kataoka,
S. Miki,
S. Mine,
M. Miura,
S. Moriyama,
Y. Nakano,
M. Nakahata,
S. Nakayama,
Y. Noguchi,
K. Sato,
H. Sekiya,
H. Shiba,
K. Shimizu,
M. Shiozawa
, et al. (211 additional authors not shown)
Abstract:
We report a search for time variations of the solar $^8$B neutrino flux using 5804 live days of Super-Kamiokande data collected between May 31, 1996, and May 30, 2018. Super-Kamiokande measured the precise time of each solar neutrino interaction over 22 calendar years to search for solar neutrino flux modulations with unprecedented precision. Periodic modulations are searched for in a dataset comp…
▽ More
We report a search for time variations of the solar $^8$B neutrino flux using 5804 live days of Super-Kamiokande data collected between May 31, 1996, and May 30, 2018. Super-Kamiokande measured the precise time of each solar neutrino interaction over 22 calendar years to search for solar neutrino flux modulations with unprecedented precision. Periodic modulations are searched for in a dataset comprising five-day interval solar neutrino flux measurements with a maximum likelihood method. We also applied the Lomb-Scargle method to this dataset to compare it with previous reports. The only significant modulation found is due to the elliptic orbit of the Earth around the Sun. The observed modulation is consistent with astronomical data: we measured an eccentricity of (1.53$\pm$0.35)\%, and a perihelion shift of ($-$1.5$\pm$13.5) days.
△ Less
Submitted 6 June, 2024; v1 submitted 2 November, 2023;
originally announced November 2023.
-
Bayesian Estimation of Panel Models under Potentially Sparse Heterogeneity
Authors:
Hyungsik Roger Moon,
Frank Schorfheide,
Boyuan Zhang
Abstract:
We incorporate a version of a spike and slab prior, comprising a pointmass at zero ("spike") and a Normal distribution around zero ("slab") into a dynamic panel data framework to model coefficient heterogeneity. In addition to homogeneity and full heterogeneity, our specification can also capture sparse heterogeneity, that is, there is a core group of units that share common parameters and a set o…
▽ More
We incorporate a version of a spike and slab prior, comprising a pointmass at zero ("spike") and a Normal distribution around zero ("slab") into a dynamic panel data framework to model coefficient heterogeneity. In addition to homogeneity and full heterogeneity, our specification can also capture sparse heterogeneity, that is, there is a core group of units that share common parameters and a set of deviators with idiosyncratic parameters. We fit a model with unobserved components to income data from the Panel Study of Income Dynamics. We find evidence for sparse heterogeneity for balanced panels composed of individuals with long employment histories.
△ Less
Submitted 5 February, 2024; v1 submitted 20 October, 2023;
originally announced October 2023.
-
On algebraic space filling curves
Authors:
Alana Campbell,
Flora Dedvukaj,
Donald McCormick III,
Han-Bom Moon,
Joshua Morales
Abstract:
Poonen and Gabber independently showed that any smooth geometrically irreducible projective scheme over a finite field has a smooth space filling curve, that is, a smooth curve defined over the field and passes through all points over the field. However, except the case of projective plane, no concrete example was found in literature. In this note, we construct explicit examples of algebraic space…
▽ More
Poonen and Gabber independently showed that any smooth geometrically irreducible projective scheme over a finite field has a smooth space filling curve, that is, a smooth curve defined over the field and passes through all points over the field. However, except the case of projective plane, no concrete example was found in literature. In this note, we construct explicit examples of algebraic space filling curves in three dimensional projective space, in particular the ones with minimum degree.
△ Less
Submitted 4 September, 2023;
originally announced October 2023.
-
Hyperdimensional Computing as a Rescue for Efficient Privacy-Preserving Machine Learning-as-a-Service
Authors:
Jaewoo Park,
Chenghao Quan,
Hyungon Moon,
Jongeun Lee
Abstract:
Machine learning models are often provisioned as a cloud-based service where the clients send their data to the service provider to obtain the result. This setting is commonplace due to the high value of the models, but it requires the clients to forfeit the privacy that the query data may contain. Homomorphic encryption (HE) is a promising technique to address this adversity. With HE, the service…
▽ More
Machine learning models are often provisioned as a cloud-based service where the clients send their data to the service provider to obtain the result. This setting is commonplace due to the high value of the models, but it requires the clients to forfeit the privacy that the query data may contain. Homomorphic encryption (HE) is a promising technique to address this adversity. With HE, the service provider can take encrypted data as a query and run the model without decrypting it. The result remains encrypted, and only the client can decrypt it. All these benefits come at the cost of computational cost because HE turns simple floating-point arithmetic into the computation between long (degree over 1024) polynomials. Previous work has proposed to tailor deep neural networks for efficient computation over encrypted data, but already high computational cost is again amplified by HE, hindering performance improvement. In this paper we show hyperdimensional computing can be a rescue for privacy-preserving machine learning over encrypted data. We find that the advantage of hyperdimensional computing in performance is amplified when working with HE. This observation led us to design HE-HDC, a machine-learning inference system that uses hyperdimensional computing with HE. We carefully structure the machine learning service so that the server will perform only the HE-friendly computation. Moreover, we adapt the computation and HE parameters to expedite computation while preserving accuracy and security. Our experimental result based on real measurements shows that HE-HDC outperforms existing systems by 26~3000 times with comparable classification accuracy.
△ Less
Submitted 16 August, 2023;
originally announced October 2023.
-
Derived categories of symmetric products and moduli spaces of vector bundles on a curve
Authors:
Kyoung-Seog Lee,
Han-Bom Moon
Abstract:
We show that the derived categories of symmetric products of a curve are embedded into the derived categories of the moduli spaces of vector bundles of large ranks on the curve. It supports a prediction of the existence of a semiorthogonal decomposition of the derived category of the moduli space, expected by a motivic computation. As an application, we show that all Jacobian varieties, symmetric…
▽ More
We show that the derived categories of symmetric products of a curve are embedded into the derived categories of the moduli spaces of vector bundles of large ranks on the curve. It supports a prediction of the existence of a semiorthogonal decomposition of the derived category of the moduli space, expected by a motivic computation. As an application, we show that all Jacobian varieties, symmetric products of curves and all principally polarized abelian varieties of dimension at most three, are Fano visitors. We also obtain similar results for motives.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Pointing-and-Acquisition for Optical Wireless in 6G: From Algorithms to Performance Evaluation
Authors:
Hyung-Joo Moon,
Chan-Byoung Chae,
Kai-Kit Wong,
Mohamed-Slim Alouini
Abstract:
The increasing demand for wireless communication services has led to the development of non-terrestrial networks, which enables various air and space applications. Free-space optical (FSO) communication is considered one of the essential technologies capable of connecting terrestrial and non-terrestrial layers. In this article, we analyze considerations and challenges for FSO communications betwee…
▽ More
The increasing demand for wireless communication services has led to the development of non-terrestrial networks, which enables various air and space applications. Free-space optical (FSO) communication is considered one of the essential technologies capable of connecting terrestrial and non-terrestrial layers. In this article, we analyze considerations and challenges for FSO communications between gateways and aircraft from a pointing-and-acquisition perspective. Based on the analysis, we first develop a baseline method that utilizes conventional devices and mechanisms. Furthermore, we propose an algorithm that combines angle of arrival (AoA) estimation through supplementary radio frequency (RF) links and beam tracking using retroreflectors. Through extensive simulations, we demonstrate that the proposed method offers superior performance in terms of link acquisition and maintenance.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
The acrylic vessel for JSNS$^{2}$-II neutrino target
Authors:
C. D. Shin,
S. Ajimura,
M. K. Cheoun,
J. H. Choi,
J. Y. Choi,
T. Dodo,
J. Goh,
K. Haga,
M. Harada,
S. Hasegawa,
T. Hiraiwa,
W. Hwang,
T. Iida,
H. I. Jang,
J. S. Jang,
H. Jeon,
S. Jeon,
K. K. Joo,
D. E. Jung,
S. K. Kang,
Y. Kasugai,
T. Kawasaki,
E. J. Kim,
J. Y. Kim,
S. B. Kim
, et al. (35 additional authors not shown)
Abstract:
The JSNS$^{2}$ (J-PARC Sterile Neutrino Search at J-PARC Spallation Neutron Source) is an experiment designed for the search for sterile neutrinos. The experiment is currently at the stage of the second phase named JSNS$^{2}$-II with two detectors at near and far locations from the neutrino source. One of the key components of the experiment is an acrylic vessel, that is used for the target volume…
▽ More
The JSNS$^{2}$ (J-PARC Sterile Neutrino Search at J-PARC Spallation Neutron Source) is an experiment designed for the search for sterile neutrinos. The experiment is currently at the stage of the second phase named JSNS$^{2}$-II with two detectors at near and far locations from the neutrino source. One of the key components of the experiment is an acrylic vessel, that is used for the target volume for the detection of the anti-neutrinos. The specifications, design, and measured properties of the acrylic vessel are described.
△ Less
Submitted 11 December, 2023; v1 submitted 4 September, 2023;
originally announced September 2023.
-
Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes
Authors:
Sunjun Kweon,
Junu Kim,
Jiyoun Kim,
Sujeong Im,
Eunbyeol Cho,
Seongsu Bae,
Jungwoo Oh,
Gyubok Lee,
Jong Hak Moon,
Seng Chan You,
Seungjin Baek,
Chang Hoon Han,
Yoon Bin Jung,
Yohan Jo,
Edward Choi
Abstract:
The development of large language models tailored for handling patients' clinical notes is often hindered by the limited accessibility and usability of these notes due to strict privacy regulations. To address these challenges, we first create synthetic large-scale clinical notes using publicly available case reports extracted from biomedical literature. We then use these synthetic notes to train…
▽ More
The development of large language models tailored for handling patients' clinical notes is often hindered by the limited accessibility and usability of these notes due to strict privacy regulations. To address these challenges, we first create synthetic large-scale clinical notes using publicly available case reports extracted from biomedical literature. We then use these synthetic notes to train our specialized clinical large language model, Asclepius. While Asclepius is trained on synthetic data, we assess its potential performance in real-world applications by evaluating it using real clinical notes. We benchmark Asclepius against several other large language models, including GPT-3.5-turbo and other open-source alternatives. To further validate our approach using synthetic notes, we also compare Asclepius with its variants trained on real clinical notes. Our findings convincingly demonstrate that synthetic clinical notes can serve as viable substitutes for real ones when constructing high-performing clinical language models. This conclusion is supported by detailed evaluations conducted by both GPT-4 and medical professionals. All resources including weights, codes, and data used in the development of Asclepius are made publicly accessible for future research. (https://github.com/starmpcc/Asclepius)
△ Less
Submitted 29 July, 2024; v1 submitted 1 September, 2023;
originally announced September 2023.
-
Distribution of the number of zeros of polynomials over a finite field
Authors:
Ritik Jain,
Han-Bom Moon,
Peter Wu
Abstract:
We study the probability distribution of the number of zeros of multivariable polynomials with bounded degree over a finite field. We find the probability generating function for each set of bounded degree polynomials. In particular, in the single variable case, we show that as the degree of the polynomials and the order of the field simultaneously approach infinity, the distribution converges to…
▽ More
We study the probability distribution of the number of zeros of multivariable polynomials with bounded degree over a finite field. We find the probability generating function for each set of bounded degree polynomials. In particular, in the single variable case, we show that as the degree of the polynomials and the order of the field simultaneously approach infinity, the distribution converges to a Poisson distribution.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Computation of GIT quotients of semisimple groups
Authors:
Patricio Gallardo,
Jesus Martinez-Garcia,
Han-Bom Moon,
David Swinarski
Abstract:
We describe three algorithms to determine the stable, semistable, and torus-polystable loci of the GIT quotient of a projective variety by a reductive group. The algorithms are efficient when the group is semisimple. By using an implementation of our algorithms for simple groups, we provide several applications to the moduli theory of algebraic varieties, including the K-moduli of algebraic variet…
▽ More
We describe three algorithms to determine the stable, semistable, and torus-polystable loci of the GIT quotient of a projective variety by a reductive group. The algorithms are efficient when the group is semisimple. By using an implementation of our algorithms for simple groups, we provide several applications to the moduli theory of algebraic varieties, including the K-moduli of algebraic varieties, the moduli of algebraic curves and the Mukai models of the moduli space of curves for low genus. We also discuss a number of potential improvements and some natural open problems arising from this work.
△ Less
Submitted 15 August, 2023;
originally announced August 2023.
-
Study on the accidental background of the JSNS$^2$ experiment
Authors:
D. H. Lee,
S. Ajimura,
M. K. Cheoun,
J. H. Choi,
J. Y. Choi,
T. Dodo,
J. Goh,
K. Haga,
M. Harada,
S. Hasegawa,
T. Hiraiwa,
W. Hwang,
H. I. Jang,
J. S. Jang,
H. Jeon,
S. Jeon,
K. K. Joo,
D. E. Jung,
S. K. Kang,
Y. Kasugai,
T. Kawasaki,
E. J. Kim,
J. Y. Kim,
S. B. Kim,
W. Kim
, et al. (33 additional authors not shown)
Abstract:
JSNS$^2$ (J-PARC Sterile Neutrino Search at J-PARC Spallation Neutron Source) is an experiment which searches for sterile neutrinos via the observation of $\barν_μ \to \barν_{e}$ appearance oscillations using muon decay-at-rest neutrinos. The data taking of JSNS$^2$ have been performed from 2021. In this manuscript, a study of the accidental background is presented. The rate of the accidental back…
▽ More
JSNS$^2$ (J-PARC Sterile Neutrino Search at J-PARC Spallation Neutron Source) is an experiment which searches for sterile neutrinos via the observation of $\barν_μ \to \barν_{e}$ appearance oscillations using muon decay-at-rest neutrinos. The data taking of JSNS$^2$ have been performed from 2021. In this manuscript, a study of the accidental background is presented. The rate of the accidental background is (9.29$\pm 0.39) \times 10^{-8}$ / spill with 0.75 MW beam power and comparable to the number of searching signals.
△ Less
Submitted 22 April, 2024; v1 submitted 4 August, 2023;
originally announced August 2023.
-
Unsupervised Accuracy Estimation of Deep Visual Models using Domain-Adaptive Adversarial Perturbation without Source Samples
Authors:
JoonHo Lee,
Jae Oh Woo,
Hankyu Moon,
Kwonho Lee
Abstract:
Deploying deep visual models can lead to performance drops due to the discrepancies between source and target distributions. Several approaches leverage labeled source data to estimate target domain accuracy, but accessing labeled source data is often prohibitively difficult due to data confidentiality or resource limitations on serving devices. Our work proposes a new framework to estimate model…
▽ More
Deploying deep visual models can lead to performance drops due to the discrepancies between source and target distributions. Several approaches leverage labeled source data to estimate target domain accuracy, but accessing labeled source data is often prohibitively difficult due to data confidentiality or resource limitations on serving devices. Our work proposes a new framework to estimate model accuracy on unlabeled target data without access to source data. We investigate the feasibility of using pseudo-labels for accuracy estimation and evolve this idea into adopting recent advances in source-free domain adaptation algorithms. Our approach measures the disagreement rate between the source hypothesis and the target pseudo-labeling function, adapted from the source hypothesis. We mitigate the impact of erroneous pseudo-labels that may arise due to a high ideal joint hypothesis risk by employing adaptive adversarial perturbation on the input of the target model. Our proposed source-free framework effectively addresses the challenging distribution shift scenarios and outperforms existing methods requiring source data and labels for training.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
A Demand-Driven Perspective on Generative Audio AI
Authors:
Sangshin Oh,
Minsung Kang,
Hyeongi Moon,
Keunwoo Choi,
Ben Sangbae Chon
Abstract:
To achieve successful deployment of AI research, it is crucial to understand the demands of the industry. In this paper, we present the results of a survey conducted with professional audio engineers, in order to determine research priorities and define various research tasks. We also summarize the current challenges in audio quality and controllability based on the survey. Our analysis emphasizes…
▽ More
To achieve successful deployment of AI research, it is crucial to understand the demands of the industry. In this paper, we present the results of a survey conducted with professional audio engineers, in order to determine research priorities and define various research tasks. We also summarize the current challenges in audio quality and controllability based on the survey. Our analysis emphasizes that the availability of datasets is currently the main bottleneck for achieving high-quality audio generation. Finally, we suggest potential solutions for some revealed issues with empirical evidence.
△ Less
Submitted 9 July, 2023;
originally announced July 2023.
-
Data-Driven Approach for Formality-Sensitive Machine Translation: Language-Specific Handling and Synthetic Data Generation
Authors:
Seugnjun Lee,
Hyeonseok Moon,
Chanjun Park,
Heuiseok Lim
Abstract:
In this paper, we introduce a data-driven approach for Formality-Sensitive Machine Translation (FSMT) that caters to the unique linguistic properties of four target languages. Our methodology centers on two core strategies: 1) language-specific data handling, and 2) synthetic data generation using large-scale language models and empirical prompt engineering. This approach demonstrates a considerab…
▽ More
In this paper, we introduce a data-driven approach for Formality-Sensitive Machine Translation (FSMT) that caters to the unique linguistic properties of four target languages. Our methodology centers on two core strategies: 1) language-specific data handling, and 2) synthetic data generation using large-scale language models and empirical prompt engineering. This approach demonstrates a considerable improvement over the baseline, highlighting the effectiveness of data-centric techniques. Our prompt engineering strategy further improves performance by producing superior synthetic translation examples.
△ Less
Submitted 27 June, 2023; v1 submitted 26 June, 2023;
originally announced June 2023.
-
Synthetic Alone: Exploring the Dark Side of Synthetic Data for Grammatical Error Correction
Authors:
Chanjun Park,
Seonmin Koo,
Seolhwa Lee,
Jaehyung Seo,
Sugyeong Eo,
Hyeonseok Moon,
Heuiseok Lim
Abstract:
Data-centric AI approach aims to enhance the model performance without modifying the model and has been shown to impact model performance positively. While recent attention has been given to data-centric AI based on synthetic data, due to its potential for performance improvement, data-centric AI has long been exclusively validated using real-world data and publicly available benchmark datasets. I…
▽ More
Data-centric AI approach aims to enhance the model performance without modifying the model and has been shown to impact model performance positively. While recent attention has been given to data-centric AI based on synthetic data, due to its potential for performance improvement, data-centric AI has long been exclusively validated using real-world data and publicly available benchmark datasets. In respect of this, data-centric AI still highly depends on real-world data, and the verification of models using synthetic data has not yet been thoroughly carried out. Given the challenges above, we ask the question: Does data quality control (noise injection and balanced data), a data-centric AI methodology acclaimed to have a positive impact, exhibit the same positive impact in models trained solely with synthetic data? To address this question, we conducted comparative analyses between models trained on synthetic and real-world data based on grammatical error correction (GEC) task. Our experimental results reveal that the data quality control method has a positive impact on models trained with real-world data, as previously reported in existing studies, while a negative impact is observed in models trained solely on synthetic data.
△ Less
Submitted 25 June, 2023;
originally announced June 2023.
-
FALL-E: A Foley Sound Synthesis Model and Strategies
Authors:
Minsung Kang,
Sangshin Oh,
Hyeongi Moon,
Kyungyun Lee,
Ben Sangbae Chon
Abstract:
This paper introduces FALL-E, a foley synthesis system and its training/inference strategies. The FALL-E model employs a cascaded approach comprising low-resolution spectrogram generation, spectrogram super-resolution, and a vocoder. We trained every sound-related model from scratch using our extensive datasets, and utilized a pre-trained language model. We conditioned the model with dataset-speci…
▽ More
This paper introduces FALL-E, a foley synthesis system and its training/inference strategies. The FALL-E model employs a cascaded approach comprising low-resolution spectrogram generation, spectrogram super-resolution, and a vocoder. We trained every sound-related model from scratch using our extensive datasets, and utilized a pre-trained language model. We conditioned the model with dataset-specific texts, enabling it to learn sound quality and recording environment based on text input. Moreover, we leveraged external language models to improve text descriptions of our datasets and performed prompt engineering for quality, coherence, and diversity. FALL-E was evaluated by an objective measure as well as listening tests in the DCASE 2023 challenge Task 7. The submission achieved the second place on average, while achieving the best score for diversity, second place for audio quality, and third place for class fitness.
△ Less
Submitted 10 August, 2023; v1 submitted 16 June, 2023;
originally announced June 2023.
-
Light-field-driven non-Ohmic current and Keldysh crossover in a Weyl semimetal
Authors:
R. Ikeda,
H. Watanabe,
J. H. Moon,
M. H. Jung,
K. Takasan,
S. Kimura
Abstract:
In recent years, coherent electrons driven by light fields have attracted significant interest in exploring novel material phases and functionalities. However, observing coherent light-field-driven electron dynamics in solids is challenging because the electrons are scattered within several ten femtoseconds in ordinary materials, and the coherence between light and electrons is disturbed. However,…
▽ More
In recent years, coherent electrons driven by light fields have attracted significant interest in exploring novel material phases and functionalities. However, observing coherent light-field-driven electron dynamics in solids is challenging because the electrons are scattered within several ten femtoseconds in ordinary materials, and the coherence between light and electrons is disturbed. However, when we use Weyl semimetals, the electron scattering becomes relatively long (several hundred femtoseconds - several picoseconds), owing to the suppression of the back-scattering process. This study presents the light-field-driven dynamics by the THz pulse to Weyl semimetal Co3Sn2S2, where the intense THz pulse of a monocycle electric field nonlinearly generates direct current (DC) via coherent acceleration without scattering and non-adiabatic excitation (Landau-Zener Transition). In other words, the non-Ohmic current appears in the Weyl semimetal with a combination of the long relaxation time and an intense THz pulse. This nonlinear DC generation also demonstrates a Keldysh crossover from a photon picture to a light-field picture by increasing the electric field strength.
△ Less
Submitted 15 June, 2023;
originally announced June 2023.
-
Towards Diverse and Effective Question-Answer Pair Generation from Children Storybooks
Authors:
Sugyeong Eo,
Hyeonseok Moon,
Jinsung Kim,
Yuna Hur,
Jeongwook Kim,
Songeun Lee,
Changwoo Chun,
Sungsoo Park,
Heuiseok Lim
Abstract:
Recent advances in QA pair generation (QAG) have raised interest in applying this technique to the educational field. However, the diversity of QA types remains a challenge despite its contributions to comprehensive learning and assessment of children. In this paper, we propose a QAG framework that enhances QA type diversity by producing different interrogative sentences and implicit/explicit answ…
▽ More
Recent advances in QA pair generation (QAG) have raised interest in applying this technique to the educational field. However, the diversity of QA types remains a challenge despite its contributions to comprehensive learning and assessment of children. In this paper, we propose a QAG framework that enhances QA type diversity by producing different interrogative sentences and implicit/explicit answers. Our framework comprises a QFS-based answer generator, an iterative QA generator, and a relevancy-aware ranker. The two generators aim to expand the number of candidates while covering various types. The ranker trained on the in-context negative samples clarifies the top-N outputs based on the ranking score. Extensive evaluations and detailed analyses demonstrate that our approach outperforms previous state-of-the-art results by significant margins, achieving improved diversity and quality. Our task-oriented processes are consistent with real-world demand, which highlights our system's high applicability.
△ Less
Submitted 11 June, 2023;
originally announced June 2023.
-
Addressing Negative Transfer in Diffusion Models
Authors:
Hyojun Go,
JinYoung Kim,
Yunsung Lee,
Seunghyun Lee,
Shinhyeok Oh,
Hyeongdon Moon,
Seungtaek Choi
Abstract:
Diffusion-based generative models have achieved remarkable success in various domains. It trains a shared model on denoising tasks that encompass different noise levels simultaneously, representing a form of multi-task learning (MTL). However, analyzing and improving diffusion models from an MTL perspective remains under-explored. In particular, MTL can sometimes lead to the well-known phenomenon…
▽ More
Diffusion-based generative models have achieved remarkable success in various domains. It trains a shared model on denoising tasks that encompass different noise levels simultaneously, representing a form of multi-task learning (MTL). However, analyzing and improving diffusion models from an MTL perspective remains under-explored. In particular, MTL can sometimes lead to the well-known phenomenon of negative transfer, which results in the performance degradation of certain tasks due to conflicts between tasks. In this paper, we first aim to analyze diffusion training from an MTL standpoint, presenting two key observations: (O1) the task affinity between denoising tasks diminishes as the gap between noise levels widens, and (O2) negative transfer can arise even in diffusion training. Building upon these observations, we aim to enhance diffusion training by mitigating negative transfer. To achieve this, we propose leveraging existing MTL methods, but the presence of a huge number of denoising tasks makes this computationally expensive to calculate the necessary per-task loss or gradient. To address this challenge, we propose clustering the denoising tasks into small task clusters and applying MTL methods to them. Specifically, based on (O2), we employ interval clustering to enforce temporal proximity among denoising tasks within clusters. We show that interval clustering can be solved using dynamic programming, utilizing signal-to-noise ratio, timestep, and task affinity for clustering objectives. Through this, our approach addresses the issue of negative transfer in diffusion models by allowing for efficient computation of MTL methods. We validate the efficacy of proposed clustering and its integration with MTL methods through various experiments, demonstrating 1) improved generation quality and 2) faster training convergence of diffusion models.
△ Less
Submitted 30 December, 2023; v1 submitted 1 June, 2023;
originally announced June 2023.
-
Cross Encoding as Augmentation: Towards Effective Educational Text Classification
Authors:
Hyun Seung Lee,
Seungtaek Choi,
Yunsung Lee,
Hyeongdon Moon,
Shinhyeok Oh,
Myeongho Jeong,
Hyojun Go,
Christian Wallraven
Abstract:
Text classification in education, usually called auto-tagging, is the automated process of assigning relevant tags to educational content, such as questions and textbooks. However, auto-tagging suffers from a data scarcity problem, which stems from two major challenges: 1) it possesses a large tag space and 2) it is multi-label. Though a retrieval approach is reportedly good at low-resource scenar…
▽ More
Text classification in education, usually called auto-tagging, is the automated process of assigning relevant tags to educational content, such as questions and textbooks. However, auto-tagging suffers from a data scarcity problem, which stems from two major challenges: 1) it possesses a large tag space and 2) it is multi-label. Though a retrieval approach is reportedly good at low-resource scenarios, there have been fewer efforts to directly address the data scarcity problem. To mitigate these issues, here we propose a novel retrieval approach CEAA that provides effective learning in educational text classification. Our main contributions are as follows: 1) we leverage transfer learning from question-answering datasets, and 2) we propose a simple but effective data augmentation method introducing cross-encoder style texts to a bi-encoder architecture for more efficient inference. An extensive set of experiments shows that our proposed method is effective in multi-label scenarios and low-resource tags compared to state-of-the-art models.
△ Less
Submitted 30 May, 2023; v1 submitted 30 May, 2023;
originally announced May 2023.
-
Evaluation of Question Generation Needs More References
Authors:
Shinhyeok Oh,
Hyojun Go,
Hyeongdon Moon,
Yunsung Lee,
Myeongho Jeong,
Hyun Seung Lee,
Seungtaek Choi
Abstract:
Question generation (QG) is the task of generating a valid and fluent question based on a given context and the target answer. According to various purposes, even given the same context, instructors can ask questions about different concepts, and even the same concept can be written in different ways. However, the evaluation for QG usually depends on single reference-based similarity metrics, such…
▽ More
Question generation (QG) is the task of generating a valid and fluent question based on a given context and the target answer. According to various purposes, even given the same context, instructors can ask questions about different concepts, and even the same concept can be written in different ways. However, the evaluation for QG usually depends on single reference-based similarity metrics, such as n-gram-based metric or learned metric, which is not sufficient to fully evaluate the potential of QG methods. To this end, we propose to paraphrase the reference question for a more robust QG evaluation. Using large language models such as GPT-3, we created semantically and syntactically diverse questions, then adopt the simple aggregation of the popular evaluation metrics as the final scores. Through our experiments, we found that using multiple (pseudo) references is more effective for QG evaluation while showing a higher correlation with human evaluations than evaluation with a single reference.
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
Search for $CP$ violation in $D^{+}_{(s)}\rightarrow K^{+}K^{0}_{S}h^{+}h^{-}$ $(h=K,π)$ decays and observation of the Cabibbo-suppressed decay $D^{+}_{s}\rightarrow K^{+}K^{-}K^{0}_{S}π^{+}$
Authors:
Belle Collaboration,
H. K. Moon,
E. Won,
I. Adachi,
H. Aihara,
D. M. Asner,
H. Atmacan,
V. Aulchenko,
T. Aushev,
R. Ayad,
V. Babu,
S. Bahinipati,
Sw. Banerjee,
M. Bauer,
P. Behera,
K. Belous,
J. Bennett,
M. Bessner,
V. Bhardwaj,
B. Bhuyan,
D. Biswas,
D. Bodrov,
J. Borah,
A. Bozek,
M. Bračko
, et al. (183 additional authors not shown)
Abstract:
We search for $CP$ violation by measuring a $T$-odd asymmetry in the Cabibbo-suppressed $D^{+}\rightarrow K^{+}K^{0}_{S}π^{+}π^{-} $ decay, and in the Cabibbo-favored $D^{+}_{s}\rightarrow K^{+}K^{0}_{S}π^{+}π^{-}$ and $D^{+}\rightarrow K^{+}K^{-}K^{0}_{S}π^{+}$ decays. We use 980 ${\rm fb}^{-1}$ of data collected by the Belle detector running at the KEKB asymmetric-energy $e^{+}e^{-}$ collider. T…
▽ More
We search for $CP$ violation by measuring a $T$-odd asymmetry in the Cabibbo-suppressed $D^{+}\rightarrow K^{+}K^{0}_{S}π^{+}π^{-} $ decay, and in the Cabibbo-favored $D^{+}_{s}\rightarrow K^{+}K^{0}_{S}π^{+}π^{-}$ and $D^{+}\rightarrow K^{+}K^{-}K^{0}_{S}π^{+}$ decays. We use 980 ${\rm fb}^{-1}$ of data collected by the Belle detector running at the KEKB asymmetric-energy $e^{+}e^{-}$ collider. The $C\!P$-violating $T$-odd parameter ${a}^{T\text{-}\rm{odd}}_{CP}$ is measured to be ${a}^{T\text{-}\rm{odd}}_{CP}(D^{+}\rightarrow K^{+}K^{0}_{S}π^{+}π^{-})=(0.34\pm0.87\pm0.32)\%,$ ${a}^{T\text{-}\rm{odd}}_{CP}(D^{+}_{s}\rightarrow K^{+}K^{0}_{S}π^{+}π^{-})=(-0.46\pm0.63\pm0.38)\%,$ and ${a}^{T\text{-}\rm{odd}}_{CP}(D^{+}\rightarrow K^{+}K^{-}K^{0}_{S}π^{+})=(-3.34\pm2.66\pm0.35)\%,$ where the first uncertainty is statistical and the second is systematic. We also report the first observation of the Cabibbo-suppressed decay $D^{+}_{s}\rightarrow K^{+}K^{-}K^{0}_{S}π^{+}$. The branching fraction is measured relative to that of the analogous Cabibbo-favored decay : $B(D^{+}_{s}\rightarrow K^{+}K^{-}K^{0}_{S}π^{+}) / B(D^{+}_{s}\rightarrow K^{+}K^{0}_{S}π^{+}π^{-}) = (1.36\pm 0.15\pm 0.04)\%$.
△ Less
Submitted 22 November, 2023; v1 submitted 18 May, 2023;
originally announced May 2023.
-
Randomized Smoothing with Masked Inference for Adversarially Robust Text Classifications
Authors:
Han Cheol Moon,
Shafiq Joty,
Ruochen Zhao,
Megh Thakkar,
Xu Chi
Abstract:
Large-scale pre-trained language models have shown outstanding performance in a variety of NLP tasks. However, they are also known to be significantly brittle against specifically crafted adversarial examples, leading to increasing interest in probing the adversarial robustness of NLP systems. We introduce RSMI, a novel two-stage framework that combines randomized smoothing (RS) with masked infere…
▽ More
Large-scale pre-trained language models have shown outstanding performance in a variety of NLP tasks. However, they are also known to be significantly brittle against specifically crafted adversarial examples, leading to increasing interest in probing the adversarial robustness of NLP systems. We introduce RSMI, a novel two-stage framework that combines randomized smoothing (RS) with masked inference (MI) to improve the adversarial robustness of NLP systems. RS transforms a classifier into a smoothed classifier to obtain robust representations, whereas MI forces a model to exploit the surrounding context of a masked token in an input sequence. RSMI improves adversarial robustness by 2 to 3 times over existing state-of-the-art methods on benchmark datasets. We also perform in-depth qualitative analysis to validate the effectiveness of the different stages of RSMI and probe the impact of its components through extensive ablations. By empirically proving the stability of RSMI, we put it forward as a practical method to robustly train large-scale NLP models. Our code and datasets are available at https://github.com/Han8931/rsmi_nlp
△ Less
Submitted 10 May, 2023;
originally announced May 2023.