-
One-Time Soft Alignment Enables Resilient Learning without Weight Transport
Authors:
Jeonghwan Cheon,
Jaehyuk Bae,
Se-Bum Paik
Abstract:
Backpropagation is the cornerstone of deep learning, but its reliance on symmetric weight transport and global synchronization makes it computationally expensive and biologically implausible. Feedback alignment offers a promising alternative by approximating error gradients through fixed random feedback, thereby avoiding symmetric weight transport. However, this approach often struggles with poor…
▽ More
Backpropagation is the cornerstone of deep learning, but its reliance on symmetric weight transport and global synchronization makes it computationally expensive and biologically implausible. Feedback alignment offers a promising alternative by approximating error gradients through fixed random feedback, thereby avoiding symmetric weight transport. However, this approach often struggles with poor learning performance and instability, especially in deep networks. Here, we show that a one-time soft alignment between forward and feedback weights at initialization enables deep networks to achieve performance comparable to backpropagation, without requiring weight transport during learning. This simple initialization condition guides stable error minimization in the loss landscape, improving network trainability. Spectral analyses further reveal that initial alignment promotes smoother gradient flow and convergence to flatter minima, resulting in better generalization and robustness. Notably, we also find that allowing moderate deviations from exact weight symmetry can improve adversarial robustness compared to standard backpropagation. These findings demonstrate that a simple initialization strategy can enable effective learning in deep networks in a biologically plausible and resource-efficient manner.
△ Less
Submitted 27 May, 2025;
originally announced May 2025.
-
Deepfake-Eval-2024: A Multi-Modal In-the-Wild Benchmark of Deepfakes Circulated in 2024
Authors:
Nuria Alina Chandra,
Ryan Murtfeldt,
Lin Qiu,
Arnab Karmakar,
Hannah Lee,
Emmanuel Tanumihardja,
Kevin Farhat,
Ben Caffee,
Sejin Paik,
Changyeon Lee,
Jongwook Choi,
Aerin Kim,
Oren Etzioni
Abstract:
In the age of increasingly realistic generative AI, robust deepfake detection is essential for mitigating fraud and disinformation. While many deepfake detectors report high accuracy on academic datasets, we show that these academic benchmarks are out of date and not representative of real-world deepfakes. We introduce Deepfake-Eval-2024, a new deepfake detection benchmark consisting of in-the-wil…
▽ More
In the age of increasingly realistic generative AI, robust deepfake detection is essential for mitigating fraud and disinformation. While many deepfake detectors report high accuracy on academic datasets, we show that these academic benchmarks are out of date and not representative of real-world deepfakes. We introduce Deepfake-Eval-2024, a new deepfake detection benchmark consisting of in-the-wild deepfakes collected from social media and deepfake detection platform users in 2024. Deepfake-Eval-2024 consists of 45 hours of videos, 56.5 hours of audio, and 1,975 images, encompassing the latest manipulation technologies. The benchmark contains diverse media content from 88 different websites in 52 different languages. We find that the performance of open-source state-of-the-art deepfake detection models drops precipitously when evaluated on Deepfake-Eval-2024, with AUC decreasing by 50% for video, 48% for audio, and 45% for image models compared to previous benchmarks. We also evaluate commercial deepfake detection models and models finetuned on Deepfake-Eval-2024, and find that they have superior performance to off-the-shelf open-source models, but do not yet reach the accuracy of deepfake forensic analysts. The dataset is available at https://github.com/nuriachandra/Deepfake-Eval-2024.
△ Less
Submitted 27 May, 2025; v1 submitted 4 March, 2025;
originally announced March 2025.
-
DebiasPI: Inference-time Debiasing by Prompt Iteration of a Text-to-Image Generative Model
Authors:
Sarah Bonna,
Yu-Cheng Huang,
Ekaterina Novozhilova,
Sejin Paik,
Zhengyang Shan,
Michelle Yilin Feng,
Ge Gao,
Yonish Tayal,
Rushil Kulkarni,
Jialin Yu,
Nupur Divekar,
Deepti Ghadiyaram,
Derry Wijaya,
Margrit Betke
Abstract:
Ethical intervention prompting has emerged as a tool to counter demographic biases of text-to-image generative AI models. Existing solutions either require to retrain the model or struggle to generate images that reflect desired distributions on gender and race. We propose an inference-time process called DebiasPI for Debiasing-by-Prompt-Iteration that provides prompt intervention by enabling the…
▽ More
Ethical intervention prompting has emerged as a tool to counter demographic biases of text-to-image generative AI models. Existing solutions either require to retrain the model or struggle to generate images that reflect desired distributions on gender and race. We propose an inference-time process called DebiasPI for Debiasing-by-Prompt-Iteration that provides prompt intervention by enabling the user to control the distributions of individuals' demographic attributes in image generation. DebiasPI keeps track of which attributes have been generated either by probing the internal state of the model or by using external attribute classifiers. Its control loop guides the text-to-image model to select not yet sufficiently represented attributes, With DebiasPI, we were able to create images with equal representations of race and gender that visualize challenging concepts of news headlines. We also experimented with the attributes age, body type, profession, and skin tone, and measured how attributes change when our intervention prompt targets the distribution of an unrelated attribute type. We found, for example, if the text-to-image model is asked to balance racial representation, gender representation improves but the skin tone becomes less diverse. Attempts to cover a wide range of skin colors with various intervention prompts showed that the model struggles to generate the palest skin tones. We conducted various ablation studies, in which we removed DebiasPI's attribute control, that reveal the model's propensity to generate young, male characters. It sometimes visualized career success by generating two-panel images with a pre-success dark-skinned person becoming light-skinned with success, or switching gender from pre-success female to post-success male, thus further motivating ethical intervention prompting with DebiasPI.
△ Less
Submitted 28 January, 2025;
originally announced January 2025.
-
Pretraining with random noise for uncertainty calibration
Authors:
Jeonghwan Cheon,
Se-Bum Paik
Abstract:
Uncertainty calibration is crucial for various machine learning applications, yet it remains challenging. Many models exhibit hallucinations - confident yet inaccurate responses - due to miscalibrated confidence. Here, we show that the common practice of random initialization in deep learning, often considered a standard technique, is an underlying cause of this miscalibration, leading to excessiv…
▽ More
Uncertainty calibration is crucial for various machine learning applications, yet it remains challenging. Many models exhibit hallucinations - confident yet inaccurate responses - due to miscalibrated confidence. Here, we show that the common practice of random initialization in deep learning, often considered a standard technique, is an underlying cause of this miscalibration, leading to excessively high confidence in untrained networks. Our method, inspired by developmental neuroscience, addresses this issue by simply pretraining networks with random noise and labels, reducing overconfidence and bringing initial confidence levels closer to chance. This ensures optimal calibration, aligning confidence with accuracy during subsequent data training, without the need for additional pre- or post-processing. Pre-calibrated networks excel at identifying "unknown data," showing low confidence for out-of-distribution inputs, thereby resolving confidence miscalibration.
△ Less
Submitted 27 March, 2025; v1 submitted 23 December, 2024;
originally announced December 2024.
-
Song Form-aware Full-Song Text-to-Lyrics Generation with Multi-Level Granularity Syllable Count Control
Authors:
Yunkee Chae,
Eunsik Shin,
Hwang Suntae,
Seungryeol Paik,
Kyogu Lee
Abstract:
Lyrics generation presents unique challenges, particularly in achieving precise syllable control while adhering to song form structures such as verses and choruses. Conventional line-by-line approaches often lead to unnatural phrasing, underscoring the need for more granular syllable management. We propose a framework for lyrics generation that enables multi-level syllable control at the word, phr…
▽ More
Lyrics generation presents unique challenges, particularly in achieving precise syllable control while adhering to song form structures such as verses and choruses. Conventional line-by-line approaches often lead to unnatural phrasing, underscoring the need for more granular syllable management. We propose a framework for lyrics generation that enables multi-level syllable control at the word, phrase, line, and paragraph levels, aware of song form. Our approach generates complete lyrics conditioned on input text and song form, ensuring alignment with specified syllable constraints. Generated lyrics samples are available at: https://tinyurl.com/lyrics9999
△ Less
Submitted 20 November, 2024;
originally announced November 2024.
-
Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT and Seq2Seq Models for Free-Text Generation
Authors:
Ge Gao,
Jongin Kim,
Sejin Paik,
Ekaterina Novozhilova,
Yi Liu,
Sarah T. Bonna,
Margrit Betke,
Derry Tanti Wijaya
Abstract:
Predicting emotions elicited by news headlines can be challenging as the task is largely influenced by the varying nature of people's interpretations and backgrounds. Previous works have explored classifying discrete emotions directly from news headlines. We provide a different approach to tackling this problem by utilizing people's explanations of their emotion, written in free-text, on how they…
▽ More
Predicting emotions elicited by news headlines can be challenging as the task is largely influenced by the varying nature of people's interpretations and backgrounds. Previous works have explored classifying discrete emotions directly from news headlines. We provide a different approach to tackling this problem by utilizing people's explanations of their emotion, written in free-text, on how they feel after reading a news headline. Using the dataset BU-NEmo+ (Gao et al., 2022), we found that for emotion classification, the free-text explanations have a strong correlation with the dominant emotion elicited by the headlines. The free-text explanations also contain more sentimental context than the news headlines alone and can serve as a better input to emotion classification models. Therefore, in this work we explored generating emotion explanations from headlines by training a sequence-to-sequence transformer model and by using pretrained large language model, ChatGPT (GPT-4). We then used the generated emotion explanations for emotion classification. In addition, we also experimented with training the pretrained T5 model for the intermediate task of explanation generation before fine-tuning it for emotion classification. Using McNemar's significance test, methods that incorporate GPT-generated free-text emotion explanations demonstrated significant improvement (P-value < 0.05) in emotion classification from headlines, compared to methods that only use headlines. This underscores the value of using intermediate free-text explanations for emotion prediction tasks with headlines.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Neuromimetic metaplasticity for adaptive continual learning
Authors:
Suhee Cho,
Hyeonsu Lee,
Seungdae Baek,
Se-Bum Paik
Abstract:
Conventional intelligent systems based on deep neural network (DNN) models encounter challenges in achieving human-like continual learning due to catastrophic forgetting. Here, we propose a metaplasticity model inspired by human working memory, enabling DNNs to perform catastrophic forgetting-free continual learning without any pre- or post-processing. A key aspect of our approach involves impleme…
▽ More
Conventional intelligent systems based on deep neural network (DNN) models encounter challenges in achieving human-like continual learning due to catastrophic forgetting. Here, we propose a metaplasticity model inspired by human working memory, enabling DNNs to perform catastrophic forgetting-free continual learning without any pre- or post-processing. A key aspect of our approach involves implementing distinct types of synapses from stable to flexible, and randomly intermixing them to train synaptic connections with different degrees of flexibility. This strategy allowed the network to successfully learn a continuous stream of information, even under unexpected changes in input length. The model achieved a balanced tradeoff between memory capacity and performance without requiring additional training or structural modifications, dynamically allocating memory resources to retain both old and new information. Furthermore, the model demonstrated robustness against data poisoning attacks by selectively filtering out erroneous memories, leveraging the Hebb repetition effect to reinforce the retention of significant data.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Pretraining with Random Noise for Fast and Robust Learning without Weight Transport
Authors:
Jeonghwan Cheon,
Sang Wan Lee,
Se-Bum Paik
Abstract:
The brain prepares for learning even before interacting with the environment, by refining and optimizing its structures through spontaneous neural activity that resembles random noise. However, the mechanism of such a process has yet to be thoroughly understood, and it is unclear whether this process can benefit the algorithm of machine learning. Here, we study this issue using a neural network wi…
▽ More
The brain prepares for learning even before interacting with the environment, by refining and optimizing its structures through spontaneous neural activity that resembles random noise. However, the mechanism of such a process has yet to be thoroughly understood, and it is unclear whether this process can benefit the algorithm of machine learning. Here, we study this issue using a neural network with a feedback alignment algorithm, demonstrating that pretraining neural networks with random noise increases the learning efficiency as well as generalization abilities without weight transport. First, we found that random noise training modifies forward weights to match backward synaptic feedback, which is necessary for teaching errors by feedback alignment. As a result, a network with pre-aligned weights learns notably faster than a network without random noise training, even reaching a convergence speed comparable to that of a backpropagation algorithm. Sequential training with both random noise and data brings weights closer to synaptic feedback than training solely with data, enabling more precise credit assignment and faster learning. We also found that each readout probability approaches the chance level and that the effective dimensionality of weights decreases in a network pretrained with random noise. This pre-regularization allows the network to learn simple solutions of a low rank, reducing the generalization loss during subsequent training. This also enables the network robustly to generalize a novel, out-of-distribution dataset. Lastly, we confirmed that random noise pretraining reduces the amount of meta-loss, enhancing the network ability to adapt to various tasks. Overall, our results suggest that random noise training with feedback alignment offers a straightforward yet effective method of pretraining that facilitates quick and reliable learning without weight transport.
△ Less
Submitted 9 May, 2025; v1 submitted 26 May, 2024;
originally announced May 2024.
-
Transfer Learning to Detect COVID-19 Coughs with Incremental Addition of Patient Coughs to Healthy People's Cough Detection Models
Authors:
Sudip Vhaduri,
Seungyeon Paik,
Jessica E Huber
Abstract:
Millions of people have died worldwide from COVID-19. In addition to its high death toll, COVID-19 has led to unbearable suffering for individuals and a huge global burden to the healthcare sector. Therefore, researchers have been trying to develop tools to detect symptoms of this human-transmissible disease remotely to control its rapid spread. Coughing is one of the common symptoms that research…
▽ More
Millions of people have died worldwide from COVID-19. In addition to its high death toll, COVID-19 has led to unbearable suffering for individuals and a huge global burden to the healthcare sector. Therefore, researchers have been trying to develop tools to detect symptoms of this human-transmissible disease remotely to control its rapid spread. Coughing is one of the common symptoms that researchers have been trying to detect objectively from smartphone microphone-sensing. While most of the approaches to detect and track cough symptoms rely on machine learning models developed from a large amount of patient data, this is not possible at the early stage of an outbreak. In this work, we present an incremental transfer learning approach that leverages the relationship between healthy peoples' coughs and COVID-19 patients' coughs to detect COVID-19 coughs with reasonable accuracy using a pre-trained healthy cough detection model and a relatively small set of patient coughs, reducing the need for large patient dataset to train the model. This type of model can be a game changer in detecting the onset of a novel respiratory virus.
△ Less
Submitted 11 November, 2023;
originally announced November 2023.
-
Integral Probability Metrics Meet Neural Networks: The Radon-Kolmogorov-Smirnov Test
Authors:
Seunghoon Paik,
Michael Celentano,
Alden Green,
Ryan J. Tibshirani
Abstract:
Integral probability metrics (IPMs) constitute a general class of nonparametric two-sample tests that are based on maximizing the mean difference between samples from one distribution $P$ versus another $Q$, over all choices of data transformations $f$ living in some function space $\mathcal{F}$. Inspired by recent work that connects what are known as functions of…
▽ More
Integral probability metrics (IPMs) constitute a general class of nonparametric two-sample tests that are based on maximizing the mean difference between samples from one distribution $P$ versus another $Q$, over all choices of data transformations $f$ living in some function space $\mathcal{F}$. Inspired by recent work that connects what are known as functions of $\textit{Radon bounded variation}$ (RBV) and neural networks (Parhi and Nowak, 2021, 2023), we study the IPM defined by taking $\mathcal{F}$ to be the unit ball in the RBV space of a given smoothness degree $k \geq 0$. This test, which we refer to as the $\textit{Radon-Kolmogorov-Smirnov}$ (RKS) test, can be viewed as a generalization of the well-known and classical Kolmogorov-Smirnov (KS) test to multiple dimensions and higher orders of smoothness. It is also intimately connected to neural networks: we prove that the witness in the RKS test -- the function $f$ achieving the maximum mean difference -- is always a ridge spline of degree $k$, i.e., a single neuron in a neural network. We can thus leverage the power of modern neural network optimization toolkits to (approximately) maximize the criterion that underlies the RKS test. We prove that the RKS test has asymptotically full power at distinguishing any distinct pair $P \not= Q$ of distributions, derive its asymptotic null distribution, and carry out experiments to elucidate the strengths and weaknesses of the RKS test versus the more traditional kernel MMD test.
△ Less
Submitted 12 January, 2025; v1 submitted 5 September, 2023;
originally announced September 2023.
-
Blind Estimation of Audio Processing Graph
Authors:
Sungho Lee,
Jaehyun Park,
Seungryeol Paik,
Kyogu Lee
Abstract:
Musicians and audio engineers sculpt and transform their sounds by connecting multiple processors, forming an audio processing graph. However, most deep-learning methods overlook this real-world practice and assume fixed graph settings. To bridge this gap, we develop a system that reconstructs the entire graph from a given reference audio. We first generate a realistic graph-reference pair dataset…
▽ More
Musicians and audio engineers sculpt and transform their sounds by connecting multiple processors, forming an audio processing graph. However, most deep-learning methods overlook this real-world practice and assume fixed graph settings. To bridge this gap, we develop a system that reconstructs the entire graph from a given reference audio. We first generate a realistic graph-reference pair dataset and train a simple blind estimation system composed of a convolutional reference encoder and a transformer-based graph decoder. We apply our model to singing voice effects and drum mixing estimation tasks. Evaluation results show that our method can reconstruct complex signal routings, including multi-band processing and sidechaining.
△ Less
Submitted 7 May, 2023; v1 submitted 15 March, 2023;
originally announced March 2023.
-
On Measuring Social Biases in Prompt-Based Multi-Task Learning
Authors:
Afra Feyza Akyürek,
Sejin Paik,
Muhammed Yusuf Kocyigit,
Seda Akbiyik,
Şerife Leman Runyun,
Derry Wijaya
Abstract:
Large language models trained on a mixture of NLP tasks that are converted into a text-to-text format using prompts, can generalize into novel forms of language and handle novel tasks. A large body of work within prompt engineering attempts to understand the effects of input forms and prompts in achieving superior performance. We consider an alternative measure and inquire whether the way in which…
▽ More
Large language models trained on a mixture of NLP tasks that are converted into a text-to-text format using prompts, can generalize into novel forms of language and handle novel tasks. A large body of work within prompt engineering attempts to understand the effects of input forms and prompts in achieving superior performance. We consider an alternative measure and inquire whether the way in which an input is encoded affects social biases promoted in outputs. In this paper, we study T0, a large-scale multi-task text-to-text language model trained using prompt-based learning. We consider two different forms of semantically equivalent inputs: question-answer format and premise-hypothesis format. We use an existing bias benchmark for the former BBQ and create the first bias benchmark in natural language inference BBNLI with hand-written hypotheses while also converting each benchmark into the other form. The results on two benchmarks suggest that given two different formulations of essentially the same input, T0 conspicuously acts more biased in question answering form, which is seen during training, compared to premise-hypothesis form which is unlike its training examples. Code and data are released under https://github.com/feyzaakyurek/bbnli.
△ Less
Submitted 23 May, 2022;
originally announced May 2022.
-
Challenges in Measuring Bias via Open-Ended Language Generation
Authors:
Afra Feyza Akyürek,
Muhammed Yusuf Kocyigit,
Sejin Paik,
Derry Wijaya
Abstract:
Researchers have devised numerous ways to quantify social biases vested in pretrained language models. As some language models are capable of generating coherent completions given a set of textual prompts, several prompting datasets have been proposed to measure biases between social groups -- posing language generation as a way of identifying biases. In this opinion paper, we analyze how specific…
▽ More
Researchers have devised numerous ways to quantify social biases vested in pretrained language models. As some language models are capable of generating coherent completions given a set of textual prompts, several prompting datasets have been proposed to measure biases between social groups -- posing language generation as a way of identifying biases. In this opinion paper, we analyze how specific choices of prompt sets, metrics, automatic tools and sampling strategies affect bias results. We find out that the practice of measuring biases through text completion is prone to yielding contradicting results under different experiment settings. We additionally provide recommendations for reporting biases in open-ended language generation for a more complete outlook of biases exhibited by a given language model. Code to reproduce the results is released under https://github.com/feyzaakyurek/bias-textgen.
△ Less
Submitted 23 May, 2022;
originally announced May 2022.
-
Semi-Parametric Contextual Bandits with Graph-Laplacian Regularization
Authors:
Young-Geun Choi,
Gi-Soo Kim,
Seunghoon Paik,
Myunghee Cho Paik
Abstract:
Non-stationarity is ubiquitous in human behavior and addressing it in the contextual bandits is challenging. Several works have addressed the problem by investigating semi-parametric contextual bandits and warned that ignoring non-stationarity could harm performances. Another prevalent human behavior is social interaction which has become available in a form of a social network or graph structure.…
▽ More
Non-stationarity is ubiquitous in human behavior and addressing it in the contextual bandits is challenging. Several works have addressed the problem by investigating semi-parametric contextual bandits and warned that ignoring non-stationarity could harm performances. Another prevalent human behavior is social interaction which has become available in a form of a social network or graph structure. As a result, graph-based contextual bandits have received much attention. In this paper, we propose "SemiGraphTS," a novel contextual Thompson-sampling algorithm for a graph-based semi-parametric reward model. Our algorithm is the first to be proposed in this setting. We derive an upper bound of the cumulative regret that can be expressed as a multiple of a factor depending on the graph structure and the order for the semi-parametric model without a graph. We evaluate the proposed and existing algorithms via simulation and real data example.
△ Less
Submitted 17 May, 2022;
originally announced May 2022.
-
End-to-end Music Remastering System Using Self-supervised and Adversarial Training
Authors:
Junghyun Koo,
Seungryeol Paik,
Kyogu Lee
Abstract:
Mastering is an essential step in music production, but it is also a challenging task that has to go through the hands of experienced audio engineers, where they adjust tone, space, and volume of a song. Remastering follows the same technical process, in which the context lies in mastering a song for the times. As these tasks have high entry barriers, we aim to lower the barriers by proposing an e…
▽ More
Mastering is an essential step in music production, but it is also a challenging task that has to go through the hands of experienced audio engineers, where they adjust tone, space, and volume of a song. Remastering follows the same technical process, in which the context lies in mastering a song for the times. As these tasks have high entry barriers, we aim to lower the barriers by proposing an end-to-end music remastering system that transforms the mastering style of input audio to that of the target. The system is trained in a self-supervised manner, in which released pop songs were used for training. We also anticipated the model to generate realistic audio reflecting the reference's mastering style by applying a pre-trained encoder and a projection discriminator. We validate our results with quantitative metrics and a subjective listening test and show that the model generated samples of mastering style similar to the target.
△ Less
Submitted 17 February, 2022;
originally announced February 2022.
-
Improving Distinction between ASR Errors and Speech Disfluencies with Feature Space Interpolation
Authors:
Seongmin Park,
Dongchan Shin,
Sangyoun Paik,
Subong Choi,
Alena Kazakova,
Jihwa Lee
Abstract:
Fine-tuning pretrained language models (LMs) is a popular approach to automatic speech recognition (ASR) error detection during post-processing. While error detection systems often take advantage of statistical language archetypes captured by LMs, at times the pretrained knowledge can hinder error detection performance. For instance, presence of speech disfluencies might confuse the post-processin…
▽ More
Fine-tuning pretrained language models (LMs) is a popular approach to automatic speech recognition (ASR) error detection during post-processing. While error detection systems often take advantage of statistical language archetypes captured by LMs, at times the pretrained knowledge can hinder error detection performance. For instance, presence of speech disfluencies might confuse the post-processing system into tagging disfluent but accurate transcriptions as ASR errors. Such confusion occurs because both error detection and disfluency detection tasks attempt to identify tokens at statistically unlikely positions. This paper proposes a scheme to improve existing LM-based ASR error detection systems, both in terms of detection scores and resilience to such distracting auxiliary tasks. Our approach adopts the popular mixup method in text feature space and can be utilized with any black-box ASR output. To demonstrate the effectiveness of our method, we conduct post-processing experiments with both traditional and end-to-end ASR systems (both for English and Korean languages) with 5 different speech corpora. We find that our method improves both ASR error detection F 1 scores and reduces the number of correctly transcribed disfluencies wrongly detected as ASR errors. Finally, we suggest methods to utilize resulting LMs directly in semi-supervised ASR training.
△ Less
Submitted 3 August, 2021;
originally announced August 2021.
-
Reverb Conversion of Mixed Vocal Tracks Using an End-to-end Convolutional Deep Neural Network
Authors:
Junghyun Koo,
Seungryeol Paik,
Kyogu Lee
Abstract:
Reverb plays a critical role in music production, where it provides listeners with spatial realization, timbre, and texture of the music. Yet, it is challenging to reproduce the musical reverb of a reference music track even by skilled engineers. In response, we propose an end-to-end system capable of switching the musical reverb factor of two different mixed vocal tracks. This method enables us t…
▽ More
Reverb plays a critical role in music production, where it provides listeners with spatial realization, timbre, and texture of the music. Yet, it is challenging to reproduce the musical reverb of a reference music track even by skilled engineers. In response, we propose an end-to-end system capable of switching the musical reverb factor of two different mixed vocal tracks. This method enables us to apply the reverb of the reference track to the source track to which the effect is desired. Further, our model can perform de-reverberation when the reference track is used as a dry vocal source. The proposed model is trained in combination with an adversarial objective, which makes it possible to handle high-resolution audio samples. The perceptual evaluation confirmed that the proposed model can convert the reverb factor with the preferred rate of 64.8%. To the best of our knowledge, this is the first attempt to apply deep neural networks to converting music reverb of vocal tracks.
△ Less
Submitted 2 March, 2021;
originally announced March 2021.
-
High resolution, High contrast optical interface for defect qubits
Authors:
Jong Sung Moon,
Haneul Lee,
Jin Hee Lee,
Woong Bae Jeon,
Dowon Lee,
Junghyun Lee,
Seoyoung Paik,
Sang-Wook Han,
Rolf Reuter,
Andrej Denisenko,
Joerg Wrachtrup,
Sang-Yun Lee,
Je-Hyung Kim
Abstract:
Point defects in crystals provide important building blocks for quantum applications. To initialize, control, and read-out their quantum states, an efficient optical interface for addressing defects with photons is required. However, conventional confocal fluorescence microscopy with high refractive index crystals has limited photon collection efficiency and spatial resolution. Here, we demonstrat…
▽ More
Point defects in crystals provide important building blocks for quantum applications. To initialize, control, and read-out their quantum states, an efficient optical interface for addressing defects with photons is required. However, conventional confocal fluorescence microscopy with high refractive index crystals has limited photon collection efficiency and spatial resolution. Here, we demonstrate high resolution, high contrast imaging for defects qubits using microsphere-assisted confocal microscopy. A microsphere provides an excellent optical interface for point defects with a magnified virtual image that improves spatial resolution up to ~$λ$/5 as well as an optical signal-to-noise ratio by four times. These features enable individual optical addressing of single photons and single spins of spatially-unresolved defects in conventional confocal microscopy with improved signal contrast. The combined optical tweezers show the possibility of positioning or scanning the microspheres for deterministic coupling and wide-field imaging of defects. The approach does not require any complicated fabrication and additional optical system but uses simple micro-optics off-the-shelf. From these distinctive advantages of the microspheres, our approach can provide an efficient way for imaging and addressing closely-spaced defects with higher resolution and sensitivity.
△ Less
Submitted 13 February, 2021;
originally announced February 2021.
-
A real-space renormalization-group calculation for the quantum Z_2 gauge theory on a square lattice
Authors:
Steve T. Paik
Abstract:
We revisit Fradkin and Raby's real-space renormalization-group method to study the quantum Z_2 gauge theory defined on links forming a two-dimensional square lattice. Following an old suggestion of theirs, a systematic perturbation expansion developed by Hirsch and Mazenko is used to improve the algorithm to second order in an intercell coupling, thereby incorporating the effects of discarded high…
▽ More
We revisit Fradkin and Raby's real-space renormalization-group method to study the quantum Z_2 gauge theory defined on links forming a two-dimensional square lattice. Following an old suggestion of theirs, a systematic perturbation expansion developed by Hirsch and Mazenko is used to improve the algorithm to second order in an intercell coupling, thereby incorporating the effects of discarded higher energy states. A careful derivation of gauge-invariant effective operators is presented in the Hamiltonian formalism. Renormalization group equations are analyzed near the nontrivial fixed point, reaffirming old work by Hirsch on the dual transverse field Ising model. In addition to recovering Hirsch's previous findings, critical exponents for the scaling of the spatial correlation length and energy gap in the electric free (deconfined) phase are compared. Unfortunately, their agreement is poor. The leading singular behavior of the ground state energy density is examined near the critical point: we compute both a critical exponent and estimate a critical amplitude ratio.
△ Less
Submitted 11 January, 2021; v1 submitted 15 September, 2020;
originally announced September 2020.
-
Graphene quantum dots prevent alpha-synucleinopathy in Parkinson's disease
Authors:
Donghoon Kim,
Je Min Yoo,
Heehong Hwang,
Junghee Lee,
Su Hyun Lee,
Seung Pil Yun,
Myung Jin Park,
MinJun Lee,
Seulah Choi,
Sang Ho Kwon,
Saebom Lee,
Seung-Hwan Kwon,
Sangjune Kim,
Yong Joo Park,
Misaki Kinoshita,
Young-Ho Lee,
Seokmin Shin,
Seung R. Paik,
Sung Joong Lee,
Seulki Lee,
Byung Hee Hong,
Han Seok Ko
Abstract:
While the emerging evidence indicates that the pathogenesis of Parkinson's disease (PD) is strongly correlated to the accumulation of alpha-synuclein (α-syn) aggregates, there has been no clinical success in anti-aggregation agents for the disease to date. Here we show that graphene quantum dots (GQDs) exhibit anti-amyloid activity via direct interaction with α-syn. Employing biophysical, biochemi…
▽ More
While the emerging evidence indicates that the pathogenesis of Parkinson's disease (PD) is strongly correlated to the accumulation of alpha-synuclein (α-syn) aggregates, there has been no clinical success in anti-aggregation agents for the disease to date. Here we show that graphene quantum dots (GQDs) exhibit anti-amyloid activity via direct interaction with α-syn. Employing biophysical, biochemical, and cell-based assays as well as molecular dynamics (MD) simulation, we find that GQDs have notable potency in not only inhibiting fibrillization of α-syn but also disaggregating mature fibrils in a time-dependent manner. Remarkably, GQDs rescue neuronal death and synaptic loss, reduce Lewy body (LB)/Lewy neurite (LN) formation, ameliorate mitochondrial dysfunctions, and prevent neuron-to-neuron transmission of α-syn pathology induced by α-syn preformed fibrils (PFFs) in neurons. In addition, in vivo administration of GQDs protects against α-syn PFFs-induced loss of dopamine neurons, LB/LN pathology, and behavioural deficits through the penetration of the blood-brain barrier (BBB). The finding that GQDs function as an anti-aggregation agent provides a promising novel therapeutic target for the treatment of PD and related α-synucleinopathies.
△ Less
Submitted 9 July, 2018; v1 submitted 17 October, 2017;
originally announced October 2017.
-
Teaching renormalization, scaling, and universality with an example from quantum mechanics
Authors:
Steve T. Paik
Abstract:
We discuss the quantum mechanics of a particle restricted to the half-line $x > 0$ with potential energy $V = α/x^2$ for $-1/4 < α< 0$. It is known that two scale-invariant theories may be defined. By regularizing the near-origin behavior of the potential by a finite square well with variable width $b$ and depth $g$, it is shown how these two scale-invariant theories occupy fixed points in the res…
▽ More
We discuss the quantum mechanics of a particle restricted to the half-line $x > 0$ with potential energy $V = α/x^2$ for $-1/4 < α< 0$. It is known that two scale-invariant theories may be defined. By regularizing the near-origin behavior of the potential by a finite square well with variable width $b$ and depth $g$, it is shown how these two scale-invariant theories occupy fixed points in the resulting $(b,g)$-space of Hamiltonians. A renormalization group flow exists in this space and scaling variables are shown to exist in a neighborhood of the fixed points. Consequently, the propagator of the regulated theory enjoys homogeneous scaling laws close to the fixed points. Using renormalization group arguments it is possible to discern the functional form of the propagator for long distances and long imaginary times, thus demonstrating the extent to which fixed points control the behavior of the cut-off theory.
By keeping the width fixed and varying only the well depth, we show how the mean position of a bound state diverges as $g$ approaches a critical value. It is proven that the exponent characterizing the divergence is universal in the sense that its value is independent of the choice of regulator.
Two classical interpretations of the results are discussed: standard Brownian motion on the real line, and the free energy of a certain one-dimensional chain of particles with prescribed boundary conditions. In the former example, $V$ appears as part of an expectation value in the Feynman-Kac formula. In the latter example, $V$ appears as the background potential for the chain, and the loss of extensivity is dictated by a universal power law.
△ Less
Submitted 31 January, 2018; v1 submitted 14 July, 2017;
originally announced July 2017.
-
Coherent control of single spins in silicon carbide at room temperature
Authors:
Matthias Widmann,
Sang-Yun Lee,
Torsten Rendler,
Nguyen Tien Son,
Helmut Fedder,
Seoyoung Paik,
Li-Ping Yang,
Nan Zhao,
Sen Yang,
Ian Booker,
Andrej Denisenko,
Mohammad Jamali,
Seyed Ali Momenzadeh,
Ilja Gerhardt,
Takeshi Ohshima,
Adam Gali,
Erik Janzén,
Jörg Wrachtrup
Abstract:
Spins in solids are cornerstone elements of quantum spintronics. Leading contenders such as defects in diamond, or individual phosphorous dopants in silicon have shown spectacular progress but either miss established nanotechnology or an efficient spin-photon interface. Silicon carbide (SiC) combines the strength of both systems: It has a large bandgap with deep defects and benefits from mature fa…
▽ More
Spins in solids are cornerstone elements of quantum spintronics. Leading contenders such as defects in diamond, or individual phosphorous dopants in silicon have shown spectacular progress but either miss established nanotechnology or an efficient spin-photon interface. Silicon carbide (SiC) combines the strength of both systems: It has a large bandgap with deep defects and benefits from mature fabrication techniques. Here we report the characterization of photoluminescence and optical spin polarization from single silicon vacancies in SiC, and demonstrate that single spins can be addressed at room temperature. We show coherent control of a single defect spin and find long spin coherence time under ambient conditions. Our study provides evidence that SiC is a promising system for atomic-scale spintronics and quantum technology.
△ Less
Submitted 31 October, 2014; v1 submitted 1 July, 2014;
originally announced July 2014.
-
Is the mean free path the mean of a distribution?
Authors:
Steve T. Paik
Abstract:
We bring attention to the fact that Maxwell's mean free path for a dilute hard-sphere gas in thermal equilibrium, $(\sqrt{2}σn)^{-1}$, which is ordinarily obtained by multiplying the average speed by the average time between collisions, is also the statistical mean of the distribution of free path lengths in such a gas.
We bring attention to the fact that Maxwell's mean free path for a dilute hard-sphere gas in thermal equilibrium, $(\sqrt{2}σn)^{-1}$, which is ordinarily obtained by multiplying the average speed by the average time between collisions, is also the statistical mean of the distribution of free path lengths in such a gas.
△ Less
Submitted 14 July, 2017; v1 submitted 4 September, 2013;
originally announced September 2013.
-
Modulation frequency dependence of continuous-wave optically/electrically detected magnetic resonance
Authors:
Sang-Yun Lee,
Seoyoung Paik,
Dane R. McCamey,
Christoph Boehme
Abstract:
Continuous wave optically and electrically detected magnetic resonance spectroscopy (cwODMR/cwEDMR) allow the investigation of paramagnetic states involved in spin-dependent transitions, like recombination and transport. Although experimentally similar to conventional electron spin resonance (ESR), there exist limitations when applying models originally developed for ESR to observables (luminescen…
▽ More
Continuous wave optically and electrically detected magnetic resonance spectroscopy (cwODMR/cwEDMR) allow the investigation of paramagnetic states involved in spin-dependent transitions, like recombination and transport. Although experimentally similar to conventional electron spin resonance (ESR), there exist limitations when applying models originally developed for ESR to observables (luminescence and electric current) of cwODMR and cwEDMR. Here we present closed-form solutions for the modulation frequency dependence of cwODMR and cwEDMR based on an intermediate pair recombination model and discuss ambiguities which arise when attempting to distinguish the dominant spin-dependent processes underlying experimental data. These include: 1) a large number of quantitatively different models cannot be differentiated, 2) signs of signals are determined not only by recombination, but also by other processes like dissociation, intersystem-crossing, pair generation, and even experimental parameter such as, modulation frequency, microwave power, and temperature, 3) radiative and non-radiative recombination cannot be distinguished due to the observed signs of cwODMR and cwEDMR experiments.
△ Less
Submitted 28 August, 2012;
originally announced August 2012.
-
Screening in strongly coupled N=2* supersymmetric Yang-Mills plasma
Authors:
Carlos Hoyos,
Steve Paik,
Laurence G. Yaffe
Abstract:
Using gauge-gravity duality, we extend thermodynamic studies and present results for thermal screening masses in strongly coupled N=2* supersymmetric Yang-Mills theory. This non-conformal theory is a mass deformation of maximally supersymmetric N=4 gauge theory. Results are obtained for the entropy density, pressure, specific heat, equation of state, and screening masses, down to previously unexpl…
▽ More
Using gauge-gravity duality, we extend thermodynamic studies and present results for thermal screening masses in strongly coupled N=2* supersymmetric Yang-Mills theory. This non-conformal theory is a mass deformation of maximally supersymmetric N=4 gauge theory. Results are obtained for the entropy density, pressure, specific heat, equation of state, and screening masses, down to previously unexplored low temperatures. The temperature dependence of screening masses in various symmetry channels, which characterize the longest length scales over which thermal fluctuations in the non-Abelian plasma are correlated, is examined and found to be asymptotically linear in the low temperature regime.
△ Less
Submitted 23 October, 2011; v1 submitted 9 August, 2011;
originally announced August 2011.
-
Thermodynamics of SU(2) N=2 supersymmetric Yang-Mills theory
Authors:
Steve Paik,
Laurence G. Yaffe
Abstract:
The thermodynamics of four-dimensional SU(2) N=2 super-Yang-Mills theory is examined in both high and low temperature regimes. At low temperatures, compelling evidence is found for two distinct equilibrium states related by a spontaneously broken discrete R-symmetry. These equilibrium states exist because the quantum moduli space of the theory has two singular points where extra massless states…
▽ More
The thermodynamics of four-dimensional SU(2) N=2 super-Yang-Mills theory is examined in both high and low temperature regimes. At low temperatures, compelling evidence is found for two distinct equilibrium states related by a spontaneously broken discrete R-symmetry. These equilibrium states exist because the quantum moduli space of the theory has two singular points where extra massless states appear. At high temperature, a unique R-symmetry-preserving equilibrium state is found. Discrepancies with previous results in the literature are explained.
△ Less
Submitted 17 January, 2010; v1 submitted 7 November, 2009;
originally announced November 2009.
-
A long-term optical and X-ray ephemeris of the polar EK Ursae Majoris
Authors:
K. Beuermann,
J. Diese,
S. Paik,
A. Ploch,
J. Zachmann,
A. D. Schwope,
F. V. Hessman
Abstract:
We searched for long-term period changes in the polar EK UMa using new optical data and archival X-ray/EUV data. An optical ephemeris was derived from data taken remotely with the MONET/N telescope and compared with the X-ray ephemeris based on Einstein, Rosat, and EUVE data. A three-parameter fit to the combined data sets yields the epoch, the period, and the phase offset between the optical mi…
▽ More
We searched for long-term period changes in the polar EK UMa using new optical data and archival X-ray/EUV data. An optical ephemeris was derived from data taken remotely with the MONET/N telescope and compared with the X-ray ephemeris based on Einstein, Rosat, and EUVE data. A three-parameter fit to the combined data sets yields the epoch, the period, and the phase offset between the optical minima and the X-ray absorption dips. An added quadratic term is insignificant and sets a limit to the period change. The derived linear ephemeris is valid over 30 years and the common optical and X-ray period is P=0.0795440225(24) days. There is no evidence of long-term O-C variations or a period change over the past 17 years Delta P = -0.14+-0.50 ms. We suggest that the observed period is the orbital period and that the system is tightly synchronized. The limit on Delta P and the phase constancy of the bright part of the light curve indicate that O-C variations of the type seen in the polars DP Leo and HU Aqr or the pre-CV NN Ser do not seem to occur in EK UMa. The X-ray dips lag the optical minima by 9.5+-0.7 deg in azimuth, providing some insight into the accretion geometry.
△ Less
Submitted 6 November, 2009;
originally announced November 2009.
-
$T_1$- and $T_2$-spin relaxation time limitations of phosphorous donor electrons near crystalline silicon to silicon dioxide interface defects
Authors:
S. -Y. Paik,
S. -Y. Lee,
W. J. Baker,
D. R. McCamey. C. Boehme
Abstract:
A study of donor electron spins and spin--dependent electronic transitions involving phosphorous ($^{31}$P) atoms in proximity of the (111) oriented crystalline silicon (c-Si) to silicon dioxide (SiO$_{2}$) interface is presented for [$^{31}$P] = 10$^{15}$ $\mathrm{cm}^{-3}$ and [$^{31}$P] = 10$^{16}$ $\mathrm{cm}^{-3}$ at about liquid $^4$He temperatures ($T = 5$ $\mathrm{K} - 15$ $\mathrm{K}$)…
▽ More
A study of donor electron spins and spin--dependent electronic transitions involving phosphorous ($^{31}$P) atoms in proximity of the (111) oriented crystalline silicon (c-Si) to silicon dioxide (SiO$_{2}$) interface is presented for [$^{31}$P] = 10$^{15}$ $\mathrm{cm}^{-3}$ and [$^{31}$P] = 10$^{16}$ $\mathrm{cm}^{-3}$ at about liquid $^4$He temperatures ($T = 5$ $\mathrm{K} - 15$ $\mathrm{K}$). Using pulsed electrically detected magnetic resonance (pEDMR), spin--dependent transitions between the \Phos donor state and two distinguishable interface states are observed, namely (i) \Pb centers which can be identified by their characteristic anisotropy and (ii) a more isotropic center which is attributed to E$^\prime$ defects of the \sio bulk close to the interface. Correlation measurements of the dynamics of spin--dependent recombination confirm that previously proposed transitions between \Phos and the interface defects take place. The influence of these electronic near--interface transitions on the \Phos donor spin coherence time $T_2$ as well as the donor spin--lattice relaxation time $T_1$ is then investigated by comparison of spin Hahn--echo decay measurements obtained from conventional bulk sensitive pulsed electron paramagnetic resonance and surface sensitive pEDMR, as well as surface sensitive electrically detected inversion recovery experiments. The measurements reveal that both $T_2$ and $T_1$ of \Phos donor electrons spins in proximity of energetically lower interface states at $T\leq 13$ K are reduced by several orders of magnitude.
△ Less
Submitted 4 May, 2009;
originally announced May 2009.
-
Holographic Double Diffractive Scattering
Authors:
Christopher P. Herzog,
Steve Paik,
Matthew J. Strassler,
Ethan G. Thompson
Abstract:
The holographic description of Pomeron exchange in a strongly-coupled gauge theory with an AdS dual is extended to the case of two to three scattering. We study the production event of a central particle via hadron-hadron scattering in the double Regge kinematic regime of large center-of-momentum energy and fixed momentum transfer. The computation reduces to the overlap of a holographic wave fun…
▽ More
The holographic description of Pomeron exchange in a strongly-coupled gauge theory with an AdS dual is extended to the case of two to three scattering. We study the production event of a central particle via hadron-hadron scattering in the double Regge kinematic regime of large center-of-momentum energy and fixed momentum transfer. The computation reduces to the overlap of a holographic wave function for the central particle with a source function for the Pomerons. The formalism is applied to scalar glueball production and the resulting amplitude is studied in various kinematic limits.
△ Less
Submitted 2 June, 2008;
originally announced June 2008.