-
An Effective Context-Balanced Adaptation Approach for Long-Tailed Speech Recognition
Authors:
Yi-Cheng Wang,
Li-Ting Pai,
Bi-Cheng Yan,
Hsin-Wei Wang,
Chi-Han Lin,
Berlin Chen
Abstract:
End-to-end (E2E) automatic speech recognition (ASR) models have become standard practice for various commercial applications. However, in real-world scenarios, the long-tailed nature of word distribution often leads E2E ASR models to perform well on common words but fall short in recognizing uncommon ones. Recently, the notion of a contextual adapter (CA) was proposed to infuse external knowledge…
▽ More
End-to-end (E2E) automatic speech recognition (ASR) models have become standard practice for various commercial applications. However, in real-world scenarios, the long-tailed nature of word distribution often leads E2E ASR models to perform well on common words but fall short in recognizing uncommon ones. Recently, the notion of a contextual adapter (CA) was proposed to infuse external knowledge represented by a context word list into E2E ASR models. Although CA can improve recognition performance on rare words, two crucial data imbalance problems remain. First, when using low-frequency words as context words during training, since these words rarely occur in the utterance, CA becomes prone to overfit on attending to the <no-context> token due to higher-frequency words not being present in the context list. Second, the long-tailed distribution within the context list itself still causes the model to perform poorly on low-frequency context words. In light of this, we explore in-depth the impact of altering the context list to have words with different frequency distributions on model performance, and meanwhile extend CA with a simple yet effective context-balanced learning objective. A series of experiments conducted on the AISHELL-1 benchmark dataset suggests that using all vocabulary words from the training corpus as the context list and pairing them with our balanced objective yields the best performance, demonstrating a significant reduction in character error rate (CER) by up to 1.21% and a more pronounced 9.44% reduction in the error rate of zero-shot words.
△ Less
Submitted 10 September, 2024;
originally announced September 2024.
-
Quasiperiods of Magic Labeling Quasipolynomials
Authors:
Margaret Bayer,
Amanda Burcroff,
Tyrrell B. McAllister,
Leilani Pai
Abstract:
A magic labeling of a graph is a labeling of the edges by nonnegative integers such that the label sum over the edges incident to every vertex is the same. This common label sum is known as the index. We count magic labelings by maximum edge label, rather than index, using an Ehrhart-theoretic approach. In contrast to Stanley's 1973 work showing that the function counting magic labelings with boun…
▽ More
A magic labeling of a graph is a labeling of the edges by nonnegative integers such that the label sum over the edges incident to every vertex is the same. This common label sum is known as the index. We count magic labelings by maximum edge label, rather than index, using an Ehrhart-theoretic approach. In contrast to Stanley's 1973 work showing that the function counting magic labelings with bounded index is a quasipolynomial with quasiperiod $2$, we show by construction that the minimum quasiperiod of the quasipolynomial counting magic labelings with bounded maximum label can be arbitrarily large, even for planar bipartite graphs. Unfortunately, this rules out a certain Ehrhart-theoretic approach to proving Hartsfield and Ringel's Antimagic Graph Conjecture. However, we show that this quasipolynomial is in fact a polynomial for any bipartite graph with matching preclusion number at most $1$, which includes any bipartite graph with a leaf.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
On the interval coloring impropriety of graphs
Authors:
MacKenzie Carr,
Eun-Kyung Cho,
Nicholas Crawford,
Vesna Iršič,
Leilani Pai,
Rebecca Robinson
Abstract:
An improper interval (edge) coloring of a graph $G$ is an assignment of colors to the edges of $G$ satisfying the condition that, for every vertex $v \in V(G)$, the set of colors assigned to the edges incident with $v$ forms an integral interval. An interval coloring is $k$-improper if at most $k$ edges with the same color all share a common endpoint. The minimum integer $k$ such that there exists…
▽ More
An improper interval (edge) coloring of a graph $G$ is an assignment of colors to the edges of $G$ satisfying the condition that, for every vertex $v \in V(G)$, the set of colors assigned to the edges incident with $v$ forms an integral interval. An interval coloring is $k$-improper if at most $k$ edges with the same color all share a common endpoint. The minimum integer $k$ such that there exists a $k$-improper interval coloring of the graph $G$ is the interval coloring impropriety of $G$, denoted by $μ_{int}(G)$. In this paper, we provide a construction of an interval coloring of a subclass of complete multipartite graphs. This provides additional evidence to the conjecture by Casselgren and Petrosyan that $μ_{int}(G)\leq 2$ for all complete multipartite graphs $G$. Additionally, we determine improved upper bounds on the interval coloring impropriety of several classes of graphs, namely 2-trees, iterated triangulations, and outerplanar graphs. Finally, we investigate the interval coloring impropriety of the corona product of two graphs, $G\odot H$.
△ Less
Submitted 30 May, 2024; v1 submitted 22 December, 2023;
originally announced December 2023.
-
The Phase Transition of Discrepancy in Random Hypergraphs
Authors:
Calum MacRury,
Tomáš Masařík,
Leilani Pai,
Xavier Pérez-Giménez
Abstract:
Motivated by the Beck-Fiala conjecture, we study the discrepancy problem in two related models of random hypergraphs on $n$ vertices and $m$ edges. In the first (edge-independent) model, a random hypergraph $H_1$ is constructed by fixing a parameter $p$ and allowing each of the $n$ vertices to join each of the $m$ edges independently with probability $p$. In the parameter range in which…
▽ More
Motivated by the Beck-Fiala conjecture, we study the discrepancy problem in two related models of random hypergraphs on $n$ vertices and $m$ edges. In the first (edge-independent) model, a random hypergraph $H_1$ is constructed by fixing a parameter $p$ and allowing each of the $n$ vertices to join each of the $m$ edges independently with probability $p$. In the parameter range in which $pn \rightarrow \infty$ and $pm \rightarrow \infty$, we show that with high probability (w.h.p.) $H_1$ has discrepancy at least $Ω(2^{-n/m} \sqrt{pn})$ when $m = O(n)$, and at least $Ω(\sqrt{pn \logγ})$ when $m \gg n$, where $γ= \min\{ m/n, pn\}$. In the second (edge-dependent) model, $d$ is fixed and each vertex of $H_2$ independently joins exactly $d$ edges uniformly at random. We obtain analogous results for this model by generalizing the techniques used for the edge-independent model with $p=d/m$. Namely, for $d \rightarrow \infty$ and $dn/m \rightarrow \infty$, we prove that w.h.p. $H_{2}$ has discrepancy at least $Ω(2^{-n/m} \sqrt{dn/m})$ when $m = O(n)$, and at least $Ω(\sqrt{(dn/m) \logγ})$ when $m \gg n$, where $γ=\min\{m/n, dn/m\}$. Furthermore, we obtain nearly matching asymptotic upper bounds on the discrepancy in both models (when $p=d/m$), in the dense regime of $m \gg n$. Specifically, we apply the partial colouring lemma of Lovett and Meka to show that w.h.p. $H_{1}$ and $H_{2}$ each have discrepancy $O( \sqrt{dn/m} \log(m/n))$, provided $d \rightarrow \infty$, $d n/m \rightarrow \infty$ and $m \gg n$. This result is algorithmic, and together with the work of Bansal and Meka characterizes how the discrepancy of each random hypergraph model transitions from $Θ(\sqrt{d})$ to $o(\sqrt{d})$ as $m$ varies from $m=Θ(n)$ to $m \gg n$.
△ Less
Submitted 22 October, 2021; v1 submitted 14 February, 2021;
originally announced February 2021.