-
Improved Long-Form Speech Recognition by Jointly Modeling the Primary and Non-primary Speakers
Authors:
Guru Prakash Arumugam,
Shuo-yiin Chang,
Tara N. Sainath,
Rohit Prabhavalkar,
Quan Wang,
Shaan Bijwadia
Abstract:
ASR models often suffer from a long-form deletion problem where the model predicts sequential blanks instead of words when transcribing a lengthy audio (in the order of minutes or hours). From the perspective of a user or downstream system consuming the ASR results, this behavior can be perceived as the model "being stuck", and potentially make the product hard to use. One of the culprits for long…
▽ More
ASR models often suffer from a long-form deletion problem where the model predicts sequential blanks instead of words when transcribing a lengthy audio (in the order of minutes or hours). From the perspective of a user or downstream system consuming the ASR results, this behavior can be perceived as the model "being stuck", and potentially make the product hard to use. One of the culprits for long-form deletion is training-test data mismatch, which can happen even when the model is trained on diverse and large-scale data collected from multiple application domains. In this work, we introduce a novel technique to simultaneously model different groups of speakers in the audio along with the standard transcript tokens. Speakers are grouped as primary and non-primary, which connects the application domains and significantly alleviates the long-form deletion problem. This improved model neither needs any additional training data nor incurs additional training or inference cost.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Can Balloons Produce Li-Fi? A Disaster Management Perspective
Authors:
Atchutananda Surampudi,
Sankalp Shirish Chapalgaonkar,
Paventhan Arumugam
Abstract:
Natural calamities and disasters disrupt the conventional communication setups and the wireless bandwidth becomes constrained. A safe and cost-effective solution for communication and data access in such scenarios is long needed. Light-Fidelity (Li-Fi) which promises wireless access to data at high speeds using visible light can be a good option. Visible light being safe to use for wireless access…
▽ More
Natural calamities and disasters disrupt the conventional communication setups and the wireless bandwidth becomes constrained. A safe and cost-effective solution for communication and data access in such scenarios is long needed. Light-Fidelity (Li-Fi) which promises wireless access to data at high speeds using visible light can be a good option. Visible light being safe to use for wireless access in such affected environments also provides illumination. Importantly, when a Li-Fi unit is attached to an air balloon and a network of such Li-Fi balloons are coordinated to form a Li-Fi balloon network, data can be accessed anytime and anywhere required and hence many lives can be tracked and saved. We propose this idea of a Li-Fi balloon and give an overview of its design using the Philips Li-Fi hardware. Further, we propose the concept of a balloon network and coin it with an acronym, the LiBNet. We consider the balloons to be arranged as a homogeneous Poisson point process in the LiBNet and we derive the mean co-channel interference for such an arrangement.
△ Less
Submitted 13 December, 2017;
originally announced December 2017.
-
Optimal Evacuation Flows on Dynamic Paths with General Edge Capacities
Authors:
Guru Prakash Arumugam,
John Augustine,
Mordecai J. Golin,
Yuya Higashikawa,
Naoki Katoh,
Prashanth Srikanthan
Abstract:
A Dynamic Graph Network is a graph in which each edge has an associated travel time and a capacity (width) that limits the number of items that can travel in parallel along that edge. Each vertex in this dynamic graph network begins with the number of items that must be evacuated into designated sink vertices. A $k$-sink evacuation protocol finds the location of $k$ sinks and associated evacuation…
▽ More
A Dynamic Graph Network is a graph in which each edge has an associated travel time and a capacity (width) that limits the number of items that can travel in parallel along that edge. Each vertex in this dynamic graph network begins with the number of items that must be evacuated into designated sink vertices. A $k$-sink evacuation protocol finds the location of $k$ sinks and associated evacuation movement protocol that allows evacuating all the items to a sink in minimum time. The associated evacuation movement must impose a confluent flow, i.e, all items passing through a particular vertex exit that vertex using the same edge. In this paper we address the $k$-sink evacuation problem on a dynamic path network. We provide solutions that run in $O(n \log n)$ time for $k=1$ and $O(k n \log^2 n)$ for $k >1$ and work for arbitrary edge capacities.
△ Less
Submitted 23 June, 2016;
originally announced June 2016.
-
A Polynomial Time Algorithm for Minimax-Regret Evacuation on a Dynamic Path
Authors:
Guru Prakash Arumugam,
John Augustine,
Mordecai J. Golin,
Prashanth Srikanthan
Abstract:
A dynamic path network is an undirected path with evacuees situated at each vertex. To evacuate the path, evacuees travel towards a designated sink (doorway) to exit. Each edge has a capacity, the number of evacuees that can enter the edge in unit time. Congestion occurs if an evacuee has to wait at a vertex for other evacuees to leave first. The basic problem is to place k sinks on the line, with…
▽ More
A dynamic path network is an undirected path with evacuees situated at each vertex. To evacuate the path, evacuees travel towards a designated sink (doorway) to exit. Each edge has a capacity, the number of evacuees that can enter the edge in unit time. Congestion occurs if an evacuee has to wait at a vertex for other evacuees to leave first. The basic problem is to place k sinks on the line, with an associated evacuation strategy, so as to minimize the total time needed to evacuate everyone. The minmax-regret version introduces uncertainty into the input, with the number of evacuees at vertices only being specified to within a range. The problem is to find a universal solution whose regret (difference from optimal for a given input) is minimized over all legal inputs. The previously best known algorithms for the minmax regret version problem ran in time exponential in k. In this paper, we derive new prop- erties of solutions that yield the first polynomial time algorithms for solving the problem.
△ Less
Submitted 22 April, 2014;
originally announced April 2014.