-
Sensing of inspiration events from speech: comparison of deep learning and linguistic methods
Authors:
Aki Härmä,
Ulf Grossekathöfer,
Okke Ouweltjes,
Venkata Srikanth Nallanthighal
Abstract:
Respiratory chest belt sensor can be used to measure the respiratory rate and other respiratory health parameters. Virtual Respiratory Belt, VRB, algorithms estimate the belt sensor waveform from speech audio. In this paper we compare the detection of inspiration events (IE) from respiratory belt sensor data using a novel neural VRB algorithm and the detections based on time-aligned linguistic con…
▽ More
Respiratory chest belt sensor can be used to measure the respiratory rate and other respiratory health parameters. Virtual Respiratory Belt, VRB, algorithms estimate the belt sensor waveform from speech audio. In this paper we compare the detection of inspiration events (IE) from respiratory belt sensor data using a novel neural VRB algorithm and the detections based on time-aligned linguistic content. The results show the superiority of the VRB method over word pause detection or grammatical content segmentation. The comparison of the methods show that both read and spontaneous speech content has a significant amount of ungrammatical breathing, that is, breathing events that are not aligned with grammatically appropriate places in language. This study gives new insights into the development of VRB methods and adds to the general understanding of speech breathing behavior. Moreover, a new VRB method, VRBOLA, for the reconstruction of the continuous breathing waveform is demonstrated.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
Improving Extrinsics between RADAR and LIDAR using Learning
Authors:
Peng Jiang,
Srikanth Saripalli
Abstract:
LIDAR and RADAR are two commonly used sensors in autonomous driving systems. The extrinsic calibration between the two is crucial for effective sensor fusion. The challenge arises due to the low accuracy and sparse information in RADAR measurements. This paper presents a novel solution for 3D RADAR-LIDAR calibration in autonomous systems. The method employs simple targets to generate data, includi…
▽ More
LIDAR and RADAR are two commonly used sensors in autonomous driving systems. The extrinsic calibration between the two is crucial for effective sensor fusion. The challenge arises due to the low accuracy and sparse information in RADAR measurements. This paper presents a novel solution for 3D RADAR-LIDAR calibration in autonomous systems. The method employs simple targets to generate data, including correspondence registration and a one-step optimization algorithm. The optimization aims to minimize the reprojection error while utilizing a small multi-layer perception (MLP) to perform regression on the return energy of the sensor around the targets. The proposed approach uses a deep learning framework such as PyTorch and can be optimized through gradient descent. The experiment uses a 360-degree Ouster-128 LIDAR and a 360-degree Navtech RADAR, providing raw measurements. The results validate the effectiveness of the proposed method in achieving improved estimates of extrinsic calibration parameters.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
HyHTM: Hyperbolic Geometry based Hierarchical Topic Models
Authors:
Simra Shahid,
Tanay Anand,
Nikitha Srikanth,
Sumit Bhatia,
Balaji Krishnamurthy,
Nikaash Puri
Abstract:
Hierarchical Topic Models (HTMs) are useful for discovering topic hierarchies in a collection of documents. However, traditional HTMs often produce hierarchies where lowerlevel topics are unrelated and not specific enough to their higher-level topics. Additionally, these methods can be computationally expensive. We present HyHTM - a Hyperbolic geometry based Hierarchical Topic Models - that addres…
▽ More
Hierarchical Topic Models (HTMs) are useful for discovering topic hierarchies in a collection of documents. However, traditional HTMs often produce hierarchies where lowerlevel topics are unrelated and not specific enough to their higher-level topics. Additionally, these methods can be computationally expensive. We present HyHTM - a Hyperbolic geometry based Hierarchical Topic Models - that addresses these limitations by incorporating hierarchical information from hyperbolic geometry to explicitly model hierarchies in topic models. Experimental results with four baselines show that HyHTM can better attend to parent-child relationships among topics. HyHTM produces coherent topic hierarchies that specialise in granularity from generic higher-level topics to specific lowerlevel topics. Further, our model is significantly faster and leaves a much smaller memory footprint than our best-performing baseline.We have made the source code for our algorithm publicly accessible.
△ Less
Submitted 16 May, 2023;
originally announced May 2023.
-
Low-Degree Testing Over Grids
Authors:
Prashanth Amireddy,
Srikanth Srinivasan,
Madhu Sudan
Abstract:
We study the question of local testability of low (constant) degree functions from a product domain $S_1 \times \dots \times {S}_n$ to a field $\mathbb{F}$, where ${S_i} \subseteq \mathbb{F}$ can be arbitrary constant sized sets. We show that this family is locally testable when the grid is "symmetric". That is, if ${S_i} = {S}$ for all i, there is a probabilistic algorithm using constantly many q…
▽ More
We study the question of local testability of low (constant) degree functions from a product domain $S_1 \times \dots \times {S}_n$ to a field $\mathbb{F}$, where ${S_i} \subseteq \mathbb{F}$ can be arbitrary constant sized sets. We show that this family is locally testable when the grid is "symmetric". That is, if ${S_i} = {S}$ for all i, there is a probabilistic algorithm using constantly many queries that distinguishes whether $f$ has a polynomial representation of degree at most $d$ or is $Ω(1)$-far from having this property. In contrast, we show that there exist asymmetric grids with $|{S}_1| =\dots= |{S}_n| = 3$ for which testing requires $ω_n(1)$ queries, thereby establishing that even in the context of polynomials, local testing depends on the structure of the domain and not just the distance of the underlying code.
The low-degree testing problem has been studied extensively over the years and a wide variety of tools have been applied to propose and analyze tests. Our work introduces yet another new connection in this rich field, by building low-degree tests out of tests for "junta-degrees". A function $f : {S}_1 \times \dots \times {S}_n \to {G}$, for an abelian group ${G}$ is said to be a junta-degree-$d$ function if it is a sum of $d$-juntas. We derive our low-degree test by giving a new local test for junta-degree-$d$ functions. For the analysis of our tests, we deduce a small-set expansion theorem for spherical noise over large grids, which may be of independent interest.
△ Less
Submitted 10 November, 2024; v1 submitted 8 May, 2023;
originally announced May 2023.
-
Lessons Learned in ATCO2: 5000 hours of Air Traffic Control Communications for Robust Automatic Speech Recognition and Understanding
Authors:
Juan Zuluaga-Gomez,
Iuliia Nigmatulina,
Amrutha Prasad,
Petr Motlicek,
Driss Khalil,
Srikanth Madikeri,
Allan Tart,
Igor Szoke,
Vincent Lenders,
Mickael Rigault,
Khalid Choukri
Abstract:
Voice communication between air traffic controllers (ATCos) and pilots is critical for ensuring safe and efficient air traffic control (ATC). This task requires high levels of awareness from ATCos and can be tedious and error-prone. Recent attempts have been made to integrate artificial intelligence (AI) into ATC in order to reduce the workload of ATCos. However, the development of data-driven AI…
▽ More
Voice communication between air traffic controllers (ATCos) and pilots is critical for ensuring safe and efficient air traffic control (ATC). This task requires high levels of awareness from ATCos and can be tedious and error-prone. Recent attempts have been made to integrate artificial intelligence (AI) into ATC in order to reduce the workload of ATCos. However, the development of data-driven AI systems for ATC demands large-scale annotated datasets, which are currently lacking in the field. This paper explores the lessons learned from the ATCO2 project, a project that aimed to develop a unique platform to collect and preprocess large amounts of ATC data from airspace in real time. Audio and surveillance data were collected from publicly accessible radio frequency channels with VHF receivers owned by a community of volunteers and later uploaded to Opensky Network servers, which can be considered an "unlimited source" of data. In addition, this paper reviews previous work from ATCO2 partners, including (i) robust automatic speech recognition, (ii) natural language processing, (iii) English language identification of ATC communications, and (iv) the integration of surveillance data such as ADS-B. We believe that the pipeline developed during the ATCO2 project, along with the open-sourcing of its data, will encourage research in the ATC field. A sample of the ATCO2 corpus is available on the following website: https://www.atco2.org/data, while the full corpus can be purchased through ELDA at http://catalog.elra.info/en-us/repository/browse/ELRA-S0484. We demonstrated that ATCO2 is an appropriate dataset to develop ASR engines when little or near to no ATC in-domain data is available. For instance, with the CNN-TDNNf kaldi model, we reached the performance of as low as 17.9% and 24.9% WER on public ATC datasets which is 6.6/7.6% better than "out-of-domain" but supervised CNN-TDNNf model.
△ Less
Submitted 1 May, 2023;
originally announced May 2023.
-
Data-driven discovery of stochastic dynamical equations of collective motion
Authors:
Arshed Nabeel,
Vivek Jadhav,
Danny Raj M,
Clément Sire,
Guy Theraulaz,
Ramón Escobedo,
Srikanth K. Iyer,
Vishwesha Guttal
Abstract:
Coarse-grained descriptions of collective motion of flocking systems are often derived for the macroscopic or the thermodynamic limit. However, many real flocks are small sized (10 to 100 individuals), called the mesoscopic scales, where stochasticity arising from the finite flock sizes is important. Developing mesoscopic scale equations, typically in the form of stochastic differential equations,…
▽ More
Coarse-grained descriptions of collective motion of flocking systems are often derived for the macroscopic or the thermodynamic limit. However, many real flocks are small sized (10 to 100 individuals), called the mesoscopic scales, where stochasticity arising from the finite flock sizes is important. Developing mesoscopic scale equations, typically in the form of stochastic differential equations, can be challenging even for the simplest of the collective motion models. Here, we take a novel data-driven equation learning approach to construct the stochastic mesoscopic descriptions of a simple self-propelled particle (SPP) model of collective motion. In our SPP model, a focal individual can interact with k randomly chosen neighbours within an interaction radius. We consider k = 1 (called stochastic pairwise interactions), k = 2 (stochastic ternary interactions), and k equalling all available neighbours within the interaction radius (equivalent to Vicsek-like local averaging). The data-driven mesoscopic equations reveal that the stochastic pairwise interaction model produces a novel form of collective motion driven by a multiplicative noise term (hence termed, noise-induced flocking). In contrast, for higher order interactions (k > 1), including Vicsek-like averaging interactions, yield collective motion driven primarily by the deterministic forces. We find that the relation between the parameters of the mesoscopic equations describing the dynamics and the population size are sensitive to the density and to the interaction radius, exhibiting deviations from mean-field theoretical expectations. We provide semi-analytic arguments potentially explaining these observed deviations. In summary, our study emphasizes the importance of mesoscopic descriptions of flocking systems and demonstrates the potential of the data-driven equation discovery methods for complex systems studies.
△ Less
Submitted 19 April, 2023;
originally announced April 2023.
-
Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR
Authors:
Xilai Li,
Goeric Huybrechts,
Srikanth Ronanki,
Jeff Farris,
Sravan Bodapati
Abstract:
Recently, there has been an increasing interest in unifying streaming and non-streaming speech recognition models to reduce development, training and deployment cost. The best-known approaches rely on either window-based or dynamic chunk-based attention strategy and causal convolutions to minimize the degradation due to streaming. However, the performance gap still remains relatively large between…
▽ More
Recently, there has been an increasing interest in unifying streaming and non-streaming speech recognition models to reduce development, training and deployment cost. The best-known approaches rely on either window-based or dynamic chunk-based attention strategy and causal convolutions to minimize the degradation due to streaming. However, the performance gap still remains relatively large between non-streaming and a full-contextual model trained independently. To address this, we propose a dynamic chunk-based convolution replacing the causal convolution in a hybrid Connectionist Temporal Classification (CTC)-Attention Conformer architecture. Additionally, we demonstrate further improvements through initialization of weights from a full-contextual model and parallelization of the convolution and self-attention modules. We evaluate our models on the open-source Voxpopuli, LibriSpeech and in-house conversational datasets. Overall, our proposed model reduces the degradation of the streaming mode over the non-streaming full-contextual model from 41.7% and 45.7% to 16.7% and 26.2% on the LibriSpeech test-clean and test-other datasets respectively, while improving by a relative 15.5% WER over the previous state-of-the-art unified model.
△ Less
Submitted 25 April, 2023; v1 submitted 18 April, 2023;
originally announced April 2023.
-
High-speed turbulent flows towards the exascale: STREAmS-2 porting and performance
Authors:
Srikanth Sathyanarayana,
Matteo Bernardini,
Davide Modesti,
Sergio Pirozzoli,
Francesco Salvadore
Abstract:
Exascale High Performance Computing (HPC) represents a tremendous opportunity to push the boundaries of Computational Fluid Dynamics (CFD), but despite the consolidated trend towards the use of Graphics Processing Units (GPUs), programmability is still an issue. STREAmS-2 (Bernardini et al. Comput. Phys. Commun. 285 (2023) 108644) is a compressible solver for canonical wall-bounded turbulent flows…
▽ More
Exascale High Performance Computing (HPC) represents a tremendous opportunity to push the boundaries of Computational Fluid Dynamics (CFD), but despite the consolidated trend towards the use of Graphics Processing Units (GPUs), programmability is still an issue. STREAmS-2 (Bernardini et al. Comput. Phys. Commun. 285 (2023) 108644) is a compressible solver for canonical wall-bounded turbulent flows capable of harvesting the potential of NVIDIA GPUs. Here we extend the already available CUDA Fortran backend with a novel HIPFort backend targeting AMD GPU architectures. The main implementation strategies are discussed along with a novel Python tool that can generate the HIPFort and CPU code versions allowing developers to focus their attention only on the CUDA Fortran backend. Single GPU performance is analysed focusing on NVIDIA A100 and AMD MI250x cards which are currently at the core of several HPC clusters. The gap between peak GPU performance and STREAmS-2 performance is found to be generally smaller for NVIDIA cards. Roofline analysis allows tracing this behavior to unexpectedly different computational intensities of the same kernel using the two cards. Parallel performance is measured on the two largest EuroHPC pre-exascale systems, LUMI (AMD GPUs) and Leonardo (NVIDIA GPUs). Strong scalability reveals more than 80% efficiency up to 16 nodes for Leonardo and up to 32 for LUMI. Weak scalability shows an impressive efficiency of over 95% up to the maximum number of nodes tested (256 for LUMI and 512 for Leonardo). This analysis shows that STREAmS-2 is the perfect candidate to fully exploit the power of current pre-exascale HPC systems in Europe, allowing users to simulate flows with over a trillion mesh points, thus reducing the gap between the Reynolds numbers achievable in high-fidelity simulations and those of real engineering applications.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
High Frobenius pushforwards generate the bounded derived category
Authors:
Matthew R. Ballard,
Srikanth B. Iyengar,
Pat Lank,
Alapan Mukhopadhyay,
Josh Pollitz
Abstract:
This work concerns generators for the bounded derived category of coherent sheaves over a noetherian scheme $X$ of prime characteristic. The main result is that when the Frobenius map on $X$ is finite, for any compact generator $G$ of $\mathsf{D}(X)$ the Frobenius pushforward $F ^e_*G$ generates the bounded derived category whenever $p^e$ is larger than the codepth of $X$, an invariant that is a m…
▽ More
This work concerns generators for the bounded derived category of coherent sheaves over a noetherian scheme $X$ of prime characteristic. The main result is that when the Frobenius map on $X$ is finite, for any compact generator $G$ of $\mathsf{D}(X)$ the Frobenius pushforward $F ^e_*G$ generates the bounded derived category whenever $p^e$ is larger than the codepth of $X$, an invariant that is a measure of the singularity of $X$. The conclusion holds for all positive integers $e$ when $X$ is locally complete intersection. The question of when one can take $G=\mathcal{O}_X$ is also investigated. For smooth projective complete intersections it reduces to a question of generation of the Kuznetsov component.
△ Less
Submitted 13 April, 2023; v1 submitted 31 March, 2023;
originally announced March 2023.
-
Revolutionizing Modern Networks: Advances in AI, Machine Learning, and Blockchain for Quantum Satellites and UAV-based Communication
Authors:
Adarsh Kumar,
Ali Ismail Awad,
Gaurav Sharma,
Rajalakshmi Krishnamurthi,
Saurabh Jain,
P. Srikanth,
Kriti Sharma,
Mustapha Hedabou,
Surinder Sood
Abstract:
Quantum communication is the most secure technique of transmitting data available today. Fiber communication lines and satellite-to-ground links have served as the basis for the most successful quantum networks that have been developed so far. Using a UAV, satellite or both for free-space quantum communication reduces the need for permanent ground connections and takes advantage of the lower loss…
▽ More
Quantum communication is the most secure technique of transmitting data available today. Fiber communication lines and satellite-to-ground links have served as the basis for the most successful quantum networks that have been developed so far. Using a UAV, satellite or both for free-space quantum communication reduces the need for permanent ground connections and takes advantage of the lower loss limit in space, which makes it more efficient. This work surveys the recent development in Quantum Satellites and Quantum UAVs-based networks. Here, the importance of the latest technologies, including quantum artificial intelligence, blockchain quantum machine learning, quantum satellites and quantum UAVs, are explored from network perspectives. Further, this work discussed the role of satellite-based images and artificial intelligence.
△ Less
Submitted 2 September, 2024; v1 submitted 21 March, 2023;
originally announced March 2023.
-
Beamformer-Guided Target Speaker Extraction
Authors:
Mohamed Elminshawi,
Srikanth Raj Chetupalli,
Emanuël A. P. Habets
Abstract:
We propose a Beamformer-guided Target Speaker Extraction (BG-TSE) method to extract a target speaker's voice from a multi-channel recording informed by the direction of arrival of the target. The proposed method employs a front-end beamformer steered towards the target speaker to provide an auxiliary signal to a single-channel TSE system. By allowing for time-varying embeddings in the single-chann…
▽ More
We propose a Beamformer-guided Target Speaker Extraction (BG-TSE) method to extract a target speaker's voice from a multi-channel recording informed by the direction of arrival of the target. The proposed method employs a front-end beamformer steered towards the target speaker to provide an auxiliary signal to a single-channel TSE system. By allowing for time-varying embeddings in the single-channel TSE block, the proposed method fully exploits the correspondence between the front-end beamformer output and the target speech in the microphone signal. Experimental evaluation on simulated multi-channel 2-speaker mixtures, in both anechoic and reverberant conditions, demonstrates the advantage of the proposed method compared to recent single-channel and multi-channel baselines.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Multi-Microphone Speaker Separation by Spatial Regions
Authors:
Julian Wechsler,
Srikanth Raj Chetupalli,
Wolfgang Mack,
Emanuël A. P. Habets
Abstract:
We consider the task of region-based source separation of reverberant multi-microphone recordings. We assume pre-defined spatial regions with a single active source per region. The objective is to estimate the signals from the individual spatial regions as captured by a reference microphone while retaining a correspondence between signals and spatial regions. We propose a data-driven approach usin…
▽ More
We consider the task of region-based source separation of reverberant multi-microphone recordings. We assume pre-defined spatial regions with a single active source per region. The objective is to estimate the signals from the individual spatial regions as captured by a reference microphone while retaining a correspondence between signals and spatial regions. We propose a data-driven approach using a modified version of a state-of-the-art network, where different layers model spatial and spectro-temporal information. The network is trained to enforce a fixed mapping of regions to network outputs. Using speech from LibriMix, we construct a data set specifically designed to contain the region information. Additionally, we train the network with permutation invariant training. We show that both training methods result in a fixed mapping of regions to network outputs, achieve comparable performance, and that the networks exploit spatial information. The proposed network outperforms a baseline network by 1.5 dB in scale-invariant signal-to-distortion ratio.
△ Less
Submitted 13 March, 2023;
originally announced March 2023.
-
Competing Magnetic Interactions and Field-Induced Metamagnetic Transition in Highly Crystalline Phase-Tunable Iron Oxide Nanorods
Authors:
Supun B. Attanayake,
Amit Chanda,
Thomas Hulse,
Raja Das,
Manh-Huong Phan,
Hariharan Srikanth
Abstract:
The inherent existence of multi phases in iron oxide nanostructures highlights the significance of them being investigated deliberately to understand and possibly control the phases. Here, the effects of annealing at 250 0C with a variable duration on the bulk magnetic and structural properties of high aspect ratio bi-phase iron oxide nanorods with ferrimagnetic Fe3O4 and antiferromagnetic alpha-F…
▽ More
The inherent existence of multi phases in iron oxide nanostructures highlights the significance of them being investigated deliberately to understand and possibly control the phases. Here, the effects of annealing at 250 0C with a variable duration on the bulk magnetic and structural properties of high aspect ratio bi-phase iron oxide nanorods with ferrimagnetic Fe3O4 and antiferromagnetic alpha-Fe2O3 is explored. Increasing annealing time under a free flow of oxygen enhanced the alpha-Fe2O3 volume fraction, and improved the crystallinity of the Fe3O4 phase, identified in changes in the magnetization as a function of annealing time. A critical annealing time of approximately 3 hours maximized the presence of both phases, as observed via an enhancement in the magnetization and an interfacial pinning effect. This is attributed to disordered spins separating the magnetically distinct phases which tend to align with the application of a magnetic field at high temperatures. The increased antiferromagnetic phase can be distinguished due to the field-induced metamagnetic transitions observed in structures annealed for more than 3 hours and was especially prominent in the 9-hour annealed sample. Our controlled study in determining the changes in volume fractions with annealing time will enable precise control over phase tunability in iron oxide nanorods, allowing custom-made phase volume fractions in different applications ranging from spintronics to biomedical applications.
△ Less
Submitted 12 March, 2023;
originally announced March 2023.
-
A class of Gorenstein algebras and their dualities
Authors:
Wassilij Gnedin,
Srikanth B. Iyengar,
Henning Krause
Abstract:
In the recent paper "The Nakayama functor and its completion for Gorenstein algebras", a class of Gorenstein algebras over commutative noetherian rings was introduced, and duality theorems for various categories of representations were established. The manuscript on hand provides more context to the results presented in the aforementioned work, identifies new classes of Gorenstein algebras, and ex…
▽ More
In the recent paper "The Nakayama functor and its completion for Gorenstein algebras", a class of Gorenstein algebras over commutative noetherian rings was introduced, and duality theorems for various categories of representations were established. The manuscript on hand provides more context to the results presented in the aforementioned work, identifies new classes of Gorenstein algebras, and explores their behaviour under standard operations like taking tensor products and tilting.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Study on the global minimum and $H\toγγ$ in the Dirac scotogenic model
Authors:
Raghavendra Srikanth Hundi
Abstract:
We have analyzed the vacuum structure of the Dirac scotogenic model, whose scalar sector consists of two complex Higgs doublets and a real singlet field. In this model, the standard model like Higgs doublet acquires non-zero vacuum expectation value (VEV), whereas, the other two fields acquire zero VEVs. This pattern of VEVs constitute a minimum, which is the desired vacuum of the model. After ana…
▽ More
We have analyzed the vacuum structure of the Dirac scotogenic model, whose scalar sector consists of two complex Higgs doublets and a real singlet field. In this model, the standard model like Higgs doublet acquires non-zero vacuum expectation value (VEV), whereas, the other two fields acquire zero VEVs. This pattern of VEVs constitute a minimum, which is the desired vacuum of the model. After analyzing the scalar potential of this model, we have found that other vacua are also possible in this model. We have shown that plenty of parameter space exist where the desired vacuum of this model is the global minimum. We have studied the implications of scalar sector of this model on the observable quantity of signal strength of Higgs to diphoton decay. After evaluating this quantity, we have found that the current experimental values of this quantity can be fitted in this model. Lastly, we have studied on the possibility of making any of the additional scalar fields of this model as a candidate for dark matter.
△ Less
Submitted 7 July, 2023; v1 submitted 8 March, 2023;
originally announced March 2023.
-
A Deep Learning Perspective on Network Routing
Authors:
Yarin Perry,
Felipe Vieira Frujeri,
Chaim Hoch,
Srikanth Kandula,
Ishai Menache,
Michael Schapira,
Aviv Tamar
Abstract:
Routing is, arguably, the most fundamental task in computer networking, and the most extensively studied one. A key challenge for routing in real-world environments is the need to contend with uncertainty about future traffic demands. We present a new approach to routing under demand uncertainty: tackling this challenge as stochastic optimization, and employing deep learning to learn complex patte…
▽ More
Routing is, arguably, the most fundamental task in computer networking, and the most extensively studied one. A key challenge for routing in real-world environments is the need to contend with uncertainty about future traffic demands. We present a new approach to routing under demand uncertainty: tackling this challenge as stochastic optimization, and employing deep learning to learn complex patterns in traffic demands. We show that our method provably converges to the global optimum in well-studied theoretical models of multicommodity flow. We exemplify the practical usefulness of our approach by zooming in on the real-world challenge of traffic engineering (TE) on wide-area networks (WANs). Our extensive empirical evaluation on real-world traffic and network topologies establishes that our approach's TE quality almost matches that of an (infeasible) omniscient oracle, outperforming previously proposed approaches, and also substantially lowers runtimes.
△ Less
Submitted 5 March, 2023; v1 submitted 1 March, 2023;
originally announced March 2023.
-
CLR-GAM: Contrastive Point Cloud Learning with Guided Augmentation and Feature Mapping
Authors:
Srikanth Malla,
Yi-Ting Chen
Abstract:
Point cloud data plays an essential role in robotics and self-driving applications. Yet, annotating point cloud data is time-consuming and nontrivial while they enable learning discriminative 3D representations that empower downstream tasks, such as classification and segmentation. Recently, contrastive learning-based frameworks have shown promising results for learning 3D representations in a sel…
▽ More
Point cloud data plays an essential role in robotics and self-driving applications. Yet, annotating point cloud data is time-consuming and nontrivial while they enable learning discriminative 3D representations that empower downstream tasks, such as classification and segmentation. Recently, contrastive learning-based frameworks have shown promising results for learning 3D representations in a self-supervised manner. However, existing contrastive learning methods cannot precisely encode and associate structural features and search the higher dimensional augmentation space efficiently. In this paper, we present CLR-GAM, a novel contrastive learning-based framework with Guided Augmentation (GA) for efficient dynamic exploration strategy and Guided Feature Mapping (GFM) for similar structural feature association between augmented point clouds. We empirically demonstrate that the proposed approach achieves state-of-the-art performance on both simulated and real-world 3D point cloud datasets for three different downstream tasks, i.e., 3D point cloud classification, few-shot learning, and object part segmentation.
△ Less
Submitted 27 February, 2023;
originally announced February 2023.
-
Local dualisable objects in local algebra
Authors:
Dave Benson,
Srikanth B. Iyengar,
Henning Krause,
Julia Pevtsova
Abstract:
We discuss dualisable objects in minimal subcategories of compactly generated tensor triangulated categories, paying special attention to the derived category of a commutative noetherian ring. A cohomological criterion for detecting these local dualisable objects is established. Generalisations to other related contexts are discussed.
We discuss dualisable objects in minimal subcategories of compactly generated tensor triangulated categories, paying special attention to the derived category of a commutative noetherian ring. A cohomological criterion for detecting these local dualisable objects is established. Generalisations to other related contexts are discussed.
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
Towards Optimal Depth-Reductions for Algebraic Formulas
Authors:
Hervé Fournier,
Nutan Limaye,
Guillaume Malod,
Srikanth Srinivasan,
Sébastien Tavenas
Abstract:
Classical results of Brent, Kuck and Maruyama (IEEE Trans. Computers 1973) and Brent (JACM 1974) show that any algebraic formula of size s can be converted to one of depth O(log s) with only a polynomial blow-up in size. In this paper, we consider a fine-grained version of this result depending on the degree of the polynomial computed by the algebraic formula. Given a homogeneous algebraic formula…
▽ More
Classical results of Brent, Kuck and Maruyama (IEEE Trans. Computers 1973) and Brent (JACM 1974) show that any algebraic formula of size s can be converted to one of depth O(log s) with only a polynomial blow-up in size. In this paper, we consider a fine-grained version of this result depending on the degree of the polynomial computed by the algebraic formula. Given a homogeneous algebraic formula of size s computing a polynomial P of degree d, we show that P can also be computed by an (unbounded fan-in) algebraic formula of depth O(log d) and size poly(s). Our proof shows that this result also holds in the highly restricted setting of monotone, non-commutative algebraic formulas. This improves on previous results in the regime when d is small (i.e., d<<s). In particular, for the setting of d=O(log s), along with a result of Raz (STOC 2010, JACM 2013), our result implies the same depth reduction even for inhomogeneous formulas. This is particularly interesting in light of recent algebraic formula lower bounds, which work precisely in this ``low-degree" and ``low-depth" setting. We also show that these results cannot be improved in the monotone setting, even for commutative formulas.
△ Less
Submitted 14 February, 2023;
originally announced February 2023.
-
Near-Optimal Non-Convex Stochastic Optimization under Generalized Smoothness
Authors:
Zijian Liu,
Srikanth Jagabathula,
Zhengyuan Zhou
Abstract:
The generalized smooth condition, $(L_{0},L_{1})$-smoothness, has triggered people's interest since it is more realistic in many optimization problems shown by both empirical and theoretical evidence. Two recent works established the $O(ε^{-3})$ sample complexity to obtain an $O(ε)$-stationary point. However, both require a large batch size on the order of $\mathrm{ploy}(ε^{-1})$, which is not onl…
▽ More
The generalized smooth condition, $(L_{0},L_{1})$-smoothness, has triggered people's interest since it is more realistic in many optimization problems shown by both empirical and theoretical evidence. Two recent works established the $O(ε^{-3})$ sample complexity to obtain an $O(ε)$-stationary point. However, both require a large batch size on the order of $\mathrm{ploy}(ε^{-1})$, which is not only computationally burdensome but also unsuitable for streaming applications. Additionally, these existing convergence bounds are established only for the expected rate, which is inadequate as they do not supply a useful performance guarantee on a single run. In this work, we solve the prior two problems simultaneously by revisiting a simple variant of the STORM algorithm. Specifically, under the $(L_{0},L_{1})$-smoothness and affine-type noises, we establish the first near-optimal $O(\log(1/(δε))ε^{-3})$ high-probability sample complexity where $δ\in(0,1)$ is the failure probability. Besides, for the same algorithm, we also recover the optimal $O(ε^{-3})$ sample complexity for the expected convergence with improved dependence on the problem-dependent parameter. More importantly, our convergence results only require a constant batch size in contrast to the previous works.
△ Less
Submitted 27 October, 2023; v1 submitted 12 February, 2023;
originally announced February 2023.
-
Analyzing DCTCP and Cubic Buffer Sharing under Diverse Router Configurations
Authors:
Santiago Vargas,
Aruna Balasubramanian,
Srikanth Sundaresan
Abstract:
In this work, we look at the impact of router configurations on DCTCP and Cubic traffic when both algorithms share router buffers in the data center. Modern data centers host traffic with mixed congestion controls, including DCTCP and Cubic traffic. Both DCTCP and Cubic in the data center can compete with each other and potentially starve and/or be unfair to each other when sharing buffer space in…
▽ More
In this work, we look at the impact of router configurations on DCTCP and Cubic traffic when both algorithms share router buffers in the data center. Modern data centers host traffic with mixed congestion controls, including DCTCP and Cubic traffic. Both DCTCP and Cubic in the data center can compete with each other and potentially starve and/or be unfair to each other when sharing buffer space in the data center. This happens since both algorithms are at odds with each other in terms of buffer utilization paradigms where DCTCP attempts to limit buffer utilization while Cubic generally fills buffers to obtain high throughput. As a result, we propose methods for a measurement-driven analysis of DCTCP and Cubic performance when sharing buffers in data center routers via simulation. We run around 10000 simulation experiments with unique router configurations and network conditions. Afterwards, we present a generalizable ML model to capture the effect that different buffer settings have on DCTCP and Cubic streaming traffic in the data center. Finally, we suggest that this model can be used to tune buffer settings in the data center.
△ Less
Submitted 11 February, 2023;
originally announced February 2023.
-
StriderNET: A Graph Reinforcement Learning Approach to Optimize Atomic Structures on Rough Energy Landscapes
Authors:
Vaibhav Bihani,
Sahil Manchanda,
Srikanth Sastry,
Sayan Ranu,
N. M. Anoop Krishnan
Abstract:
Optimization of atomic structures presents a challenging problem, due to their highly rough and non-convex energy landscape, with wide applications in the fields of drug design, materials discovery, and mechanics. Here, we present a graph reinforcement learning approach, StriderNET, that learns a policy to displace the atoms towards low energy configurations. We evaluate the performance of Strider…
▽ More
Optimization of atomic structures presents a challenging problem, due to their highly rough and non-convex energy landscape, with wide applications in the fields of drug design, materials discovery, and mechanics. Here, we present a graph reinforcement learning approach, StriderNET, that learns a policy to displace the atoms towards low energy configurations. We evaluate the performance of StriderNET on three complex atomic systems, namely, binary Lennard-Jones particles, calcium silicate hydrates gel, and disordered silicon. We show that StriderNET outperforms all classical optimization algorithms and enables the discovery of a lower energy minimum. In addition, StriderNET exhibits a higher rate of reaching minima with energies, as confirmed by the average over multiple realizations. Finally, we show that StriderNET exhibits inductivity to unseen system sizes that are an order of magnitude different from the training system.
△ Less
Submitted 29 January, 2023;
originally announced January 2023.
-
Kinetic reconstruction of free energies as a function of multiple order parameters
Authors:
Yagyik Goswami,
Srikanth Sastry
Abstract:
A vast array of phenomena, ranging from chemical reactions to phase transformations, are analysed in terms of a free energy surface defined with respect to a single or multiple order parameters. Enhanced sampling methods are typically used, especially in the presence of large free energy barriers, to estimate free energies using biasing protocols and sampling of transition paths. Kinetic reconstru…
▽ More
A vast array of phenomena, ranging from chemical reactions to phase transformations, are analysed in terms of a free energy surface defined with respect to a single or multiple order parameters. Enhanced sampling methods are typically used, especially in the presence of large free energy barriers, to estimate free energies using biasing protocols and sampling of transition paths. Kinetic reconstructions of free energy barriers of intermediate height have been performed, with respect to a single order parameter, employing the steady state properties of unconstrained simulation trajectories when barrier crossing is achievable with reasonable computational effort. Considering such cases, we describe a method to estimate free energy surfaces with respect to multiple order parameters from a steady state ensemble of trajectories. The approach applies to cases where the transition rates between pairs of order parameter values considered is not affected by the presence of an absorbing boundary, whereas the macroscopic fluxes and sampling probabilities are. We demonstrate the applicability of our prescription on different test cases of random walkers executing Brownian motion in order parameter space with an underlying (free) energy landscape and discuss strategies to improve numerical estimates of the fluxes and sampling. We next use this approach to reconstruct the free energy surface for supercooled liquid silicon with respect to the degree of crystallinity and density, from unconstrained molecular dynamics simulations, and obtain results quantitatively consistent with earlier results from umbrella sampling.
△ Less
Submitted 29 January, 2023;
originally announced January 2023.
-
Discontinuous rigidity transition associated with shear jamming in granular simulations
Authors:
Varghese Babu,
H. A Vinutha,
Dapeng Bi,
Srikanth Sastry
Abstract:
We investigate the rigidity transition associated with shear jamming in frictionless, as well as frictional, disk packings in the quasi-static regime and at low shear rates. For frictionless disks, the transition is under quasistatic shear is discontinuous, with an instantaneous emergence of a system spanning rigid cluster at the jamming transition. For frictional systems, the transition appears c…
▽ More
We investigate the rigidity transition associated with shear jamming in frictionless, as well as frictional, disk packings in the quasi-static regime and at low shear rates. For frictionless disks, the transition is under quasistatic shear is discontinuous, with an instantaneous emergence of a system spanning rigid cluster at the jamming transition. For frictional systems, the transition appears continuous for finite shear rates, but becomes sharper for lower shear rates. In the quasi-static limit, it is discontinuous as in the frictionless case. Thus, our results show that the rigidity transition associated with shear jamming is discontinuous, as demonstrated in a past for isotropic jamming of frictionless particles, and therefore a unifying feature of the jamming transition in general.
△ Less
Submitted 28 January, 2023;
originally announced January 2023.
-
Sea ice motion as a stochastic process
Authors:
Srikanth Toppaladoddi
Abstract:
We use tools from statistical physics to develop a stochastic theory for the drift of a single Arctic sea-ice floe. Floe-floe interactions are modelled using a Coulomb friction term, with any change in the thickness or the size of the ice floe due to phase change and/or mechanical deformation being neglected. We obtain a Langevin equation for the fluctuating velocity and the corresponding Fokker-P…
▽ More
We use tools from statistical physics to develop a stochastic theory for the drift of a single Arctic sea-ice floe. Floe-floe interactions are modelled using a Coulomb friction term, with any change in the thickness or the size of the ice floe due to phase change and/or mechanical deformation being neglected. We obtain a Langevin equation for the fluctuating velocity and the corresponding Fokker-Planck equation for its probability density function (PDF). For values of ice compactness close to unity, the stationary PDFs for the individual components of the fluctuating velocity are found to be the Laplace distribution, in agreement with observations. A possible way of obtaining a more general model that accounts for thermal growth and mechanical deformation is also discussed.
△ Less
Submitted 30 May, 2023; v1 submitted 1 January, 2023;
originally announced January 2023.
-
Effectiveness of Text, Acoustic, and Lattice-based representations in Spoken Language Understanding tasks
Authors:
Esaú Villatoro-Tello,
Srikanth Madikeri,
Juan Zuluaga-Gomez,
Bidisha Sharma,
Seyyed Saeed Sarfjoo,
Iuliia Nigmatulina,
Petr Motlicek,
Alexei V. Ivanov,
Aravind Ganapathiraju
Abstract:
In this paper, we perform an exhaustive evaluation of different representations to address the intent classification problem in a Spoken Language Understanding (SLU) setup. We benchmark three types of systems to perform the SLU intent detection task: 1) text-based, 2) lattice-based, and a novel 3) multimodal approach. Our work provides a comprehensive analysis of what could be the achievable perfo…
▽ More
In this paper, we perform an exhaustive evaluation of different representations to address the intent classification problem in a Spoken Language Understanding (SLU) setup. We benchmark three types of systems to perform the SLU intent detection task: 1) text-based, 2) lattice-based, and a novel 3) multimodal approach. Our work provides a comprehensive analysis of what could be the achievable performance of different state-of-the-art SLU systems under different circumstances, e.g., automatically- vs. manually-generated transcripts. We evaluate the systems on the publicly available SLURP spoken language resource corpus. Our results indicate that using richer forms of Automatic Speech Recognition (ASR) outputs, namely word-consensus-networks, allows the SLU system to improve in comparison to the 1-best setup (5.5% relative improvement). However, crossmodal approaches, i.e., learning from acoustic and text embeddings, obtains performance similar to the oracle setup, a relative improvement of 17.8% over the 1-best configuration, being a recommended alternative to overcome the limitations of working with automatically generated transcripts.
△ Less
Submitted 17 March, 2023; v1 submitted 16 December, 2022;
originally announced December 2022.
-
Simulations of star forming main sequence galaxies in Milgromian gravity
Authors:
Srikanth T. Nagesh,
Pavel Kroupa,
Indranil Banik,
Benoit Famaey,
Neda Ghafourian,
Mahmood Roshan,
Ingo Thies,
Hongsheng Zhao,
Nils Wittenburg
Abstract:
We conduct hydrodynamical MOND simulations of isolated disc galaxies over the stellar mass range $M_{\star}/M_\odot = 10^7 - 10^{11}$ using the adaptive mesh refinement code \textsc{phantom of ramses} (\textsc{por}), an adaptation of the \textsc{ramses} code with a Milgromian gravity solver. The scale lengths and gas fractions are based on observed galaxies, and the simulations are run for 5~Gyr.…
▽ More
We conduct hydrodynamical MOND simulations of isolated disc galaxies over the stellar mass range $M_{\star}/M_\odot = 10^7 - 10^{11}$ using the adaptive mesh refinement code \textsc{phantom of ramses} (\textsc{por}), an adaptation of the \textsc{ramses} code with a Milgromian gravity solver. The scale lengths and gas fractions are based on observed galaxies, and the simulations are run for 5~Gyr. The main aim is to see whether existing sub-grid physics prescriptions for star formation and stellar feedback reproduce the observed main sequence and reasonably match the Kennicutt-Schmidt relation that captures how the local and global star formation rates relate to other properties. Star formation in the models starts soon after initialisation and continues as the models evolve. The initialized galaxies indeed evolve to a state which is on the observed main sequence, and reasonably matches the Kennicutt-Schmidt relation. The available formulation of sub-grid physics is therefore adequate and leads to galaxies that largely behave like observed galaxies, grow in radius, and have flat rotation curves $-$ provided we use Milgromian gravitation. Furthermore, the strength of the bars tends to be inversely correlated with the stellar mass of the galaxy, whereas the bar length strongly correlates with the stellar mass. Irrespective of the mass, the bar pattern speed stays constant with time, indicating that dynamical friction does not affect the bar dynamics. The models demonstrate Renzo's rule and form structures at large radii, much as in real galaxies. In this framework, baryonic physics is thus sufficiently understood to not pose major uncertainties in our modelling of global galaxy properties.
△ Less
Submitted 14 December, 2022;
originally announced December 2022.
-
UNet Based Pipeline for Lung Segmentation from Chest X-Ray Images
Authors:
Shashank Shekhar,
Ritika Nandi,
H Srikanth Kamath
Abstract:
Biomedical image segmentation is one of the fastest growing fields which has seen extensive automation through the use of Artificial Intelligence. This has enabled widespread adoption of accurate techniques to expedite the screening and diagnostic processes which would otherwise take several days to finalize. In this paper, we present an end-to-end pipeline to segment lungs from chest X-ray images…
▽ More
Biomedical image segmentation is one of the fastest growing fields which has seen extensive automation through the use of Artificial Intelligence. This has enabled widespread adoption of accurate techniques to expedite the screening and diagnostic processes which would otherwise take several days to finalize. In this paper, we present an end-to-end pipeline to segment lungs from chest X-ray images, training the neural network model on the Japanese Society of Radiological Technology (JSRT) dataset, using UNet to enable faster processing of initial screening for various lung disorders. The pipeline developed can be readily used by medical centers with just the provision of X-Ray images as input. The model will perform the preprocessing, and provide a segmented image as the final output. It is expected that this will drastically reduce the manual effort involved and lead to greater accessibility in resource-constrained locations.
△ Less
Submitted 8 December, 2022;
originally announced December 2022.
-
Seasonal evolution of the Arctic sea ice thickness distribution
Authors:
Srikanth Toppaladoddi,
Woosok Moon,
John S. Wettlaufer
Abstract:
The Thorndike et al., (\emph{J. Geophys. Res.} {\bf 80} 4501, 1975) theory of the ice thickness distribution, $g(h)$, treats the dynamic and thermodynamic aggregate properties of the ice pack in a novel and physically self-consistent manner. Therefore, it has provided the conceptual basis of the treatment of sea-ice thickness categories in climate models. The approach, however, is not mathematical…
▽ More
The Thorndike et al., (\emph{J. Geophys. Res.} {\bf 80} 4501, 1975) theory of the ice thickness distribution, $g(h)$, treats the dynamic and thermodynamic aggregate properties of the ice pack in a novel and physically self-consistent manner. Therefore, it has provided the conceptual basis of the treatment of sea-ice thickness categories in climate models. The approach, however, is not mathematically closed due to the treatment of mechanical deformation using the redistribution function $ψ$, the authors noting ``The present theory suffers from a burdensome and arbitrary redistribution function $ψ.$'' Toppaladoddi and Wettlaufer (\emph{Phys. Rev. Lett.} {\bf 115} 148501, 2015) showed how $ψ$ can be written in terms of $g(h)$, thereby solving the mathematical closure problem and writing the theory in terms of a Fokker-Planck equation, which they solved analytically to quantitatively reproduce the observed winter $g(h)$. Here, we extend this approach to include open water by formulating a new boundary condition for their Fokker-Planck equation, which is then coupled to the observationally consistent sea-ice growth model of Semtner (\emph{J. Phys. Oceanogr.} {\bf 6}(3), 379, 1976) to study the seasonal evolution of $g(h)$. We find that as the ice thins, $g(h)$ transitions from a single- to a double-peaked distribution, which is in agreement with observations. To understand the cause of this transition, we construct a simpler description of the system using the equivalent Langevin equation formulation and solve the resulting stochastic ordinary differential equation numerically. Finally, we solve the Fokker-Planck equation for $g(h)$ under different climatological conditions to study the evolution of the open-water fraction.
△ Less
Submitted 18 March, 2023; v1 submitted 5 December, 2022;
originally announced December 2022.
-
Device Directedness with Contextual Cues for Spoken Dialog Systems
Authors:
Dhanush Bekal,
Sundararajan Srinivasan,
Sravan Bodapati,
Srikanth Ronanki,
Katrin Kirchhoff
Abstract:
In this work, we define barge-in verification as a supervised learning task where audio-only information is used to classify user spoken dialogue into true and false barge-ins. Following the success of pre-trained models, we use low-level speech representations from a self-supervised representation learning model for our downstream classification task. Further, we propose a novel technique to infu…
▽ More
In this work, we define barge-in verification as a supervised learning task where audio-only information is used to classify user spoken dialogue into true and false barge-ins. Following the success of pre-trained models, we use low-level speech representations from a self-supervised representation learning model for our downstream classification task. Further, we propose a novel technique to infuse lexical information directly into speech representations to improve the domain-specific language information implicitly learned during pre-training. Experiments conducted on spoken dialog data show that our proposed model trained to validate barge-in entirely from speech representations is faster by 38% relative and achieves 4.5% relative F1 score improvement over a baseline LSTM model that uses both audio and Automatic Speech Recognition (ASR) 1-best hypotheses. On top of this, our best proposed model with lexically infused representations along with contextual features provides a further relative improvement of 5.7% in the F1 score but only 22% faster than the baseline.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Secure Quantum Computing for Healthcare Sector: A Short Analysis
Authors:
P. Srikanth,
Adarsh Kumar
Abstract:
Quantum computing research might lead to "quantum leaps," and it could have unanticipated repercussions in the medical field. This technique has the potential to be used in a broad range of contexts, some of which include the development of novel drugs, the individualization of medical treatments, and the speeding of DNA sequencing. This work has assembled a list of the numerous methodologies pres…
▽ More
Quantum computing research might lead to "quantum leaps," and it could have unanticipated repercussions in the medical field. This technique has the potential to be used in a broad range of contexts, some of which include the development of novel drugs, the individualization of medical treatments, and the speeding of DNA sequencing. This work has assembled a list of the numerous methodologies presently employed in quantum medicine and other disciplines pertaining to healthcare. This work has created a list of the most critical concerns that need to be addressed before the broad use of quantum computing can be realized. In addition, this work investigates in detail the ways in which potential future applications of quantum computing might compromise the safety of healthcare delivery systems from the perspective of the medical industry and the patient-centric healthcare system. The primary objective of this investigation into quantum cryptography is to locate any potential flaws in the cryptographic protocols and strategies that have only very recently been the focus of scrutiny from academic research community members.
△ Less
Submitted 17 November, 2022;
originally announced November 2022.
-
Estimation of Appearance and Occupancy Information in Birds Eye View from Surround Monocular Images
Authors:
Sarthak Sharma,
Unnikrishnan R. Nair,
Udit Singh Parihar,
Midhun Menon S,
Srikanth Vidapanakal
Abstract:
Autonomous driving requires efficient reasoning about the location and appearance of the different agents in the scene, which aids in downstream tasks such as object detection, object tracking, and path planning. The past few years have witnessed a surge in approaches that combine the different taskbased modules of the classic self-driving stack into an End-toEnd(E2E) trainable learning system. Th…
▽ More
Autonomous driving requires efficient reasoning about the location and appearance of the different agents in the scene, which aids in downstream tasks such as object detection, object tracking, and path planning. The past few years have witnessed a surge in approaches that combine the different taskbased modules of the classic self-driving stack into an End-toEnd(E2E) trainable learning system. These approaches replace perception, prediction, and sensor fusion modules with a single contiguous module with shared latent space embedding, from which one extracts a human-interpretable representation of the scene. One of the most popular representations is the Birds-eye View (BEV), which expresses the location of different traffic participants in the ego vehicle frame from a top-down view. However, a BEV does not capture the chromatic appearance information of the participants. To overcome this limitation, we propose a novel representation that captures various traffic participants appearance and occupancy information from an array of monocular cameras covering 360 deg field of view (FOV). We use a learned image embedding of all camera images to generate a BEV of the scene at any instant that captures both appearance and occupancy of the scene, which can aid in downstream tasks such as object tracking and executing language-based commands. We test the efficacy of our approach on synthetic dataset generated from CARLA. The code, data set, and results can be found at https://rebrand.ly/APP OCC-results.
△ Less
Submitted 8 November, 2022;
originally announced November 2022.
-
Flow regimes and types of solid obstacle surface roughness in turbulent heat transfer inside periodic porous media
Authors:
Vishal Srikanth,
Dylan Peverall,
Andrey V. Kuznetsov
Abstract:
The focus of this paper is to systematically study the influence of solid obstacle surface roughness in porous media on the microscale flow physics and report its effect on macroscale drag and Nusselt number. The Reynolds averaged flow field is numerically simulated for a flow through a periodic porous medium consisting of an in-line arrangement of square cylinders with square roughness particles…
▽ More
The focus of this paper is to systematically study the influence of solid obstacle surface roughness in porous media on the microscale flow physics and report its effect on macroscale drag and Nusselt number. The Reynolds averaged flow field is numerically simulated for a flow through a periodic porous medium consisting of an in-line arrangement of square cylinders with square roughness particles on the cylinder surface. Two flow regimes are identified with respect to the surface roughness particle height: fine and coarse roughness regimes. The effect of the roughness particles in the fine roughness regime is limited to the near-wall boundary layer around the solid obstacle surface. In the coarse roughness regime, the roughness particles modify the microscale flow field in the entire pore space of the porous medium. In the fine roughness regime, the heat transfer from the rough solid obstacles to the fluid inside the porous medium is less than that from a smooth solid obstacle. In the coarse roughness regime, there is an enhancement in the heat transfer from the rough solid obstacle to the fluid inside the porous medium. Total drag reduction is also observed in the fine roughness regime for the smallest roughness particle height. The surface roughness particle spacing determines the fractional area of the solid obstacle surface covered by recirculating, reattached, and stagnating flow. As the roughness particle spacing increases, there are two competing factors for the heat transfer rate: increase due to more surface area covered by reattached flow, and decrease due to the decrease in the number of roughness particles on the solid obstacle surface. Decreasing the porosity and increasing the Reynolds number amplify the effect of the surface roughness on the microscale flow.
△ Less
Submitted 2 February, 2023; v1 submitted 28 October, 2022;
originally announced October 2022.
-
Towards Personalization of CTC Speech Recognition Models with Contextual Adapters and Adaptive Boosting
Authors:
Saket Dingliwal,
Monica Sunkara,
Sravan Bodapati,
Srikanth Ronanki,
Jeff Farris,
Katrin Kirchhoff
Abstract:
End-to-end speech recognition models trained using joint Connectionist Temporal Classification (CTC)-Attention loss have gained popularity recently. In these models, a non-autoregressive CTC decoder is often used at inference time due to its speed and simplicity. However, such models are hard to personalize because of their conditional independence assumption that prevents output tokens from previ…
▽ More
End-to-end speech recognition models trained using joint Connectionist Temporal Classification (CTC)-Attention loss have gained popularity recently. In these models, a non-autoregressive CTC decoder is often used at inference time due to its speed and simplicity. However, such models are hard to personalize because of their conditional independence assumption that prevents output tokens from previous time steps to influence future predictions. To tackle this, we propose a novel two-way approach that first biases the encoder with attention over a predefined list of rare long-tail and out-of-vocabulary (OOV) words and then uses dynamic boosting and phone alignment network during decoding to further bias the subword predictions. We evaluate our approach on open-source VoxPopuli and in-house medical datasets to showcase a 60% improvement in F1 score on domain-specific rare words over a strong CTC baseline.
△ Less
Submitted 13 November, 2022; v1 submitted 17 October, 2022;
originally announced October 2022.
-
Cross-domain Variational Capsules for Information Extraction
Authors:
Akash Nagaraj,
Akhil K,
Akshay Venkatesh,
Srikanth HR
Abstract:
In this paper, we present a characteristic extraction algorithm and the Multi-domain Image Characteristics Dataset of characteristic-tagged images to simulate the way a human brain classifies cross-domain information and generates insight. The intent was to identify prominent characteristics in data and use this identification mechanism to auto-generate insight from data in other unseen domains. A…
▽ More
In this paper, we present a characteristic extraction algorithm and the Multi-domain Image Characteristics Dataset of characteristic-tagged images to simulate the way a human brain classifies cross-domain information and generates insight. The intent was to identify prominent characteristics in data and use this identification mechanism to auto-generate insight from data in other unseen domains. An information extraction algorithm is proposed which is a combination of Variational Autoencoders (VAEs) and Capsule Networks. Capsule Networks are used to decompose images into their individual features and VAEs are used to explore variations on these decomposed features. Thus, making the model robust in recognizing characteristics from variations of the data. A noteworthy point is that the algorithm uses efficient hierarchical decoding of data which helps in richer output interpretation. Noticing a dearth in the number of datasets that contain visible characteristics in images belonging to various domains, the Multi-domain Image Characteristics Dataset was created and made publicly available. It consists of thousands of images across three domains. This dataset was created with the intent of introducing a new benchmark for fine-grained characteristic recognition tasks in the future.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
Homological dimensions of the Jacobson radical
Authors:
Xiao-Wu Chen,
Srikanth B. Iyengar,
René Marczinzik
Abstract:
This work presents results on the finiteness, and on the symmetry properties, of various homological dimensions associated to the Jacobson radical and its higher syzygies, of a semiperfect ring.
This work presents results on the finiteness, and on the symmetry properties, of various homological dimensions associated to the Jacobson radical and its higher syzygies, of a semiperfect ring.
△ Less
Submitted 16 October, 2022;
originally announced October 2022.
-
Online Multi Camera-IMU Calibration
Authors:
Jacob Hartzer,
Srikanth Saripalli
Abstract:
Visual-inertial navigation systems are powerful in their ability to accurately estimate localization of mobile systems within complex environments that preclude the use of global navigation satellite systems. However, these navigation systems are reliant on accurate and up-to-date temporospatial calibrations of the sensors being used. As such, online estimators for these parameters are useful in r…
▽ More
Visual-inertial navigation systems are powerful in their ability to accurately estimate localization of mobile systems within complex environments that preclude the use of global navigation satellite systems. However, these navigation systems are reliant on accurate and up-to-date temporospatial calibrations of the sensors being used. As such, online estimators for these parameters are useful in resilient systems. This paper presents an extension to existing Kalman Filter based frameworks for estimating and calibrating the extrinsic parameters of multi-camera IMU systems. In addition to extending the filter framework to include multiple camera sensors, the measurement model was reformulated to make use of measurement data that is typically made available in fiducial detection software. A secondary filter layer was used to estimate time translation parameters without closed-loop feedback of sensor data. Experimental calibration results, including the use of cameras with non-overlapping fields of view, were used to validate the stability and accuracy of the filter formulation when compared to offline methods. Finally the generalized filter code has been open-sourced and is available online.
△ Less
Submitted 7 October, 2023; v1 submitted 28 September, 2022;
originally announced September 2022.
-
Towards Human-Compatible XAI: Explaining Data Differentials with Concept Induction over Background Knowledge
Authors:
Cara Widmer,
Md Kamruzzaman Sarker,
Srikanth Nadella,
Joshua Fiechter,
Ion Juvina,
Brandon Minnery,
Pascal Hitzler,
Joshua Schwartz,
Michael Raymer
Abstract:
Concept induction, which is based on formal logical reasoning over description logics, has been used in ontology engineering in order to create ontology (TBox) axioms from the base data (ABox) graph. In this paper, we show that it can also be used to explain data differentials, for example in the context of Explainable AI (XAI), and we show that it can in fact be done in a way that is meaningful t…
▽ More
Concept induction, which is based on formal logical reasoning over description logics, has been used in ontology engineering in order to create ontology (TBox) axioms from the base data (ABox) graph. In this paper, we show that it can also be used to explain data differentials, for example in the context of Explainable AI (XAI), and we show that it can in fact be done in a way that is meaningful to a human observer. Our approach utilizes a large class hierarchy, curated from the Wikipedia category hierarchy, as background knowledge.
△ Less
Submitted 27 September, 2022;
originally announced September 2022.
-
CAMEL: Learning Cost-maps Made Easy for Off-road Driving
Authors:
Kasi Vishwanath,
P. B. Sujit,
Srikanth Saripalli
Abstract:
Cost-maps are used by robotic vehicles to plan collision-free paths. The cost associated with each cell in the map represents the sensed environment information which is often determined manually after several trial-and-error efforts. In off-road environments, due to the presence of several types of features, it is challenging to handcraft the cost values associated with each feature. Moreover, di…
▽ More
Cost-maps are used by robotic vehicles to plan collision-free paths. The cost associated with each cell in the map represents the sensed environment information which is often determined manually after several trial-and-error efforts. In off-road environments, due to the presence of several types of features, it is challenging to handcraft the cost values associated with each feature. Moreover, different handcrafted cost values can lead to different paths for the same environment which is not desirable. In this paper, we address the problem of learning the cost-map values from the sensed environment for robust vehicle path planning. We propose a novel framework called as CAMEL using deep learning approach that learns the parameters through demonstrations yielding an adaptive and robust cost-map for path planning. CAMEL has been trained on multi-modal datasets such as RELLIS-3D. The evaluation of CAMEL is carried out on an off-road scene simulator (MAVS) and on field data from IISER-B campus. We also perform realworld implementation of CAMEL on a ground rover. The results shows flexible and robust motion of the vehicle without collisions in unstructured terrains.
△ Less
Submitted 18 October, 2022; v1 submitted 26 September, 2022;
originally announced September 2022.
-
DRAMA: Joint Risk Localization and Captioning in Driving
Authors:
Srikanth Malla,
Chiho Choi,
Isht Dwivedi,
Joon Hee Choi,
Jiachen Li
Abstract:
Considering the functionality of situational awareness in safety-critical automation systems, the perception of risk in driving scenes and its explainability is of particular importance for autonomous and cooperative driving. Toward this goal, this paper proposes a new research direction of joint risk localization in driving scenes and its risk explanation as a natural language description. Due to…
▽ More
Considering the functionality of situational awareness in safety-critical automation systems, the perception of risk in driving scenes and its explainability is of particular importance for autonomous and cooperative driving. Toward this goal, this paper proposes a new research direction of joint risk localization in driving scenes and its risk explanation as a natural language description. Due to the lack of standard benchmarks, we collected a large-scale dataset, DRAMA (Driving Risk Assessment Mechanism with A captioning module), which consists of 17,785 interactive driving scenarios collected in Tokyo, Japan. Our DRAMA dataset accommodates video- and object-level questions on driving risks with associated important objects to achieve the goal of visual captioning as a free-form language description utilizing closed and open-ended responses for multi-level questions, which can be used to evaluate a range of visual captioning capabilities in driving scenarios. We make this data available to the community for further research. Using DRAMA, we explore multiple facets of joint risk localization and captioning in interactive driving scenarios. In particular, we benchmark various multi-task prediction architectures and provide a detailed analysis of joint risk localization and risk captioning. The data set is available at https://usa.honda-ri.com/drama
△ Less
Submitted 5 October, 2022; v1 submitted 21 September, 2022;
originally announced September 2022.
-
Leveraging Local Patch Differences in Multi-Object Scenes for Generative Adversarial Attacks
Authors:
Abhishek Aich,
Shasha Li,
Chengyu Song,
M. Salman Asif,
Srikanth V. Krishnamurthy,
Amit K. Roy-Chowdhury
Abstract:
State-of-the-art generative model-based attacks against image classifiers overwhelmingly focus on single-object (i.e., single dominant object) images. Different from such settings, we tackle a more practical problem of generating adversarial perturbations using multi-object (i.e., multiple dominant objects) images as they are representative of most real-world scenes. Our goal is to design an attac…
▽ More
State-of-the-art generative model-based attacks against image classifiers overwhelmingly focus on single-object (i.e., single dominant object) images. Different from such settings, we tackle a more practical problem of generating adversarial perturbations using multi-object (i.e., multiple dominant objects) images as they are representative of most real-world scenes. Our goal is to design an attack strategy that can learn from such natural scenes by leveraging the local patch differences that occur inherently in such images (e.g. difference between the local patch on the object `person' and the object `bike' in a traffic scene). Our key idea is to misclassify an adversarial multi-object image by confusing the victim classifier for each local patch in the image. Based on this, we propose a novel generative attack (called Local Patch Difference or LPD-Attack) where a novel contrastive loss function uses the aforesaid local differences in feature space of multi-object scenes to optimize the perturbation generator. Through various experiments across diverse victim convolutional neural networks, we show that our approach outperforms baseline generative attacks with highly transferable perturbations when evaluated under different white-box and black-box settings.
△ Less
Submitted 3 October, 2022; v1 submitted 20 September, 2022;
originally announced September 2022.
-
GAMA: Generative Adversarial Multi-Object Scene Attacks
Authors:
Abhishek Aich,
Calvin-Khang Ta,
Akash Gupta,
Chengyu Song,
Srikanth V. Krishnamurthy,
M. Salman Asif,
Amit K. Roy-Chowdhury
Abstract:
The majority of methods for crafting adversarial attacks have focused on scenes with a single dominant object (e.g., images from ImageNet). On the other hand, natural scenes include multiple dominant objects that are semantically related. Thus, it is crucial to explore designing attack strategies that look beyond learning on single-object scenes or attack single-object victim classifiers. Due to t…
▽ More
The majority of methods for crafting adversarial attacks have focused on scenes with a single dominant object (e.g., images from ImageNet). On the other hand, natural scenes include multiple dominant objects that are semantically related. Thus, it is crucial to explore designing attack strategies that look beyond learning on single-object scenes or attack single-object victim classifiers. Due to their inherent property of strong transferability of perturbations to unknown models, this paper presents the first approach of using generative models for adversarial attacks on multi-object scenes. In order to represent the relationships between different objects in the input scene, we leverage upon the open-sourced pre-trained vision-language model CLIP (Contrastive Language-Image Pre-training), with the motivation to exploit the encoded semantics in the language space along with the visual space. We call this attack approach Generative Adversarial Multi-object scene Attacks (GAMA). GAMA demonstrates the utility of the CLIP model as an attacker's tool to train formidable perturbation generators for multi-object scenes. Using the joint image-text features to train the generator, we show that GAMA can craft potent transferable perturbations in order to fool victim classifiers in various attack settings. For example, GAMA triggers ~16% more misclassification than state-of-the-art generative approaches in black-box settings where both the classifier architecture and data distribution of the attacker are different from the victim. Our code is available here: https://abhishekaich27.github.io/gama.html
△ Less
Submitted 15 October, 2022; v1 submitted 20 September, 2022;
originally announced September 2022.
-
Lim Ulrich sequences and Boij-Söderberg cones
Authors:
Srikanth B. Iyengar,
Linquan Ma,
Mark E. Walker
Abstract:
This paper extends the results of Boij, Eisenbud, Erman, Schreyer, and Söderberg on the structure of Betti cones of finitely generated graded modules and finite free complexes over polynomial rings, to all finitely generated graded rings admitting linear Noether normalizations. The key new input is the existence of lim Ulrich sequences of graded modules over such rings.
This paper extends the results of Boij, Eisenbud, Erman, Schreyer, and Söderberg on the structure of Betti cones of finitely generated graded modules and finite free complexes over polynomial rings, to all finitely generated graded rings admitting linear Noether normalizations. The key new input is the existence of lim Ulrich sequences of graded modules over such rings.
△ Less
Submitted 29 September, 2024; v1 submitted 7 September, 2022;
originally announced September 2022.
-
Freeness of Hecke modules at non-minimal levels
Authors:
Srikanth B. Iyengar,
Chandrashekhar B. Khare,
Jeffrey Manning
Abstract:
We build on the results of [6] to show that the homology groups $\mathrm{H}_{r_1+r_2}(Y_0(\mathcal{N}_Σ),\mathcal{O})_{\mathfrak{m}_Σ}$ of arithmetic manifolds are free over certain deformation rings $R_Σ$, when there are enough geometric characteristic 0 representations. Hitherto we had proved that the homology group has a nonzero free $R_Σ$-direct summand. The new ingredient is a commutative alg…
▽ More
We build on the results of [6] to show that the homology groups $\mathrm{H}_{r_1+r_2}(Y_0(\mathcal{N}_Σ),\mathcal{O})_{\mathfrak{m}_Σ}$ of arithmetic manifolds are free over certain deformation rings $R_Σ$, when there are enough geometric characteristic 0 representations. Hitherto we had proved that the homology group has a nonzero free $R_Σ$-direct summand. The new ingredient is a commutative algebra argument involving congruence modules defined in higher codimension in [6].
△ Less
Submitted 27 August, 2022;
originally announced August 2022.
-
Blackbox Attacks via Surrogate Ensemble Search
Authors:
Zikui Cai,
Chengyu Song,
Srikanth Krishnamurthy,
Amit Roy-Chowdhury,
M. Salman Asif
Abstract:
Blackbox adversarial attacks can be categorized into transfer- and query-based attacks. Transfer methods do not require any feedback from the victim model, but provide lower success rates compared to query-based methods. Query attacks often require a large number of queries for success. To achieve the best of both approaches, recent efforts have tried to combine them, but still require hundreds of…
▽ More
Blackbox adversarial attacks can be categorized into transfer- and query-based attacks. Transfer methods do not require any feedback from the victim model, but provide lower success rates compared to query-based methods. Query attacks often require a large number of queries for success. To achieve the best of both approaches, recent efforts have tried to combine them, but still require hundreds of queries to achieve high success rates (especially for targeted attacks). In this paper, we propose a novel method for Blackbox Attacks via Surrogate Ensemble Search (BASES) that can generate highly successful blackbox attacks using an extremely small number of queries. We first define a perturbation machine that generates a perturbed image by minimizing a weighted loss function over a fixed set of surrogate models. To generate an attack for a given victim model, we search over the weights in the loss function using queries generated by the perturbation machine. Since the dimension of the search space is small (same as the number of surrogate models), the search requires a small number of queries. We demonstrate that our proposed method achieves better success rate with at least 30x fewer queries compared to state-of-the-art methods on different image classifiers trained with ImageNet. In particular, our method requires as few as 3 queries per image (on average) to achieve more than a 90% success rate for targeted attacks and 1-2 queries per image for over a 99% success rate for untargeted attacks. Our method is also effective on Google Cloud Vision API and achieved a 91% untargeted attack success rate with 2.9 queries per image. We also show that the perturbations generated by our proposed method are highly transferable and can be adopted for hard-label blackbox attacks. We also show effectiveness of BASES for hiding attacks on object detectors.
△ Less
Submitted 23 November, 2022; v1 submitted 6 August, 2022;
originally announced August 2022.
-
Supergranular Fractal Dimension and Solar Rotation
Authors:
Sowmya G. M.,
Rajani G.,
U. Paniveni,
R. Srikanth
Abstract:
We present findings from an analysis of the fractal dimension of solar supergranulation as a function of latitude, supergranular cell size and solar rotation, employing spectroheliographic data in the Ca II K line of solar cycle no. 23. We find that the fractal dimension tends to decrease from about 1.37 at the equator to about 1 at 20 degree latitude in either hemisphere, suggesting that solar ro…
▽ More
We present findings from an analysis of the fractal dimension of solar supergranulation as a function of latitude, supergranular cell size and solar rotation, employing spectroheliographic data in the Ca II K line of solar cycle no. 23. We find that the fractal dimension tends to decrease from about 1.37 at the equator to about 1 at 20 degree latitude in either hemisphere, suggesting that solar rotation rate has the effect of augmenting the irregularity of supergranular boundaries. Considering that supergranular cell size is directly correlated with fractal dimension, we conclude that the mechanism behind our observation is that solar rotation influences the cell outflow strength, and thereby cell size, with the latitude dependence of the supergranular fractal dimension being a consequence thereof.
△ Less
Submitted 21 July, 2022;
originally announced July 2022.
-
NESC: Robust Neural End-2-End Speech Coding with GANs
Authors:
Nicola Pia,
Kishan Gupta,
Srikanth Korse,
Markus Multrus,
Guillaume Fuchs
Abstract:
Neural networks have proven to be a formidable tool to tackle the problem of speech coding at very low bit rates. However, the design of a neural coder that can be operated robustly under real-world conditions remains a major challenge. Therefore, we present Neural End-2-End Speech Codec (NESC) a robust, scalable end-to-end neural speech codec for high-quality wideband speech coding at 3 kbps. The…
▽ More
Neural networks have proven to be a formidable tool to tackle the problem of speech coding at very low bit rates. However, the design of a neural coder that can be operated robustly under real-world conditions remains a major challenge. Therefore, we present Neural End-2-End Speech Codec (NESC) a robust, scalable end-to-end neural speech codec for high-quality wideband speech coding at 3 kbps. The encoder uses a new architecture configuration, which relies on our proposed Dual-PathConvRNN (DPCRNN) layer, while the decoder architecture is based on our previous work Streamwise-StyleMelGAN. Our subjective listening tests on clean and noisy speech show that NESC is particularly robust to unseen conditions and signal perturbations.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
Analyzing the impact of SARS-CoV-2 variants on respiratory sound signals
Authors:
Debarpan Bhattacharya,
Debottam Dutta,
Neeraj Kumar Sharma,
Srikanth Raj Chetupalli,
Pravin Mote,
Sriram Ganapathy,
Chandrakiran C,
Sahiti Nori,
Suhail K K,
Sadhana Gonuguntla,
Murali Alagesan
Abstract:
The COVID-19 outbreak resulted in multiple waves of infections that have been associated with different SARS-CoV-2 variants. Studies have reported differential impact of the variants on respiratory health of patients. We explore whether acoustic signals, collected from COVID-19 subjects, show computationally distinguishable acoustic patterns suggesting a possibility to predict the underlying virus…
▽ More
The COVID-19 outbreak resulted in multiple waves of infections that have been associated with different SARS-CoV-2 variants. Studies have reported differential impact of the variants on respiratory health of patients. We explore whether acoustic signals, collected from COVID-19 subjects, show computationally distinguishable acoustic patterns suggesting a possibility to predict the underlying virus variant. We analyze the Coswara dataset which is collected from three subject pools, namely, i) healthy, ii) COVID-19 subjects recorded during the delta variant dominant period, and iii) data from COVID-19 subjects recorded during the omicron surge. Our findings suggest that multiple sound categories, such as cough, breathing, and speech, indicate significant acoustic feature differences when comparing COVID-19 subjects with omicron and delta variants. The classification areas-under-the-curve are significantly above chance for differentiating subjects infected by omicron from those infected by delta. Using a score fusion from multiple sound categories, we obtained an area-under-the-curve of 89% and 52.4% sensitivity at 95% specificity. Additionally, a hierarchical three class approach was used to classify the acoustic data into healthy and COVID-19 positive, and further COVID-19 subjects into delta and omicron variants providing high level of 3-class classification accuracy. These results suggest new ways for designing sound based COVID-19 diagnosis approaches.
△ Less
Submitted 24 June, 2022;
originally announced June 2022.
-
Contrastive Learning of Features between Images and LiDAR
Authors:
Peng Jiang,
Srikanth Saripalli
Abstract:
Image and Point Clouds provide different information for robots. Finding the correspondences between data from different sensors is crucial for various tasks such as localization, mapping, and navigation. Learning-based descriptors have been developed for single sensors; there is little work on cross-modal features. This work treats learning cross-modal features as a dense contrastive learning pro…
▽ More
Image and Point Clouds provide different information for robots. Finding the correspondences between data from different sensors is crucial for various tasks such as localization, mapping, and navigation. Learning-based descriptors have been developed for single sensors; there is little work on cross-modal features. This work treats learning cross-modal features as a dense contrastive learning problem. We propose a Tuple-Circle loss function for cross-modality feature learning. Furthermore, to learn good features and not lose generality, we developed a variant of widely used PointNet++ architecture for point cloud and U-Net CNN architecture for images. Moreover, we conduct experiments on a real-world dataset to show the effectiveness of our loss function and network structure. We show that our models indeed learn information from both images as well as LiDAR by visualizing the features.
△ Less
Submitted 24 June, 2022;
originally announced June 2022.
-
Congruence modules and the Wiles-Lenstra-Diamond numerical criterion in higher codimensions
Authors:
Srikanth B. Iyengar,
Chandrashekhar B. Khare,
Jeffrey Manning
Abstract:
We define a congruence module $Ψ_A(M)$ associated to a surjective $\mathcal O$-algebra morphism $λ\colon A \to \mathcal{O}$, with $\mathcal{O}$ a discrete valuation ring, $A$ a complete noetherian local $\mathcal{O}$-algebra regular at $\mathfrak{p}$, the kernel of $λ$, and $M$ a finitely generated $A$-module. We establish a numerical criterion for $M$ to have a free direct summand over $A$ of pos…
▽ More
We define a congruence module $Ψ_A(M)$ associated to a surjective $\mathcal O$-algebra morphism $λ\colon A \to \mathcal{O}$, with $\mathcal{O}$ a discrete valuation ring, $A$ a complete noetherian local $\mathcal{O}$-algebra regular at $\mathfrak{p}$, the kernel of $λ$, and $M$ a finitely generated $A$-module. We establish a numerical criterion for $M$ to have a free direct summand over $A$ of positive rank. It is in terms of the lengths of $Ψ_A(M)$ and the torsion part of $\mathfrak{p}/\mathfrak{p}^2$. It generalizes results of Wiles, Lenstra, and Diamond, that deal with the case when the codimension of $\mathfrak{p}$ is zero.
Number theoretic applications include integral (non-minimal) $R=\mathbb T$ theorems in situations of positive defect conditional on certain standard conjectures. Here $R$ is a deformation ring parametrizing certain Galois representations and $\mathbb T$ is a Hecke algebra. An example is a modularity lifting for 2-dimensional $\ell$-adic Galois representations over an imaginary quadratic field. The proofs combine our commutative algebra results with a generalization due to Calegari and Geraghty of the patching method of Wiles and Taylor--Wiles and level raising arguments that go back to Ribet. The results provide new evidence in favor of the intriguing, and as yet fledgling, torsion analog of the classical Langlands correspondence.
We also prove unconditional integral $R=\mathbb T$ results for Hecke algebras $\mathbb T$ acting on weight one cohomology of Shimura curves over $\mathbb Q$. This leads to a torsion Jacquet--Langlands correspondence comparing integral Hecke algebras acting on weight one cohomology of Shimura curves and modular curves. In this case the cohomology has abundant torsion and so our correspondence cannot be deduced by means of the classical Jacquet--Langlands correspondence.
△ Less
Submitted 29 September, 2024; v1 submitted 16 June, 2022;
originally announced June 2022.