Search | arXiv e-print repository

Granite-speech: open-source speech-aware LLMs with strong English ASR capabilities

Authors: George Saon, Avihu Dekel, Alexander Brooks, Tohru Nagano, Abraham Daniels, Aharon Satt, Ashish Mittal, Brian Kingsbury, David Haws, Edmilson Morais, Gakuto Kurata, Hagai Aronowitz, Ibrahim Ibrahim, Jeff Kuo, Kate Soule, Luis Lastras, Masayuki Suzuki, Ron Hoory, Samuel Thomas, Sashi Novitasari, Takashi Fukuda, Vishal Sunder, Xiaodong Cui, Zvi Kons

Abstract: Granite-speech LLMs are compact and efficient speech language models specifically designed for English ASR and automatic speech translation (AST). The models were trained by modality aligning the 2B and 8B parameter variants of granite-3.3-instruct to speech on publicly available open-source corpora containing audio inputs and text targets consisting of either human transcripts for ASR or automati… ▽ More Granite-speech LLMs are compact and efficient speech language models specifically designed for English ASR and automatic speech translation (AST). The models were trained by modality aligning the 2B and 8B parameter variants of granite-3.3-instruct to speech on publicly available open-source corpora containing audio inputs and text targets consisting of either human transcripts for ASR or automatically generated translations for AST. Comprehensive benchmarking shows that on English ASR, which was our primary focus, they outperform several competitors' models that were trained on orders of magnitude more proprietary data, and they keep pace on English-to-X AST for major European languages, Japanese, and Chinese. The speech-specific components are: a conformer acoustic encoder using block attention and self-conditioning trained with connectionist temporal classification, a windowed query-transformer speech modality adapter used to do temporal downsampling of the acoustic embeddings and map them to the LLM text embedding space, and LoRA adapters to further fine-tune the text LLM. Granite-speech-3.3 operates in two modes: in speech mode, it performs ASR and AST by activating the encoder, projector, and LoRA adapters; in text mode, it calls the underlying granite-3.3-instruct model directly (without LoRA), essentially preserving all the text LLM capabilities and safety. Both models are freely available on HuggingFace (https://huggingface.co/ibm-granite/granite-speech-3.3-2b and https://huggingface.co/ibm-granite/granite-speech-3.3-8b) and can be used for both research and commercial purposes under a permissive Apache 2.0 license. △ Less

Submitted 13 May, 2025; v1 submitted 13 May, 2025; originally announced May 2025.

Comments: 7 pages, 9 figures

arXiv:2501.09104 [pdf, other]

A Non-autoregressive Model for Joint STT and TTS

Authors: Vishal Sunder, Brian Kingsbury, George Saon, Samuel Thomas, Slava Shechtman, Hagai Aronowitz, Eric Fosler-Lussier, Luis Lastras

Abstract: In this paper, we take a step towards jointly modeling automatic speech recognition (STT) and speech synthesis (TTS) in a fully non-autoregressive way. We develop a novel multimodal framework capable of handling the speech and text modalities as input either individually or together. The proposed model can also be trained with unpaired speech or text data owing to its multimodal nature. We further… ▽ More In this paper, we take a step towards jointly modeling automatic speech recognition (STT) and speech synthesis (TTS) in a fully non-autoregressive way. We develop a novel multimodal framework capable of handling the speech and text modalities as input either individually or together. The proposed model can also be trained with unpaired speech or text data owing to its multimodal nature. We further propose an iterative refinement strategy to improve the STT and TTS performance of our model such that the partial hypothesis at the output can be fed back to the input of our model, thus iteratively improving both STT and TTS predictions. We show that our joint model can effectively perform both STT and TTS tasks, outperforming the STT-specific baseline in all tasks and performing competitively with the TTS-specific baseline across a wide range of evaluation metrics. △ Less

Submitted 20 January, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

Comments: 5 pages, 3 figures, 3 tables

arXiv:2501.01936 [pdf, other]

Improving Transducer-Based Spoken Language Understanding with Self-Conditioned CTC and Knowledge Transfer

Authors: Vishal Sunder, Eric Fosler-Lussier

Abstract: In this paper, we propose to improve end-to-end (E2E) spoken language understand (SLU) in an RNN transducer model (RNN-T) by incorporating a joint self-conditioned CTC automatic speech recognition (ASR) objective. Our proposed model is akin to an E2E differentiable cascaded model which performs ASR and SLU sequentially and we ensure that the SLU task is conditioned on the ASR task by having CTC se… ▽ More In this paper, we propose to improve end-to-end (E2E) spoken language understand (SLU) in an RNN transducer model (RNN-T) by incorporating a joint self-conditioned CTC automatic speech recognition (ASR) objective. Our proposed model is akin to an E2E differentiable cascaded model which performs ASR and SLU sequentially and we ensure that the SLU task is conditioned on the ASR task by having CTC self conditioning. This novel joint modeling of ASR and SLU improves SLU performance significantly over just using SLU optimization. We further improve the performance by aligning the acoustic embeddings of this model with the semantically richer BERT model. Our proposed knowledge transfer strategy makes use of a bag-of-entity prediction layer on the aligned embeddings and the output of this is used to condition the RNN-T based SLU decoding. These techniques show significant improvement over several strong baselines and can perform at par with large models like Whisper with significantly fewer parameters. △ Less

Submitted 3 January, 2025; originally announced January 2025.

Comments: 8 pages, 4 figures

arXiv:2310.11486 [pdf, other]

End-to-End real time tracking of children's reading with pointer network

Authors: Vishal Sunder, Beulah Karrolla, Eric Fosler-Lussier

Abstract: In this work, we explore how a real time reading tracker can be built efficiently for children's voices. While previously proposed reading trackers focused on ASR-based cascaded approaches, we propose a fully end-to-end model making it less prone to lags in voice tracking. We employ a pointer network that directly learns to predict positions in the ground truth text conditioned on the streaming sp… ▽ More In this work, we explore how a real time reading tracker can be built efficiently for children's voices. While previously proposed reading trackers focused on ASR-based cascaded approaches, we propose a fully end-to-end model making it less prone to lags in voice tracking. We employ a pointer network that directly learns to predict positions in the ground truth text conditioned on the streaming speech. To train this pointer network, we generate ground truth training signals by using forced alignment between the read speech and the text being read on the training set. Exploring different forced alignment models, we find a neural attention based model is at least as close in alignment accuracy to the Montreal Forced Aligner, but surprisingly is a better training signal for the pointer network. Our results are reported on one adult speech data (TIMIT) and two children's speech datasets (CMU Kids and Reading Races). Our best model can accurately track adult speech with 87.8% accuracy and the much harder and disfluent children's speech with 77.1% accuracy on CMU Kids data and a 65.3% accuracy on the Reading Races dataset. △ Less

Submitted 17 October, 2023; originally announced October 2023.

Comments: 5 pages, 3 figures

arXiv:2204.05188 [pdf, other]

Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems

Authors: Vishal Sunder, Eric Fosler-Lussier, Samuel Thomas, Hong-Kwang J. Kuo, Brian Kingsbury

Abstract: Recent advances in End-to-End (E2E) Spoken Language Understanding (SLU) have been primarily due to effective pretraining of speech representations. One such pretraining paradigm is the distillation of semantic knowledge from state-of-the-art text-based models like BERT to speech encoder neural networks. This work is a step towards doing the same in a much more efficient and fine-grained manner whe… ▽ More Recent advances in End-to-End (E2E) Spoken Language Understanding (SLU) have been primarily due to effective pretraining of speech representations. One such pretraining paradigm is the distillation of semantic knowledge from state-of-the-art text-based models like BERT to speech encoder neural networks. This work is a step towards doing the same in a much more efficient and fine-grained manner where we align speech embeddings and BERT embeddings on a token-by-token basis. We introduce a simple yet novel technique that uses a cross-modal attention mechanism to extract token-level contextual embeddings from a speech encoder such that these can be directly compared and aligned with BERT based contextual embeddings. This alignment is performed using a novel tokenwise contrastive loss. Fine-tuning such a pretrained model to perform intent recognition using speech directly yields state-of-the-art performance on two widely used SLU datasets. Our model improves further when fine-tuned with additional regularization using SpecAugment especially when speech is noisy, giving an absolute improvement as high as 8% over previous results. △ Less

Submitted 1 July, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

Comments: 5 pages, 2 figures

arXiv:2204.05183 [pdf, other]

Building an ASR Error Robust Spoken Virtual Patient System in a Highly Class-Imbalanced Scenario Without Speech Data

Authors: Vishal Sunder, Prashant Serai, Eric Fosler-Lussier

Abstract: A Virtual Patient (VP) is a powerful tool for training medical students to take patient histories, where responding to a diverse set of spoken questions is essential to simulate natural conversations with a student. The performance of such a Spoken Language Understanding system (SLU) can be adversely affected by both the presence of Automatic Speech Recognition (ASR) errors in the test data and a… ▽ More A Virtual Patient (VP) is a powerful tool for training medical students to take patient histories, where responding to a diverse set of spoken questions is essential to simulate natural conversations with a student. The performance of such a Spoken Language Understanding system (SLU) can be adversely affected by both the presence of Automatic Speech Recognition (ASR) errors in the test data and a high degree of class imbalance in the SLU training data. While these two issues have been addressed separately in prior work, we develop a novel two-step training methodology that tackles both these issues effectively in a single dialog agent. As it is difficult to collect spoken data from users without a functioning SLU system, our method does not rely on spoken data for training, rather we use an ASR error predictor to "speechify" the text data. Our method shows significant improvements over strong baselines on the VP intent classification task at various word error rate settings. △ Less

Submitted 1 July, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

Comments: 5 pages, 3 figures

arXiv:2204.05169 [pdf, other]

Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding

Authors: Vishal Sunder, Samuel Thomas, Hong-Kwang J. Kuo, Jatin Ganhotra, Brian Kingsbury, Eric Fosler-Lussier

Abstract: Dialog history plays an important role in spoken language understanding (SLU) performance in a dialog system. For end-to-end (E2E) SLU, previous work has used dialog history in text form, which makes the model dependent on a cascaded automatic speech recognizer (ASR). This rescinds the benefits of an E2E system which is intended to be compact and robust to ASR errors. In this paper, we propose a h… ▽ More Dialog history plays an important role in spoken language understanding (SLU) performance in a dialog system. For end-to-end (E2E) SLU, previous work has used dialog history in text form, which makes the model dependent on a cascaded automatic speech recognizer (ASR). This rescinds the benefits of an E2E system which is intended to be compact and robust to ASR errors. In this paper, we propose a hierarchical conversation model that is capable of directly using dialog history in speech form, making it fully E2E. We also distill semantic knowledge from the available gold conversation transcripts by jointly training a similar text-based conversation model with an explicit tying of acoustic and semantic embeddings. We also propose a novel technique that we call DropFrame to deal with the long training time incurred by adding dialog history in an E2E manner. On the HarperValleyBank dialog dataset, our E2E history integration outperforms a history independent baseline by 7.7% absolute F1 score on the task of dialog action recognition. Our model performs competitively with the state-of-the-art history based cascaded baseline, but uses 48% fewer parameters. In the absence of gold transcripts to fine-tune an ASR model, our model outperforms this baseline by a significant margin of 10% absolute F1 score. △ Less

Submitted 11 April, 2022; originally announced April 2022.

Comments: 5 pages, 1 figure

arXiv:2103.12258 [pdf, other]

Hallucination of speech recognition errors with sequence to sequence learning

Authors: Prashant Serai, Vishal Sunder, Eric Fosler-Lussier

Abstract: Automatic Speech Recognition (ASR) is an imperfect process that results in certain mismatches in ASR output text when compared to plain written text or transcriptions. When plain text data is to be used to train systems for spoken language understanding or ASR, a proven strategy to reduce said mismatch and prevent degradations, is to hallucinate what the ASR outputs would be given a gold transcrip… ▽ More Automatic Speech Recognition (ASR) is an imperfect process that results in certain mismatches in ASR output text when compared to plain written text or transcriptions. When plain text data is to be used to train systems for spoken language understanding or ASR, a proven strategy to reduce said mismatch and prevent degradations, is to hallucinate what the ASR outputs would be given a gold transcription. Prior work in this domain has focused on modeling errors at the phonetic level, while using a lexicon to convert the phones to words, usually accompanied by an FST Language model. We present novel end-to-end models to directly predict hallucinated ASR word sequence outputs, conditioning on an input word sequence as well as a corresponding phoneme sequence. This improves prior published results for recall of errors from an in-domain ASR system's transcription of unseen data, as well as an out-of-domain ASR system's transcriptions of audio from an unrelated task, while additionally exploring an in-between scenario when limited characterization data from the test ASR system is obtainable. To verify the extrinsic validity of the method, we also use our hallucinated ASR errors to augment training for a spoken question classifier, finding that they enable robustness to real ASR errors in a downstream task, when scarce or even zero task-specific audio was available at train-time. △ Less

Submitted 31 March, 2021; v1 submitted 22 March, 2021; originally announced March 2021.

Comments: Submitted to IEEE/ACM Transactions on Audio Speech and Language Processing

arXiv:2010.15090 [pdf, other]

Handling Class Imbalance in Low-Resource Dialogue Systems by Combining Few-Shot Classification and Interpolation

Authors: Vishal Sunder, Eric Fosler-Lussier

Abstract: Utterance classification performance in low-resource dialogue systems is constrained by an inevitably high degree of data imbalance in class labels. We present a new end-to-end pairwise learning framework that is designed specifically to tackle this phenomenon by inducing a few-shot classification capability in the utterance representations and augmenting data through an interpolation of utterance… ▽ More Utterance classification performance in low-resource dialogue systems is constrained by an inevitably high degree of data imbalance in class labels. We present a new end-to-end pairwise learning framework that is designed specifically to tackle this phenomenon by inducing a few-shot classification capability in the utterance representations and augmenting data through an interpolation of utterance representations. Our approach is a general purpose training methodology, agnostic to the neural architecture used for encoding utterances. We show significant improvements in macro-F1 score over standard cross-entropy training for three different neural architectures, demonstrating improvements on a Virtual Patient dialogue dataset as well as a low-resourced emulation of the Switchboard dialogue act classification dataset. △ Less

Submitted 28 October, 2020; originally announced October 2020.

Comments: 5 pages, 4 figures, 3 tables

arXiv:1912.07228 [pdf, ps, other]

Planar algebras, quantum information theory and subfactors

Authors: Vijay Kodiyalam, Sruthymurali, V. S. Sunder

Abstract: We define generalised notions of biunitary elements in planar algebras and show that objects arising in quantum information theory such as Hadamard matrices, quantum latin squares and unitary error bases are all given by biunitary elements in the spin planar algebra. We show that there are natural subfactor planar algebras associated with biunitary elements. We define generalised notions of biunitary elements in planar algebras and show that objects arising in quantum information theory such as Hadamard matrices, quantum latin squares and unitary error bases are all given by biunitary elements in the spin planar algebra. We show that there are natural subfactor planar algebras associated with biunitary elements. △ Less

Submitted 16 December, 2019; originally announced December 2019.

Comments: 18 pages, 25 figures

MSC Class: 46L37; 81P45; 81P68

arXiv:1906.02427 [pdf, other]

One-shot Information Extraction from Document Images using Neuro-Deductive Program Synthesis

Authors: Vishal Sunder, Ashwin Srinivasan, Lovekesh Vig, Gautam Shroff, Rohit Rahul

Abstract: Our interest in this paper is in meeting a rapidly growing industrial demand for information extraction from images of documents such as invoices, bills, receipts etc. In practice users are able to provide a very small number of example images labeled with the information that needs to be extracted. We adopt a novel two-level neuro-deductive, approach where (a) we use pre-trained deep neural netwo… ▽ More Our interest in this paper is in meeting a rapidly growing industrial demand for information extraction from images of documents such as invoices, bills, receipts etc. In practice users are able to provide a very small number of example images labeled with the information that needs to be extracted. We adopt a novel two-level neuro-deductive, approach where (a) we use pre-trained deep neural networks to populate a relational database with facts about each document-image; and (b) we use a form of deductive reasoning, related to meta-interpretive learning of transition systems to learn extraction programs: Given task-specific transitions defined using the entities and relations identified by the neural detectors and a small number of instances (usually 1, sometimes 2) of images and the desired outputs, a resource-bounded meta-interpreter constructs proofs for the instance(s) via logical deduction; a set of logic programs that extract each desired entity is easily synthesized from such proofs. In most cases a single training example together with a noisy-clone of itself suffices to learn a program-set that generalizes well on test documents, at which time the value of each entity is determined by a majority vote across its program-set. We demonstrate our two-level neuro-deductive approach on publicly available datasets ("Patent" and "Doctor's Bills") and also describe its use in a real-life industrial problem. △ Less

Submitted 6 June, 2019; originally announced June 2019.

Comments: 11 pages, appears in the 13th International Workshop on Neural-Symbolic Learning and Reasoning at IJCAI 2019

arXiv:1901.11191 [pdf, ps, other]

On a presentation of the spin planar algebra

Authors: Vijay Kodiyalam, Sohan Lal Saini, Sruthymurali, V. S. Sunder

Abstract: We define a certain abstract planar algebra by generators and relations, study various aspects of its structure, and then identify it with Jones' spin planar algebra. We define a certain abstract planar algebra by generators and relations, study various aspects of its structure, and then identify it with Jones' spin planar algebra. △ Less

Submitted 30 January, 2019; originally announced January 2019.

Comments: 11 pages, 12 figures

MSC Class: 46L37

arXiv:1809.07066 [pdf, other]

Prosocial or Selfish? Agents with different behaviors for Contract Negotiation using Reinforcement Learning

Authors: Vishal Sunder, Lovekesh Vig, Arnab Chatterjee, Gautam Shroff

Abstract: We present an effective technique for training deep learning agents capable of negotiating on a set of clauses in a contract agreement using a simple communication protocol. We use Multi Agent Reinforcement Learning to train both agents simultaneously as they negotiate with each other in the training environment. We also model selfish and prosocial behavior to varying degrees in these agents. Empi… ▽ More We present an effective technique for training deep learning agents capable of negotiating on a set of clauses in a contract agreement using a simple communication protocol. We use Multi Agent Reinforcement Learning to train both agents simultaneously as they negotiate with each other in the training environment. We also model selfish and prosocial behavior to varying degrees in these agents. Empirical evidence is provided showing consistency in agent behaviors. We further train a meta agent with a mixture of behaviors by learning an ensemble of different models using reinforcement learning. Finally, to ascertain the deployability of the negotiating agents, we conducted experiments pitting the trained agents against human players. Results demonstrate that the agents are able to hold their own against human players, often emerging as winners in the negotiation. Our experiments demonstrate that the meta agent is able to reasonably emulate human behavior. △ Less

Submitted 19 September, 2018; originally announced September 2018.

Comments: Proceedings of the 11th International Workshop on Automated Negotiations (held in conjunction with IJCAI 2018)

arXiv:1804.01000 [pdf, other]

CIKM AnalytiCup 2017 Lazada Product Title Quality Challenge An Ensemble of Deep and Shallow Learning to predict the Quality of Product Titles

Authors: Karamjit Singh, Vishal Sunder

Abstract: We present an approach where two different models (Deep and Shallow) are trained separately on the data and a weighted average of the outputs is taken as the final result. For the Deep approach, we use different combinations of models like Convolution Neural Network, pretrained word2vec embeddings and LSTMs to get representations which are then used to train a Deep Neural Network. For Clarity pred… ▽ More We present an approach where two different models (Deep and Shallow) are trained separately on the data and a weighted average of the outputs is taken as the final result. For the Deep approach, we use different combinations of models like Convolution Neural Network, pretrained word2vec embeddings and LSTMs to get representations which are then used to train a Deep Neural Network. For Clarity prediction, we also use an Attentive Pooling approach for the pooling operation so as to be aware of the Title-Category pair. For the shallow approach, we use boosting technique LightGBM on features generated using title and categories. We find that an ensemble of these approaches does a better job than using them alone suggesting that the results of the deep and shallow approach are highly complementary △ Less

Submitted 1 April, 2018; originally announced April 2018.

arXiv:1509.04884 [pdf, ps, other]

doi 10.1007/s11117-015-0377-x

On a tensor-analogue of the Schur product

Authors: K. Sumesh, V. S. Sunder

Abstract: We consider the tensorial Schur product $R \circ^\otimes S = [r_{ij} \otimes s_{ij}]$ for $R \in M_n(\mathcal{A}), S\in M_n(\mathcal{B}),$ with $\mathcal{A}, \mathcal{B}$ unital $C^*$-algebras, verify that such a `tensorial Schur product' of positive operators is again positive, and then use this fact to prove (an apparently marginally more general version of) the classical result of Choi that a l… ▽ More We consider the tensorial Schur product $R \circ^\otimes S = [r_{ij} \otimes s_{ij}]$ for $R \in M_n(\mathcal{A}), S\in M_n(\mathcal{B}),$ with $\mathcal{A}, \mathcal{B}$ unital $C^*$-algebras, verify that such a `tensorial Schur product' of positive operators is again positive, and then use this fact to prove (an apparently marginally more general version of) the classical result of Choi that a linear map $φ:M_n \to M_d$ is completely positive if and only if $[φ(E_{ij})] \in M_n(M_d)^+$, where of course $\{E_{ij}:1 \leq i,j \leq n\}$ denotes the usual system of matrix units in $M_n (:= M_n(\mathbb{C}))$. We also discuss some other corollaries of the main result. △ Less

Submitted 14 October, 2015; v1 submitted 16 September, 2015; originally announced September 2015.

Comments: Corrected typos, The final publication (with marginal changes) is available at Springer via http://dx.doi.org/[10.1007/s11117-015-0377-x]

arXiv:1410.7188 [pdf, ps, other]

The Functional Analysis of Quantum Information Theory

Authors: Ved Prakash Gupta, Prabha Mandayam, V. S. Sunder

Abstract: This book is a compilation of notes from a two-week international workshop on the "The Functional Analysis of Quantum Information Theory" that was held at the Institute of Mathematical Sciences during 26/12/2011-06/01/2012. The workshop was devoted to the mathematical framework of quantized functional analysis (QFA), and aimed at illustrating its applications to problems in quantum communication.… ▽ More This book is a compilation of notes from a two-week international workshop on the "The Functional Analysis of Quantum Information Theory" that was held at the Institute of Mathematical Sciences during 26/12/2011-06/01/2012. The workshop was devoted to the mathematical framework of quantized functional analysis (QFA), and aimed at illustrating its applications to problems in quantum communication. The lectures were given by Gilles Pisier (Pierre and Marie Curie University and Texas A&M), K.R. Parthasarathy (ISI Delhi), Vern Paulsen (University of Houston), and Andreas Winter (Universitat Autonoma de Barcelona). Topics discussed include Operator Spaces and Completely bounded maps, Schmidt number and Schmidt rank of bipartite entangled states, Operator Systems and Completely Positive Maps, and, Operator Methods in Quantum Information. △ Less

Submitted 28 April, 2015; v1 submitted 27 October, 2014; originally announced October 2014.

Comments: v3; 123 pages; 4 chapters; To appear in Springer Lecture Notes in Physics; Video recordings of the lectures can be found here: https://www.youtube.com/playlist?list=PLD3E479AB374A718F&spfreload=10

arXiv:1211.2576 [pdf, ps, other]

Extendable endomorphisms on factors

Authors: Panchugopal Bikram, Masaki Izumi, R. Srinivasan, V. S. Sunder

Abstract: We begin this note with a von Neumann algebraic version of the elementary but extremely useful fact about being able to extend inner-product preserving maps from a total set of the domain Hilbert space to an isometry defined on the entire domain. This leads us to the notion of when `good' endomorphisms of a factorial probability space $(M,φ)$ (which we call equi-modular) admit a natural extension… ▽ More We begin this note with a von Neumann algebraic version of the elementary but extremely useful fact about being able to extend inner-product preserving maps from a total set of the domain Hilbert space to an isometry defined on the entire domain. This leads us to the notion of when `good' endomorphisms of a factorial probability space $(M,φ)$ (which we call equi-modular) admit a natural extension to endomorphisms of $L^2(M,φ)$. We exhibit examples of such extendable endomorphisms. We then pass to $E_0$-semigroups $α= {α_t: t \geq 0}$ of factors, and observe that extendability of this semigroup (i.e., extendability of each $α_t$) is a cocycle-conjugacy invariant of the semigroup. We identify a necessary condition for extendability of such an $E_0$-semigroup, which we then use to show that the Clifford flow on the hyperfinite $II_1$ factor is not extendable. △ Less

Submitted 11 October, 2013; v1 submitted 12 November, 2012; originally announced November 2012.

Comments: 26 pages. New co-author (Izumi) added in view of his contributions

MSC Class: 46L55

arXiv:1210.7581 [pdf, ps, other]

Continuous minimax theorems

Authors: Madhushree Basu, V. S. Sunder

Abstract: In classical matrix theory, there exist useful extremal characterizations of eigenvalues and their sums for Hermitian matrices (due to Ky Fan, Courant-Fischer-Weyl and Wielandt) and some consequences such as the majorization assertion in Lidskii's theorem. In this paper, we extend these results to the context of self adjoint elements of finite von Neumann algebras, and their distribution and quant… ▽ More In classical matrix theory, there exist useful extremal characterizations of eigenvalues and their sums for Hermitian matrices (due to Ky Fan, Courant-Fischer-Weyl and Wielandt) and some consequences such as the majorization assertion in Lidskii's theorem. In this paper, we extend these results to the context of self adjoint elements of finite von Neumann algebras, and their distribution and quantile functions. This work was motivated by a lemma in a paper by Voiculescu and Bercovici, that described such an extremal characterization of the distribution of a self-adjoint operator affiliated to a finite von Neumann algebra - suggesting a possible analogue of the classical Courant-Fischer-Weyl minmax theorem, for a self adjoint operator in a finite von Neumann algebra. It is to be noted that the only von Neumann algebras considered here have separable pre-duals. △ Less

Submitted 11 November, 2013; v1 submitted 29 October, 2012; originally announced October 2012.

MSC Class: 46L10; 60B11; 34L15

arXiv:1102.4663 [pdf, ps, other]

Hilbert von Neumann modules

Authors: Panchugopal Bikram, Kunal Mukherjee, R. Srinivasan, V. S. Sunder

Abstract: We introduce a way of regarding Hilbert von Neumann modules as spaces of operators between Hilbert space, not unlike [Skei], but in an apparently much simpler manner and involving far less machinery. We verify that our definition is equivalent to that of [Skei], by verifying the `Riesz lemma' or what is called `self-duality' in [Skei]. An advantage with our approach is that we can totally side-ste… ▽ More We introduce a way of regarding Hilbert von Neumann modules as spaces of operators between Hilbert space, not unlike [Skei], but in an apparently much simpler manner and involving far less machinery. We verify that our definition is equivalent to that of [Skei], by verifying the `Riesz lemma' or what is called `self-duality' in [Skei]. An advantage with our approach is that we can totally side-step the need to go through $C^*$-modules and avoid the two stages of completion - first in norm, then in the strong operator topology - involved in the former approach. We establish the analogue of the Stinespring dilation theorem for Hilbert von Neumann bimodules, and we develop our version of `internal tensor products' which we refer to as Connes fusion for obvious reasons. In our discussion of examples, we examine the bimodules arising from automorphisms of von Neumann algebras, verify that fusion of bimodules corresponds to composition of automorphisms in this case, and that the isomorphism class of such a bimodule depends only on the inner conjugacy class of the automorphism. We also relate Jones' basic construction to the Stinespring dilation associated to the conditional expectation onto a finite-index inclusion (by invoking the uniqueness assertion regarding the latter). △ Less

Submitted 23 February, 2011; originally announced February 2011.

Comments: 20 pages

MSC Class: 46L10

arXiv:1102.4413 [pdf, ps, other]

From graphs to free products

Authors: Madhushree Basu, Vijay Kodiyalam, V. S. Sunder

Abstract: We investigate a construction which associates a finite von Neumann algebra $M(Γ,μ)$ to a finite weighted graph $(Γ,μ)$. Pleasantly, but not surprisingly, the von Neumann algebra associated to to a `flower with $n$ petals' is the group von Neumann algebra of the free group on $n$ generators. In general, the algebra $M(Γ,μ)$ is a free product, with amalgamation over a finite-dimensional abelian sub… ▽ More We investigate a construction which associates a finite von Neumann algebra $M(Γ,μ)$ to a finite weighted graph $(Γ,μ)$. Pleasantly, but not surprisingly, the von Neumann algebra associated to to a `flower with $n$ petals' is the group von Neumann algebra of the free group on $n$ generators. In general, the algebra $M(Γ,μ)$ is a free product, with amalgamation over a finite-dimensional abelian subalgebra corresponding to the vertex set, of algebras associated to subgraphs `with one edge' (or actually a pair of dual edges). This also yields `natural' examples of (i) a Fock-type model of an operator with a free Poisson distribution; and (ii) $\C \oplus \C$-valued circular and semi-circular operators. △ Less

Submitted 22 February, 2011; originally announced February 2011.

Comments: 14 pages, 1 figure

MSC Class: 46L54

arXiv:0911.2047 [pdf, ps, other]

On the Guionnet-Jones-Shlyakhtenko construction for graphs

Authors: Vijay Kodiyalam, V. S. Sunder

Abstract: Using an analogue of the Guionnet-Jones-Shlaykhtenko construction for graphs we show that their construction applied to any subfactor planar algebra of finite depth yields an inclusion of interpolated free group factors with finite parameter, thereby giving another proof of their universality for finite depth planar algebras. Using an analogue of the Guionnet-Jones-Shlaykhtenko construction for graphs we show that their construction applied to any subfactor planar algebra of finite depth yields an inclusion of interpolated free group factors with finite parameter, thereby giving another proof of their universality for finite depth planar algebras. △ Less

Submitted 24 March, 2010; v1 submitted 11 November, 2009; originally announced November 2009.

Comments: 42 pages, 12 figures. v2 has updated references and minor changes. v3 corrects some typos.

MSC Class: 46L37; 46L54

arXiv:0901.3180 [pdf, ps, other]

Guionnet-Jones-Shlyakhtenko subfactors associated to finite-dimensional Kac algebras

Authors: Vijay Kodiyalam, V. S. Sunder

Abstract: We analyse the Guionnet-Jones-Shlyakhtenko construction for the planar algebra associated to a finite-dimensional Kac algebra and identify the factors that arise as finite interpolated free group factors. We analyse the Guionnet-Jones-Shlyakhtenko construction for the planar algebra associated to a finite-dimensional Kac algebra and identify the factors that arise as finite interpolated free group factors. △ Less

Submitted 10 March, 2009; v1 submitted 20 January, 2009; originally announced January 2009.

Comments: 18 pages, 21 figures, corrected typos

arXiv:0807.3704 [pdf, ps, other]

From subfactor planar algebras to subfactors

Authors: Vijay Kodiyalam, V. S. Sunder

Abstract: We present a purely planar algebraic proof of the main result of a paper of Guionnet-Jones-Shlaykhtenko which constructs an extremal subfactor from a subfactor planar algebra whose standard invariant is given by that planar algebra. We present a purely planar algebraic proof of the main result of a paper of Guionnet-Jones-Shlaykhtenko which constructs an extremal subfactor from a subfactor planar algebra whose standard invariant is given by that planar algebra. △ Less

Submitted 23 July, 2008; originally announced July 2008.

Comments: 22 pages, 25 figures

MSC Class: 46L37

arXiv:math/0509302 [pdf, ps, other]

Planar algebras and Kuperberg's 3-manifold invariant

Authors: Vijay Kodiyalam, V. S. Sunder

Abstract: We recapture Kuperberg's numerical invariant of 3-manifolds associated to a semisimple and cosemisimple Hopf algebra through a `planar algebra construction'. A result of possibly independent interest, used during the proof, which relates duality in planar graphs and Hopf algebras, is the subject of a final section. We recapture Kuperberg's numerical invariant of 3-manifolds associated to a semisimple and cosemisimple Hopf algebra through a `planar algebra construction'. A result of possibly independent interest, used during the proof, which relates duality in planar graphs and Hopf algebras, is the subject of a final section. △ Less

Submitted 14 September, 2005; originally announced September 2005.

Comments: 19 pages, 9 figures

MSC Class: 57M27;16W30

arXiv:math/0507050 [pdf, ps, other]

doi 10.1142/S0129167X07003923

Subfactors and 1+1-dimensional TQFTs

Authors: Vijay Kodiyalam, Vishwambhar Pati, V. S. Sunder

Abstract: We construct a certain `cobordism category' ${\cal D}$ whose morphisms are suitably decorated cobordism classes between similarly decorated closed oriented 1-manifolds, and show that there is essentially a bijection between (1+1-dimensional) unitary topological quantum field theories (TQFTs) defined on ${\cal D}$, on the one hand, and Jones' subfactor planar algebras, on the other. We construct a certain `cobordism category' ${\cal D}$ whose morphisms are suitably decorated cobordism classes between similarly decorated closed oriented 1-manifolds, and show that there is essentially a bijection between (1+1-dimensional) unitary topological quantum field theories (TQFTs) defined on ${\cal D}$, on the one hand, and Jones' subfactor planar algebras, on the other. △ Less

Submitted 4 July, 2005; originally announced July 2005.

Comments: 57 pages, 9 figures

MSC Class: 46L37

Journal ref: Int.J.Math.18:69-112,2007

arXiv:math/0506153 [pdf, ps, other]

The planar algebra of a semisimple and cosemisimple Hopf algebra

Authors: Vijay Kodiyalam, V. S. Sunder

Abstract: To a semisimple and cosemisimple Hopf algebra over an algebraically closed field, we associate a planar algebra defined by generators and relations and show that it is a connected, irreducible, spherical, non-degenerate planar algebra with non-zero modulus and of depth two. This association is shown to yield a bijection between (the isomorphism classes, on both sides, of) such objects. To a semisimple and cosemisimple Hopf algebra over an algebraically closed field, we associate a planar algebra defined by generators and relations and show that it is a connected, irreducible, spherical, non-degenerate planar algebra with non-zero modulus and of depth two. This association is shown to yield a bijection between (the isomorphism classes, on both sides, of) such objects. △ Less

Submitted 20 June, 2005; v1 submitted 9 June, 2005; originally announced June 2005.

Comments: 16 pages, 20 figures; content added

MSC Class: 16W30; 46L37

Showing 1–26 of 26 results for author: Sunder, V