-
Granite-speech: open-source speech-aware LLMs with strong English ASR capabilities
Authors:
George Saon,
Avihu Dekel,
Alexander Brooks,
Tohru Nagano,
Abraham Daniels,
Aharon Satt,
Ashish Mittal,
Brian Kingsbury,
David Haws,
Edmilson Morais,
Gakuto Kurata,
Hagai Aronowitz,
Ibrahim Ibrahim,
Jeff Kuo,
Kate Soule,
Luis Lastras,
Masayuki Suzuki,
Ron Hoory,
Samuel Thomas,
Sashi Novitasari,
Takashi Fukuda,
Vishal Sunder,
Xiaodong Cui,
Zvi Kons
Abstract:
Granite-speech LLMs are compact and efficient speech language models specifically designed for English ASR and automatic speech translation (AST). The models were trained by modality aligning the 2B and 8B parameter variants of granite-3.3-instruct to speech on publicly available open-source corpora containing audio inputs and text targets consisting of either human transcripts for ASR or automati…
▽ More
Granite-speech LLMs are compact and efficient speech language models specifically designed for English ASR and automatic speech translation (AST). The models were trained by modality aligning the 2B and 8B parameter variants of granite-3.3-instruct to speech on publicly available open-source corpora containing audio inputs and text targets consisting of either human transcripts for ASR or automatically generated translations for AST. Comprehensive benchmarking shows that on English ASR, which was our primary focus, they outperform several competitors' models that were trained on orders of magnitude more proprietary data, and they keep pace on English-to-X AST for major European languages, Japanese, and Chinese. The speech-specific components are: a conformer acoustic encoder using block attention and self-conditioning trained with connectionist temporal classification, a windowed query-transformer speech modality adapter used to do temporal downsampling of the acoustic embeddings and map them to the LLM text embedding space, and LoRA adapters to further fine-tune the text LLM. Granite-speech-3.3 operates in two modes: in speech mode, it performs ASR and AST by activating the encoder, projector, and LoRA adapters; in text mode, it calls the underlying granite-3.3-instruct model directly (without LoRA), essentially preserving all the text LLM capabilities and safety. Both models are freely available on HuggingFace (https://huggingface.co/ibm-granite/granite-speech-3.3-2b and https://huggingface.co/ibm-granite/granite-speech-3.3-8b) and can be used for both research and commercial purposes under a permissive Apache 2.0 license.
△ Less
Submitted 13 May, 2025; v1 submitted 13 May, 2025;
originally announced May 2025.
-
A Non-autoregressive Model for Joint STT and TTS
Authors:
Vishal Sunder,
Brian Kingsbury,
George Saon,
Samuel Thomas,
Slava Shechtman,
Hagai Aronowitz,
Eric Fosler-Lussier,
Luis Lastras
Abstract:
In this paper, we take a step towards jointly modeling automatic speech recognition (STT) and speech synthesis (TTS) in a fully non-autoregressive way. We develop a novel multimodal framework capable of handling the speech and text modalities as input either individually or together. The proposed model can also be trained with unpaired speech or text data owing to its multimodal nature. We further…
▽ More
In this paper, we take a step towards jointly modeling automatic speech recognition (STT) and speech synthesis (TTS) in a fully non-autoregressive way. We develop a novel multimodal framework capable of handling the speech and text modalities as input either individually or together. The proposed model can also be trained with unpaired speech or text data owing to its multimodal nature. We further propose an iterative refinement strategy to improve the STT and TTS performance of our model such that the partial hypothesis at the output can be fed back to the input of our model, thus iteratively improving both STT and TTS predictions. We show that our joint model can effectively perform both STT and TTS tasks, outperforming the STT-specific baseline in all tasks and performing competitively with the TTS-specific baseline across a wide range of evaluation metrics.
△ Less
Submitted 20 January, 2025; v1 submitted 15 January, 2025;
originally announced January 2025.
-
Improving Transducer-Based Spoken Language Understanding with Self-Conditioned CTC and Knowledge Transfer
Authors:
Vishal Sunder,
Eric Fosler-Lussier
Abstract:
In this paper, we propose to improve end-to-end (E2E) spoken language understand (SLU) in an RNN transducer model (RNN-T) by incorporating a joint self-conditioned CTC automatic speech recognition (ASR) objective. Our proposed model is akin to an E2E differentiable cascaded model which performs ASR and SLU sequentially and we ensure that the SLU task is conditioned on the ASR task by having CTC se…
▽ More
In this paper, we propose to improve end-to-end (E2E) spoken language understand (SLU) in an RNN transducer model (RNN-T) by incorporating a joint self-conditioned CTC automatic speech recognition (ASR) objective. Our proposed model is akin to an E2E differentiable cascaded model which performs ASR and SLU sequentially and we ensure that the SLU task is conditioned on the ASR task by having CTC self conditioning. This novel joint modeling of ASR and SLU improves SLU performance significantly over just using SLU optimization. We further improve the performance by aligning the acoustic embeddings of this model with the semantically richer BERT model. Our proposed knowledge transfer strategy makes use of a bag-of-entity prediction layer on the aligned embeddings and the output of this is used to condition the RNN-T based SLU decoding. These techniques show significant improvement over several strong baselines and can perform at par with large models like Whisper with significantly fewer parameters.
△ Less
Submitted 3 January, 2025;
originally announced January 2025.
-
End-to-End real time tracking of children's reading with pointer network
Authors:
Vishal Sunder,
Beulah Karrolla,
Eric Fosler-Lussier
Abstract:
In this work, we explore how a real time reading tracker can be built efficiently for children's voices. While previously proposed reading trackers focused on ASR-based cascaded approaches, we propose a fully end-to-end model making it less prone to lags in voice tracking. We employ a pointer network that directly learns to predict positions in the ground truth text conditioned on the streaming sp…
▽ More
In this work, we explore how a real time reading tracker can be built efficiently for children's voices. While previously proposed reading trackers focused on ASR-based cascaded approaches, we propose a fully end-to-end model making it less prone to lags in voice tracking. We employ a pointer network that directly learns to predict positions in the ground truth text conditioned on the streaming speech. To train this pointer network, we generate ground truth training signals by using forced alignment between the read speech and the text being read on the training set. Exploring different forced alignment models, we find a neural attention based model is at least as close in alignment accuracy to the Montreal Forced Aligner, but surprisingly is a better training signal for the pointer network. Our results are reported on one adult speech data (TIMIT) and two children's speech datasets (CMU Kids and Reading Races). Our best model can accurately track adult speech with 87.8% accuracy and the much harder and disfluent children's speech with 77.1% accuracy on CMU Kids data and a 65.3% accuracy on the Reading Races dataset.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems
Authors:
Vishal Sunder,
Eric Fosler-Lussier,
Samuel Thomas,
Hong-Kwang J. Kuo,
Brian Kingsbury
Abstract:
Recent advances in End-to-End (E2E) Spoken Language Understanding (SLU) have been primarily due to effective pretraining of speech representations. One such pretraining paradigm is the distillation of semantic knowledge from state-of-the-art text-based models like BERT to speech encoder neural networks. This work is a step towards doing the same in a much more efficient and fine-grained manner whe…
▽ More
Recent advances in End-to-End (E2E) Spoken Language Understanding (SLU) have been primarily due to effective pretraining of speech representations. One such pretraining paradigm is the distillation of semantic knowledge from state-of-the-art text-based models like BERT to speech encoder neural networks. This work is a step towards doing the same in a much more efficient and fine-grained manner where we align speech embeddings and BERT embeddings on a token-by-token basis. We introduce a simple yet novel technique that uses a cross-modal attention mechanism to extract token-level contextual embeddings from a speech encoder such that these can be directly compared and aligned with BERT based contextual embeddings. This alignment is performed using a novel tokenwise contrastive loss. Fine-tuning such a pretrained model to perform intent recognition using speech directly yields state-of-the-art performance on two widely used SLU datasets. Our model improves further when fine-tuned with additional regularization using SpecAugment especially when speech is noisy, giving an absolute improvement as high as 8% over previous results.
△ Less
Submitted 1 July, 2022; v1 submitted 11 April, 2022;
originally announced April 2022.
-
Building an ASR Error Robust Spoken Virtual Patient System in a Highly Class-Imbalanced Scenario Without Speech Data
Authors:
Vishal Sunder,
Prashant Serai,
Eric Fosler-Lussier
Abstract:
A Virtual Patient (VP) is a powerful tool for training medical students to take patient histories, where responding to a diverse set of spoken questions is essential to simulate natural conversations with a student. The performance of such a Spoken Language Understanding system (SLU) can be adversely affected by both the presence of Automatic Speech Recognition (ASR) errors in the test data and a…
▽ More
A Virtual Patient (VP) is a powerful tool for training medical students to take patient histories, where responding to a diverse set of spoken questions is essential to simulate natural conversations with a student. The performance of such a Spoken Language Understanding system (SLU) can be adversely affected by both the presence of Automatic Speech Recognition (ASR) errors in the test data and a high degree of class imbalance in the SLU training data. While these two issues have been addressed separately in prior work, we develop a novel two-step training methodology that tackles both these issues effectively in a single dialog agent. As it is difficult to collect spoken data from users without a functioning SLU system, our method does not rely on spoken data for training, rather we use an ASR error predictor to "speechify" the text data. Our method shows significant improvements over strong baselines on the VP intent classification task at various word error rate settings.
△ Less
Submitted 1 July, 2022; v1 submitted 11 April, 2022;
originally announced April 2022.
-
Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding
Authors:
Vishal Sunder,
Samuel Thomas,
Hong-Kwang J. Kuo,
Jatin Ganhotra,
Brian Kingsbury,
Eric Fosler-Lussier
Abstract:
Dialog history plays an important role in spoken language understanding (SLU) performance in a dialog system. For end-to-end (E2E) SLU, previous work has used dialog history in text form, which makes the model dependent on a cascaded automatic speech recognizer (ASR). This rescinds the benefits of an E2E system which is intended to be compact and robust to ASR errors. In this paper, we propose a h…
▽ More
Dialog history plays an important role in spoken language understanding (SLU) performance in a dialog system. For end-to-end (E2E) SLU, previous work has used dialog history in text form, which makes the model dependent on a cascaded automatic speech recognizer (ASR). This rescinds the benefits of an E2E system which is intended to be compact and robust to ASR errors. In this paper, we propose a hierarchical conversation model that is capable of directly using dialog history in speech form, making it fully E2E. We also distill semantic knowledge from the available gold conversation transcripts by jointly training a similar text-based conversation model with an explicit tying of acoustic and semantic embeddings. We also propose a novel technique that we call DropFrame to deal with the long training time incurred by adding dialog history in an E2E manner. On the HarperValleyBank dialog dataset, our E2E history integration outperforms a history independent baseline by 7.7% absolute F1 score on the task of dialog action recognition. Our model performs competitively with the state-of-the-art history based cascaded baseline, but uses 48% fewer parameters. In the absence of gold transcripts to fine-tune an ASR model, our model outperforms this baseline by a significant margin of 10% absolute F1 score.
△ Less
Submitted 11 April, 2022;
originally announced April 2022.
-
Hallucination of speech recognition errors with sequence to sequence learning
Authors:
Prashant Serai,
Vishal Sunder,
Eric Fosler-Lussier
Abstract:
Automatic Speech Recognition (ASR) is an imperfect process that results in certain mismatches in ASR output text when compared to plain written text or transcriptions. When plain text data is to be used to train systems for spoken language understanding or ASR, a proven strategy to reduce said mismatch and prevent degradations, is to hallucinate what the ASR outputs would be given a gold transcrip…
▽ More
Automatic Speech Recognition (ASR) is an imperfect process that results in certain mismatches in ASR output text when compared to plain written text or transcriptions. When plain text data is to be used to train systems for spoken language understanding or ASR, a proven strategy to reduce said mismatch and prevent degradations, is to hallucinate what the ASR outputs would be given a gold transcription. Prior work in this domain has focused on modeling errors at the phonetic level, while using a lexicon to convert the phones to words, usually accompanied by an FST Language model. We present novel end-to-end models to directly predict hallucinated ASR word sequence outputs, conditioning on an input word sequence as well as a corresponding phoneme sequence. This improves prior published results for recall of errors from an in-domain ASR system's transcription of unseen data, as well as an out-of-domain ASR system's transcriptions of audio from an unrelated task, while additionally exploring an in-between scenario when limited characterization data from the test ASR system is obtainable. To verify the extrinsic validity of the method, we also use our hallucinated ASR errors to augment training for a spoken question classifier, finding that they enable robustness to real ASR errors in a downstream task, when scarce or even zero task-specific audio was available at train-time.
△ Less
Submitted 31 March, 2021; v1 submitted 22 March, 2021;
originally announced March 2021.
-
Handling Class Imbalance in Low-Resource Dialogue Systems by Combining Few-Shot Classification and Interpolation
Authors:
Vishal Sunder,
Eric Fosler-Lussier
Abstract:
Utterance classification performance in low-resource dialogue systems is constrained by an inevitably high degree of data imbalance in class labels. We present a new end-to-end pairwise learning framework that is designed specifically to tackle this phenomenon by inducing a few-shot classification capability in the utterance representations and augmenting data through an interpolation of utterance…
▽ More
Utterance classification performance in low-resource dialogue systems is constrained by an inevitably high degree of data imbalance in class labels. We present a new end-to-end pairwise learning framework that is designed specifically to tackle this phenomenon by inducing a few-shot classification capability in the utterance representations and augmenting data through an interpolation of utterance representations. Our approach is a general purpose training methodology, agnostic to the neural architecture used for encoding utterances. We show significant improvements in macro-F1 score over standard cross-entropy training for three different neural architectures, demonstrating improvements on a Virtual Patient dialogue dataset as well as a low-resourced emulation of the Switchboard dialogue act classification dataset.
△ Less
Submitted 28 October, 2020;
originally announced October 2020.
-
Planar algebras, quantum information theory and subfactors
Authors:
Vijay Kodiyalam,
Sruthymurali,
V. S. Sunder
Abstract:
We define generalised notions of biunitary elements in planar algebras and show that objects arising in quantum information theory such as Hadamard matrices, quantum latin squares and unitary error bases are all given by biunitary elements in the spin planar algebra. We show that there are natural subfactor planar algebras associated with biunitary elements.
We define generalised notions of biunitary elements in planar algebras and show that objects arising in quantum information theory such as Hadamard matrices, quantum latin squares and unitary error bases are all given by biunitary elements in the spin planar algebra. We show that there are natural subfactor planar algebras associated with biunitary elements.
△ Less
Submitted 16 December, 2019;
originally announced December 2019.
-
One-shot Information Extraction from Document Images using Neuro-Deductive Program Synthesis
Authors:
Vishal Sunder,
Ashwin Srinivasan,
Lovekesh Vig,
Gautam Shroff,
Rohit Rahul
Abstract:
Our interest in this paper is in meeting a rapidly growing industrial demand for information extraction from images of documents such as invoices, bills, receipts etc. In practice users are able to provide a very small number of example images labeled with the information that needs to be extracted. We adopt a novel two-level neuro-deductive, approach where (a) we use pre-trained deep neural netwo…
▽ More
Our interest in this paper is in meeting a rapidly growing industrial demand for information extraction from images of documents such as invoices, bills, receipts etc. In practice users are able to provide a very small number of example images labeled with the information that needs to be extracted. We adopt a novel two-level neuro-deductive, approach where (a) we use pre-trained deep neural networks to populate a relational database with facts about each document-image; and (b) we use a form of deductive reasoning, related to meta-interpretive learning of transition systems to learn extraction programs: Given task-specific transitions defined using the entities and relations identified by the neural detectors and a small number of instances (usually 1, sometimes 2) of images and the desired outputs, a resource-bounded meta-interpreter constructs proofs for the instance(s) via logical deduction; a set of logic programs that extract each desired entity is easily synthesized from such proofs. In most cases a single training example together with a noisy-clone of itself suffices to learn a program-set that generalizes well on test documents, at which time the value of each entity is determined by a majority vote across its program-set. We demonstrate our two-level neuro-deductive approach on publicly available datasets ("Patent" and "Doctor's Bills") and also describe its use in a real-life industrial problem.
△ Less
Submitted 6 June, 2019;
originally announced June 2019.
-
On a presentation of the spin planar algebra
Authors:
Vijay Kodiyalam,
Sohan Lal Saini,
Sruthymurali,
V. S. Sunder
Abstract:
We define a certain abstract planar algebra by generators and relations, study various aspects of its structure, and then identify it with Jones' spin planar algebra.
We define a certain abstract planar algebra by generators and relations, study various aspects of its structure, and then identify it with Jones' spin planar algebra.
△ Less
Submitted 30 January, 2019;
originally announced January 2019.
-
Prosocial or Selfish? Agents with different behaviors for Contract Negotiation using Reinforcement Learning
Authors:
Vishal Sunder,
Lovekesh Vig,
Arnab Chatterjee,
Gautam Shroff
Abstract:
We present an effective technique for training deep learning agents capable of negotiating on a set of clauses in a contract agreement using a simple communication protocol. We use Multi Agent Reinforcement Learning to train both agents simultaneously as they negotiate with each other in the training environment. We also model selfish and prosocial behavior to varying degrees in these agents. Empi…
▽ More
We present an effective technique for training deep learning agents capable of negotiating on a set of clauses in a contract agreement using a simple communication protocol. We use Multi Agent Reinforcement Learning to train both agents simultaneously as they negotiate with each other in the training environment. We also model selfish and prosocial behavior to varying degrees in these agents. Empirical evidence is provided showing consistency in agent behaviors. We further train a meta agent with a mixture of behaviors by learning an ensemble of different models using reinforcement learning. Finally, to ascertain the deployability of the negotiating agents, we conducted experiments pitting the trained agents against human players. Results demonstrate that the agents are able to hold their own against human players, often emerging as winners in the negotiation. Our experiments demonstrate that the meta agent is able to reasonably emulate human behavior.
△ Less
Submitted 19 September, 2018;
originally announced September 2018.
-
CIKM AnalytiCup 2017 Lazada Product Title Quality Challenge An Ensemble of Deep and Shallow Learning to predict the Quality of Product Titles
Authors:
Karamjit Singh,
Vishal Sunder
Abstract:
We present an approach where two different models (Deep and Shallow) are trained separately on the data and a weighted average of the outputs is taken as the final result. For the Deep approach, we use different combinations of models like Convolution Neural Network, pretrained word2vec embeddings and LSTMs to get representations which are then used to train a Deep Neural Network. For Clarity pred…
▽ More
We present an approach where two different models (Deep and Shallow) are trained separately on the data and a weighted average of the outputs is taken as the final result. For the Deep approach, we use different combinations of models like Convolution Neural Network, pretrained word2vec embeddings and LSTMs to get representations which are then used to train a Deep Neural Network. For Clarity prediction, we also use an Attentive Pooling approach for the pooling operation so as to be aware of the Title-Category pair. For the shallow approach, we use boosting technique LightGBM on features generated using title and categories. We find that an ensemble of these approaches does a better job than using them alone suggesting that the results of the deep and shallow approach are highly complementary
△ Less
Submitted 1 April, 2018;
originally announced April 2018.
-
On a tensor-analogue of the Schur product
Authors:
K. Sumesh,
V. S. Sunder
Abstract:
We consider the tensorial Schur product $R \circ^\otimes S = [r_{ij} \otimes s_{ij}]$ for $R \in M_n(\mathcal{A}), S\in M_n(\mathcal{B}),$ with $\mathcal{A}, \mathcal{B}$ unital $C^*$-algebras, verify that such a `tensorial Schur product' of positive operators is again positive, and then use this fact to prove (an apparently marginally more general version of) the classical result of Choi that a l…
▽ More
We consider the tensorial Schur product $R \circ^\otimes S = [r_{ij} \otimes s_{ij}]$ for $R \in M_n(\mathcal{A}), S\in M_n(\mathcal{B}),$ with $\mathcal{A}, \mathcal{B}$ unital $C^*$-algebras, verify that such a `tensorial Schur product' of positive operators is again positive, and then use this fact to prove (an apparently marginally more general version of) the classical result of Choi that a linear map $φ:M_n \to M_d$ is completely positive if and only if $[φ(E_{ij})] \in M_n(M_d)^+$, where of course $\{E_{ij}:1 \leq i,j \leq n\}$ denotes the usual system of matrix units in $M_n (:= M_n(\mathbb{C}))$. We also discuss some other corollaries of the main result.
△ Less
Submitted 14 October, 2015; v1 submitted 16 September, 2015;
originally announced September 2015.
-
The Functional Analysis of Quantum Information Theory
Authors:
Ved Prakash Gupta,
Prabha Mandayam,
V. S. Sunder
Abstract:
This book is a compilation of notes from a two-week international workshop on the "The Functional Analysis of Quantum Information Theory" that was held at the Institute of Mathematical Sciences during 26/12/2011-06/01/2012. The workshop was devoted to the mathematical framework of quantized functional analysis (QFA), and aimed at illustrating its applications to problems in quantum communication.…
▽ More
This book is a compilation of notes from a two-week international workshop on the "The Functional Analysis of Quantum Information Theory" that was held at the Institute of Mathematical Sciences during 26/12/2011-06/01/2012. The workshop was devoted to the mathematical framework of quantized functional analysis (QFA), and aimed at illustrating its applications to problems in quantum communication. The lectures were given by Gilles Pisier (Pierre and Marie Curie University and Texas A&M), K.R. Parthasarathy (ISI Delhi), Vern Paulsen (University of Houston), and Andreas Winter (Universitat Autonoma de Barcelona). Topics discussed include Operator Spaces and Completely bounded maps, Schmidt number and Schmidt rank of bipartite entangled states, Operator Systems and Completely Positive Maps, and, Operator Methods in Quantum Information.
△ Less
Submitted 28 April, 2015; v1 submitted 27 October, 2014;
originally announced October 2014.
-
Extendable endomorphisms on factors
Authors:
Panchugopal Bikram,
Masaki Izumi,
R. Srinivasan,
V. S. Sunder
Abstract:
We begin this note with a von Neumann algebraic version of the elementary but extremely useful fact about being able to extend inner-product preserving maps from a total set of the domain Hilbert space to an isometry defined on the entire domain. This leads us to the notion of when `good' endomorphisms of a factorial probability space $(M,φ)$ (which we call equi-modular) admit a natural extension…
▽ More
We begin this note with a von Neumann algebraic version of the elementary but extremely useful fact about being able to extend inner-product preserving maps from a total set of the domain Hilbert space to an isometry defined on the entire domain. This leads us to the notion of when `good' endomorphisms of a factorial probability space $(M,φ)$ (which we call equi-modular) admit a natural extension to endomorphisms of $L^2(M,φ)$. We exhibit examples of such extendable endomorphisms.
We then pass to $E_0$-semigroups $α= {α_t: t \geq 0}$ of factors, and observe that extendability of this semigroup (i.e., extendability of each $α_t$) is a cocycle-conjugacy invariant of the semigroup. We identify a necessary condition for extendability of such an $E_0$-semigroup, which we then use to show that the Clifford flow on the hyperfinite $II_1$ factor is not extendable.
△ Less
Submitted 11 October, 2013; v1 submitted 12 November, 2012;
originally announced November 2012.
-
Continuous minimax theorems
Authors:
Madhushree Basu,
V. S. Sunder
Abstract:
In classical matrix theory, there exist useful extremal characterizations of eigenvalues and their sums for Hermitian matrices (due to Ky Fan, Courant-Fischer-Weyl and Wielandt) and some consequences such as the majorization assertion in Lidskii's theorem. In this paper, we extend these results to the context of self adjoint elements of finite von Neumann algebras, and their distribution and quant…
▽ More
In classical matrix theory, there exist useful extremal characterizations of eigenvalues and their sums for Hermitian matrices (due to Ky Fan, Courant-Fischer-Weyl and Wielandt) and some consequences such as the majorization assertion in Lidskii's theorem. In this paper, we extend these results to the context of self adjoint elements of finite von Neumann algebras, and their distribution and quantile functions. This work was motivated by a lemma in a paper by Voiculescu and Bercovici, that described such an extremal characterization of the distribution of a self-adjoint operator affiliated to a finite von Neumann algebra - suggesting a possible analogue of the classical Courant-Fischer-Weyl minmax theorem, for a self adjoint operator in a finite von Neumann algebra. It is to be noted that the only von Neumann algebras considered here have separable pre-duals.
△ Less
Submitted 11 November, 2013; v1 submitted 29 October, 2012;
originally announced October 2012.
-
Hilbert von Neumann modules
Authors:
Panchugopal Bikram,
Kunal Mukherjee,
R. Srinivasan,
V. S. Sunder
Abstract:
We introduce a way of regarding Hilbert von Neumann modules as spaces of operators between Hilbert space, not unlike [Skei], but in an apparently much simpler manner and involving far less machinery. We verify that our definition is equivalent to that of [Skei], by verifying the `Riesz lemma' or what is called `self-duality' in [Skei]. An advantage with our approach is that we can totally side-ste…
▽ More
We introduce a way of regarding Hilbert von Neumann modules as spaces of operators between Hilbert space, not unlike [Skei], but in an apparently much simpler manner and involving far less machinery. We verify that our definition is equivalent to that of [Skei], by verifying the `Riesz lemma' or what is called `self-duality' in [Skei]. An advantage with our approach is that we can totally side-step the need to go through $C^*$-modules and avoid the two stages of completion - first in norm, then in the strong operator topology - involved in the former approach.
We establish the analogue of the Stinespring dilation theorem for Hilbert von Neumann bimodules, and we develop our version of `internal tensor products' which we refer to as Connes fusion for obvious reasons.
In our discussion of examples, we examine the bimodules arising from automorphisms of von Neumann algebras, verify that fusion of bimodules corresponds to composition of automorphisms in this case, and that the isomorphism class of such a bimodule depends only on the inner conjugacy class of the automorphism. We also relate Jones' basic construction to the Stinespring dilation associated to the conditional expectation onto a finite-index inclusion (by invoking the uniqueness assertion regarding the latter).
△ Less
Submitted 23 February, 2011;
originally announced February 2011.
-
From graphs to free products
Authors:
Madhushree Basu,
Vijay Kodiyalam,
V. S. Sunder
Abstract:
We investigate a construction which associates a finite von Neumann algebra $M(Γ,μ)$ to a finite weighted graph $(Γ,μ)$. Pleasantly, but not surprisingly, the von Neumann algebra associated to to a `flower with $n$ petals' is the group von Neumann algebra of the free group on $n$ generators. In general, the algebra $M(Γ,μ)$ is a free product, with amalgamation over a finite-dimensional abelian sub…
▽ More
We investigate a construction which associates a finite von Neumann algebra $M(Γ,μ)$ to a finite weighted graph $(Γ,μ)$. Pleasantly, but not surprisingly, the von Neumann algebra associated to to a `flower with $n$ petals' is the group von Neumann algebra of the free group on $n$ generators. In general, the algebra $M(Γ,μ)$ is a free product, with amalgamation over a finite-dimensional abelian subalgebra corresponding to the vertex set, of algebras associated to subgraphs `with one edge' (or actually a pair of dual edges). This also yields `natural' examples of (i) a Fock-type model of an operator with a free Poisson distribution; and (ii) $\C \oplus \C$-valued circular and semi-circular operators.
△ Less
Submitted 22 February, 2011;
originally announced February 2011.
-
On the Guionnet-Jones-Shlyakhtenko construction for graphs
Authors:
Vijay Kodiyalam,
V. S. Sunder
Abstract:
Using an analogue of the Guionnet-Jones-Shlaykhtenko construction for graphs we show that their construction applied to any subfactor planar algebra of finite depth yields an inclusion of interpolated free group factors with finite parameter, thereby giving another proof of their universality for finite depth planar algebras.
Using an analogue of the Guionnet-Jones-Shlaykhtenko construction for graphs we show that their construction applied to any subfactor planar algebra of finite depth yields an inclusion of interpolated free group factors with finite parameter, thereby giving another proof of their universality for finite depth planar algebras.
△ Less
Submitted 24 March, 2010; v1 submitted 11 November, 2009;
originally announced November 2009.
-
Guionnet-Jones-Shlyakhtenko subfactors associated to finite-dimensional Kac algebras
Authors:
Vijay Kodiyalam,
V. S. Sunder
Abstract:
We analyse the Guionnet-Jones-Shlyakhtenko construction for the planar algebra associated to a finite-dimensional Kac algebra and identify the factors that arise as finite interpolated free group factors.
We analyse the Guionnet-Jones-Shlyakhtenko construction for the planar algebra associated to a finite-dimensional Kac algebra and identify the factors that arise as finite interpolated free group factors.
△ Less
Submitted 10 March, 2009; v1 submitted 20 January, 2009;
originally announced January 2009.
-
From subfactor planar algebras to subfactors
Authors:
Vijay Kodiyalam,
V. S. Sunder
Abstract:
We present a purely planar algebraic proof of the main result of a paper of Guionnet-Jones-Shlaykhtenko which constructs an extremal subfactor from a subfactor planar algebra whose standard invariant is given by that planar algebra.
We present a purely planar algebraic proof of the main result of a paper of Guionnet-Jones-Shlaykhtenko which constructs an extremal subfactor from a subfactor planar algebra whose standard invariant is given by that planar algebra.
△ Less
Submitted 23 July, 2008;
originally announced July 2008.
-
Planar algebras and Kuperberg's 3-manifold invariant
Authors:
Vijay Kodiyalam,
V. S. Sunder
Abstract:
We recapture Kuperberg's numerical invariant of 3-manifolds associated to a semisimple and cosemisimple Hopf algebra through a `planar algebra construction'. A result of possibly independent interest, used during the proof, which relates duality in planar graphs and Hopf algebras, is the subject of a final section.
We recapture Kuperberg's numerical invariant of 3-manifolds associated to a semisimple and cosemisimple Hopf algebra through a `planar algebra construction'. A result of possibly independent interest, used during the proof, which relates duality in planar graphs and Hopf algebras, is the subject of a final section.
△ Less
Submitted 14 September, 2005;
originally announced September 2005.
-
Subfactors and 1+1-dimensional TQFTs
Authors:
Vijay Kodiyalam,
Vishwambhar Pati,
V. S. Sunder
Abstract:
We construct a certain `cobordism category' ${\cal D}$ whose morphisms are suitably decorated cobordism classes between similarly decorated closed oriented 1-manifolds, and show that there is essentially a bijection between (1+1-dimensional) unitary topological quantum field theories (TQFTs) defined on ${\cal D}$, on the one hand, and Jones' subfactor planar algebras, on the other.
We construct a certain `cobordism category' ${\cal D}$ whose morphisms are suitably decorated cobordism classes between similarly decorated closed oriented 1-manifolds, and show that there is essentially a bijection between (1+1-dimensional) unitary topological quantum field theories (TQFTs) defined on ${\cal D}$, on the one hand, and Jones' subfactor planar algebras, on the other.
△ Less
Submitted 4 July, 2005;
originally announced July 2005.
-
The planar algebra of a semisimple and cosemisimple Hopf algebra
Authors:
Vijay Kodiyalam,
V. S. Sunder
Abstract:
To a semisimple and cosemisimple Hopf algebra over an algebraically closed field, we associate a planar algebra defined by generators and relations and show that it is a connected, irreducible, spherical, non-degenerate planar algebra with non-zero modulus and of depth two. This association is shown to yield a bijection between (the isomorphism classes, on both sides, of) such objects.
To a semisimple and cosemisimple Hopf algebra over an algebraically closed field, we associate a planar algebra defined by generators and relations and show that it is a connected, irreducible, spherical, non-degenerate planar algebra with non-zero modulus and of depth two. This association is shown to yield a bijection between (the isomorphism classes, on both sides, of) such objects.
△ Less
Submitted 20 June, 2005; v1 submitted 9 June, 2005;
originally announced June 2005.