-
A multidimensional measurement of photorealistic avatar quality of experience
Authors:
Ross Cutler,
Babak Naderi,
Vishak Gopal,
Dharmendar Palle
Abstract:
Photorealistic avatars are human avatars that look, move, and talk like real people. The performance of photorealistic avatars has significantly improved recently based on objective metrics such as PSNR, SSIM, LPIPS, FID, and FVD. However, recent photorealistic avatar publications do not provide subjective tests of the avatars to measure human usability factors. We provide an open source test fram…
▽ More
Photorealistic avatars are human avatars that look, move, and talk like real people. The performance of photorealistic avatars has significantly improved recently based on objective metrics such as PSNR, SSIM, LPIPS, FID, and FVD. However, recent photorealistic avatar publications do not provide subjective tests of the avatars to measure human usability factors. We provide an open source test framework to subjectively measure photorealistic avatar performance in ten dimensions: realism, trust, comfortableness using, comfortableness interacting with, appropriateness for work, creepiness, formality, affinity, resemblance to the person, and emotion accuracy. Using telecommunication scenarios, we show that the correlation of nine of these subjective metrics with PSNR, SSIM, LPIPS, FID, and FVD is weak, and moderate for emotion accuracy. The crowdsourced subjective test framework is highly reproducible and accurate when compared to a panel of experts. We analyze a wide range of avatars from photorealistic to cartoon-like and show that some photorealistic avatars are approaching real video performance based on these dimensions. We also find that for avatars above a certain level of realism, eight of these measured dimensions are strongly correlated. This means that avatars that are not as realistic as real video will have lower trust, comfortableness using, comfortableness interacting with, appropriateness for work, formality, and affinity, and higher creepiness compared to real video. In addition, because there is a strong linear relationship between avatar affinity and realism, there is no uncanny valley effect for photorealistic avatars in the telecommunication scenario. We suggest several extensions of this test framework for future work and discuss design implications for telecommunication systems. The test framework is available at https://github.com/microsoft/P.910.
△ Less
Submitted 5 April, 2025; v1 submitted 13 November, 2024;
originally announced November 2024.
-
Streetwise Agents: Empowering Offline RL Policies to Outsmart Exogenous Stochastic Disturbances in RTC
Authors:
Aditya Soni,
Mayukh Das,
Anjaly Parayil,
Supriyo Ghosh,
Shivam Shandilya,
Ching-An Cheng,
Vishak Gopal,
Sami Khairy,
Gabriel Mittag,
Yasaman Hosseinkashi,
Chetan Bansal
Abstract:
The difficulty of exploring and training online on real production systems limits the scope of real-time online data/feedback-driven decision making. The most feasible approach is to adopt offline reinforcement learning from limited trajectory samples. However, after deployment, such policies fail due to exogenous factors that temporarily or permanently disturb/alter the transition distribution of…
▽ More
The difficulty of exploring and training online on real production systems limits the scope of real-time online data/feedback-driven decision making. The most feasible approach is to adopt offline reinforcement learning from limited trajectory samples. However, after deployment, such policies fail due to exogenous factors that temporarily or permanently disturb/alter the transition distribution of the assumed decision process structure induced by offline samples. This results in critical policy failures and generalization errors in sensitive domains like Real-Time Communication (RTC). We solve this crucial problem of identifying robust actions in presence of domain shifts due to unseen exogenous stochastic factors in the wild. As it is impossible to learn generalized offline policies within the support of offline data that are robust to these unseen exogenous disturbances, we propose a novel post-deployment shaping of policies (Streetwise), conditioned on real-time characterization of out-of-distribution sub-spaces. This leads to robust actions in bandwidth estimation (BWE) of network bottlenecks in RTC and in standard benchmarks. Our extensive experimental results on BWE and other standard offline RL benchmark environments demonstrate a significant improvement ($\approx$ 18% on some scenarios) in final returns wrt. end-user metrics over state-of-the-art baselines.
△ Less
Submitted 11 November, 2024;
originally announced November 2024.
-
Balancing Generalization and Specialization: Offline Metalearning for Bandwidth Estimation
Authors:
Aashish Gottipati,
Sami Khairy,
Yasaman Hosseinkashi,
Gabriel Mittag,
Vishak Gopal,
Francis Y. Yan,
Ross Cutler
Abstract:
User experience in real-time video applications requires continuously adjusting video encoding bitrates to match available network capacity, which hinges on accurate bandwidth estimation (BWE). However, network heterogeneity prevents a one-size-fits-all solution to BWE, motivating the demand for personalized approaches. Although personalizing BWE algorithms offers benefits such as improved adaptab…
▽ More
User experience in real-time video applications requires continuously adjusting video encoding bitrates to match available network capacity, which hinges on accurate bandwidth estimation (BWE). However, network heterogeneity prevents a one-size-fits-all solution to BWE, motivating the demand for personalized approaches. Although personalizing BWE algorithms offers benefits such as improved adaptability to individual network conditions, it faces the challenge of data drift -- where estimators degrade over time due to evolving network environments. To address this, we introduce Ivy, a novel method for BWE that leverages offline metalearning to tackle data drift and maximize end-user Quality of Experience (QoE). Our key insight is that dynamically selecting the most suitable BWE algorithm for current network conditions allows for more effective adaption to changing environments. Ivy is trained entirely offline using Implicit Q-learning, enabling it to learn from individual network conditions without a single, live videoconferencing interaction, thereby reducing deployment complexity and making Ivy more practical for real-world personalization. We implemented our method in a popular videoconferencing application and demonstrated that Ivy can enhance QoE by 5.9% to 11.2% over individual BWE algorithms and by 6.3% to 11.4% compared to existing online meta heuristics.
△ Less
Submitted 29 September, 2024;
originally announced September 2024.
-
ACM MMSys 2024 Bandwidth Estimation in Real Time Communications Challenge
Authors:
Sami Khairy,
Gabriel Mittag,
Vishak Gopal,
Francis Y. Yan,
Zhixiong Niu,
Ezra Ameri,
Scott Inglis,
Mehrsa Golestaneh,
Ross Cutler
Abstract:
The quality of experience (QoE) delivered by video conferencing systems to end users depends in part on correctly estimating the capacity of the bottleneck link between the sender and the receiver over time. Bandwidth estimation for real-time communications (RTC) remains a significant challenge, primarily due to the continuously evolving heterogeneous network architectures and technologies. From t…
▽ More
The quality of experience (QoE) delivered by video conferencing systems to end users depends in part on correctly estimating the capacity of the bottleneck link between the sender and the receiver over time. Bandwidth estimation for real-time communications (RTC) remains a significant challenge, primarily due to the continuously evolving heterogeneous network architectures and technologies. From the first bandwidth estimation challenge which was hosted at ACM MMSys 2021, we learned that bandwidth estimation models trained with reinforcement learning (RL) in simulations to maximize network-based reward functions may not be optimal in reality due to the sim-to-real gap and the difficulty of aligning network-based rewards with user-perceived QoE. This grand challenge aims to advance bandwidth estimation model design by aligning reward maximization with user-perceived QoE optimization using offline RL and a real-world dataset with objective rewards which have high correlations with subjective audio/video quality in Microsoft Teams. All models submitted to the grand challenge underwent initial evaluation on our emulation platform. For a comprehensive evaluation under diverse network conditions with temporal fluctuations, top models were further evaluated on our geographically distributed testbed by using each model to conduct 600 calls within a 12-day period. The winning model is shown to deliver comparable performance to the top behavior policy in the released dataset. By leveraging real-world data and integrating objective audio/video quality scores as rewards, offline RL can therefore facilitate the development of competitive bandwidth estimators for RTC.
△ Less
Submitted 15 March, 2024; v1 submitted 10 March, 2024;
originally announced March 2024.
-
3D photonics for ultra-low energy, high bandwidth-density chip data links
Authors:
Stuart Daudlin,
Anthony Rizzo,
Sunwoo Lee,
Devesh Khilwani,
Christine Ou,
Songli Wang,
Asher Novick,
Vignesh Gopal,
Michael Cullen,
Robert Parsons,
Alyosha Molnar,
Keren Bergman
Abstract:
Artificial intelligence (AI) hardware is positioned to unlock revolutionary computational abilities across diverse fields ranging from fundamental science [1] to medicine [2] and environmental science [3] by leveraging advanced semiconductor chips interconnected in vast distributed networks. However, AI chip development has far outpaced that of the networks that connect them, as chip computation s…
▽ More
Artificial intelligence (AI) hardware is positioned to unlock revolutionary computational abilities across diverse fields ranging from fundamental science [1] to medicine [2] and environmental science [3] by leveraging advanced semiconductor chips interconnected in vast distributed networks. However, AI chip development has far outpaced that of the networks that connect them, as chip computation speeds have accelerated a thousandfold faster than communication bandwidth over the last two decades [4, 5]. This gap is the largest barrier for scaling AI performance [6, 7] and results from the disproportionately high energy expended to transmit data [8], which is two orders of magnitude more intensive than computing [9]. Here, we show a leveling of this long-standing discrepancy and achieve the lowest energy optical data link to date through dense 3D integration of photonic and electronic chips. At 120 fJ of consumed energy per communicated bit and 5.3 Tb/s bandwidth per square millimeter of chip area, our platform simultaneously achieves a twofold improvement in both energy consumption and bandwidth density relative to prior demonstrations [10, 11]. These improvements are realized through employing massively parallel 80 channel microresonator-based transmitter and receiver arrays operating at 10 Gb/s per channel, occupying a combined chip footprint of only 0.32 mm2. Furthermore, commercial complementary metal-oxide-semiconductor (CMOS) foundries fabricate both the electronic and photonic chips on 300 mm wafers, providing a clear avenue to volume scaling. Through these demonstrated ultra-energy efficient, high bandwidth data communication links, this work eliminates the bandwidth bottleneck between spatially distanced compute nodes and will enable a fundamentally new scale of future AI computing hardware without constraints on data locality.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Confidence Intervals for the F1 Score: A Comparison of Four Methods
Authors:
Kevin Fu Yuan Lam,
Vikneswaran Gopal,
Jiang Qian
Abstract:
In Natural Language Processing (NLP), binary classification algorithms are often evaluated using the F1 score. Because the sample F1 score is an estimate of the population F1 score, it is not sufficient to report the sample F1 score without an indication of how accurate it is. Confidence intervals are an indication of how accurate the sample F1 score is. However, most studies either do not report…
▽ More
In Natural Language Processing (NLP), binary classification algorithms are often evaluated using the F1 score. Because the sample F1 score is an estimate of the population F1 score, it is not sufficient to report the sample F1 score without an indication of how accurate it is. Confidence intervals are an indication of how accurate the sample F1 score is. However, most studies either do not report them or report them using methods that demonstrate poor statistical properties. In the present study, I review current analytical methods (i.e., Clopper-Pearson method and Wald method) to construct confidence intervals for the population F1 score, propose two new analytical methods (i.e., Wilson direct method and Wilson indirect method) to do so, and compare these methods based on their coverage probabilities and interval lengths, as well as whether these methods suffer from overshoot and degeneracy. Theoretical results demonstrate that both proposed methods do not suffer from overshoot and degeneracy. Experimental results suggest that both proposed methods perform better, as compared to current methods, in terms of coverage probabilities and interval lengths. I illustrate both current and proposed methods on two suggestion mining tasks. I discuss the practical implications of these results, and suggest areas for future research.
△ Less
Submitted 9 June, 2024; v1 submitted 25 September, 2023;
originally announced September 2023.
-
Offline to Online Learning for Real-Time Bandwidth Estimation
Authors:
Aashish Gottipati,
Sami Khairy,
Gabriel Mittag,
Vishak Gopal,
Ross Cutler
Abstract:
Real-time video applications require accurate bandwidth estimation (BWE) to maintain user experience across varying network conditions. However, increasing network heterogeneity challenges general-purpose BWE algorithms, necessitating solutions that adapt to end-user environments. While widely adopted, heuristic-based methods are difficult to individualize without extensive domain expertise. Conve…
▽ More
Real-time video applications require accurate bandwidth estimation (BWE) to maintain user experience across varying network conditions. However, increasing network heterogeneity challenges general-purpose BWE algorithms, necessitating solutions that adapt to end-user environments. While widely adopted, heuristic-based methods are difficult to individualize without extensive domain expertise. Conversely, online reinforcement learning (RL) offers ease of customization but neglects prior domain expertise and suffers from sample inefficiency. Thus, we present Merlin, an imitation learning-based solution that replaces the manual parameter tuning of heuristic-based methods with data-driven updates to streamline end-user personalization. Our key insight is that transforming heuristic-based BWE algorithms into neural networks facilitates data-driven personalization. Merlin utilizes Behavioral Cloning to efficiently learn from offline telemetry logs, capturing heuristic policies without live network interactions. The cloned policy can then be seamlessly tailored to end user network conditions through online finetuning. In real intercontinental videoconferencing calls, Merlin matches our heuristic's policy with no statistically significant differences in user quality of experience (QoE). Finetuning Merlin's control policy to end-user environments enables QoE improvements of up to 7.8% compared to the heuristic policy. Lastly, our IL-based design performs competitively with current state-of-the-art online RL techniques but converges with 80% fewer videoconferencing samples, facilitating practical end-user personalization.
△ Less
Submitted 11 February, 2025; v1 submitted 23 September, 2023;
originally announced September 2023.
-
A Real-Time Active Speaker Detection System Integrating an Audio-Visual Signal with a Spatial Querying Mechanism
Authors:
Ilya Gurvich,
Ido Leichter,
Dharmendar Reddy Palle,
Yossi Asher,
Alon Vinnikov,
Igor Abramovski,
Vishak Gopal,
Ross Cutler,
Eyal Krupka
Abstract:
We introduce a distinctive real-time, causal, neural network-based active speaker detection system optimized for low-power edge computing. This system drives a virtual cinematography module and is deployed on a commercial device. The system uses data originating from a microphone array and a 360-degree camera. Our network requires only 127 MFLOPs per participant, for a meeting with 14 participants…
▽ More
We introduce a distinctive real-time, causal, neural network-based active speaker detection system optimized for low-power edge computing. This system drives a virtual cinematography module and is deployed on a commercial device. The system uses data originating from a microphone array and a 360-degree camera. Our network requires only 127 MFLOPs per participant, for a meeting with 14 participants. Unlike previous work, we examine the error rate of our network when the computational budget is exhausted, and find that it exhibits graceful degradation, allowing the system to operate reasonably well even in this case. Departing from conventional DOA estimation approaches, our network learns to query the available acoustic data, considering the detected head locations. We train and evaluate our algorithm on a realistic meetings dataset featuring up to 14 participants in the same meeting, overlapped speech, and other challenging scenarios.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
LSTM-based Video Quality Prediction Accounting for Temporal Distortions in Videoconferencing Calls
Authors:
Gabriel Mittag,
Babak Naderi,
Vishak Gopal,
Ross Cutler
Abstract:
Current state-of-the-art video quality models, such as VMAF, give excellent prediction results by comparing the degraded video with its reference video. However, they do not consider temporal distortions (e.g., frame freezes or skips) that occur during videoconferencing calls. In this paper, we present a data-driven approach for modeling such distortions automatically by training an LSTM with subj…
▽ More
Current state-of-the-art video quality models, such as VMAF, give excellent prediction results by comparing the degraded video with its reference video. However, they do not consider temporal distortions (e.g., frame freezes or skips) that occur during videoconferencing calls. In this paper, we present a data-driven approach for modeling such distortions automatically by training an LSTM with subjective quality ratings labeled via crowdsourcing. The videos were collected from live videoconferencing calls in 83 different network conditions. We applied QR codes as markers on the source videos to create aligned references and compute temporal features based on the alignment vectors. Using these features together with VMAF core features, our proposed model achieves a PCC of 0.99 on the validation set. Furthermore, our model outputs per-frame quality that gives detailed insight into the cause of video quality impairments. The VCM model and dataset are open-sourced at https://github.com/microsoft/Video_Call_MOS.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
ICASSP 2023 Deep Noise Suppression Challenge
Authors:
Harishchandra Dubey,
Ashkan Aazami,
Vishak Gopal,
Babak Naderi,
Sebastian Braun,
Ross Cutler,
Alex Ju,
Mehdi Zohourian,
Min Tang,
Hannes Gamper,
Mehrsa Golestaneh,
Robert Aichner
Abstract:
Deep Speech Enhancement Challenge is the 5th edition of deep noise suppression (DNS) challenges organized at ICASSP 2023 Signal Processing Grand Challenges. DNS challenges were organized during 2019-2023 to stimulate research in deep speech enhancement (DSE). Previous DNS challenges were organized at INTERSPEECH 2020, ICASSP 2021, INTERSPEECH 2021, and ICASSP 2022. From prior editions, we learnt t…
▽ More
Deep Speech Enhancement Challenge is the 5th edition of deep noise suppression (DNS) challenges organized at ICASSP 2023 Signal Processing Grand Challenges. DNS challenges were organized during 2019-2023 to stimulate research in deep speech enhancement (DSE). Previous DNS challenges were organized at INTERSPEECH 2020, ICASSP 2021, INTERSPEECH 2021, and ICASSP 2022. From prior editions, we learnt that improving signal quality (SIG) is challenging particularly in presence of simultaneously active interfering talkers and noise. This challenge aims to develop models for joint denosing, dereverberation and suppression of interfering talkers. When primary talker wears a headphone, certain acoustic properties of their speech such as direct-to-reverberation (DRR), signal to noise ratio (SNR) etc. make it possible to suppress neighboring talkers even without enrollment data for primary talker. This motivated us to create two tracks for this challenge: (i) Track-1 Headset; (ii) Track-2 Speakerphone. Both tracks has fullband (48kHz) training data and testset, and each testclips has a corresponding enrollment data (10-30s duration) for primary talker. Each track invited submissions of personalized and non-personalized models all of which are evaluated through same subjective evaluation. Most models submitted to challenge were personalized models, same team is winner in both tracks where the best models has improvement of 0.145 and 0.141 in challenge's Score as compared to noisy blind testset.
△ Less
Submitted 8 May, 2023; v1 submitted 20 March, 2023;
originally announced March 2023.
-
A photonic integrated chip platform for interlayer exciton valley routing
Authors:
Kishor K Mandal,
Yashika Gupta,
Mandar Sohoni,
Achanta Venu Gopal,
Anshuman Kumar
Abstract:
Interlayer excitons in two dimensional semiconductor heterostructures show suppressed electron-hole overlap resulting in longer radiative lifetimes as compared to intralyer excitons. Such tightly bound interlayer excitons are relevant for important optoelectronic applications including light storage and quantum communication. Their optical accessibility is, however, limited due to their out-of-pla…
▽ More
Interlayer excitons in two dimensional semiconductor heterostructures show suppressed electron-hole overlap resulting in longer radiative lifetimes as compared to intralyer excitons. Such tightly bound interlayer excitons are relevant for important optoelectronic applications including light storage and quantum communication. Their optical accessibility is, however, limited due to their out-of-plane transition dipole moment. In this work, we design a CMOS compatible photonic integrated chip platform for enhanced near field coupling of these interlayer excitons with the whispering gallery modes of a microresonator, exploiting the high confinement of light in a small modal volume and high quality factor of the system. Our platform allows for highly selective emission routing via engineering an asymmetric light transmission which facilitates efficient readout and channeling of the excitonic valley state from such systems.
△ Less
Submitted 7 March, 2022;
originally announced March 2022.
-
ICASSP 2022 Deep Noise Suppression Challenge
Authors:
Harishchandra Dubey,
Vishak Gopal,
Ross Cutler,
Ashkan Aazami,
Sergiy Matusevych,
Sebastian Braun,
Sefik Emre Eskimez,
Manthan Thakker,
Takuya Yoshioka,
Hannes Gamper,
Robert Aichner
Abstract:
The Deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. This is the 4th DNS challenge, with the previous editions held at INTERSPEECH 2020, ICASSP 2021, and INTERSPEECH 2021. We open-source datasets and test sets for researchers to train their deep noise suppression models, as well as a subjective e…
▽ More
The Deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. This is the 4th DNS challenge, with the previous editions held at INTERSPEECH 2020, ICASSP 2021, and INTERSPEECH 2021. We open-source datasets and test sets for researchers to train their deep noise suppression models, as well as a subjective evaluation framework based on ITU-T P.835 to rate and rank-order the challenge entries. We provide access to DNSMOS P.835 and word accuracy (WAcc) APIs to challenge participants to help with iterative model improvements. In this challenge, we introduced the following changes: (i) Included mobile device scenarios in the blind test set; (ii) Included a personalized noise suppression track with baseline; (iii) Added WAcc as an objective metric; (iv) Included DNSMOS P.835; (v) Made the training datasets and test sets fullband (48 kHz). We use an average of WAcc and subjective scores P.835 SIG, BAK, and OVRL to get the final score for ranking the DNS models. We believe that as a research community, we still have a long way to go in achieving excellent speech quality in challenging noisy real-world scenarios.
△ Less
Submitted 26 February, 2022;
originally announced February 2022.
-
Performance optimizations on deep noise suppression models
Authors:
Jerry Chee,
Sebastian Braun,
Vishak Gopal,
Ross Cutler
Abstract:
We study the role of magnitude structured pruning as an architecture search to speed up the inference time of a deep noise suppression (DNS) model. While deep learning approaches have been remarkably successful in enhancing audio quality, their increased complexity inhibits their deployment in real-time applications. We achieve up to a 7.25X inference speedup over the baseline, with a smooth model…
▽ More
We study the role of magnitude structured pruning as an architecture search to speed up the inference time of a deep noise suppression (DNS) model. While deep learning approaches have been remarkably successful in enhancing audio quality, their increased complexity inhibits their deployment in real-time applications. We achieve up to a 7.25X inference speedup over the baseline, with a smooth model performance degradation. Ablation studies indicate that our proposed network re-parameterization (i.e., size per layer) is the major driver of the speedup, and that magnitude structured pruning does comparably to directly training a model in the smaller size. We report inference speed because a parameter reduction does not necessitate speedup, and we measure model quality using an accurate non-intrusive objective speech quality metric.
△ Less
Submitted 8 October, 2021;
originally announced October 2021.
-
DNSMOS P.835: A Non-Intrusive Perceptual Objective Speech Quality Metric to Evaluate Noise Suppressors
Authors:
Chandan K A Reddy,
Vishak Gopal,
Ross Cutler
Abstract:
Human subjective evaluation is the gold standard to evaluate speech quality optimized for human perception. Perceptual objective metrics serve as a proxy for subjective scores. We have recently developed a non-intrusive speech quality metric called Deep Noise Suppression Mean Opinion Score (DNSMOS) using the scores from ITU-T Rec. P.808 subjective evaluation. The P.808 scores reflect the overall q…
▽ More
Human subjective evaluation is the gold standard to evaluate speech quality optimized for human perception. Perceptual objective metrics serve as a proxy for subjective scores. We have recently developed a non-intrusive speech quality metric called Deep Noise Suppression Mean Opinion Score (DNSMOS) using the scores from ITU-T Rec. P.808 subjective evaluation. The P.808 scores reflect the overall quality of the audio clip. ITU-T Rec. P.835 subjective evaluation framework gives the standalone quality scores of speech and background noise in addition to the overall quality. In this work, we train an objective metric based on P.835 human ratings that outputs 3 scores: i) speech quality (SIG), ii) background noise quality (BAK), and iii) the overall quality (OVRL) of the audio. The developed metric is highly correlated with human ratings, with a Pearson's Correlation Coefficient (PCC)=0.94 for SIG and PCC=0.98 for BAK and OVRL. This is the first non-intrusive P.835 predictor we are aware of. DNSMOS P.835 is made publicly available as an Azure service.
△ Less
Submitted 4 February, 2022; v1 submitted 4 October, 2021;
originally announced October 2021.
-
Integrated Kerr frequency comb-driven silicon photonic transmitter
Authors:
Anthony Rizzo,
Asher Novick,
Vignesh Gopal,
Bok Young Kim,
Xingchen Ji,
Stuart Daudlin,
Yoshitomo Okawachi,
Qixiang Cheng,
Michal Lipson,
Alexander L. Gaeta,
Keren Bergman
Abstract:
The exponential growth of computing needs for artificial intelligence and machine learning has had a dramatic impact on data centre energy consumption, which has risen to environmentally significant levels. Using light to send information between compute nodes can dramatically decrease this energy consumption while simultaneously increasing bandwidth. Through wavelength-division multiplexing with…
▽ More
The exponential growth of computing needs for artificial intelligence and machine learning has had a dramatic impact on data centre energy consumption, which has risen to environmentally significant levels. Using light to send information between compute nodes can dramatically decrease this energy consumption while simultaneously increasing bandwidth. Through wavelength-division multiplexing with chip-based microresonator Kerr frequency combs, independent information channels can be encoded onto many distinct colours of light in the same optical fibre for massively parallel data transmission with low energy. While previous demonstrations have relied on benchtop equipment for filtering and modulating Kerr comb wavelength channels, data centre interconnects require a compact on-chip form factor for these operations. Here, we demonstrate the first integrated silicon photonic transmitter using a Kerr comb source. The demonstrated architecture is scalable to hundreds of wavelength channels, enabling a fundamentally new class of massively parallel terabit-scale optical interconnects for future green hyperscale data centres.
△ Less
Submitted 8 September, 2021;
originally announced September 2021.
-
Intel HEXL: Accelerating Homomorphic Encryption with Intel AVX512-IFMA52
Authors:
Fabian Boemer,
Sejun Kim,
Gelila Seifu,
Fillipe D. M. de Souza,
Vinodh Gopal
Abstract:
Modern implementations of homomorphic encryption (HE) rely heavily on polynomial arithmetic over a finite field. This is particularly true of the CKKS, BFV, and BGV HE schemes. Two of the biggest performance bottlenecks in HE primitives and applications are polynomial modular multiplication and the forward and inverse number-theoretic transform (NTT). Here, we introduce Intel Homomorphic Encryptio…
▽ More
Modern implementations of homomorphic encryption (HE) rely heavily on polynomial arithmetic over a finite field. This is particularly true of the CKKS, BFV, and BGV HE schemes. Two of the biggest performance bottlenecks in HE primitives and applications are polynomial modular multiplication and the forward and inverse number-theoretic transform (NTT). Here, we introduce Intel Homomorphic Encryption Acceleration Library (Intel HEXL), a C++ library which provides optimized implementations of polynomial arithmetic for Intel processors. Intel HEXL takes advantage of the recent Intel Advanced Vector Extensions 512 (Intel AVX512) instruction set to provide state-of-the-art implementations of the NTT and modular multiplication. On the forward and inverse NTT, Intel HEXL provides up to 7.2x and 6.7x speedup, respectively, over a native C++ implementation. Intel HEXL also provides up to 6.0x speedup on the element-wise vector-vector modular multiplication, and 1.7x speedup on the element-wise vector-scalar modular multiplication. Intel HEXL is available open-source at https://github.com/intel/hexl under the Apache 2.0 license and has been adopted by the Microsoft SEAL and PALISADE homomorphic encryption libraries.
△ Less
Submitted 9 July, 2021; v1 submitted 30 March, 2021;
originally announced March 2021.
-
Interspeech 2021 Deep Noise Suppression Challenge
Authors:
Chandan K A Reddy,
Harishchandra Dubey,
Kazuhito Koishida,
Arun Nair,
Vishak Gopal,
Ross Cutler,
Sebastian Braun,
Hannes Gamper,
Robert Aichner,
Sriram Srinivasan
Abstract:
The Deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTERSPEECH and ICASSP 2020. We open-sourced training and test datasets for the wideband scenario. We also open-sourced a subjective evaluation framework based on ITU-T standard P.808, wh…
▽ More
The Deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTERSPEECH and ICASSP 2020. We open-sourced training and test datasets for the wideband scenario. We also open-sourced a subjective evaluation framework based on ITU-T standard P.808, which was also used to evaluate participants of the challenge. Many researchers from academia and industry made significant contributions to push the field forward, yet even the best noise suppressor was far from achieving superior speech quality in challenging scenarios. In this version of the challenge organized at INTERSPEECH 2021, we are expanding both our training and test datasets to accommodate full band scenarios. The two tracks in this challenge will focus on real-time denoising for (i) wide band, and(ii) full band scenarios. We are also making available a reliable non-intrusive objective speech quality metric called DNSMOS for the participants to use during their development phase.
△ Less
Submitted 4 April, 2021; v1 submitted 6 January, 2021;
originally announced January 2021.
-
Resonance: Replacing Software Constants with Context-Aware Models in Real-time Communication
Authors:
Jayant Gupchup,
Ashkan Aazami,
Yaran Fan,
Senja Filipi,
Tom Finley,
Scott Inglis,
Marcus Asteborg,
Luke Caroll,
Rajan Chari,
Markus Cozowicz,
Vishak Gopal,
Vinod Prakash,
Sasikanth Bendapudi,
Jack Gerrits,
Eric Lau,
Huazhou Liu,
Marco Rossi,
Dima Slobodianyk,
Dmitri Birjukov,
Matty Cooper,
Nilesh Javar,
Dmitriy Perednya,
Sriram Srinivasan,
John Langford,
Ross Cutler
, et al. (1 additional authors not shown)
Abstract:
Large software systems tune hundreds of 'constants' to optimize their runtime performance. These values are commonly derived through intuition, lab tests, or A/B tests. A 'one-size-fits-all' approach is often sub-optimal as the best value depends on runtime context. In this paper, we provide an experimental approach to replace constants with learned contextual functions for Skype - a widely used r…
▽ More
Large software systems tune hundreds of 'constants' to optimize their runtime performance. These values are commonly derived through intuition, lab tests, or A/B tests. A 'one-size-fits-all' approach is often sub-optimal as the best value depends on runtime context. In this paper, we provide an experimental approach to replace constants with learned contextual functions for Skype - a widely used real-time communication (RTC) application. We present Resonance, a system based on contextual bandits (CB). We describe experiences from three real-world experiments: applying it to the audio, video, and transport components in Skype. We surface a unique and practical challenge of performing machine learning (ML) inference in large software systems written using encapsulation principles. Finally, we open-source FeatureBroker, a library to reduce the friction in adopting ML models in such development environments
△ Less
Submitted 22 November, 2020;
originally announced November 2020.
-
DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors
Authors:
Chandan K A Reddy,
Vishak Gopal,
Ross Cutler
Abstract:
Human subjective evaluation is the gold standard to evaluate speech quality optimized for human perception. Perceptual objective metrics serve as a proxy for subjective scores. The conventional and widely used metrics require a reference clean speech signal, which is unavailable in real recordings. The no-reference approaches correlate poorly with human ratings and are not widely adopted in the re…
▽ More
Human subjective evaluation is the gold standard to evaluate speech quality optimized for human perception. Perceptual objective metrics serve as a proxy for subjective scores. The conventional and widely used metrics require a reference clean speech signal, which is unavailable in real recordings. The no-reference approaches correlate poorly with human ratings and are not widely adopted in the research community. One of the biggest use cases of these perceptual objective metrics is to evaluate noise suppression algorithms. This paper introduces a multi-stage self-teaching based perceptual objective metric that is designed to evaluate noise suppressors. The proposed method generalizes well in challenging test conditions with a high correlation to human ratings.
△ Less
Submitted 10 February, 2021; v1 submitted 28 October, 2020;
originally announced October 2020.
-
ICASSP 2021 Deep Noise Suppression Challenge
Authors:
Chandan K A Reddy,
Harishchandra Dubey,
Vishak Gopal,
Ross Cutler,
Sebastian Braun,
Hannes Gamper,
Robert Aichner,
Sriram Srinivasan
Abstract:
The Deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTERSPEECH 2020. We open sourced training and test datasets for researchers to train their noise suppression models. We also open sourced a subjective evaluation framework and used the t…
▽ More
The Deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTERSPEECH 2020. We open sourced training and test datasets for researchers to train their noise suppression models. We also open sourced a subjective evaluation framework and used the tool to evaluate and pick the final winners. Many researchers from academia and industry made significant contributions to push the field forward. We also learned that as a research community, we still have a long way to go in achieving excellent speech quality in challenging noisy real-time conditions. In this challenge, we are expanding both our training and test datasets. There are two tracks with one focusing on real-time denoising and the other focusing on real-time personalized deep noise suppression. We also make a non-intrusive objective speech quality metric called DNSMOS available for participants to use during their development stages. The final evaluation will be based on subjective tests.
△ Less
Submitted 26 October, 2020; v1 submitted 13 September, 2020;
originally announced September 2020.
-
Lumos: A Library for Diagnosing Metric Regressions in Web-Scale Applications
Authors:
Jamie Pool,
Ebrahim Beyrami,
Vishak Gopal,
Ashkan Aazami,
Jayant Gupchup,
Jeff Rowland,
Binlong Li,
Pritesh Kanani,
Ross Cutler,
Johannes Gehrke
Abstract:
Web-scale applications can ship code on a daily to weekly cadence. These applications rely on online metrics to monitor the health of new releases. Regressions in metric values need to be detected and diagnosed as early as possible to reduce the disruption to users and product owners. Regressions in metrics can surface due to a variety of reasons: genuine product regressions, changes in user popul…
▽ More
Web-scale applications can ship code on a daily to weekly cadence. These applications rely on online metrics to monitor the health of new releases. Regressions in metric values need to be detected and diagnosed as early as possible to reduce the disruption to users and product owners. Regressions in metrics can surface due to a variety of reasons: genuine product regressions, changes in user population, and bias due to telemetry loss (or processing) are among the common causes. Diagnosing the cause of these metric regressions is costly for engineering teams as they need to invest time in finding the root cause of the issue as soon as possible. We present Lumos, a Python library built using the principles of AB testing to systematically diagnose metric regressions to automate such analysis. Lumos has been deployed across the component teams in Microsoft's Real-Time Communication applications Skype and Microsoft Teams. It has enabled engineering teams to detect 100s of real changes in metrics and reject 1000s of false alarms detected by anomaly detectors. The application of Lumos has resulted in freeing up as much as 95% of the time allocated to metric-based investigations. In this work, we open source Lumos and present our results from applying it to two different components within the RTC group over millions of sessions. This general library can be coupled with any production system to manage the volume of alerting efficiently.
△ Less
Submitted 23 June, 2020;
originally announced June 2020.
-
The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Testing Framework, and Challenge Results
Authors:
Chandan K. A. Reddy,
Vishak Gopal,
Ross Cutler,
Ebrahim Beyrami,
Roger Cheng,
Harishchandra Dubey,
Sergiy Matusevych,
Robert Aichner,
Ashkan Aazami,
Sebastian Braun,
Puneet Rana,
Sriram Srinivasan,
Johannes Gehrke
Abstract:
The INTERSPEECH 2020 Deep Noise Suppression (DNS) Challenge is intended to promote collaborative research in real-time single-channel Speech Enhancement aimed to maximize the subjective (perceptual) quality of the enhanced speech. A typical approach to evaluate the noise suppression methods is to use objective metrics on the test set obtained by splitting the original dataset. While the performanc…
▽ More
The INTERSPEECH 2020 Deep Noise Suppression (DNS) Challenge is intended to promote collaborative research in real-time single-channel Speech Enhancement aimed to maximize the subjective (perceptual) quality of the enhanced speech. A typical approach to evaluate the noise suppression methods is to use objective metrics on the test set obtained by splitting the original dataset. While the performance is good on the synthetic test set, often the model performance degrades significantly on real recordings. Also, most of the conventional objective metrics do not correlate well with subjective tests and lab subjective tests are not scalable for a large test set. In this challenge, we open-sourced a large clean speech and noise corpus for training the noise suppression models and a representative test set to real-world scenarios consisting of both synthetic and real recordings. We also open-sourced an online subjective test framework based on ITU-T P.808 for researchers to reliably test their developments. We evaluated the results using P.808 on a blind test set. The results and the key learnings from the challenge are discussed. The datasets and scripts can be found here for quick access https://github.com/microsoft/DNS-Challenge.
△ Less
Submitted 18 October, 2020; v1 submitted 16 May, 2020;
originally announced May 2020.
-
The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Speech Quality and Testing Framework
Authors:
Chandan K. A. Reddy,
Ebrahim Beyrami,
Harishchandra Dubey,
Vishak Gopal,
Roger Cheng,
Ross Cutler,
Sergiy Matusevych,
Robert Aichner,
Ashkan Aazami,
Sebastian Braun,
Puneet Rana,
Sriram Srinivasan,
Johannes Gehrke
Abstract:
The INTERSPEECH 2020 Deep Noise Suppression Challenge is intended to promote collaborative research in real-time single-channel Speech Enhancement aimed to maximize the subjective (perceptual) quality of the enhanced speech. A typical approach to evaluate the noise suppression methods is to use objective metrics on the test set obtained by splitting the original dataset. Many publications report r…
▽ More
The INTERSPEECH 2020 Deep Noise Suppression Challenge is intended to promote collaborative research in real-time single-channel Speech Enhancement aimed to maximize the subjective (perceptual) quality of the enhanced speech. A typical approach to evaluate the noise suppression methods is to use objective metrics on the test set obtained by splitting the original dataset. Many publications report reasonable performance on the synthetic test set drawn from the same distribution as that of the training set. However, often the model performance degrades significantly on real recordings. Also, most of the conventional objective metrics do not correlate well with subjective tests and lab subjective tests are not scalable for a large test set. In this challenge, we open-source a large clean speech and noise corpus for training the noise suppression models and a representative test set to real-world scenarios consisting of both synthetic and real recordings. We also open source an online subjective test framework based on ITU-T P.808 for researchers to quickly test their developments. The winners of this challenge will be selected based on subjective evaluation on a representative test set using P.808 framework.
△ Less
Submitted 19 April, 2020; v1 submitted 23 January, 2020;
originally announced January 2020.
-
Assessing the Impact of Gamification on Self-Directed Learning in Medical Students
Authors:
De-Zhang Lee,
Vik Gopal,
Jia-Min Chan,
Li-Shia Ng,
Eng-Tat Ang
Abstract:
Gamification refers to the process of adding game elements to a task. Of late, this process has been introduced in pedagogical settings to capture the attention and interest of students. In our study, we apply the process to Anatomy students and assess the impact on their learning behaviour. We apply a novel path analysis to assess the change in their learning behaviour after a semester of games-e…
▽ More
Gamification refers to the process of adding game elements to a task. Of late, this process has been introduced in pedagogical settings to capture the attention and interest of students. In our study, we apply the process to Anatomy students and assess the impact on their learning behaviour. We apply a novel path analysis to assess the change in their learning behaviour after a semester of games-enhanced small group sessions. We find that too much games could reduce their enjoyment of the underlying learning. However, we also find that students appreciate a change in the traditional model of instruction - they embraced peer-to-peer learning in the classroom.
△ Less
Submitted 22 October, 2018;
originally announced October 2018.
-
Visible absorbing TiO2 thin films by physical deposition methods
Authors:
Litty Varghese,
Anuradha Patra,
Biswajit Mishra,
Deepa Khushalani,
Achanta Venu Gopal
Abstract:
Titanium dioxide is one of the most widely used wide bandgap materials. However, the TiO2 deposited on a substrate is not always transparent leading to a loss in efficiency of the device, especially, the photo response. Herein, we show that atomic layer deposition (ALD) and sputtered TiO2 thin films can be highly absorbing in the visible region. While in ALD, the mechanism is purported to be due t…
▽ More
Titanium dioxide is one of the most widely used wide bandgap materials. However, the TiO2 deposited on a substrate is not always transparent leading to a loss in efficiency of the device, especially, the photo response. Herein, we show that atomic layer deposition (ALD) and sputtered TiO2 thin films can be highly absorbing in the visible region. While in ALD, the mechanism is purported to be due to oxygen deficiency, intriguingly, in sputtered films it has been observed that in fact oxygen rich atmosphere leads to visible absorption. We show that the oxygen content during deposition, the resistivity of the film could be controlled and also the photocatalysis response has been evaluated for both the ALD and sputtered films. High resolution TEM and STEM studies show that the origin of visible absorption could be due to the presence of nanoparticles with surface defects inside the amorphous film.
△ Less
Submitted 31 July, 2018;
originally announced July 2018.
-
A Constrained Conditional Likelihood Approach for Estimating the Means of Selected Populations
Authors:
Claudio Fuentes,
Vik Gopal
Abstract:
Given p independent normal populations, we consider the problem of estimating the mean of those populations, that based on the observed data, give the strongest signals. We explicitly condition on the ranking of the sample means, and consider a constrained conditional maximum likelihood (CCMLE) approach, avoiding the use of any priors and of any sparsity requirement between the population means. O…
▽ More
Given p independent normal populations, we consider the problem of estimating the mean of those populations, that based on the observed data, give the strongest signals. We explicitly condition on the ranking of the sample means, and consider a constrained conditional maximum likelihood (CCMLE) approach, avoiding the use of any priors and of any sparsity requirement between the population means. Our results show that if the observed means are too close together, we should in fact use the grand mean to estimate the mean of the population with the larger sample mean. If they are separated by more than a certain threshold, we should shrink the observed means towards each other. As intuition suggests, it is only if the observed means are far apart that we should conclude that the magnitude of separation and consequent ranking are not due to chance. Unlike other methods, our approach does not need to pre-specify the number of selected populations and the proposed CCMLE is able to perform simultaneous inference. Our method, which is conceptually straightforward, can be easily adapted to incorporate other selection criteria.
Selected populations, Maximum likelihood, Constrained MLE, Post-selection inference
△ Less
Submitted 24 February, 2017;
originally announced February 2017.
-
Coherent perfect absorption mediated enhancement and optical bistability in phase conjugation
Authors:
K. Nireekshan Reddy,
Achanta Venu Gopal,
S. Dutta Gupta
Abstract:
We study phase conjugation in a nonlinear composite slab when the counter propagating pump waves are completely absorbed by means of coherent perfect absorption. Under the undepleted pump approximation the coupling constant and the phase conjugated reflectivity are shown to undergo a substantial increase and multivalued response. The effect can be used for efficient switching of the phase conjugat…
▽ More
We study phase conjugation in a nonlinear composite slab when the counter propagating pump waves are completely absorbed by means of coherent perfect absorption. Under the undepleted pump approximation the coupling constant and the phase conjugated reflectivity are shown to undergo a substantial increase and multivalued response. The effect can be used for efficient switching of the phase conjugated reflectivity in photonic circuits.
△ Less
Submitted 26 November, 2016; v1 submitted 21 November, 2016;
originally announced November 2016.
-
Transverse spin with coupled plasmons
Authors:
Samyobrata Mukherjee,
A V Gopal,
S Dutta Gupta
Abstract:
We study theoretically the transverse spin associated with the eigenmodes of a thin metal film embedded in a dielectric. We show that the transverse spin has a direct dependence on the nature and strength of the coupling leading to two distinct branches for the long- and short- range modes. We show that the short-range mode exhibits larger extraordinary spin because of its more 'structured' nature…
▽ More
We study theoretically the transverse spin associated with the eigenmodes of a thin metal film embedded in a dielectric. We show that the transverse spin has a direct dependence on the nature and strength of the coupling leading to two distinct branches for the long- and short- range modes. We show that the short-range mode exhibits larger extraordinary spin because of its more 'structured' nature due to higher decay in propagation. In contrast to some of the earlier studies, calculations are performed retaining the full lossy character of the metal. In the limit of vanishing losses we present analytical results for the extraordinary spin for both the coupled modes. The results can have direct implications for enhancing the elusive transverse spin exploiting the coupled plasmon structures.
△ Less
Submitted 14 October, 2016; v1 submitted 3 October, 2016;
originally announced October 2016.
-
A Spatio-Temporal Modeling Approach for Weather Radar Reflectivity Data and Its Applications in Tropical Southeast Asia
Authors:
Xiao Liu,
Viknesswaran Gopal,
Jayant Kalagnanam
Abstract:
Weather radar echoes, correlated in both space and time, are the most important input data for short-term precipitation forecast. Motivated by real datasets, this paper is concerned with the spatio-temporal modeling of two-dimensional radar reflectivity fields from a sequence of radar images. Under a Lagrangian integration scheme, we model the radar reflectivity data by a spatio-temporal condition…
▽ More
Weather radar echoes, correlated in both space and time, are the most important input data for short-term precipitation forecast. Motivated by real datasets, this paper is concerned with the spatio-temporal modeling of two-dimensional radar reflectivity fields from a sequence of radar images. Under a Lagrangian integration scheme, we model the radar reflectivity data by a spatio-temporal conditional autoregressive process which is driven by two hidden sub-processes. The first sub-process is the dynamic velocity field which determines the motion of the weather system, while the second sub-process governs the growth or decay of the strength of radar reflectivity. The proposed method is demonstrated, and compared with existing methods, using the real radar data collected from the tropical southeast Asia. Note that, since the tropical storms are known to be highly chaotic and extremely difficult to be predicted, we only focus on the modeling of reflectivity data within a short-period of time and consider the short-term prediction problem based on the proposed model. This is often referred to as the nowcasting issue in the meteorology society.
△ Less
Submitted 30 September, 2016;
originally announced September 2016.
-
Single and multiband THz Metamaterial Polarizers
Authors:
Bagvanth Reddy Sangala,
Arvind Nagarajan,
Prathmesh Deshmukh,
Harshad Surdi,
Goutam Rana,
Achanta Venu Gopal,
S. S. Prabhu
Abstract:
We report single and multiband linear polarizers for terahertz (THz) frequencies using cut-wire metamaterials (MM). The MMs are designed by finite element method, fabricated by electron beam lithography, and characterized by THz time-domain spectroscopy. The MM unit cells consist of single or multiple length cut-wire pads of gold on semi-insulating Gallium Arsenide for single or multiple band pola…
▽ More
We report single and multiband linear polarizers for terahertz (THz) frequencies using cut-wire metamaterials (MM). The MMs are designed by finite element method, fabricated by electron beam lithography, and characterized by THz time-domain spectroscopy. The MM unit cells consist of single or multiple length cut-wire pads of gold on semi-insulating Gallium Arsenide for single or multiple band polarizers. The dependence of the resonance frequency of the single band polarizer on the length of the cut-wires is explained based a transmission line model.
△ Less
Submitted 12 February, 2015;
originally announced February 2015.
-
A Broadband Dipolar Resonance in THz Metamaterials
Authors:
Bagvanth Reddy Sangala,
Harshad Surdi,
Achanta Venu Gopal,
S. S. Prabhu
Abstract:
We demonstrate a THz metamaterial with broadband dipole resonance originating due to the hybridization of LC resonances. The structure optimized by finite element method simulations is fabricated by electron beam lithography and characterized by terahertz time-domain spectroscopy. Numerically, we found that when two LC metamaterial resonators are brought together, an electric dipole resonance aris…
▽ More
We demonstrate a THz metamaterial with broadband dipole resonance originating due to the hybridization of LC resonances. The structure optimized by finite element method simulations is fabricated by electron beam lithography and characterized by terahertz time-domain spectroscopy. Numerically, we found that when two LC metamaterial resonators are brought together, an electric dipole resonance arises in addition to the LC resonances. We observed a strong dependence of the width of these resonances on the separation between the resonators. This dependence can be explained based on series and parallel RLC circuit analogies. The broadband dipole resonance appears when both the resonators are fused together. The metamaterial has a stopband with FWHM of 0.47 THz centered at 1.12 THz. The experimentally measured band features are in reasonable agreement with the simulated ones. The experimental power extinction ratio of THz in the stopbands is found to be 15 dB.
△ Less
Submitted 24 November, 2014;
originally announced November 2014.
-
Plasmonic quasicrystals for designable ultra broadband transmission enhancement and second harmonic generation
Authors:
Sachin Kasture,
Ajith P R,
V J Yallapragada,
Raj Patil,
Nikesh V. V.,
Gajendra Mulay,
Achanta Venu Gopal
Abstract:
Quasi-crystals are intriguing as they exhibit rotational symmetry and long range ordering but lack translational symmetry. 2-dimensional metal-dielectric patterns are interesting to make use of surface plasmon polariton (SPP) mediated local field enhancement and for near dispersionless SPP modes. In plasmonic crystals, the orientation and periodicity of the pattern dictate the polarization respons…
▽ More
Quasi-crystals are intriguing as they exhibit rotational symmetry and long range ordering but lack translational symmetry. 2-dimensional metal-dielectric patterns are interesting to make use of surface plasmon polariton (SPP) mediated local field enhancement and for near dispersionless SPP modes. In plasmonic crystals, the orientation and periodicity of the pattern dictate the polarization response and the discrete plasmon resonances while the interfaces define the plasmon dispersion. However, unique properties of plasmonic quasicrystals lead to polarization independence, designable k-space and broadband transmission enhancement due to SPP mediation. These are useful in many applications like energy harvesting, nonlinear optics and quantum plasmonics. We demonstrate design and fabrication of large area quasicrystal air hole patterns of pi/5 symmetry in metal film in which broadband, launch angle and polarization independent transmission enhancement as well as broadband second harmonic generation are observed. Designable transmission response, other symmetries and tilings are possible.
△ Less
Submitted 12 September, 2013;
originally announced September 2013.
-
Superluminal propagation and broadband omnidirectional antireflection in optical reflectionless potentials
Authors:
L. V. Thekkekara,
Achanta Venu Gopal,
Sachin Kasture,
Gajendra Mulay,
S. Dutta Gupta
Abstract:
Reflectionless potentials (RPs) represent a class of potentials that offer total transmission in the context of one dimensional scattering. Optical realization of RPs in stratified medium can exhibit broadband omnidirectional antireflection property. In addition to the antireflection property, RPs are also expected to demonstrate negative delay. We designed refractive index profiles conforming to…
▽ More
Reflectionless potentials (RPs) represent a class of potentials that offer total transmission in the context of one dimensional scattering. Optical realization of RPs in stratified medium can exhibit broadband omnidirectional antireflection property. In addition to the antireflection property, RPs are also expected to demonstrate negative delay. We designed refractive index profiles conforming to RPs and realize them in stratified media consisting of Al2O3 and TiO2 heterolayers. In these structures we observed < 1% reflection over the broad wavelength range of 350 nm to 2500 nm for angles of incidence 0 - 50 degrees. The observed reflection and transmission response of RPs are polarization independent. A negative delay of about 31 fsec with discernible pulse narrowing was observed in passage through two RPs. These RPs can be interesting for optical instrumentation as broadband, omni-directional antireflection coatings as well as in pulse control and transmission applications like delay lines.
△ Less
Submitted 2 September, 2013;
originally announced September 2013.
-
Nonlinearity Induced Critical Coupling
Authors:
K. Nireekshan Reddy,
Achanta Venu Gopal,
S. Dutta Gupta
Abstract:
We study a critically coupled system (Opt. Lett., \textbf{32}, 1483 (2007)) with a Kerr-nonlinear spacer layer. Nonlinearity is shown to inhibit null-scattering in a critically coupled system at low powers. However, a system detuned from critical coupling can exhibit near-complete suppression of scattering by means of nonlinearity-induced changes in refractive index. Our studies reveal clearly an…
▽ More
We study a critically coupled system (Opt. Lett., \textbf{32}, 1483 (2007)) with a Kerr-nonlinear spacer layer. Nonlinearity is shown to inhibit null-scattering in a critically coupled system at low powers. However, a system detuned from critical coupling can exhibit near-complete suppression of scattering by means of nonlinearity-induced changes in refractive index. Our studies reveal clearly an important aspect of critical coupling as a delicate balance in both the amplitude and the phase relations, while a nonlinear resonance in dispersive bistability concerns only the phase.
△ Less
Submitted 25 May, 2013;
originally announced May 2013.
-
Coherent perfect absorption mediated anomalous reflection and refraction
Authors:
Shourya Dutta-Gupta,
Rahul Deshmukh,
Achanta Venu Gopal,
Olivier J. F. Martin,
S. Dutta Gupta
Abstract:
We demonstrate bending of light on the same side of the normal in a free standing corrugated metal film under bi-directional illumination. Coherent perfect absorption (CPA) is exploited to suppress the specular zeroth order leading to effective back-bending of light into the `-1' order, while the `+1' order is resonant with the surface mode. The effect is shown to be phase sensitive yielding CPA a…
▽ More
We demonstrate bending of light on the same side of the normal in a free standing corrugated metal film under bi-directional illumination. Coherent perfect absorption (CPA) is exploited to suppress the specular zeroth order leading to effective back-bending of light into the `-1' order, while the `+1' order is resonant with the surface mode. The effect is shown to be phase sensitive yielding CPA and superscattering in the same geometry.
△ Less
Submitted 23 July, 2012;
originally announced July 2012.
-
Simultaneous SNP identification in association studies with missing data
Authors:
Zhen Li,
Vikneswaran Gopal,
Xiaobo Li,
John M. Davis,
George Casella
Abstract:
Association testing aims to discover the underlying relationship between genotypes (usually Single Nucleotide Polymorphisms, or SNPs) and phenotypes (attributes, or traits). The typically large data sets used in association testing often contain missing values. Standard statistical methods either impute the missing values using relatively simple assumptions, or delete them, or both, which can gene…
▽ More
Association testing aims to discover the underlying relationship between genotypes (usually Single Nucleotide Polymorphisms, or SNPs) and phenotypes (attributes, or traits). The typically large data sets used in association testing often contain missing values. Standard statistical methods either impute the missing values using relatively simple assumptions, or delete them, or both, which can generate biased results. Here we describe the Bayesian hierarchical model BAMD (Bayesian Association with Missing Data). BAMD is a Gibbs sampler, in which missing values are multiply imputed based upon all of the available information in the data set. We estimate the parameters and prove that updating one SNP at each iteration preserves the ergodic property of the Markov chain, and at the same time improves computational speed. We also implement a model selection option in BAMD, which enables potential detection of SNP interactions. Simulations show that unbiased estimates of SNP effects are recovered with missing genotype data. Also, we validate associations between SNPs and a carbon isotope discrimination phenotype that were previously reported using a family based method, and discover an additional SNP associated with the trait. BAMD is available as an R-package from http://cran.r-project.org/package=BAMD
△ Less
Submitted 3 July, 2012; v1 submitted 2 July, 2012;
originally announced July 2012.
-
Interfacing a quantum dot spin with a photonic circuit
Authors:
Isaac J. Luxmoore,
Nicholas A. Wasley,
Andrew J. Ramsay,
Arthur C. T. Thijssen,
Ruth Oulton,
Maxime Hugues,
Sachin Kasture,
Achanta V. Gopal,
A. Mark Fox,
Maurice S. Skolnick
Abstract:
A scalable optical quantum information processor is likely to be a waveguide circuit with integrated sources, detectors, and either deterministic quantum-logic or quantum memory elements. With microsecond coherence times, ultrafast coherent control, and lifetime-limited transitions, semiconductor quantum-dot spins are a natural choice for the static qubits. However their integration with flying ph…
▽ More
A scalable optical quantum information processor is likely to be a waveguide circuit with integrated sources, detectors, and either deterministic quantum-logic or quantum memory elements. With microsecond coherence times, ultrafast coherent control, and lifetime-limited transitions, semiconductor quantum-dot spins are a natural choice for the static qubits. However their integration with flying photonic qubits requires an on-chip spin-photon interface, which presents a fundamental problem: the spin-state is measured and controlled via circularly-polarised photons, but waveguides support only linear polarisation. We demonstrate here a solution based on two orthogonal photonic nanowires, in which the spin-state is mapped to a path-encoded photon, thus providing a blue-print for a scalable spin-photon network. Furthermore, for some devices we observe that the circular polarisation state is directly mapped to orthogonal nanowires. This result, which is physically surprising for a non-chiral structure, is shown to be related to the nano-positioning of the quantum-dot with respect to the photonic circuit.
△ Less
Submitted 14 June, 2012;
originally announced June 2012.
-
Modulation of a surface plasmon-polariton resonance by sub-terahertz diffracted coherent phonons
Authors:
Christian Brüggemann,
Andrey V. Akimov,
Boris A. Glavin,
Vladimir I. Belotelov,
Ilya A. Akimov,
Jasmin Jäger,
Sachin Kasture,
Achanta Venu Gopal,
Arvind S. Vengurlekar,
Dmitri R. Yakovlev,
Anthony J. Kent,
Manfred Bayer
Abstract:
Coherent sub-THz phonons incident on a gold grating that is deposited on a dielectric substrate undergo diffraction and thereby induce an alteration of the surface plasmon-polariton resonance. This results in efficient high-frequency modulation (up to 110 GHz) of the structure's reflectivity for visible light in the vicinity of the plasmon-polariton resonance. High modulation efficiency is achieve…
▽ More
Coherent sub-THz phonons incident on a gold grating that is deposited on a dielectric substrate undergo diffraction and thereby induce an alteration of the surface plasmon-polariton resonance. This results in efficient high-frequency modulation (up to 110 GHz) of the structure's reflectivity for visible light in the vicinity of the plasmon-polariton resonance. High modulation efficiency is achieved by designing a periodic nanostructure which provides both plasmon-polariton and phonon resonances. Our theoretical analysis shows that the dynamical alteration of the plasmon-polariton resonance is governed by modulation of the slit widths within the grating at the frequencies of higher-order phonon resonances.
△ Less
Submitted 14 June, 2012; v1 submitted 13 June, 2012;
originally announced June 2012.
-
Near dispersion-less surface plasmon polariton resonances at a metal-dielectric interface
Authors:
Sachin Kasture,
P. Mandal,
Amandev Singh,
Andrew Ramsay,
Arvind S. Vengurlekar,
S. Dutta Gupta,
V. I. Belotelov,
Achanta Venu Gopal
Abstract:
Omni-directional light coupling to surface plasmon polariton (SPP) modes to make use of plasmon mediated near-field enhancement is challenging. We report possibility of near dispersion-less modes in structures with unpatterned metal-dielectric interfaces having 2-D dielectric patterns on top. We show that the position and dispersion of the excited modes can be controlled by the excitation geometry…
▽ More
Omni-directional light coupling to surface plasmon polariton (SPP) modes to make use of plasmon mediated near-field enhancement is challenging. We report possibility of near dispersion-less modes in structures with unpatterned metal-dielectric interfaces having 2-D dielectric patterns on top. We show that the position and dispersion of the excited modes can be controlled by the excitation geometry and the 2-D pattern. The anti-crossings resulting from the in-plane coupling of different SPP modes are also shown.
△ Less
Submitted 1 August, 2012; v1 submitted 22 December, 2011;
originally announced December 2011.
-
Plasmonic crystals for ultrafast nanophotonics: Optical switching of surface plasmon polaritons
Authors:
M. Pohl,
V. I. Belotelov,
I. A. Akimov,
S. Kasture,
A. S. Vengurlekar,
A. V. Gopal,
A. K. Zvezdin,
D. R. Yakovlev,
M. Bayer
Abstract:
We demonstrate that the dispersion of surface plasmon polaritons in a periodically perforated gold film can be efficiently manipulated by femtosecond laser pulses with the wavelengths far from the intrinsic resonances of gold. Using a time- and frequency- resolved pump-probe technique we observe shifting of the plasmon polariton resonances with response times from 200 to 800 fs depending on the pr…
▽ More
We demonstrate that the dispersion of surface plasmon polaritons in a periodically perforated gold film can be efficiently manipulated by femtosecond laser pulses with the wavelengths far from the intrinsic resonances of gold. Using a time- and frequency- resolved pump-probe technique we observe shifting of the plasmon polariton resonances with response times from 200 to 800 fs depending on the probe photon energy, through which we obtain comprehensive insight into the electron dynamics in gold. We show that Wood anomalies in the optical spectra provide pronounced resonances in differential transmission and reflection with magnitudes up to 3% for moderate pump fluences of 0.5 mJ/cm^2.
△ Less
Submitted 7 December, 2011;
originally announced December 2011.
-
Effect of detuning on the phonon induced dephasing of optically driven InGaAs/GaAs quantum dots
Authors:
A. J. Ramsay,
T. M. Godden,
S. J. Boyle,
E. M. Gauger,
A. Nazir,
B. W. Lovett,
Achanta Venu Gopal,
A. M. Fox,
M. S. Skolnick
Abstract:
Recently, longitudinal acoustic phonons have been identified as the main source of the intensity damping observed in Rabi rotation measurements of the ground-state exciton of a single InAs/GaAs quantum dot. Here we report experiments of intensity damped Rabi rotations in the case of detuned laser pulses, the results have implications for the coherent optical control of both excitons and spins usin…
▽ More
Recently, longitudinal acoustic phonons have been identified as the main source of the intensity damping observed in Rabi rotation measurements of the ground-state exciton of a single InAs/GaAs quantum dot. Here we report experiments of intensity damped Rabi rotations in the case of detuned laser pulses, the results have implications for the coherent optical control of both excitons and spins using detuned laser pulses.
△ Less
Submitted 1 June, 2011;
originally announced June 2011.
-
Extraordinary Magnetooptics in Plasmonic Crystals
Authors:
V. I. Belotelov,
I. A. Akimov,
M. Poh,
V. A. Kotov,
S. Kasture,
A. S. Vengurlekar,
A. V. Gopal,
D. Yakovlev,
A. K. Zvezdin,
M. Bayer
Abstract:
Plasmonics has been attracting considerable interest as it allows localization of light at nanoscale dimensions. A breakthrough in integrated nanophotonics can be obtained by fabricating plasmonic functional materials. Such systems may show a rich variety of novel phenomena and also have huge application potential. In particular magnetooptical materials are appealing as they may provide ultrafast…
▽ More
Plasmonics has been attracting considerable interest as it allows localization of light at nanoscale dimensions. A breakthrough in integrated nanophotonics can be obtained by fabricating plasmonic functional materials. Such systems may show a rich variety of novel phenomena and also have huge application potential. In particular magnetooptical materials are appealing as they may provide ultrafast control of laser light and surface plasmons via an external magnetic field. Here we demonstrate a new magnetooptical material: a one-dimensional plasmonic crystal formed by a periodically perforated noble metal film on top of a ferromagnetic dielectric film. It provides giant Faraday and Kerr effects as proved by the observation of enhancement of the transverse Kerr effect near Ebbesen's extraordinary transmission peaks by three orders of magnitude. Surface plasmon polaritons play a decisive role in this enhancement, as the Kerr effect depends sensitively on their properties. The plasmonic crystal can be operated in transmission, so that it may be implemented in devices for telecommunication, plasmonic circuitry, magnetic field sensing and all-optical magnetic data storage.
△ Less
Submitted 10 November, 2010;
originally announced November 2010.
-
Damping of Exciton Rabi Rotations by Acoustic Phonons in Optically Excited InGaAs/GaAs Quantum Dots
Authors:
A. J. Ramsay,
Achanta Venu Gopal,
E. M. Gauger,
A. Nazir,
B. W. Lovett,
A. M. Fox,
M. S. Skolnick
Abstract:
We report experimental evidence identifying acoustic phonons as the principal source of the excitation-induced-dephasing (EID) responsible for the intensity damping of quantum dot excitonic Rabi rotations. The rate of EID is extracted from temperature dependent Rabi rotation measurements of the ground-state excitonic transition, and is found to be in close quantitative agreement with an acoustic…
▽ More
We report experimental evidence identifying acoustic phonons as the principal source of the excitation-induced-dephasing (EID) responsible for the intensity damping of quantum dot excitonic Rabi rotations. The rate of EID is extracted from temperature dependent Rabi rotation measurements of the ground-state excitonic transition, and is found to be in close quantitative agreement with an acoustic-phonon model.
△ Less
Submitted 29 January, 2010; v1 submitted 30 March, 2009;
originally announced March 2009.
-
Light-Shift Imbalance Induced Blockade of Collective Excitations Beyond the Lowest Order
Authors:
M. S. Shahriar,
P. Pradhan,
G. S. Pati,
V. Gopal,
K. Salit
Abstract:
Current proposals focusing on neutral atoms for quantum computing are mostly based on using single atoms as quantum bits (qubits), while using cavity induced coupling or dipole-dipole interaction for two-qubit operations. An alternative approach is to use atomic ensembles as qubits. However, when an atomic ensemble is excited, by a laser beam matched to a two-level transition (or a Raman transit…
▽ More
Current proposals focusing on neutral atoms for quantum computing are mostly based on using single atoms as quantum bits (qubits), while using cavity induced coupling or dipole-dipole interaction for two-qubit operations. An alternative approach is to use atomic ensembles as qubits. However, when an atomic ensemble is excited, by a laser beam matched to a two-level transition (or a Raman transition) for example, it leads to a cascade of many states as more and more photons are absorbed^1. In order to make use of an ensemble as a qubit, it is necessary to disrupt this cascade, and restrict the excitation to the absorption (and emission) of a single photon only. Here, we show how this can be achieved by using a new type of blockade mechanism, based on the light-shift imbalance (LSI) in a Raman transition. We describe first a simple example illustrating the concept of light shift imbalanced induced blockade (LSIIB) using a multi-level structure in a single atom, and show verifications of the analytic prediction using numerical simulations. We then extend this model to show how a blockade can be realized by using LSI in the excitation of an ensemble. Specifically, we show how the LSIIB process enables one to treat the ensemble as a two level atom that undergoes fully deterministic Rabi oscillations between two collective quantum states, while suppressing excitations of higher order collective states.
△ Less
Submitted 17 April, 2006; v1 submitted 17 April, 2006;
originally announced April 2006.
-
Enhancement of interferometric precision using fast light
Authors:
M. S. Shahriar,
R. Tripathi,
G. S. Pati,
V. Gopal,
M. Messall,
K. Salit
Abstract:
We show that anomalous dispersion characteristic of fast-light can be used to enhance the sensitivity of optical interferometry under certain conditions. In particular, we show that a dual-chamber Fabry-Perot interferometer with a shared mirror-pair can be used in a way so that its sensitivity is increased by operating near the critically anomalous dispersion condition where the group index is m…
▽ More
We show that anomalous dispersion characteristic of fast-light can be used to enhance the sensitivity of optical interferometry under certain conditions. In particular, we show that a dual-chamber Fabry-Perot interferometer with a shared mirror-pair can be used in a way so that its sensitivity is increased by operating near the critically anomalous dispersion condition where the group index is much less than unity. The enhancement factor can be as high as 108 for realistic conditions. The process of bi-frequency pumped Raman gain in a lambda-type atomic medium can be used to achieve this effect.
△ Less
Submitted 14 July, 2005;
originally announced July 2005.
-
Ultrahigh Precision Absolute and Relative Rotation Sensing using Fast and Slow Light
Authors:
M. S. Shahriar,
G. S. Pati,
R. Tripathi,
V. Gopal,
M. Messall,
K. Salit
Abstract:
We describe a resonator based optical gyroscope whose sensitivity for measuring absolute rotation is enhanced via use of the anomalous dispersion characteristic of superluminal light propagation. The enhancement is given by the inverse of the group index, saturating to a bound determined by the group velocity dispersion. We also show how the offsetting effect of the concomitant broadening of the…
▽ More
We describe a resonator based optical gyroscope whose sensitivity for measuring absolute rotation is enhanced via use of the anomalous dispersion characteristic of superluminal light propagation. The enhancement is given by the inverse of the group index, saturating to a bound determined by the group velocity dispersion. We also show how the offsetting effect of the concomitant broadening of the resonator linewidth may be circumvented by using an active cavity. For realistic conditions, the enhancement factor is as high as 106. We also show how normal dispersion used for slow light can enhance relative rotation sensing in a specially designed Sagnac interferometer, with the enhancement given by the slowing factor
△ Less
Submitted 2 March, 2007; v1 submitted 25 May, 2005;
originally announced May 2005.
-
Wavelength Teleportation via Distant Quantum Entanglement
Authors:
M. S. Shahriar,
P. Pradhan,
V. Gopal,
G. Pati,
G. C. Cardoso
Abstract:
Recently, we have shown theoretically [1] as well as experimentally [2] how the phase of an electromagnetic field can be determined by measuring the population of either of the two states of a two-level atomic system excited by this field, via the so-called Bloch-Siegert oscillation resulting from the interference between the co- and counter-rotating excitations. Here, we show how a degenerate e…
▽ More
Recently, we have shown theoretically [1] as well as experimentally [2] how the phase of an electromagnetic field can be determined by measuring the population of either of the two states of a two-level atomic system excited by this field, via the so-called Bloch-Siegert oscillation resulting from the interference between the co- and counter-rotating excitations. Here, we show how a degenerate entanglement, created without transmitting any timing signal, can be used to teleport this phase information. This phase-teleportation process may be applied to achieve wavelength teleportation, which in turn may be used for frequency-locking of remote oscillators.
△ Less
Submitted 24 September, 2003; v1 submitted 10 September, 2003;
originally announced September 2003.
-
Small-worlds: How and why
Authors:
Nisha Mathias,
Venkatesh Gopal
Abstract:
We investigate small-world networks from the point of view of their origin. While the characteristics of small-world networks are now fairly well understood, there is as yet no work on what drives the emergence of such a network architecture. In situations such as neural or transportation networks, where a physical distance between the nodes of the network exists, we study whether the small-worl…
▽ More
We investigate small-world networks from the point of view of their origin. While the characteristics of small-world networks are now fairly well understood, there is as yet no work on what drives the emergence of such a network architecture. In situations such as neural or transportation networks, where a physical distance between the nodes of the network exists, we study whether the small-world topology arises as a consequence of a tradeoff between maximal connectivity and minimal wiring. Using simulated annealing, we study the properties of a randomly rewired network as the relative tradeoff between wiring and connectivity is varied. When the network seeks to minimize wiring, a regular graph results. At the other extreme, when connectivity is maximized, a near random network is obtained. In the intermediate regime, a small-world network is formed. However, unlike the model of Watts and Strogatz (Nature {\bf 393}, 440 (1998)), we find an alternate route to small-world behaviour through the formation of hubs, small clusters where one vertex is connected to a large number of neighbours.
△ Less
Submitted 5 February, 2000;
originally announced February 2000.
-
The backscattering of polarised light from turbid media - An analysis of the azimuthal intensity variations and its implications for the position of the source of diffusing radiation
Authors:
Venkatesh Gopal,
Hema Ramachandran,
A. K. Sood
Abstract:
We study the azimuthal variation in the backscattered intensity that is seen when polarised light is scattered by a turbid medium. We present experimental observations of these intensity variations in colloidal suspensions over a wide range of optical densities. For a medium composed of spherical scatterers, we have developed using Mie scattering theory and Monte Carlo simulations of photon tran…
▽ More
We study the azimuthal variation in the backscattered intensity that is seen when polarised light is scattered by a turbid medium. We present experimental observations of these intensity variations in colloidal suspensions over a wide range of optical densities. For a medium composed of spherical scatterers, we have developed using Mie scattering theory and Monte Carlo simulations of photon transport, a model which calculates the constant intensity contours of the backscattered intensity. Comparisons of calculated and experimentally obtained contours show very good agreement. To our knowledge, this is the first model to provide a quantitative comparison with experimental data. Close to the exact backscattering direction, where we have made our intensity measuements, we show that the patterns are formed by what we have called `reflected snake photons'. These are photons that have been backscattered once and have maintained their direction of propagation thereafter until they exit the medium. We also find that the reflected snake photons originate within a depth of about $4l^{*}$ from the point of entry of the incident beam, where $l^{*}$ is the photon transport mean free path. Further, in a novel approach, we have used these patterns as a probe of the assumptions underlying the diffusion approximation and present new results on the position of the apparent source of diffusing radiation within the medium. Possible applications are also discussed.
△ Less
Submitted 15 June, 1999;
originally announced June 1999.
-
Imaging in turbid media using quasi-ballistic photons
Authors:
Venkatesh Gopal,
Sushil Mujumdar,
Hema Ramachandran,
A. K. Sood
Abstract:
We study by means of experiments and Monte Carlo simulations, the scattering of light in random media, to determine the distance upto which photons travel along almost undeviated paths within a scattering medium, and are therefore capable of casting a shadow of an opaque inclusion embedded within the medium. Such photons are isolated by polarisation discrimination wherein the plane of linear pol…
▽ More
We study by means of experiments and Monte Carlo simulations, the scattering of light in random media, to determine the distance upto which photons travel along almost undeviated paths within a scattering medium, and are therefore capable of casting a shadow of an opaque inclusion embedded within the medium. Such photons are isolated by polarisation discrimination wherein the plane of linear polarisation of the input light is continuously rotated and the polarisation preserving component of the emerging light is extracted by means of a Fourier transform. This technique is a software implementation of lock-in detection. We find that images may be recovered to a depth far in excess of what is predicted by the diffusion theory of photon propagation. To understand our experimental results, we perform Monte Carlo simulations to model the random walk behaviour of the multiply scattered photons. We present a new definition of a diffusing photon in terms of the memory of its initial direction of propagation, which we then quantify in terms of an angular correlation function. This redefinition yields the penetration depth of the polarisation preserving photons. Based on these results, we have formulated a model to understand shadow formation in a turbid medium, the predictions of which are in good agreement with our experimental results.
△ Less
Submitted 13 June, 1999;
originally announced June 1999.