-
MS-IQA: A Multi-Scale Feature Fusion Network for PET/CT Image Quality Assessment
Authors:
Siqiao Li,
Chen Hui,
Wei Zhang,
Rui Liang,
Chenyue Song,
Feng Jiang,
Haiqi Zhu,
Zhixuan Li,
Hong Huang,
Xiang Li
Abstract:
Positron Emission Tomography / Computed Tomography (PET/CT) plays a critical role in medical imaging, combining functional and anatomical information to aid in accurate diagnosis. However, image quality degradation due to noise, compression and other factors could potentially lead to diagnostic uncertainty and increase the risk of misdiagnosis. When evaluating the quality of a PET/CT image, both l…
▽ More
Positron Emission Tomography / Computed Tomography (PET/CT) plays a critical role in medical imaging, combining functional and anatomical information to aid in accurate diagnosis. However, image quality degradation due to noise, compression and other factors could potentially lead to diagnostic uncertainty and increase the risk of misdiagnosis. When evaluating the quality of a PET/CT image, both low-level features like distortions and high-level features like organ anatomical structures affect the diagnostic value of the image. However, existing medical image quality assessment (IQA) methods are unable to account for both feature types simultaneously. In this work, we propose MS-IQA, a novel multi-scale feature fusion network for PET/CT IQA, which utilizes multi-scale features from various intermediate layers of ResNet and Swin Transformer, enhancing its ability of perceiving both local and global information. In addition, a multi-scale feature fusion module is also introduced to effectively combine high-level and low-level information through a dynamically weighted channel attention mechanism. Finally, to fill the blank of PET/CT IQA dataset, we construct PET-CT-IQA-DS, a dataset containing 2,700 varying-quality PET/CT images with quality scores assigned by radiologists. Experiments on our dataset and the publicly available LDCTIQAC2023 dataset demonstrate that our proposed model has achieved superior performance against existing state-of-the-art methods in various IQA metrics. This work provides an accurate and efficient IQA method for PET/CT. Our code and dataset are available at https://github.com/MS-IQA/MS-IQA/.
△ Less
Submitted 25 June, 2025;
originally announced June 2025.
-
LVPNet: A Latent-variable-based Prediction-driven End-to-end Framework for Lossless Compression of Medical Images
Authors:
Chenyue Song,
Chen Hui,
Qing Lin,
Wei Zhang,
Siqiao Li,
Haiqi Zhu,
Zhixuan Li,
Shengping Zhang,
Shaohui Liu,
Feng Jiang,
Xiang Li
Abstract:
Autoregressive Initial Bits is a framework that integrates sub-image autoregression and latent variable modeling, demonstrating its advantages in lossless medical image compression. However, in existing methods, the image segmentation process leads to an even distribution of latent variable information across each sub-image, which in turn causes posterior collapse and inefficient utilization of la…
▽ More
Autoregressive Initial Bits is a framework that integrates sub-image autoregression and latent variable modeling, demonstrating its advantages in lossless medical image compression. However, in existing methods, the image segmentation process leads to an even distribution of latent variable information across each sub-image, which in turn causes posterior collapse and inefficient utilization of latent variables. To deal with these issues, we propose a prediction-based end-to-end lossless medical image compression method named LVPNet, leveraging global latent variables to predict pixel values and encoding predicted probabilities for lossless compression. Specifically, we introduce the Global Multi-scale Sensing Module (GMSM), which extracts compact and informative latent representations from the entire image, effectively capturing spatial dependencies within the latent space. Furthermore, to mitigate the information loss introduced during quantization, we propose the Quantization Compensation Module (QCM), which learns the distribution of quantization errors and refines the quantized features to compensate for quantization loss. Extensive experiments on challenging benchmarks demonstrate that our method achieves superior compression efficiency compared to state-of-the-art lossless image compression approaches, while maintaining competitive inference speed. The code is at https://github.com/scy-Jackel/LVPNet.
△ Less
Submitted 25 June, 2025; v1 submitted 22 June, 2025;
originally announced June 2025.
-
BPCLIP: A Bottom-up Image Quality Assessment from Distortion to Semantics Based on CLIP
Authors:
Chenyue Song,
Chen Hui,
Wei Zhang,
Haiqi Zhu,
Shaohui Liu,
Hong Huang,
Feng Jiang
Abstract:
Image Quality Assessment (IQA) aims to evaluate the perceptual quality of images based on human subjective perception. Existing methods generally combine multiscale features to achieve high performance, but most rely on straightforward linear fusion of these features, which may not adequately capture the impact of distortions on semantic content. To address this, we propose a bottom-up image quali…
▽ More
Image Quality Assessment (IQA) aims to evaluate the perceptual quality of images based on human subjective perception. Existing methods generally combine multiscale features to achieve high performance, but most rely on straightforward linear fusion of these features, which may not adequately capture the impact of distortions on semantic content. To address this, we propose a bottom-up image quality assessment approach based on the Contrastive Language-Image Pre-training (CLIP, a recently proposed model that aligns images and text in a shared feature space), named BPCLIP, which progressively extracts the impact of low-level distortions on high-level semantics. Specifically, we utilize an encoder to extract multiscale features from the input image and introduce a bottom-up multiscale cross attention module designed to capture the relationships between shallow and deep features. In addition, by incorporating 40 image quality adjectives across six distinct dimensions, we enable the pre-trained CLIP text encoder to generate representations of the intrinsic quality of the image, thereby strengthening the connection between image quality perception and human language. Our method achieves superior results on most public Full-Reference (FR) and No-Reference (NR) IQA benchmarks, while demonstrating greater robustness.
△ Less
Submitted 22 June, 2025;
originally announced June 2025.
-
MVQA: Mamba with Unified Sampling for Efficient Video Quality Assessment
Authors:
Yachun Mi,
Yu Li,
Weicheng Meng,
Chaofeng Chen,
Chen Hui,
Shaohui Liu
Abstract:
The rapid growth of long-duration, high-definition videos has made efficient video quality assessment (VQA) a critical challenge. Existing research typically tackles this problem through two main strategies: reducing model parameters and resampling inputs. However, light-weight Convolution Neural Networks (CNN) and Transformers often struggle to balance efficiency with high performance due to the…
▽ More
The rapid growth of long-duration, high-definition videos has made efficient video quality assessment (VQA) a critical challenge. Existing research typically tackles this problem through two main strategies: reducing model parameters and resampling inputs. However, light-weight Convolution Neural Networks (CNN) and Transformers often struggle to balance efficiency with high performance due to the requirement of long-range modeling capabilities. Recently, the state-space model, particularly Mamba, has emerged as a promising alternative, offering linear complexity with respect to sequence length. Meanwhile, efficient VQA heavily depends on resampling long sequences to minimize computational costs, yet current resampling methods are often weak in preserving essential semantic information. In this work, we present MVQA, a Mamba-based model designed for efficient VQA along with a novel Unified Semantic and Distortion Sampling (USDS) approach. USDS combines semantic patch sampling from low-resolution videos and distortion patch sampling from original-resolution videos. The former captures semantically dense regions, while the latter retains critical distortion details. To prevent computation increase from dual inputs, we propose a fusion mechanism using pre-defined masks, enabling a unified sampling strategy that captures both semantic and quality information without additional computational burden. Experiments show that the proposed MVQA, equipped with USDS, achieve comparable performance to state-of-the-art methods while being $2\times$ as fast and requiring only $1/5$ GPU memory.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
-
LLM Sensitivity Evaluation Framework for Clinical Diagnosis
Authors:
Chenwei Yan,
Xiangling Fu,
Yuxuan Xiong,
Tianyi Wang,
Siu Cheung Hui,
Ji Wu,
Xien Liu
Abstract:
Large language models (LLMs) have demonstrated impressive performance across various domains. However, for clinical diagnosis, higher expectations are required for LLM's reliability and sensitivity: thinking like physicians and remaining sensitive to key medical information that affects diagnostic reasoning, as subtle variations can lead to different diagnosis results. Yet, existing works focus ma…
▽ More
Large language models (LLMs) have demonstrated impressive performance across various domains. However, for clinical diagnosis, higher expectations are required for LLM's reliability and sensitivity: thinking like physicians and remaining sensitive to key medical information that affects diagnostic reasoning, as subtle variations can lead to different diagnosis results. Yet, existing works focus mainly on investigating the sensitivity of LLMs to irrelevant context and overlook the importance of key information. In this paper, we investigate the sensitivity of LLMs, i.e. GPT-3.5, GPT-4, Gemini, Claude3 and LLaMA2-7b, to key medical information by introducing different perturbation strategies. The evaluation results highlight the limitations of current LLMs in remaining sensitive to key medical information for diagnostic decision-making. The evolution of LLMs must focus on improving their reliability, enhancing their ability to be sensitive to key information, and effectively utilizing this information. These improvements will enhance human trust in LLMs and facilitate their practical application in real-world scenarios. Our code and dataset are available at https://github.com/chenwei23333/DiagnosisQA.
△ Less
Submitted 18 April, 2025;
originally announced April 2025.
-
ASRL:A robust loss function with potential for development
Authors:
Chenyu Hui,
Anran Zhang,
Xintong Li
Abstract:
In this article, we proposed a partition:wise robust loss function based on the previous robust loss function. The characteristics of this loss function are that it achieves high robustness and a wide range of applicability through partition-wise design and adaptive parameter adjustment. Finally, the advantages and development potential of this loss function were verified by applying this loss fun…
▽ More
In this article, we proposed a partition:wise robust loss function based on the previous robust loss function. The characteristics of this loss function are that it achieves high robustness and a wide range of applicability through partition-wise design and adaptive parameter adjustment. Finally, the advantages and development potential of this loss function were verified by applying this loss function to the regression question and using five different datasets (with different dimensions, different sample numbers, and different fields) to compare with the other loss functions. The results of multiple experiments have proven the advantages of our loss function .
△ Less
Submitted 9 April, 2025;
originally announced April 2025.
-
Batch Aggregation: An Approach to Enhance Text Classification with Correlated Augmented Data
Authors:
Charco Hui,
Yalu Wen
Abstract:
Natural language processing models often face challenges due to limited labeled data, especially in domain specific areas, e.g., clinical trials. To overcome this, text augmentation techniques are commonly used to increases sample size by transforming the original input data into artificial ones with the label preserved. However, traditional text classification methods ignores the relationship bet…
▽ More
Natural language processing models often face challenges due to limited labeled data, especially in domain specific areas, e.g., clinical trials. To overcome this, text augmentation techniques are commonly used to increases sample size by transforming the original input data into artificial ones with the label preserved. However, traditional text classification methods ignores the relationship between augmented texts and treats them as independent samples which may introduce classification error. Therefore, we propose a novel approach called 'Batch Aggregation' (BAGG) which explicitly models the dependence of text inputs generated through augmentation by incorporating an additional layer that aggregates results from correlated texts. Through studying multiple benchmark data sets across different domains, we found that BAGG can improve classification accuracy. We also found that the increase of performance with BAGG is more obvious in domain specific data sets, with accuracy improvements of up to 10-29%. Through the analysis of benchmark data, the proposed method addresses limitations of traditional techniques and improves robustness in text classification tasks. Our result demonstrates that BAGG offers more robust results and outperforms traditional approaches when training data is limited.
△ Less
Submitted 7 April, 2025;
originally announced April 2025.
-
DeepGrav: Anomalous Gravitational-Wave Detection Through Deep Latent Features
Authors:
Jianqi Yan,
Alex P. Leung,
Zhiyuan Pei,
David C. Y. Hui,
Sangin Kim
Abstract:
This work introduces a novel deep learning-based approach for gravitational wave anomaly detection, aiming to overcome the limitations of traditional matched filtering techniques in identifying unknown waveform gravitational wave signals. We introduce a modified convolutional neural network architecture inspired by ResNet that leverages residual blocks to extract high-dimensional features, effecti…
▽ More
This work introduces a novel deep learning-based approach for gravitational wave anomaly detection, aiming to overcome the limitations of traditional matched filtering techniques in identifying unknown waveform gravitational wave signals. We introduce a modified convolutional neural network architecture inspired by ResNet that leverages residual blocks to extract high-dimensional features, effectively capturing subtle differences between background noise and gravitational wave signals. This network architecture learns a high-dimensional projection while preserving discrepancies with the original input, facilitating precise identification of gravitational wave signals. In our experiments, we implement an innovative data augmentation strategy that generates new data by computing the arithmetic mean of multiple signal samples while retaining the key features of the original signals.
In the NSF HDR A3D3: Detecting Anomalous Gravitational Wave Signals competition, it is honorable for us (group name: easonyan123) to get to the first place at the end with our model achieving a true negative rate (TNR) of 0.9708 during development/validation phase and 0.9832 on an unseen challenge dataset during final/testing phase, the highest among all competitors. These results demonstrate that our method not only achieves excellent generalization performance but also maintains robust adaptability in addressing the complex uncertainties inherent in gravitational wave anomaly detection.
△ Less
Submitted 15 March, 2025; v1 submitted 5 March, 2025;
originally announced March 2025.
-
Building Machine Learning Challenges for Anomaly Detection in Science
Authors:
Elizabeth G. Campolongo,
Yuan-Tang Chou,
Ekaterina Govorkova,
Wahid Bhimji,
Wei-Lun Chao,
Chris Harris,
Shih-Chieh Hsu,
Hilmar Lapp,
Mark S. Neubauer,
Josephine Namayanja,
Aneesh Subramanian,
Philip Harris,
Advaith Anand,
David E. Carlyn,
Subhankar Ghosh,
Christopher Lawrence,
Eric Moreno,
Ryan Raikman,
Jiaman Wu,
Ziheng Zhang,
Bayu Adhi,
Mohammad Ahmadi Gharehtoragh,
Saúl Alonso Monsalve,
Marta Babicz,
Furqan Baig
, et al. (125 additional authors not shown)
Abstract:
Scientific discoveries are often made by finding a pattern or object that was not predicted by the known rules of science. Oftentimes, these anomalous events or objects that do not conform to the norms are an indication that the rules of science governing the data are incomplete, and something new needs to be present to explain these unexpected outliers. The challenge of finding anomalies can be c…
▽ More
Scientific discoveries are often made by finding a pattern or object that was not predicted by the known rules of science. Oftentimes, these anomalous events or objects that do not conform to the norms are an indication that the rules of science governing the data are incomplete, and something new needs to be present to explain these unexpected outliers. The challenge of finding anomalies can be confounding since it requires codifying a complete knowledge of the known scientific behaviors and then projecting these known behaviors on the data to look for deviations. When utilizing machine learning, this presents a particular challenge since we require that the model not only understands scientific data perfectly but also recognizes when the data is inconsistent and out of the scope of its trained behavior. In this paper, we present three datasets aimed at developing machine learning-based anomaly detection for disparate scientific domains covering astrophysics, genomics, and polar science. We present the different datasets along with a scheme to make machine learning challenges around the three datasets findable, accessible, interoperable, and reusable (FAIR). Furthermore, we present an approach that generalizes to future machine learning challenges, enabling the possibility of large, more compute-intensive challenges that can ultimately lead to scientific discovery.
△ Less
Submitted 29 March, 2025; v1 submitted 3 March, 2025;
originally announced March 2025.
-
A Structure-aware Generative Model for Biomedical Event Extraction
Authors:
Haohan Yuan,
Siu Cheung Hui,
Haopeng Zhang
Abstract:
Biomedical Event Extraction (BEE) is a challenging task that involves modeling complex relationships between fine-grained entities in biomedical text. BEE has traditionally been formulated as a classification problem. With recent advancements in large language models (LLMs), generation-based models that cast event extraction as a sequence generation problem have attracted attention in the NLP rese…
▽ More
Biomedical Event Extraction (BEE) is a challenging task that involves modeling complex relationships between fine-grained entities in biomedical text. BEE has traditionally been formulated as a classification problem. With recent advancements in large language models (LLMs), generation-based models that cast event extraction as a sequence generation problem have attracted attention in the NLP research community. However, current generative models often overlook cross-instance information in complex event structures, such as nested and overlapping events, which constitute over 20% of events in benchmark datasets. In this paper, we propose GenBEE, an event structure-aware generative model that captures complex event structures in biomedical text for biomedical event extraction. GenBEE constructs event prompts that distill knowledge from LLMs to incorporate both label semantics and argument dependency relationships. In addition, GenBEE generates prefixes with event structural prompts to incorporate structural features to improve the model's overall performance. We have evaluated the proposed GenBEE model on three widely used BEE benchmark datasets, namely MLEE, GE11, and PHEE. Experimental results show that GenBEE has achieved state-of-the-art performance on the MLEE and GE11 datasets, and achieved competitive results when compared to the state-of-the-art classification-based models on the PHEE dataset.
△ Less
Submitted 20 February, 2025; v1 submitted 12 August, 2024;
originally announced August 2024.
-
ISPO: An Integrated Ontology of Symptom Phenotypes for Semantic Integration of Traditional Chinese Medical Data
Authors:
Zixin Shu,
Rui Hua,
Dengying Yan,
Chenxia Lu,
Ning Xu,
Jun Li,
Hui Zhu,
Jia Zhang,
Dan Zhao,
Chenyang Hui,
Junqiu Ye,
Chu Liao,
Qi Hao,
Wen Ye,
Cheng Luo,
Xinyan Wang,
Chuang Cheng,
Xiaodong Li,
Baoyan Liu,
Xiaji Zhou,
Runshun Zhang,
Min Xu,
Xuezhong Zhou
Abstract:
Symptom phenotypes are one of the key types of manifestations for diagnosis and treatment of various disease conditions. However, the diversity of symptom terminologies is one of the major obstacles hindering the analysis and knowledge sharing of various types of symptom-related medical data particularly in the fields of Traditional Chinese Medicine (TCM). Objective: This study aimed to construct…
▽ More
Symptom phenotypes are one of the key types of manifestations for diagnosis and treatment of various disease conditions. However, the diversity of symptom terminologies is one of the major obstacles hindering the analysis and knowledge sharing of various types of symptom-related medical data particularly in the fields of Traditional Chinese Medicine (TCM). Objective: This study aimed to construct an Integrated Ontology of symptom phenotypes (ISPO) to support the data mining of Chinese EMRs and real-world study in TCM field. Methods: To construct an integrated ontology of symptom phenotypes (ISPO), we manually annotated classical TCM textbooks and large-scale Chinese electronic medical records (EMRs) to collect symptom terms with support from a medical text annotation system. Furthermore, to facilitate the semantic interoperability between different terminologies, we incorporated public available biomedical vocabularies by manual mapping between Chinese terms and English terms with cross-references to source vocabularies. In addition, we evaluated the ISPO using independent clinical EMRs to provide a high-usable medical ontology for clinical data analysis. Results: By integrating 78,696 inpatient cases of EMRs, 5 biomedical vocabularies, 21 TCM books and dictionaries, ISPO provides 3,147 concepts, 23,475 terms, and 55,552 definition or contextual texts. Adhering to the taxonomical structure of the related anatomical systems of symptom phenotypes, ISPO provides 12 top-level categories and 79 middle-level sub-categories. The validation of data analysis showed the ISPO has a coverage rate of 95.35%, 98.53% and 92.66% for symptom terms with occurrence rates of 0.5% in additional three independent curated clinical datasets, which can demonstrate the significant value of ISPO in mapping clinical terms to ontologies.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Image Quality Assessment With Compressed Sampling
Authors:
Ronghua Liao,
Chen Hui,
Lang Yuan,
Haiqi Zhu,
Feng Jiang
Abstract:
No-Reference Image Quality Assessment (NR-IQA) aims at estimating image quality in accordance with subjective human perception. However, most methods focus on exploring increasingly complex networks to improve the final performance,accompanied by limitations on input images. Especially when applied to high-resolution (HR) images, these methods offen have to adjust the size of original image to mee…
▽ More
No-Reference Image Quality Assessment (NR-IQA) aims at estimating image quality in accordance with subjective human perception. However, most methods focus on exploring increasingly complex networks to improve the final performance,accompanied by limitations on input images. Especially when applied to high-resolution (HR) images, these methods offen have to adjust the size of original image to meet model input.To further alleviate the aforementioned issue, we propose two networks for NR-IQA with Compressive Sampling (dubbed CL-IQA and CS-IQA). They consist of four components: (1) The Compressed Sampling Module (CSM) to sample the image (2)The Adaptive Embedding Module (AEM). The measurements are embedded by AEM to extract high-level features. (3) The Vision Transformer and Scale Swin TranBlocksformer Moudle(SSTM) to extract deep features. (4) The Dual Branch (DB) to get final quality score. Experiments show that our proposed methods outperform other methods on various datasets with less data usage.
△ Less
Submitted 11 September, 2024; v1 submitted 26 April, 2024;
originally announced April 2024.
-
AntiPhishStack: LSTM-based Stacked Generalization Model for Optimized Phishing URL Detection
Authors:
Saba Aslam,
Hafsa Aslam,
Arslan Manzoor,
Chen Hui,
Abdur Rasool
Abstract:
The escalating reliance on revolutionary online web services has introduced heightened security risks, with persistent challenges posed by phishing despite extensive security measures. Traditional phishing systems, reliant on machine learning and manual features, struggle with evolving tactics. Recent advances in deep learning offer promising avenues for tackling novel phishing challenges and mali…
▽ More
The escalating reliance on revolutionary online web services has introduced heightened security risks, with persistent challenges posed by phishing despite extensive security measures. Traditional phishing systems, reliant on machine learning and manual features, struggle with evolving tactics. Recent advances in deep learning offer promising avenues for tackling novel phishing challenges and malicious URLs. This paper introduces a two-phase stack generalized model named AntiPhishStack, designed to detect phishing sites. The model leverages the learning of URLs and character-level TF-IDF features symmetrically, enhancing its ability to combat emerging phishing threats. In Phase I, features are trained on a base machine learning classifier, employing K-fold cross-validation for robust mean prediction. Phase II employs a two-layered stacked-based LSTM network with five adaptive optimizers for dynamic compilation, ensuring premier prediction on these features. Additionally, the symmetrical predictions from both phases are optimized and integrated to train a meta-XGBoost classifier, contributing to a final robust prediction. The significance of this work lies in advancing phishing detection with AntiPhishStack, operating without prior phishing-specific feature knowledge. Experimental validation on two benchmark datasets, comprising benign and phishing or malicious URLs, demonstrates the model's exceptional performance, achieving a notable 96.04% accuracy compared to existing studies. This research adds value to the ongoing discourse on symmetry and asymmetry in information security and provides a forward-thinking solution for enhancing network security in the face of evolving cyber threats.
△ Less
Submitted 21 January, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
CLiF-VQA: Enhancing Video Quality Assessment by Incorporating High-Level Semantic Information related to Human Feelings
Authors:
Yachun Mi,
Yu Li,
Yan Shu,
Chen Hui,
Puchao Zhou,
Shaohui Liu
Abstract:
Video Quality Assessment (VQA) aims to simulate the process of perceiving video quality by the human visual system (HVS). The judgments made by HVS are always influenced by human subjective feelings. However, most of the current VQA research focuses on capturing various distortions in the spatial and temporal domains of videos, while ignoring the impact of human feelings. In this paper, we propose…
▽ More
Video Quality Assessment (VQA) aims to simulate the process of perceiving video quality by the human visual system (HVS). The judgments made by HVS are always influenced by human subjective feelings. However, most of the current VQA research focuses on capturing various distortions in the spatial and temporal domains of videos, while ignoring the impact of human feelings. In this paper, we propose CLiF-VQA, which considers both features related to human feelings and spatial features of videos. In order to effectively extract features related to human feelings from videos, we explore the consistency between CLIP and human feelings in video perception for the first time. Specifically, we design multiple objective and subjective descriptions closely related to human feelings as prompts. Further we propose a novel CLIP-based semantic feature extractor (SFE) which extracts features related to human feelings by sliding over multiple regions of the video frame. In addition, we further capture the low-level-aware features of the video through a spatial feature extraction module. The two different features are then aggregated thereby obtaining the quality score of the video. Extensive experiments show that the proposed CLiF-VQA exhibits excellent performance on several VQA datasets.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Domain Transfer in Latent Space (DTLS) Wins on Image Super-Resolution -- a Non-Denoising Model
Authors:
Chun-Chuen Hui,
Wan-Chi Siu,
Ngai-Fong Law
Abstract:
Large scale image super-resolution is a challenging computer vision task, since vast information is missing in a highly degraded image, say for example forscale x16 super-resolution. Diffusion models are used successfully in recent years in extreme super-resolution applications, in which Gaussian noise is used as a means to form a latent photo-realistic space, and acts as a link between the space…
▽ More
Large scale image super-resolution is a challenging computer vision task, since vast information is missing in a highly degraded image, say for example forscale x16 super-resolution. Diffusion models are used successfully in recent years in extreme super-resolution applications, in which Gaussian noise is used as a means to form a latent photo-realistic space, and acts as a link between the space of latent vectors and the latent photo-realistic space. There are quite a few sophisticated mathematical derivations on mapping the statistics of Gaussian noises making Diffusion Models successful. In this paper we propose a simple approach which gets away from using Gaussian noise but adopts some basic structures of diffusion models for efficient image super-resolution. Essentially, we propose a DNN to perform domain transfer between neighbor domains, which can learn the differences in statistical properties to facilitate gradual interpolation with results of reasonable quality. Further quality improvement is achieved by conditioning the domain transfer with reference to the input LR image. Experimental results show that our method outperforms not only state-of-the-art large scale super resolution models, but also the current diffusion models for image super-resolution. The approach can readily be extended to other image-to-image tasks, such as image enlightening, inpainting, denoising, etc.
△ Less
Submitted 21 December, 2023; v1 submitted 4 November, 2023;
originally announced November 2023.
-
Hierarchical Interactive Reconstruction Network For Video Compressive Sensing
Authors:
Tong Zhang,
Wenxue Cui,
Chen Hui,
Feng Jiang
Abstract:
Deep network-based image and video Compressive Sensing(CS) has attracted increasing attentions in recent years. However, in the existing deep network-based CS methods, a simple stacked convolutional network is usually adopted, which not only weakens the perception of rich contextual prior knowledge, but also limits the exploration of the correlations between temporal video frames. In this paper, w…
▽ More
Deep network-based image and video Compressive Sensing(CS) has attracted increasing attentions in recent years. However, in the existing deep network-based CS methods, a simple stacked convolutional network is usually adopted, which not only weakens the perception of rich contextual prior knowledge, but also limits the exploration of the correlations between temporal video frames. In this paper, we propose a novel Hierarchical InTeractive Video CS Reconstruction Network(HIT-VCSNet), which can cooperatively exploit the deep priors in both spatial and temporal domains to improve the reconstruction quality. Specifically, in the spatial domain, a novel hierarchical structure is designed, which can hierarchically extract deep features from keyframes and non-keyframes. In the temporal domain, a novel hierarchical interaction mechanism is proposed, which can cooperatively learn the correlations among different frames in the multiscale space. Extensive experiments manifest that the proposed HIT-VCSNet outperforms the existing state-of-the-art video and image CS methods in a large margin.
△ Less
Submitted 15 April, 2023;
originally announced April 2023.
-
A Multimodal Data-driven Framework for Anxiety Screening
Authors:
Haimiao Mo,
Shuai Ding,
Siu Cheung Hui
Abstract:
Early screening for anxiety and appropriate interventions are essential to reduce the incidence of self-harm and suicide in patients. Due to limited medical resources, traditional methods that overly rely on physician expertise and specialized equipment cannot simultaneously meet the needs for high accuracy and model interpretability. Multimodal data can provide more objective evidence for anxiety…
▽ More
Early screening for anxiety and appropriate interventions are essential to reduce the incidence of self-harm and suicide in patients. Due to limited medical resources, traditional methods that overly rely on physician expertise and specialized equipment cannot simultaneously meet the needs for high accuracy and model interpretability. Multimodal data can provide more objective evidence for anxiety screening to improve the accuracy of models. The large amount of noise in multimodal data and the unbalanced nature of the data make the model prone to overfitting. However, it is a non-differentiable problem when high-dimensional and multimodal feature combinations are used as model inputs and incorporated into model training. This causes existing anxiety screening methods based on machine learning and deep learning to be inapplicable. Therefore, we propose a multimodal data-driven anxiety screening framework, namely MMD-AS, and conduct experiments on the collected health data of over 200 seafarers by smartphones. The proposed framework's feature extraction, dimension reduction, feature selection, and anxiety inference are jointly trained to improve the model's performance. In the feature selection step, a feature selection method based on the Improved Fireworks Algorithm is used to solve the non-differentiable problem of feature combination to remove redundant features and search for the ideal feature subset. The experimental results show that our framework outperforms the comparison methods.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Dialogue State Distillation Network with Inter-slot Contrastive Learning for Dialogue State Tracking
Authors:
Jing Xu,
Dandan Song,
Chong Liu,
Siu Cheung Hui,
Fei Li,
Qiang Ju,
Xiaonan He,
Jian Xie
Abstract:
In task-oriented dialogue systems, Dialogue State Tracking (DST) aims to extract users' intentions from the dialogue history. Currently, most existing approaches suffer from error propagation and are unable to dynamically select relevant information when utilizing previous dialogue states. Moreover, the relations between the updates of different slots provide vital clues for DST. However, the exis…
▽ More
In task-oriented dialogue systems, Dialogue State Tracking (DST) aims to extract users' intentions from the dialogue history. Currently, most existing approaches suffer from error propagation and are unable to dynamically select relevant information when utilizing previous dialogue states. Moreover, the relations between the updates of different slots provide vital clues for DST. However, the existing approaches rely only on predefined graphs to indirectly capture the relations. In this paper, we propose a Dialogue State Distillation Network (DSDN) to utilize relevant information of previous dialogue states and migrate the gap of utilization between training and testing. Thus, it can dynamically exploit previous dialogue states and avoid introducing error propagation simultaneously. Further, we propose an inter-slot contrastive learning loss to effectively capture the slot co-update relations from dialogue context. Experiments are conducted on the widely used MultiWOZ 2.0 and MultiWOZ 2.1 datasets. The experimental results show that our proposed model achieves the state-of-the-art performance for DST.
△ Less
Submitted 7 March, 2023; v1 submitted 16 February, 2023;
originally announced February 2023.
-
Semantic Pivoting Model for Effective Event Detection
Authors:
Anran Hao,
Siu Cheung Hui,
Jian Su
Abstract:
Event Detection, which aims to identify and classify mentions of event instances from unstructured articles, is an important task in Natural Language Processing (NLP). Existing techniques for event detection only use homogeneous one-hot vectors to represent the event type classes, ignoring the fact that the semantic meaning of the types is important to the task. Such an approach is inefficient and…
▽ More
Event Detection, which aims to identify and classify mentions of event instances from unstructured articles, is an important task in Natural Language Processing (NLP). Existing techniques for event detection only use homogeneous one-hot vectors to represent the event type classes, ignoring the fact that the semantic meaning of the types is important to the task. Such an approach is inefficient and prone to overfitting. In this paper, we propose a Semantic Pivoting Model for Effective Event Detection (SPEED), which explicitly incorporates prior information during training and captures semantically meaningful correlations between input and events. Experimental results show that our proposed model achieves state-of-the-art performance and outperforms the baselines in multiple settings without using any external resources.
△ Less
Submitted 1 November, 2022;
originally announced November 2022.
-
SoccerNet 2022 Challenges Results
Authors:
Silvio Giancola,
Anthony Cioppa,
Adrien Deliège,
Floriane Magera,
Vladimir Somers,
Le Kang,
Xin Zhou,
Olivier Barnich,
Christophe De Vleeschouwer,
Alexandre Alahi,
Bernard Ghanem,
Marc Van Droogenbroeck,
Abdulrahman Darwish,
Adrien Maglo,
Albert Clapés,
Andreas Luyts,
Andrei Boiarov,
Artur Xarles,
Astrid Orcesi,
Avijit Shah,
Baoyu Fan,
Bharath Comandur,
Chen Chen,
Chen Zhang,
Chen Zhao
, et al. (69 additional authors not shown)
Abstract:
The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team. In 2022, the challenges were composed of 6 vision-based tasks: (1) action spotting, focusing on retrieving action timestamps in long untrimmed videos, (2) replay grounding, focusing on retrieving the live moment of an action shown in a replay, (3) pitch localization, focusing on det…
▽ More
The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team. In 2022, the challenges were composed of 6 vision-based tasks: (1) action spotting, focusing on retrieving action timestamps in long untrimmed videos, (2) replay grounding, focusing on retrieving the live moment of an action shown in a replay, (3) pitch localization, focusing on detecting line and goal part elements, (4) camera calibration, dedicated to retrieving the intrinsic and extrinsic camera parameters, (5) player re-identification, focusing on retrieving the same players across multiple views, and (6) multiple object tracking, focusing on tracking players and the ball through unedited video streams. Compared to last year's challenges, tasks (1-2) had their evaluation metrics redefined to consider tighter temporal accuracies, and tasks (3-6) were novel, including their underlying data and annotations. More information on the tasks, challenges and leaderboards are available on https://www.soccer-net.org. Baselines and development kits are available on https://github.com/SoccerNet.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
On Improving the Performance of Glitch Classification for Gravitational Wave Detection by using Generative Adversarial Networks
Authors:
Jianqi Yan,
Alex P. Leung,
David C. Y. Hui
Abstract:
Spectrogram classification plays an important role in analyzing gravitational wave data. In this paper, we propose a framework to improve the classification performance by using Generative Adversarial Networks (GANs). As substantial efforts and expertise are required to annotate spectrograms, the number of training examples is very limited. However, it is well known that deep networks can perform…
▽ More
Spectrogram classification plays an important role in analyzing gravitational wave data. In this paper, we propose a framework to improve the classification performance by using Generative Adversarial Networks (GANs). As substantial efforts and expertise are required to annotate spectrograms, the number of training examples is very limited. However, it is well known that deep networks can perform well only when the sample size of the training set is sufficiently large. Furthermore, the imbalanced sample sizes in different classes can also hamper the performance. In order to tackle these problems, we propose a GAN-based data augmentation framework. While standard data augmentation methods for conventional images cannot be applied on spectrograms, we found that a variant of GANs, ProGAN, is capable of generating high-resolution spectrograms which are consistent with the quality of the high-resolution original images and provide a desirable diversity. We have validated our framework by classifying glitches in the {\it Gravity Spy} dataset with the GAN-generated spectrograms for training. We show that the proposed method can provide an alternative to transfer learning for the classification of spectrograms using deep networks, i.e. using a high-resolution GAN for data augmentation instead. Furthermore, fluctuations in classification performance with small sample sizes for training and evaluation can be greatly reduced. Using the trained network in our framework, we have also examined the spectrograms with label anomalies in {\it Gravity Spy}.
△ Less
Submitted 8 July, 2022;
originally announced July 2022.
-
Machine Learning for Food Review and Recommendation
Authors:
Tan Khang Le,
Siu Cheung Hui
Abstract:
Food reviews and recommendations have always been important for online food service websites. However, reviewing and recommending food is not simple as it is likely to be overwhelmed by disparate contexts and meanings. In this paper, we use different deep learning approaches to address the problems of sentiment analysis, automatic review tag generation, and retrieval of food reviews. We propose to…
▽ More
Food reviews and recommendations have always been important for online food service websites. However, reviewing and recommending food is not simple as it is likely to be overwhelmed by disparate contexts and meanings. In this paper, we use different deep learning approaches to address the problems of sentiment analysis, automatic review tag generation, and retrieval of food reviews. We propose to develop a web-based food review system at Nanyang Technological University (NTU) named NTU Food Hunter, which incorporates different deep learning approaches that help users with food selection. First, we implement the BERT and LSTM deep learning models into the system for sentiment analysis of food reviews. Then, we develop a Part-of-Speech (POS) algorithm to automatically identify and extract adjective-noun pairs from the review content for review tag generation based on POS tagging and dependency parsing. Finally, we also train a RankNet model for the re-ranking of the retrieval results to improve the accuracy in our Solr-based food reviews search system. The experimental results show that our proposed deep learning approaches are promising for the applications of real-world problems.
△ Less
Submitted 14 January, 2022;
originally announced January 2022.
-
Improving Knowledge Graph Representation Learning by Structure Contextual Pre-training
Authors:
Ganqiang Ye,
Wen Zhang,
Zhen Bi,
Chi Man Wong,
Chen Hui,
Huajun Chen
Abstract:
Representation learning models for Knowledge Graphs (KG) have proven to be effective in encoding structural information and performing reasoning over KGs. In this paper, we propose a novel pre-training-then-fine-tuning framework for knowledge graph representation learning, in which a KG model is firstly pre-trained with triple classification task, followed by discriminative fine-tuning on specific…
▽ More
Representation learning models for Knowledge Graphs (KG) have proven to be effective in encoding structural information and performing reasoning over KGs. In this paper, we propose a novel pre-training-then-fine-tuning framework for knowledge graph representation learning, in which a KG model is firstly pre-trained with triple classification task, followed by discriminative fine-tuning on specific downstream tasks such as entity type prediction and entity alignment. Drawing on the general ideas of learning deep contextualized word representations in typical pre-trained language models, we propose SCoP to learn pre-trained KG representations with structural and contextual triples of the target triple encoded. Experimental results demonstrate that fine-tuning SCoP not only outperforms results of baselines on a portfolio of downstream tasks but also avoids tedious task-specific model design and parameter training.
△ Less
Submitted 7 December, 2021;
originally announced December 2021.
-
Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters
Authors:
Aston Zhang,
Yi Tay,
Shuai Zhang,
Alvin Chan,
Anh Tuan Luu,
Siu Cheung Hui,
Jie Fu
Abstract:
Recent works have demonstrated reasonable success of representation learning in hypercomplex space. Specifically, "fully-connected layers with Quaternions" (4D hypercomplex numbers), which replace real-valued matrix multiplications in fully-connected layers with Hamilton products of Quaternions, both enjoy parameter savings with only 1/4 learnable parameters and achieve comparable performance in v…
▽ More
Recent works have demonstrated reasonable success of representation learning in hypercomplex space. Specifically, "fully-connected layers with Quaternions" (4D hypercomplex numbers), which replace real-valued matrix multiplications in fully-connected layers with Hamilton products of Quaternions, both enjoy parameter savings with only 1/4 learnable parameters and achieve comparable performance in various applications. However, one key caveat is that hypercomplex space only exists at very few predefined dimensions (4D, 8D, and 16D). This restricts the flexibility of models that leverage hypercomplex multiplications. To this end, we propose parameterizing hypercomplex multiplications, allowing models to learn multiplication rules from data regardless of whether such rules are predefined. As a result, our method not only subsumes the Hamilton product, but also learns to operate on any arbitrary nD hypercomplex space, providing more architectural flexibility using arbitrarily $1/n$ learnable parameters compared with the fully-connected layer counterpart. Experiments of applications to the LSTM and Transformer models on natural language inference, machine translation, text style transfer, and subject verb agreement demonstrate architectural flexibility and effectiveness of the proposed approach.
△ Less
Submitted 17 February, 2021;
originally announced February 2021.
-
Generalized Matrix Factorization: efficient algorithms for fitting generalized linear latent variable models to large data arrays
Authors:
Łukasz Kidziński,
Francis K. C. Hui,
David I. Warton,
Trevor Hastie
Abstract:
Unmeasured or latent variables are often the cause of correlations between multivariate measurements, which are studied in a variety of fields such as psychology, ecology, and medicine. For Gaussian measurements, there are classical tools such as factor analysis or principal component analysis with a well-established theory and fast algorithms. Generalized Linear Latent Variable models (GLLVMs) ge…
▽ More
Unmeasured or latent variables are often the cause of correlations between multivariate measurements, which are studied in a variety of fields such as psychology, ecology, and medicine. For Gaussian measurements, there are classical tools such as factor analysis or principal component analysis with a well-established theory and fast algorithms. Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses. However, current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets with thousands of observational units or responses.
In this article, we propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood and then using a Newton method and Fisher scoring to learn the model parameters. Computationally, our method is noticeably faster and more stable, enabling GLLVM fits to much larger matrices than previously possible. We apply our method on a dataset of 48,000 observational units with over 2,000 observed species in each unit and find that most of the variability can be explained with a handful of factors. We publish an easy-to-use implementation of our proposed fitting algorithm.
△ Less
Submitted 27 January, 2022; v1 submitted 6 October, 2020;
originally announced October 2020.
-
A Reliable Gravity Compensation Control Strategy for dVRK Robotic Arms With Nonlinear Disturbance Forces
Authors:
Hongbin Lin,
C. W. Vincent Hui,
Yan Wang,
Anton Deguet,
Peter Kazanzides,
K. W. Samuel Au
Abstract:
External disturbance forces caused by nonlinear springy electrical cables in the Master Tool Manipulator (MTM) of the da Vinci Research Kit (dVRK) limits the usage of the existing gravity compensation methods. Significant motion drifts at the MTM tip are often observed when the MTM is located far from its identification trajectory, preventing the usage of these methods for the entire workspace rel…
▽ More
External disturbance forces caused by nonlinear springy electrical cables in the Master Tool Manipulator (MTM) of the da Vinci Research Kit (dVRK) limits the usage of the existing gravity compensation methods. Significant motion drifts at the MTM tip are often observed when the MTM is located far from its identification trajectory, preventing the usage of these methods for the entire workspace reliably. In this paper, we propose a general and systematic framework to address the problems of the gravity compensation for the MTM of the dVRK. Particularly, high order polynomial models were used to capture the highly nonlinear disturbance forces and integrated with the Multi-step Least Square Estimation (MLSE) framework. This method allows us to identify the parameters of both the gravitational and disturbance forces for each link sequentially, preventing residual error passing among the links of the MTM with uneven mass distribution. A corresponding gravity compensation controller was developed to compensate the gravitational and disturbance forces. The method was validated with extensive experiments in the majority of the manipulator's workspace, showing significant performance enhancements over existing methods. Finally, a deliverable software package in MATLAB and C++ was integrated with dVRK and published in the dVRK community for open-source research and development.
△ Less
Submitted 16 January, 2020;
originally announced January 2020.
-
Heterogeneous Collaborative Filtering
Authors:
Yifang Liu,
Zhentao Xu,
Cong Hui,
Yi Xuan,
Jessie Chen,
Yuanming Shan
Abstract:
Recommendation system is important to a content sharing/creating social network. Collaborative filtering is a widely-adopted technology in conventional recommenders, which is based on similarity between positively engaged content items involving the same users. Conventional collaborative filtering (CCF) suffers from cold start problem and narrow content diversity. We propose a new recommendation a…
▽ More
Recommendation system is important to a content sharing/creating social network. Collaborative filtering is a widely-adopted technology in conventional recommenders, which is based on similarity between positively engaged content items involving the same users. Conventional collaborative filtering (CCF) suffers from cold start problem and narrow content diversity. We propose a new recommendation approach, heterogeneous collaborative filtering (HCF) to tackle these challenges at the root, while keeping the strength of collaborative filtering. We present two implementation algorithms of HCF for content recommendation and content dissemination. Experiment results demonstrate that our approach improve the recommendation quality in a real world social network for content creating and sharing.
△ Less
Submitted 31 August, 2019;
originally announced September 2019.
-
Collecting and Analyzing Multidimensional Data with Local Differential Privacy
Authors:
Ning Wang,
Xiaokui Xiao,
Yin Yang,
Jun Zhao,
Siu Cheung Hui,
Hyejin Shin,
Junbum Shin,
Ge Yu
Abstract:
Local differential privacy (LDP) is a recently proposed privacy standard for collecting and analyzing data, which has been used, e.g., in the Chrome browser, iOS and macOS. In LDP, each user perturbs her information locally, and only sends the randomized version to an aggregator who performs analyses, which protects both the users and the aggregator against private information leaks. Although LDP…
▽ More
Local differential privacy (LDP) is a recently proposed privacy standard for collecting and analyzing data, which has been used, e.g., in the Chrome browser, iOS and macOS. In LDP, each user perturbs her information locally, and only sends the randomized version to an aggregator who performs analyses, which protects both the users and the aggregator against private information leaks. Although LDP has attracted much research attention in recent years, the majority of existing work focuses on applying LDP to complex data and/or analysis tasks. In this paper, we point out that the fundamental problem of collecting multidimensional data under LDP has not been addressed sufficiently, and there remains much room for improvement even for basic tasks such as computing the mean value over a single numeric attribute under LDP. Motivated by this, we first propose novel LDP mechanisms for collecting a numeric attribute, whose accuracy is at least no worse (and usually better) than existing solutions in terms of worst-case noise variance. Then, we extend these mechanisms to multidimensional data that can contain both numeric and categorical attributes, where our mechanisms always outperform existing solutions regarding worst-case noise variance. As a case study, we apply our solutions to build an LDP-compliant stochastic gradient descent algorithm (SGD), which powers many important machine learning tasks. Experiments using real datasets confirm the effectiveness of our methods, and their advantages over existing solutions.
△ Less
Submitted 28 June, 2019;
originally announced July 2019.
-
Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks
Authors:
Yi Tay,
Aston Zhang,
Luu Anh Tuan,
Jinfeng Rao,
Shuai Zhang,
Shuohang Wang,
Jie Fu,
Siu Cheung Hui
Abstract:
Many state-of-the-art neural models for NLP are heavily parameterized and thus memory inefficient. This paper proposes a series of lightweight and memory efficient neural architectures for a potpourri of natural language processing (NLP) tasks. To this end, our models exploit computation using Quaternion algebra and hypercomplex spaces, enabling not only expressive inter-component interactions but…
▽ More
Many state-of-the-art neural models for NLP are heavily parameterized and thus memory inefficient. This paper proposes a series of lightweight and memory efficient neural architectures for a potpourri of natural language processing (NLP) tasks. To this end, our models exploit computation using Quaternion algebra and hypercomplex spaces, enabling not only expressive inter-component interactions but also significantly ($75\%$) reduced parameter size due to lesser degrees of freedom in the Hamilton product. We propose Quaternion variants of models, giving rise to new architectures such as the Quaternion attention Model and Quaternion Transformer. Extensive experiments on a battery of NLP tasks demonstrates the utility of proposed Quaternion-inspired models, enabling up to $75\%$ reduction in parameter size without significant loss in performance.
△ Less
Submitted 11 June, 2019;
originally announced June 2019.
-
Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives
Authors:
Yi Tay,
Shuohang Wang,
Luu Anh Tuan,
Jie Fu,
Minh C. Phan,
Xingdi Yuan,
Jinfeng Rao,
Siu Cheung Hui,
Aston Zhang
Abstract:
This paper tackles the problem of reading comprehension over long narratives where documents easily span over thousands of tokens. We propose a curriculum learning (CL) based Pointer-Generator framework for reading/sampling over large documents, enabling diverse training of the neural model based on the notion of alternating contextual difficulty. This can be interpreted as a form of domain random…
▽ More
This paper tackles the problem of reading comprehension over long narratives where documents easily span over thousands of tokens. We propose a curriculum learning (CL) based Pointer-Generator framework for reading/sampling over large documents, enabling diverse training of the neural model based on the notion of alternating contextual difficulty. This can be interpreted as a form of domain randomization and/or generative pretraining during training. To this end, the usage of the Pointer-Generator softens the requirement of having the answer within the context, enabling us to construct diverse training samples for learning. Additionally, we propose a new Introspective Alignment Layer (IAL), which reasons over decomposed alignments using block-based self-attention. We evaluate our proposed method on the NarrativeQA reading comprehension benchmark, achieving state-of-the-art performance, improving existing baselines by $51\%$ relative improvement on BLEU-4 and $17\%$ relative improvement on Rouge-L. Extensive ablations confirm the effectiveness of our proposed IAL and CL components.
△ Less
Submitted 26 May, 2019;
originally announced May 2019.
-
Recurrently Controlled Recurrent Networks
Authors:
Yi Tay,
Luu Anh Tuan,
Siu Cheung Hui
Abstract:
Recurrent neural networks (RNNs) such as long short-term memory and gated recurrent units are pivotal building blocks across a broad spectrum of sequence modeling problems. This paper proposes a recurrently controlled recurrent network (RCRN) for expressive and powerful sequence encoding. More concretely, the key idea behind our approach is to learn the recurrent gating functions using recurrent n…
▽ More
Recurrent neural networks (RNNs) such as long short-term memory and gated recurrent units are pivotal building blocks across a broad spectrum of sequence modeling problems. This paper proposes a recurrently controlled recurrent network (RCRN) for expressive and powerful sequence encoding. More concretely, the key idea behind our approach is to learn the recurrent gating functions using recurrent networks. Our architecture is split into two components - a controller cell and a listener cell whereby the recurrent controller actively influences the compositionality of the listener cell. We conduct extensive experiments on a myriad of tasks in the NLP domain such as sentiment analysis (SST, IMDb, Amazon reviews, etc.), question classification (TREC), entailment classification (SNLI, SciTail), answer selection (WikiQA, TrecQA) and reading comprehension (NarrativeQA). Across all 26 datasets, our results demonstrate that RCRN not only consistently outperforms BiLSTMs but also stacked BiLSTMs, suggesting that our controller architecture might be a suitable replacement for the widely adopted stacked architecture.
△ Less
Submitted 24 November, 2018;
originally announced November 2018.
-
Densely Connected Attention Propagation for Reading Comprehension
Authors:
Yi Tay,
Luu Anh Tuan,
Siu Cheung Hui,
Jian Su
Abstract:
We propose DecaProp (Densely Connected Attention Propagation), a new densely connected neural architecture for reading comprehension (RC). There are two distinct characteristics of our model. Firstly, our model densely connects all pairwise layers of the network, modeling relationships between passage and query across all hierarchical levels. Secondly, the dense connectors in our network are learn…
▽ More
We propose DecaProp (Densely Connected Attention Propagation), a new densely connected neural architecture for reading comprehension (RC). There are two distinct characteristics of our model. Firstly, our model densely connects all pairwise layers of the network, modeling relationships between passage and query across all hierarchical levels. Secondly, the dense connectors in our network are learned via attention instead of standard residual skip-connectors. To this end, we propose novel Bidirectional Attention Connectors (BAC) for efficiently forging connections throughout the network. We conduct extensive experiments on four challenging RC benchmarks. Our proposed approach achieves state-of-the-art results on all four, outperforming existing baselines by up to $2.6\%-14.2\%$ in absolute F1 score.
△ Less
Submitted 2 April, 2019; v1 submitted 10 November, 2018;
originally announced November 2018.
-
Co-Stack Residual Affinity Networks with Multi-level Attention Refinement for Matching Text Sequences
Authors:
Yi Tay,
Luu Anh Tuan,
Siu Cheung Hui
Abstract:
Learning a matching function between two text sequences is a long standing problem in NLP research. This task enables many potential applications such as question answering and paraphrase identification. This paper proposes Co-Stack Residual Affinity Networks (CSRAN), a new and universal neural architecture for this problem. CSRAN is a deep architecture, involving stacked (multi-layered) recurrent…
▽ More
Learning a matching function between two text sequences is a long standing problem in NLP research. This task enables many potential applications such as question answering and paraphrase identification. This paper proposes Co-Stack Residual Affinity Networks (CSRAN), a new and universal neural architecture for this problem. CSRAN is a deep architecture, involving stacked (multi-layered) recurrent encoders. Stacked/Deep architectures are traditionally difficult to train, due to the inherent weaknesses such as difficulty with feature propagation and vanishing gradients. CSRAN incorporates two novel components to take advantage of the stacked architecture. Firstly, it introduces a new bidirectional alignment mechanism that learns affinity weights by fusing sequence pairs across stacked hierarchies. Secondly, it leverages a multi-level attention refinement component between stacked recurrent layers. The key intuition is that, by leveraging information across all network hierarchies, we can not only improve gradient flow but also improve overall performance. We conduct extensive experiments on six well-studied text sequence matching datasets, achieving state-of-the-art performance on all.
△ Less
Submitted 6 October, 2018;
originally announced October 2018.
-
Self-Attentive Neural Collaborative Filtering
Authors:
Yi Tay,
Shuai Zhang,
Luu Anh Tuan,
Siu Cheung Hui
Abstract:
This paper has been withdrawn as we discovered a bug in our tensorflow implementation that involved accidental mixing of vectors across batches. This lead to different inference results given different batch sizes which is completely strange. The performance scores still remain the same but we concluded that it was not the self-attention that contributed to the performance. We are withdrawing the…
▽ More
This paper has been withdrawn as we discovered a bug in our tensorflow implementation that involved accidental mixing of vectors across batches. This lead to different inference results given different batch sizes which is completely strange. The performance scores still remain the same but we concluded that it was not the self-attention that contributed to the performance. We are withdrawing the paper because this renders the main claim of the paper false. Thanks to Guan Xinyu from NUS for discovering this issue in our previously open source code.
△ Less
Submitted 19 July, 2018; v1 submitted 17 June, 2018;
originally announced June 2018.
-
Multi-Cast Attention Networks for Retrieval-based Question Answering and Response Prediction
Authors:
Yi Tay,
Luu Anh Tuan,
Siu Cheung Hui
Abstract:
Attention is typically used to select informative sub-phrases that are used for prediction. This paper investigates the novel use of attention as a form of feature augmentation, i.e, casted attention. We propose Multi-Cast Attention Networks (MCAN), a new attention mechanism and general model architecture for a potpourri of ranking tasks in the conversational modeling and question answering domain…
▽ More
Attention is typically used to select informative sub-phrases that are used for prediction. This paper investigates the novel use of attention as a form of feature augmentation, i.e, casted attention. We propose Multi-Cast Attention Networks (MCAN), a new attention mechanism and general model architecture for a potpourri of ranking tasks in the conversational modeling and question answering domains. Our approach performs a series of soft attention operations, each time casting a scalar feature upon the inner word embeddings. The key idea is to provide a real-valued hint (feature) to a subsequent encoder layer and is targeted at improving the representation learning process. There are several advantages to this design, e.g., it allows an arbitrary number of attention mechanisms to be casted, allowing for multiple attention types (e.g., co-attention, intra-attention) and attention variants (e.g., alignment-pooling, max-pooling, mean-pooling) to be executed simultaneously. This not only eliminates the costly need to tune the nature of the co-attention layer, but also provides greater extents of explainability to practitioners. Via extensive experiments on four well-known benchmark datasets, we show that MCAN achieves state-of-the-art performance. On the Ubuntu Dialogue Corpus, MCAN outperforms existing state-of-the-art models by $9\%$. MCAN also achieves the best performing score to date on the well-studied TrecQA dataset.
△ Less
Submitted 3 June, 2018;
originally announced June 2018.
-
CoupleNet: Paying Attention to Couples with Coupled Attention for Relationship Recommendation
Authors:
Yi Tay,
Anh Tuan Luu,
Siu Cheung Hui
Abstract:
Dating and romantic relationships not only play a huge role in our personal lives but also collectively influence and shape society. Today, many romantic partnerships originate from the Internet, signifying the importance of technology and the web in modern dating. In this paper, we present a text-based computational approach for estimating the relationship compatibility of two users on social med…
▽ More
Dating and romantic relationships not only play a huge role in our personal lives but also collectively influence and shape society. Today, many romantic partnerships originate from the Internet, signifying the importance of technology and the web in modern dating. In this paper, we present a text-based computational approach for estimating the relationship compatibility of two users on social media. Unlike many previous works that propose reciprocal recommender systems for online dating websites, we devise a distant supervision heuristic to obtain real world couples from social platforms such as Twitter. Our approach, the CoupleNet is an end-to-end deep learning based estimator that analyzes the social profiles of two users and subsequently performs a similarity match between the users. Intuitively, our approach performs both user profiling and match-making within a unified end-to-end framework. CoupleNet utilizes hierarchical recurrent neural models for learning representations of user profiles and subsequently coupled attention mechanisms to fuse information aggregated from two users. To the best of our knowledge, our approach is the first data-driven deep learning approach for our novel relationship recommendation problem. We benchmark our CoupleNet against several machine learning and deep learning baselines. Experimental results show that our approach outperforms all approaches significantly in terms of precision. Qualitative analysis shows that our model is capable of also producing explainable results to users.
△ Less
Submitted 29 May, 2018;
originally announced May 2018.
-
Standard Cell Library Evaluation with Multiple lithography-compliant verification and Improved Synopsys Pin Access Checking Utility
Authors:
Yongfu Li,
Wan Chia Ang,
Chin Hui Lee,
Kok Peng Chua,
Yoong Seang Jonathan Ong,
Chiu Wing Colin Hui
Abstract:
While standard cell layouts are drawn with minimum design rules to maximize the benefit of design area shrinkage, the complicated design rules have caused difficulties with signal routes accessing the pins in standard cell layouts. As a result, it has become a great challenge for physical layout designers to design a standard cell layout that is optimized for area, power, timing, signal integrity,…
▽ More
While standard cell layouts are drawn with minimum design rules to maximize the benefit of design area shrinkage, the complicated design rules have caused difficulties with signal routes accessing the pins in standard cell layouts. As a result, it has become a great challenge for physical layout designers to design a standard cell layout that is optimized for area, power, timing, signal integrity, and printability. Multiple design iterations are required to consider pin accessibility during standard cells layout to increase the number of feasible solutions available to the router. In this work, we will demonstrate several improvements with the Synopsys PAC methodology, such as reducing the number of cells required for each Synopsys 'testcell' with the same cell abutment condition, increasing the complexity of the pin connection for better pin accessibility evaluation. We also recommend additional constraints to improve the probability of detecting pin accessibility issues. We also integrate other physical verification methods to access the design rule compliance and the printability of standard cells. We hope that the easy to use utility enables layout engineers to perform the verification, simplifying the verification methodology.
△ Less
Submitted 27 May, 2018;
originally announced May 2018.
-
Multiple-Lithography-Compliant Verification for Standard Cell Library Development Flow
Authors:
Yongfu Li,
Wan Chia Ang,
Chin Hui Lee,
Kok Peng Chua,
Yoong Seang Jonathan Ong,
Chiu Wing Colin Hui
Abstract:
Starting from 22-nm, a standard cell must be designed to be full lithography-compliant, which includes Design Rule Check, Design-for-Manufacturability and Double-Patterning compliant. It has become a great challenge for physical layout designers to provide a full lithography-compliant standard cell layout that is optimized for area, power, timing, signal integrity, and yield. This challenge is fur…
▽ More
Starting from 22-nm, a standard cell must be designed to be full lithography-compliant, which includes Design Rule Check, Design-for-Manufacturability and Double-Patterning compliant. It has become a great challenge for physical layout designers to provide a full lithography-compliant standard cell layout that is optimized for area, power, timing, signal integrity, and yield. This challenge is further exacerbated with abutted single- and multiple-height standard cells. At present, different foundries and library vendors have different approaches for full lithography-compliant library preparation and validation. To the best of our knowledge, there is no single tool integrates all types of lithography-compliant check in standard cell libraries validation flow. In this work, we will demonstrate multiple lithography-compliant verification for standard cell library development flow. Validation flow and detailed algorithm implementation will be explained to assist engineers to achieve full lithography-compliant standard cell libraries. An area-efficient standard cell placement methodology will also be discussed to validate the issues arises from standard cell abutment.
△ Less
Submitted 27 May, 2018;
originally announced May 2018.
-
Constraining the Synopsys Pin Access Checker Utility for Improved Standard Cells Library Verification Flow
Authors:
Yongfu Li,
Chin Hui Lee,
Wan Chia Ang,
Kok Peng Chua,
Yoong Seang Jonathan Ong,
Chiu Wing Colin Hui
Abstract:
While standard cell layouts are drawn with minimum design rules for maximum benefit of design area shrinkage, the complicated design rules begin to cause difficulties with signal routes accessing the pins in standard cell layouts. Multiple design iterations are required to resolve routing issues, thus increasing the runtime and the overall chip area. To optimize the chip performance, power and are…
▽ More
While standard cell layouts are drawn with minimum design rules for maximum benefit of design area shrinkage, the complicated design rules begin to cause difficulties with signal routes accessing the pins in standard cell layouts. Multiple design iterations are required to resolve routing issues, thus increasing the runtime and the overall chip area. To optimize the chip performance, power and area (PPA) and improve the routability, it is necessary to consider the pin accessibility during standard cell development phase so that each cell is designed to maximize the number of feasible pin-access solutions available to the router. As part of the Synopsys IC Compiler Library Preparation Reference Methodology, the Synopsys Pin Access Checker (PAC) reports DRC violations associated with the standard cell. Based on Synopsys PAC's methodology, we demonstrate several methods to improve the probability of detecting pin accessibility issues, such as reducing the number of cells required for each Synopsys 'testcell', increasing the complexity of the pin connectivity assignment and recommending the router constraints.
△ Less
Submitted 25 May, 2018;
originally announced May 2018.
-
Reasoning with Sarcasm by Reading In-between
Authors:
Yi Tay,
Luu Anh Tuan,
Siu Cheung Hui,
Jian Su
Abstract:
Sarcasm is a sophisticated speech act which commonly manifests on social communities such as Twitter and Reddit. The prevalence of sarcasm on the social web is highly disruptive to opinion mining systems due to not only its tendency of polarity flipping but also usage of figurative language. Sarcasm commonly manifests with a contrastive theme either between positive-negative sentiments or between…
▽ More
Sarcasm is a sophisticated speech act which commonly manifests on social communities such as Twitter and Reddit. The prevalence of sarcasm on the social web is highly disruptive to opinion mining systems due to not only its tendency of polarity flipping but also usage of figurative language. Sarcasm commonly manifests with a contrastive theme either between positive-negative sentiments or between literal-figurative scenarios. In this paper, we revisit the notion of modeling contrast in order to reason with sarcasm. More specifically, we propose an attention-based neural model that looks in-between instead of across, enabling it to explicitly model contrast and incongruity. We conduct extensive experiments on six benchmark datasets from Twitter, Reddit and the Internet Argument Corpus. Our proposed model not only achieves state-of-the-art performance on all datasets but also enjoys improved interpretability.
△ Less
Submitted 8 May, 2018;
originally announced May 2018.
-
Multi-range Reasoning for Machine Comprehension
Authors:
Yi Tay,
Luu Anh Tuan,
Siu Cheung Hui
Abstract:
We propose MRU (Multi-Range Reasoning Units), a new fast compositional encoder for machine comprehension (MC). Our proposed MRU encoders are characterized by multi-ranged gating, executing a series of parameterized contract-and-expand layers for learning gating vectors that benefit from long and short-term dependencies. The aims of our approach are as follows: (1) learning representations that are…
▽ More
We propose MRU (Multi-Range Reasoning Units), a new fast compositional encoder for machine comprehension (MC). Our proposed MRU encoders are characterized by multi-ranged gating, executing a series of parameterized contract-and-expand layers for learning gating vectors that benefit from long and short-term dependencies. The aims of our approach are as follows: (1) learning representations that are concurrently aware of long and short-term context, (2) modeling relationships between intra-document blocks and (3) fast and efficient sequence encoding. We show that our proposed encoder demonstrates promising results both as a standalone encoder and as well as a complementary building block. We conduct extensive experiments on three challenging MC datasets, namely RACE, SearchQA and NarrativeQA, achieving highly competitive performance on all. On the RACE benchmark, our model outperforms DFN (Dynamic Fusion Networks) by 1.5%-6% without using any recurrent or convolution layers. Similarly, we achieve competitive performance relative to AMANDA on the SearchQA benchmark and BiDAF on the NarrativeQA benchmark without using any LSTM/GRU layers. Finally, incorporating MRU encoders with standard BiLSTM architectures further improves performance, achieving state-of-the-art results.
△ Less
Submitted 24 March, 2018;
originally announced March 2018.
-
Multi-Pointer Co-Attention Networks for Recommendation
Authors:
Yi Tay,
Luu Anh Tuan,
Siu Cheung Hui
Abstract:
Many recent state-of-the-art recommender systems such as D-ATT, TransNet and DeepCoNN exploit reviews for representation learning. This paper proposes a new neural architecture for recommendation with reviews. Our model operates on a multi-hierarchical paradigm and is based on the intuition that not all reviews are created equal, i.e., only a select few are important. The importance, however, shou…
▽ More
Many recent state-of-the-art recommender systems such as D-ATT, TransNet and DeepCoNN exploit reviews for representation learning. This paper proposes a new neural architecture for recommendation with reviews. Our model operates on a multi-hierarchical paradigm and is based on the intuition that not all reviews are created equal, i.e., only a select few are important. The importance, however, should be dynamically inferred depending on the current target. To this end, we propose a review-by-review pointer-based learning scheme that extracts important reviews, subsequently matching them in a word-by-word fashion. This enables not only the most informative reviews to be utilized for prediction but also a deeper word-level interaction. Our pointer-based method operates with a novel gumbel-softmax based pointer mechanism that enables the incorporation of discrete vectors within differentiable neural architectures. Our pointer mechanism is co-attentive in nature, learning pointers which are co-dependent on user-item relationships. Finally, we propose a multi-pointer learning scheme that learns to combine multiple views of interactions between user and item. Overall, we demonstrate the effectiveness of our proposed model via extensive experiments on \textbf{24} benchmark datasets from Amazon and Yelp. Empirical results show that our approach significantly outperforms existing state-of-the-art, with up to 19% and 71% relative improvement when compared to TransNet and DeepCoNN respectively. We study the behavior of our multi-pointer learning mechanism, shedding light on evidence aggregation patterns in review-based recommender systems.
△ Less
Submitted 21 June, 2018; v1 submitted 28 January, 2018;
originally announced January 2018.
-
Compare, Compress and Propagate: Enhancing Neural Architectures with Alignment Factorization for Natural Language Inference
Authors:
Yi Tay,
Luu Anh Tuan,
Siu Cheung Hui
Abstract:
This paper presents a new deep learning architecture for Natural Language Inference (NLI). Firstly, we introduce a new architecture where alignment pairs are compared, compressed and then propagated to upper layers for enhanced representation learning. Secondly, we adopt factorization layers for efficient and expressive compression of alignment vectors into scalar features, which are then used to…
▽ More
This paper presents a new deep learning architecture for Natural Language Inference (NLI). Firstly, we introduce a new architecture where alignment pairs are compared, compressed and then propagated to upper layers for enhanced representation learning. Secondly, we adopt factorization layers for efficient and expressive compression of alignment vectors into scalar features, which are then used to augment the base word representations. The design of our approach is aimed to be conceptually simple, compact and yet powerful. We conduct experiments on three popular benchmarks, SNLI, MultiNLI and SciTail, achieving competitive performance on all. A lightweight parameterization of our model also enjoys a $\approx 3$ times reduction in parameter size compared to the existing state-of-the-art models, e.g., ESIM and DIIN, while maintaining competitive performance. Additionally, visual analysis shows that our propagated features are highly interpretable.
△ Less
Submitted 10 September, 2018; v1 submitted 30 December, 2017;
originally announced January 2018.
-
Learning to Attend via Word-Aspect Associative Fusion for Aspect-based Sentiment Analysis
Authors:
Yi Tay,
Anh Tuan Luu,
Siu Cheung Hui
Abstract:
Aspect-based sentiment analysis (ABSA) tries to predict the polarity of a given document with respect to a given aspect entity. While neural network architectures have been successful in predicting the overall polarity of sentences, aspect-specific sentiment analysis still remains as an open problem. In this paper, we propose a novel method for integrating aspect information into the neural model.…
▽ More
Aspect-based sentiment analysis (ABSA) tries to predict the polarity of a given document with respect to a given aspect entity. While neural network architectures have been successful in predicting the overall polarity of sentences, aspect-specific sentiment analysis still remains as an open problem. In this paper, we propose a novel method for integrating aspect information into the neural model. More specifically, we incorporate aspect information into the neural model by modeling word-aspect relationships. Our novel model, \textit{Aspect Fusion LSTM} (AF-LSTM) learns to attend based on associative relationships between sentence words and aspect which allows our model to adaptively focus on the correct words given an aspect term. This ameliorates the flaws of other state-of-the-art models that utilize naive concatenations to model word-aspect similarity. Instead, our model adopts circular convolution and circular correlation to model the similarity between aspect and words and elegantly incorporates this within a differentiable neural attention framework. Finally, our model is end-to-end differentiable and highly related to convolution-correlation (holographic like) memories. Our proposed neural model achieves state-of-the-art performance on benchmark datasets, outperforming ATAE-LSTM by $4\%-5\%$ on average across multiple datasets.
△ Less
Submitted 14 December, 2017;
originally announced December 2017.
-
Cross Temporal Recurrent Networks for Ranking Question Answer Pairs
Authors:
Yi Tay,
Luu Anh Tuan,
Siu Cheung Hui
Abstract:
Temporal gates play a significant role in modern recurrent-based neural encoders, enabling fine-grained control over recursive compositional operations over time. In recurrent models such as the long short-term memory (LSTM), temporal gates control the amount of information retained or discarded over time, not only playing an important role in influencing the learned representations but also servi…
▽ More
Temporal gates play a significant role in modern recurrent-based neural encoders, enabling fine-grained control over recursive compositional operations over time. In recurrent models such as the long short-term memory (LSTM), temporal gates control the amount of information retained or discarded over time, not only playing an important role in influencing the learned representations but also serving as a protection against vanishing gradients. This paper explores the idea of learning temporal gates for sequence pairs (question and answer), jointly influencing the learned representations in a pairwise manner. In our approach, temporal gates are learned via 1D convolutional layers and then subsequently cross applied across question and answer for joint learning. Empirically, we show that this conceptually simple sharing of temporal gates can lead to competitive performance across multiple benchmarks. Intuitively, what our network achieves can be interpreted as learning representations of question and answer pairs that are aware of what each other is remembering or forgetting, i.e., pairwise temporal gating. Via extensive experiments, we show that our proposed model achieves state-of-the-art performance on two community-based QA datasets and competitive performance on one factoid-based QA dataset.
△ Less
Submitted 21 November, 2017;
originally announced November 2017.
-
SkipFlow: Incorporating Neural Coherence Features for End-to-End Automatic Text Scoring
Authors:
Yi Tay,
Minh C. Phan,
Luu Anh Tuan,
Siu Cheung Hui
Abstract:
Deep learning has demonstrated tremendous potential for Automatic Text Scoring (ATS) tasks. In this paper, we describe a new neural architecture that enhances vanilla neural network models with auxiliary neural coherence features. Our new method proposes a new \textsc{SkipFlow} mechanism that models relationships between snapshots of the hidden representations of a long short-term memory (LSTM) ne…
▽ More
Deep learning has demonstrated tremendous potential for Automatic Text Scoring (ATS) tasks. In this paper, we describe a new neural architecture that enhances vanilla neural network models with auxiliary neural coherence features. Our new method proposes a new \textsc{SkipFlow} mechanism that models relationships between snapshots of the hidden representations of a long short-term memory (LSTM) network as it reads. Subsequently, the semantic relationships between multiple snapshots are used as auxiliary features for prediction. This has two main benefits. Firstly, essays are typically long sequences and therefore the memorization capability of the LSTM network may be insufficient. Implicit access to multiple snapshots can alleviate this problem by acting as a protection against vanishing gradients. The parameters of the \textsc{SkipFlow} mechanism also acts as an auxiliary memory. Secondly, modeling relationships between multiple positions allows our model to learn features that represent and approximate textual coherence. In our model, we call this \textit{neural coherence} features. Overall, we present a unified deep learning architecture that generates neural coherence features as it reads in an end-to-end fashion. Our approach demonstrates state-of-the-art performance on the benchmark ASAP dataset, outperforming not only feature engineering baselines but also other deep learning models.
△ Less
Submitted 14 November, 2017;
originally announced November 2017.
-
Differentially Private Regression for Discrete-Time Survival Analysis
Authors:
Thông T. Nguyên,
Siu Cheung Hui
Abstract:
In survival analysis, regression models are used to understand the effects of explanatory variables (e.g., age, sex, weight, etc.) to the survival probability. However, for sensitive survival data such as medical data, there are serious concerns about the privacy of individuals in the data set when medical data is used to fit the regression models. The closest work addressing such privacy concerns…
▽ More
In survival analysis, regression models are used to understand the effects of explanatory variables (e.g., age, sex, weight, etc.) to the survival probability. However, for sensitive survival data such as medical data, there are serious concerns about the privacy of individuals in the data set when medical data is used to fit the regression models. The closest work addressing such privacy concerns is the work on Cox regression which linearly projects the original data to a lower dimensional space. However, the weakness of this approach is that there is no formal privacy guarantee for such projection. In this work, we aim to propose solutions for the regression problem in survival analysis with the protection of differential privacy which is a golden standard of privacy protection in data privacy research. To this end, we extend the Output Perturbation and Objective Perturbation approaches which are originally proposed to protect differential privacy for the Empirical Risk Minimization (ERM) problems. In addition, we also propose a novel sampling approach based on the Markov Chain Monte Carlo (MCMC) method to practically guarantee differential privacy with better accuracy. We show that our proposed approaches achieve good accuracy as compared to the non-private results while guaranteeing differential privacy for individuals in the private data set.
△ Less
Submitted 24 August, 2017; v1 submitted 24 August, 2017;
originally announced August 2017.
-
Multi-task Neural Network for Non-discrete Attribute Prediction in Knowledge Graphs
Authors:
Yi Tay,
Luu Anh Tuan,
Minh C. Phan,
Siu Cheung Hui
Abstract:
Many popular knowledge graphs such as Freebase, YAGO or DBPedia maintain a list of non-discrete attributes for each entity. Intuitively, these attributes such as height, price or population count are able to richly characterize entities in knowledge graphs. This additional source of information may help to alleviate the inherent sparsity and incompleteness problem that are prevalent in knowledge g…
▽ More
Many popular knowledge graphs such as Freebase, YAGO or DBPedia maintain a list of non-discrete attributes for each entity. Intuitively, these attributes such as height, price or population count are able to richly characterize entities in knowledge graphs. This additional source of information may help to alleviate the inherent sparsity and incompleteness problem that are prevalent in knowledge graphs. Unfortunately, many state-of-the-art relational learning models ignore this information due to the challenging nature of dealing with non-discrete data types in the inherently binary-natured knowledge graphs. In this paper, we propose a novel multi-task neural network approach for both encoding and prediction of non-discrete attribute information in a relational setting. Specifically, we train a neural network for triplet prediction along with a separate network for attribute value regression. Via multi-task learning, we are able to learn representations of entities, relations and attributes that encode information about both tasks. Moreover, such attributes are not only central to many predictive tasks as an information source but also as a prediction target. Therefore, models that are able to encode, incorporate and predict such information in a relational learning context are highly attractive as well. We show that our approach outperforms many state-of-the-art methods for the tasks of relational triplet classification and attribute value prediction.
△ Less
Submitted 16 August, 2017;
originally announced August 2017.
-
Privacy-Preserving Mechanisms for Parametric Survival Analysis with Weibull Distribution
Authors:
Thông T. Nguyên,
Siu Cheung Hui
Abstract:
Survival analysis studies the statistical properties of the time until an event of interest occurs. It has been commonly used to study the effectiveness of medical treatments or the lifespan of a population. However, survival analysis can potentially leak confidential information of individuals in the dataset. The state-of-the-art techniques apply ad-hoc privacy-preserving mechanisms on publishing…
▽ More
Survival analysis studies the statistical properties of the time until an event of interest occurs. It has been commonly used to study the effectiveness of medical treatments or the lifespan of a population. However, survival analysis can potentially leak confidential information of individuals in the dataset. The state-of-the-art techniques apply ad-hoc privacy-preserving mechanisms on publishing results to protect the privacy. These techniques usually publish sanitized and randomized answers which promise to protect the privacy of individuals in the dataset but without providing any formal mechanism on privacy protection. In this paper, we propose private mechanisms for parametric survival analysis with Weibull distribution. We prove that our proposed mechanisms achieve differential privacy, a robust and rigorous definition of privacy-preservation. Our mechanisms exploit the property of local sensitivity to carefully design a utility function which enables us to publish parameters of Weibull distribution with high precision. Our experimental studies show that our mechanisms can publish useful answers and outperform other differentially private techniques on real datasets.
△ Less
Submitted 24 August, 2017; v1 submitted 1 July, 2017;
originally announced August 2017.
-
Hyperbolic Representation Learning for Fast and Efficient Neural Question Answering
Authors:
Yi Tay,
Luu Anh Tuan,
Siu Cheung Hui
Abstract:
The dominant neural architectures in question answer retrieval are based on recurrent or convolutional encoders configured with complex word matching layers. Given that recent architectural innovations are mostly new word interaction layers or attention-based matching mechanisms, it seems to be a well-established fact that these components are mandatory for good performance. Unfortunately, the mem…
▽ More
The dominant neural architectures in question answer retrieval are based on recurrent or convolutional encoders configured with complex word matching layers. Given that recent architectural innovations are mostly new word interaction layers or attention-based matching mechanisms, it seems to be a well-established fact that these components are mandatory for good performance. Unfortunately, the memory and computation cost incurred by these complex mechanisms are undesirable for practical applications. As such, this paper tackles the question of whether it is possible to achieve competitive performance with simple neural architectures. We propose a simple but novel deep learning architecture for fast and efficient question-answer ranking and retrieval. More specifically, our proposed model, \textsc{HyperQA}, is a parameter efficient neural network that outperforms other parameter intensive models such as Attentive Pooling BiLSTMs and Multi-Perspective CNNs on multiple QA benchmarks. The novelty behind \textsc{HyperQA} is a pairwise ranking objective that models the relationship between question and answer embeddings in Hyperbolic space instead of Euclidean space. This empowers our model with a self-organizing ability and enables automatic discovery of latent hierarchies while learning embeddings of questions and answers. Our model requires no feature engineering, no similarity matrix matching, no complicated attention mechanisms nor over-parameterized layers and yet outperforms and remains competitive to many models that have these functionalities on multiple benchmarks.
△ Less
Submitted 23 November, 2017; v1 submitted 25 July, 2017;
originally announced July 2017.