Search | arXiv e-print repository

Spatial-temporal Hierarchical Reinforcement Learning for Interpretable Pathology Image Super-Resolution

Authors: Wenting Chen, Jie Liu, Tommy W. S. Chow, Yixuan Yuan

Abstract: Pathology image are essential for accurately interpreting lesion cells in cytopathology screening, but acquiring high-resolution digital slides requires specialized equipment and long scanning times. Though super-resolution (SR) techniques can alleviate this problem, existing deep learning models recover pathology image in a black-box manner, which can lead to untruthful biological details and mis… ▽ More Pathology image are essential for accurately interpreting lesion cells in cytopathology screening, but acquiring high-resolution digital slides requires specialized equipment and long scanning times. Though super-resolution (SR) techniques can alleviate this problem, existing deep learning models recover pathology image in a black-box manner, which can lead to untruthful biological details and misdiagnosis. Additionally, current methods allocate the same computational resources to recover each pixel of pathology image, leading to the sub-optimal recovery issue due to the large variation of pathology image. In this paper, we propose the first hierarchical reinforcement learning framework named Spatial-Temporal hierARchical Reinforcement Learning (STAR-RL), mainly for addressing the aforementioned issues in pathology image super-resolution problem. We reformulate the SR problem as a Markov decision process of interpretable operations and adopt the hierarchical recovery mechanism in patch level, to avoid sub-optimal recovery. Specifically, the higher-level spatial manager is proposed to pick out the most corrupted patch for the lower-level patch worker. Moreover, the higher-level temporal manager is advanced to evaluate the selected patch and determine whether the optimization should be stopped earlier, thereby avoiding the over-processed problem. Under the guidance of spatial-temporal managers, the lower-level patch worker processes the selected patch with pixel-wise interpretable actions at each time step. Experimental results on medical images degraded by different kernels show the effectiveness of STAR-RL. Furthermore, STAR-RL validates the promotion in tumor diagnosis with a large margin and shows generalizability under various degradations. The source code is available at https://github.com/CUHK-AIM-Group/STAR-RL. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: Accepted to IEEE TRANSACTIONS ON MEDICAL IMAGING (TMI)

arXiv:2406.07061 [pdf, other]

Triage of 3D pathology data via 2.5D multiple-instance learning to guide pathologist assessments

Authors: Gan Gao, Andrew H. Song, Fiona Wang, David Brenes, Rui Wang, Sarah S. L. Chow, Kevin W. Bishop, Lawrence D. True, Faisal Mahmood, Jonathan T. C. Liu

Abstract: Accurate patient diagnoses based on human tissue biopsies are hindered by current clinical practice, where pathologists assess only a limited number of thin 2D tissue slices sectioned from 3D volumetric tissue. Recent advances in non-destructive 3D pathology, such as open-top light-sheet microscopy, enable comprehensive imaging of spatially heterogeneous tissue morphologies, offering the feasibili… ▽ More Accurate patient diagnoses based on human tissue biopsies are hindered by current clinical practice, where pathologists assess only a limited number of thin 2D tissue slices sectioned from 3D volumetric tissue. Recent advances in non-destructive 3D pathology, such as open-top light-sheet microscopy, enable comprehensive imaging of spatially heterogeneous tissue morphologies, offering the feasibility to improve diagnostic determinations. A potential early route towards clinical adoption for 3D pathology is to rely on pathologists for final diagnosis based on viewing familiar 2D H&E-like image sections from the 3D datasets. However, manual examination of the massive 3D pathology datasets is infeasible. To address this, we present CARP3D, a deep learning triage approach that automatically identifies the highest-risk 2D slices within 3D volumetric biopsy, enabling time-efficient review by pathologists. For a given slice in the biopsy, we estimate its risk by performing attention-based aggregation of 2D patches within each slice, followed by pooling of the neighboring slices to compute a context-aware 2.5D risk score. For prostate cancer risk stratification, CARP3D achieves an area under the curve (AUC) of 90.4% for triaging slices, outperforming methods relying on independent analysis of 2D sections (AUC=81.3%). These results suggest that integrating additional depth context enhances the model's discriminative capabilities. In conclusion, CARP3D has the potential to improve pathologist diagnosis via accurate triage of high-risk slices within large-volume 3D pathology datasets. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: CVPR CVMI 2024

Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 6955-6965

arXiv:2405.08194 [pdf, other]

Distributionally Robust Degree Optimization for BATS Codes

Authors: Hoover H. F. Yin, Jie Wang, Sherman S. M. Chow

Abstract: Batched sparse (BATS) code is a network coding solution for multi-hop wireless networks with packet loss. Achieving a close-to-optimal rate relies on an optimal degree distribution. Technical challenges arise from the sensitivity of this distribution to the often empirically obtained rank distribution at the destination node. Specifically, if the empirical distribution overestimates the channel, B… ▽ More Batched sparse (BATS) code is a network coding solution for multi-hop wireless networks with packet loss. Achieving a close-to-optimal rate relies on an optimal degree distribution. Technical challenges arise from the sensitivity of this distribution to the often empirically obtained rank distribution at the destination node. Specifically, if the empirical distribution overestimates the channel, BATS codes experience a significant rate degradation, leading to unstable rates across different runs and hence unpredictable transmission costs. Confronting this unresolved obstacle, we introduce a formulation for distributionally robust optimization in degree optimization. Deploying the resulting degree distribution resolves the instability of empirical rank distributions, ensuring a close-to-optimal rate, and unleashing the potential of applying BATS codes in real-world scenarios. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 8 pages, accepted by 2024 IEEE International Symposium on Information Theory

arXiv:2309.06746 [pdf, other]

doi 10.1145/3576915.3616592

DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass

Authors: Minxin Du, Xiang Yue, Sherman S. M. Chow, Tianhao Wang, Chenyu Huang, Huan Sun

Abstract: Differentially private stochastic gradient descent (DP-SGD) adds noise to gradients in back-propagation, safeguarding training data from privacy leakage, particularly membership inference. It fails to cover (inference-time) threats like embedding inversion and sensitive attribute inference. It is also costly in storage and computation when used to fine-tune large pre-trained language models (LMs).… ▽ More Differentially private stochastic gradient descent (DP-SGD) adds noise to gradients in back-propagation, safeguarding training data from privacy leakage, particularly membership inference. It fails to cover (inference-time) threats like embedding inversion and sensitive attribute inference. It is also costly in storage and computation when used to fine-tune large pre-trained language models (LMs). We propose DP-Forward, which directly perturbs embedding matrices in the forward pass of LMs. It satisfies stringent local DP requirements for training and inference data. To instantiate it using the smallest matrix-valued noise, we devise an analytic matrix Gaussian~mechanism (aMGM) by drawing possibly non-i.i.d. noise from a matrix Gaussian distribution. We then investigate perturbing outputs from different hidden (sub-)layers of LMs with aMGM noises. Its utility on three typical tasks almost hits the non-private baseline and outperforms DP-SGD by up to 7.7pp at a moderate privacy level. It saves 3$\times$ time and memory costs compared to DP-SGD with the latest high-speed library. It also reduces the average success rates of embedding inversion and sensitive attribute inference by up to 88pp and 41pp, respectively, whereas DP-SGD fails. △ Less

Submitted 19 September, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

Comments: To appear at ACM CCS '23. This is the full version. The first two authors contribute equally

arXiv:2305.07593 [pdf, ps, other]

Unconditionally Secure Access Control Encryption

Authors: Cheuk Ting Li, Sherman S. M. Chow

Abstract: Access control encryption (ACE) enforces, through a sanitizer as the mediator, that only legitimate sender-receiver pairs can communicate, without the sanitizer knowing the communication metadata, including its sender and recipient identity, the policy over them, and the underlying plaintext. Any illegitimate transmission is indistinguishable from pure noise. Existing works focused on computationa… ▽ More Access control encryption (ACE) enforces, through a sanitizer as the mediator, that only legitimate sender-receiver pairs can communicate, without the sanitizer knowing the communication metadata, including its sender and recipient identity, the policy over them, and the underlying plaintext. Any illegitimate transmission is indistinguishable from pure noise. Existing works focused on computational security and require trapdoor functions and possibly other heavyweight primitives. We present the first ACE scheme with information-theoretic security (unconditionally against unbounded adversaries). Our novel randomization techniques over matrices realize sanitization (traditionally via homomorphism over a fixed randomness space) such that the secret message in the hidden message subspace remains intact if and only if there is no illegitimate transmission. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: 10 pages. This is the long version of a paper to be presented at 2023 IEEE International Symposium on Information Theory

arXiv:2304.07971 [pdf, other]

doi 10.1145/3539618.3591649

Collaborative Residual Metric Learning

Authors: Tianjun Wei, Jianghong Ma, Tommy W. S. Chow

Abstract: In collaborative filtering, distance metric learning has been applied to matrix factorization techniques with promising results. However, matrix factorization lacks the ability of capturing collaborative information, which has been remarked by recent works and improved by interpreting user interactions as signals. This paper aims to find out how metric learning connect to these signal-based models… ▽ More In collaborative filtering, distance metric learning has been applied to matrix factorization techniques with promising results. However, matrix factorization lacks the ability of capturing collaborative information, which has been remarked by recent works and improved by interpreting user interactions as signals. This paper aims to find out how metric learning connect to these signal-based models. By adopting a generalized distance metric, we discovered that in signal-based models, it is easier to estimate the residual of distances, which refers to the difference between the distances from a user to a target item and another item, rather than estimating the distances themselves. Further analysis also uncovers a link between the normalization strength of interaction signals and the novelty of recommendation, which has been overlooked by existing studies. Based on the above findings, we propose a novel model to learn a generalized distance user-item distance metric to capture user preference in interaction signals by modeling the residuals of distance. The proposed CoRML model is then further improved in training efficiency by a newly introduced approximated ranking weight. Extensive experiments conducted on 4 public datasets demonstrate the superior performance of CoRML compared to the state-of-the-art baselines in collaborative filtering, along with high efficiency and the ability of providing novelty-promoted recommendations, shedding new light on the study of metric learning-based recommender systems. △ Less

Submitted 16 April, 2023; originally announced April 2023.

Comments: Accepted by SIGIR '23

arXiv:2304.03841 [pdf, other]

Efficient Secure Aggregation for Privacy-Preserving Federated Machine Learning

Authors: Rouzbeh Behnia, Arman Riasi, Reza Ebrahimi, Sherman S. M. Chow, Balaji Padmanabhan, Thang Hoang

Abstract: Secure aggregation protocols ensure the privacy of users' data in federated learning by preventing the disclosure of local gradients. Many existing protocols impose significant communication and computational burdens on participants and may not efficiently handle the large update vectors typical of machine learning models. Correspondingly, we present e-SeaFL, an efficient verifiable secure aggrega… ▽ More Secure aggregation protocols ensure the privacy of users' data in federated learning by preventing the disclosure of local gradients. Many existing protocols impose significant communication and computational burdens on participants and may not efficiently handle the large update vectors typical of machine learning models. Correspondingly, we present e-SeaFL, an efficient verifiable secure aggregation protocol taking only one communication round during the aggregation phase. e-SeaFL allows the aggregation server to generate proof of honest aggregation to participants via authenticated homomorphic vector commitments. Our core idea is the use of assisting nodes to help the aggregation server, under similar trust assumptions existing works place upon the participating users. Our experiments show that the user enjoys an order of magnitude efficiency improvement over the state-of-the-art (IEEE S\&P 2023) for large gradient vectors with thousands of parameters. Our open-source implementation is available at https://github.com/vt-asaplab/e-SeaFL. △ Less

Submitted 8 November, 2024; v1 submitted 7 April, 2023; originally announced April 2023.

Comments: Accepted in ACSAC 2024

arXiv:2210.10244 [pdf, other]

Prove You Owned Me: One Step beyond RFID Tag/Mutual Authentication

Authors: Shaoying Cai, Yingjiu Li, Changshe Ma, Sherman S. M. Chow, Robert H. Deng

Abstract: Radio Frequency Identification (RFID) is a key technology used in many applications. In the past decades, plenty of secure and privacy-preserving RFID tag/mutual authentication protocols as well as formal frameworks for evaluating them have been proposed. However, we notice that a property, namely proof of possession (PoP), has not been rigorously studied till now, despite it has significant value… ▽ More Radio Frequency Identification (RFID) is a key technology used in many applications. In the past decades, plenty of secure and privacy-preserving RFID tag/mutual authentication protocols as well as formal frameworks for evaluating them have been proposed. However, we notice that a property, namely proof of possession (PoP), has not been rigorously studied till now, despite it has significant value in many RFID applications. For example, in RFID-enabled supply chains, PoP helps prevent dis-honest parties from publishing information about products/tags that they actually have never processed. We propose the first formal framework for RFID tag/mutual authentication with PoP after correcting deficiencies of some existing RFID formal frameworks. We provide a generic construction to transform an RFID tag/mutual authentication protocol to one that supports PoP using a cryptographic hash function, a pseudorandom function (PRF) and a signature scheme. We prove that the constructed protocol is secure and privacy-preserving under our framework if all the building blocks possess desired security properties. Finally, we show an RFID mutual authentication protocol with PoP. Arming tag/mutual authentication protocols with PoP is an important step to strengthen RFID-enabled systems as it bridges the security gap between physical layer and data layer, and reduces the misuses of RFID-related data. △ Less

Submitted 18 October, 2022; originally announced October 2022.

arXiv:2209.09430 [pdf, other]

doi 10.1109/TCYB.2021.3117700

Modeling sequential annotations for sequence labeling with crowds

Authors: Xiaolei Lu, Tommy W. S. Chow

Abstract: Crowd sequential annotations can be an efficient and cost-effective way to build large datasets for sequence labeling. Different from tagging independent instances, for crowd sequential annotations the quality of label sequence relies on the expertise level of annotators in capturing internal dependencies for each token in the sequence. In this paper, we propose Modeling sequential annotation for… ▽ More Crowd sequential annotations can be an efficient and cost-effective way to build large datasets for sequence labeling. Different from tagging independent instances, for crowd sequential annotations the quality of label sequence relies on the expertise level of annotators in capturing internal dependencies for each token in the sequence. In this paper, we propose Modeling sequential annotation for sequence labeling with crowds (SA-SLC). First, a conditional probabilistic model is developed to jointly model sequential data and annotators' expertise, in which categorical distribution is introduced to estimate the reliability of each annotator in capturing local and non-local label dependency for sequential annotation. To accelerate the marginalization of the proposed model, a valid label sequence inference (VLSE) method is proposed to derive the valid ground-truth label sequences from crowd sequential annotations. VLSE derives possible ground-truth labels from the token-wise level and further prunes sub-paths in the forward inference for label sequence decoding. VLSE reduces the number of candidate label sequences and improves the quality of possible ground-truth label sequences. The experimental results on several sequence labeling tasks of Natural Language Processing show the effectiveness of the proposed model. △ Less

Submitted 19 September, 2022; originally announced September 2022.

Journal ref: IEEE Transactions on Cybernetics, 2021, 1-11

arXiv:2209.09410 [pdf, other]

doi 10.1109/TCYB.2020.3000053

Weak Disambiguation for Partial Structured Output Learning

Authors: Xiaolei Lu, Tommy W. S. Chow

Abstract: Existing disambiguation strategies for partial structured output learning just cannot generalize well to solve the problem that there are some candidates which can be false positive or similar to the ground-truth label. In this paper, we propose a novel weak disambiguation for partial structured output learning (WD-PSL). First, a piecewise large margin formulation is generalized to partial structu… ▽ More Existing disambiguation strategies for partial structured output learning just cannot generalize well to solve the problem that there are some candidates which can be false positive or similar to the ground-truth label. In this paper, we propose a novel weak disambiguation for partial structured output learning (WD-PSL). First, a piecewise large margin formulation is generalized to partial structured output learning, which effectively avoids handling large number of candidate structured outputs for complex structures. Second, in the proposed weak disambiguation strategy, each candidate label is assigned with a confidence value indicating how likely it is the true label, which aims to reduce the negative effects of wrong ground-truth label assignment in the learning process. Then two large margins are formulated to combine two types of constraints which are the disambiguation between candidates and non-candidates, and the weak disambiguation for candidates. In the framework of alternating optimization, a new 2n-slack variables cutting plane algorithm is developed to accelerate each iteration of optimization. The experimental results on several sequence labeling tasks of Natural Language Processing show the effectiveness of the proposed model. △ Less

Submitted 19 September, 2022; originally announced September 2022.

Journal ref: IEEE Transactions on Cybernetics ( Volume: 52, Issue: 2, February 2022)

arXiv:2209.09397 [pdf, ps, other]

doi 10.1109/TNNLS.2022.3191726

Partial sequence labeling with structured Gaussian Processes

Authors: Xiaolei Lu, Tommy W. S. Chow

Abstract: Existing partial sequence labeling models mainly focus on max-margin framework which fails to provide an uncertainty estimation of the prediction. Further, the unique ground truth disambiguation strategy employed by these models may include wrong label information for parameter learning. In this paper, we propose structured Gaussian Processes for partial sequence labeling (SGPPSL), which encodes u… ▽ More Existing partial sequence labeling models mainly focus on max-margin framework which fails to provide an uncertainty estimation of the prediction. Further, the unique ground truth disambiguation strategy employed by these models may include wrong label information for parameter learning. In this paper, we propose structured Gaussian Processes for partial sequence labeling (SGPPSL), which encodes uncertainty in the prediction and does not need extra effort for model selection and hyperparameter learning. The model employs factor-as-piece approximation that divides the linear-chain graph structure into the set of pieces, which preserves the basic Markov Random Field structure and effectively avoids handling large number of candidate output sequences generated by partially annotated data. Then confidence measure is introduced in the model to address different contributions of candidate labels, which enables the ground-truth label information to be utilized in parameter learning. Based on the derived lower bound of the variational lower bound of the proposed model, variational parameters and confidence measures are estimated in the framework of alternating optimization. Moreover, weighted Viterbi algorithm is proposed to incorporate confidence measure to sequence prediction, which considers label ambiguity arose from multiple annotations in the training data and thus helps improve the performance. SGPPSL is evaluated on several sequence labeling tasks and the experimental results show the effectiveness of the proposed model. △ Less

Submitted 19 September, 2022; originally announced September 2022.

Journal ref: IEEE Transactions on Neural Networks and Learning Systems, 2022, 1 - 10

arXiv:2209.09149 [pdf, ps, other]

doi 10.1109/TKDE.2019.2942295

Duration modeling with semi-Markov Conditional Random Fields for keyphrase extraction

Authors: Xiaolei Lu, Tommy W. S. Chow

Abstract: Existing methods for keyphrase extraction need preprocessing to generate candidate phrase or post-processing to transform keyword into keyphrase. In this paper, we propose a novel approach called duration modeling with semi-Markov Conditional Random Fields (DM-SMCRFs) for keyphrase extraction. First of all, based on the property of semi-Markov chain, DM-SMCRFs can encode segment-level features and… ▽ More Existing methods for keyphrase extraction need preprocessing to generate candidate phrase or post-processing to transform keyword into keyphrase. In this paper, we propose a novel approach called duration modeling with semi-Markov Conditional Random Fields (DM-SMCRFs) for keyphrase extraction. First of all, based on the property of semi-Markov chain, DM-SMCRFs can encode segment-level features and sequentially classify the phrase in the sentence as keyphrase or non-keyphrase. Second, by assuming the independence between state transition and state duration, DM-SMCRFs model the distribution of duration (length) of keyphrases to further explore state duration information, which can help identify the size of keyphrase. Based on the convexity of parametric duration feature derived from duration distribution, a constrained Viterbi algorithm is derived to improve the performance of decoding in DM-SMCRFs. We thoroughly evaluate the performance of DM-SMCRFs on the datasets from various domains. The experimental results demonstrate the effectiveness of proposed model. △ Less

Submitted 19 September, 2022; originally announced September 2022.

Journal ref: IEEE Transactions on Knowledge and Data Engineering,Volume: 33, Issue: 4, 01 April 2021

arXiv:2207.05959 [pdf, other]

doi 10.1145/3543507.3583240

Fine-tuning Partition-aware Item Similarities for Efficient and Scalable Recommendation

Authors: Tianjun Wei, Jianghong Ma, Tommy W. S. Chow

Abstract: Collaborative filtering (CF) is widely searched in recommendation with various types of solutions. Recent success of Graph Convolution Networks (GCN) in CF demonstrates the effectiveness of modeling high-order relationships through graphs, while repetitive graph convolution and iterative batch optimization limit their efficiency. Instead, item similarity models attempt to construct direct relation… ▽ More Collaborative filtering (CF) is widely searched in recommendation with various types of solutions. Recent success of Graph Convolution Networks (GCN) in CF demonstrates the effectiveness of modeling high-order relationships through graphs, while repetitive graph convolution and iterative batch optimization limit their efficiency. Instead, item similarity models attempt to construct direct relationships through efficient interaction encoding. Despite their great performance, the growing item numbers result in quadratic growth in similarity modeling process, posing critical scalability problems. In this paper, we investigate the graph sampling strategy adopted in latest GCN model for efficiency improving, and identify the potential item group structure in the sampled graph. Based on this, we propose a novel item similarity model which introduces graph partitioning to restrict the item similarity modeling within each partition. Specifically, we show that the spectral information of the original graph is well in preserving global-level information. Then, it is added to fine-tune local item similarities with a new data augmentation strategy acted as partition-aware prior knowledge, jointly to cope with the information loss brought by partitioning. Experiments carried out on 4 datasets show that the proposed model outperforms state-of-the-art GCN models with 10x speed-up and item similarity models with 95\% parameter storage savings. △ Less

Submitted 10 February, 2023; v1 submitted 13 July, 2022; originally announced July 2022.

Comments: Accepted by The 2023 ACM Web Conference (WWW 2023)

arXiv:2206.04855 [pdf, other]

Beyond the Gates of Euclidean Space: Temporal-Discrimination-Fusions and Attention-based Graph Neural Network for Human Activity Recognition

Authors: Nafees Ahmad, Savio Ho-Chit Chow, Ho-fung Leung

Abstract: Human activity recognition (HAR) through wearable devices has received much interest due to its numerous applications in fitness tracking, wellness screening, and supported living. As a result, we have seen a great deal of work in this field. Traditional deep learning (DL) has set a state of the art performance for HAR domain. However, it ignores the data's structure and the association between co… ▽ More Human activity recognition (HAR) through wearable devices has received much interest due to its numerous applications in fitness tracking, wellness screening, and supported living. As a result, we have seen a great deal of work in this field. Traditional deep learning (DL) has set a state of the art performance for HAR domain. However, it ignores the data's structure and the association between consecutive time stamps. To address this constraint, we offer an approach based on Graph Neural Networks (GNNs) for structuring the input representation and exploiting the relations among the samples. However, even when using a simple graph convolution network to eliminate this shortage, there are still several limiting factors, such as inter-class activities issues, skewed class distribution, and a lack of consideration for sensor data priority, all of which harm the HAR model's performance. To improve the current HAR model's performance, we investigate novel possibilities within the framework of graph structure to achieve highly discriminated and rich activity features. We propose a model for (1) time-series-graph module that converts raw data from HAR dataset into graphs; (2) Graph Convolutional Neural Networks (GCNs) to discover local dependencies and correlations between neighboring nodes; and (3) self-attention GNN encoder to identify sensors interactions and data priorities. To the best of our knowledge, this is the first work for HAR, which introduces a GNN-based approach that incorporates both the GCN and the attention mechanism. By employing a uniform evaluation method, our framework significantly improves the performance on hospital patient's activities dataset comparatively considered other state of the art baseline methods. △ Less

Submitted 9 June, 2022; originally announced June 2022.

arXiv:2106.01221 [pdf, other]

Differential Privacy for Text Analytics via Natural Text Sanitization

Authors: Xiang Yue, Minxin Du, Tianhao Wang, Yaliang Li, Huan Sun, Sherman S. M. Chow

Abstract: Texts convey sophisticated knowledge. However, texts also convey sensitive information. Despite the success of general-purpose language models and domain-specific mechanisms with differential privacy (DP), existing text sanitization mechanisms still provide low utility, as cursed by the high-dimensional text representation. The companion issue of utilizing sanitized texts for downstream analytics… ▽ More Texts convey sophisticated knowledge. However, texts also convey sensitive information. Despite the success of general-purpose language models and domain-specific mechanisms with differential privacy (DP), existing text sanitization mechanisms still provide low utility, as cursed by the high-dimensional text representation. The companion issue of utilizing sanitized texts for downstream analytics is also under-explored. This paper takes a direct approach to text sanitization. Our insight is to consider both sensitivity and similarity via our new local DP notion. The sanitized texts also contribute to our sanitization-aware pretraining and fine-tuning, enabling privacy-preserving natural language processing over the BERT language model with promising utility. Surprisingly, the high utility does not boost up the success rate of inference attacks. △ Less

Submitted 2 June, 2021; originally announced June 2021.

Comments: ACL-ICJNLP'21 Findings; The first two authors contributed equally

arXiv:2002.10944 [pdf, other]

Optimizing Privacy-Preserving Outsourced Convolutional Neural Network Predictions

Authors: Minghui Li, Sherman S. M. Chow, Shengshan Hu, Yuejing Yan, Chao Shen, Qian Wang

Abstract: Convolutional neural network is a machine-learning model widely applied in various prediction tasks, such as computer vision and medical image analysis. Their great predictive power requires extensive computation, which encourages model owners to host the prediction service in a cloud platform. Recent researches focus on the privacy of the query and results, but they do not provide model privacy a… ▽ More Convolutional neural network is a machine-learning model widely applied in various prediction tasks, such as computer vision and medical image analysis. Their great predictive power requires extensive computation, which encourages model owners to host the prediction service in a cloud platform. Recent researches focus on the privacy of the query and results, but they do not provide model privacy against the model-hosting server and may leak partial information about the results. Some of them further require frequent interactions with the querier or heavy computation overheads, which discourages querier from using the prediction service. This paper proposes a new scheme for privacy-preserving neural network prediction in the outsourced setting, i.e., the server cannot learn the query, (intermediate) results, and the model. Similar to SecureML (S&P'17), a representative work that provides model privacy, we leverage two non-colluding servers with secret sharing and triplet generation to minimize the usage of heavyweight cryptography. Further, we adopt asynchronous computation to improve the throughput, and design garbled circuits for the non-polynomial activation function to keep the same accuracy as the underlying network (instead of approximating it). Our experiments on MNIST dataset show that our scheme achieves an average of 122x, 14.63x, and 36.69x reduction in latency compared to SecureML, MiniONN (CCS'17), and EzPC (EuroS&P'19), respectively. For the communication costs, our scheme outperforms SecureML by 1.09x, MiniONN by 36.69x, and EzPC by 31.32x on average. On the CIFAR dataset, our scheme achieves a lower latency by a factor of 7.14x and 3.48x compared to MiniONN and EzPC, respectively. Our scheme also provides 13.88x and 77.46x lower communication costs than MiniONN and EzPC on the CIFAR dataset. △ Less

Submitted 29 June, 2020; v1 submitted 22 February, 2020; originally announced February 2020.

arXiv:2001.09961 [pdf, ps, other]

Efficient Algorithms towards Network Intervention

Authors: Hui-Ju Hung, Wang-Chien Lee, De-Nian Yang, Chih-Ya Shen, Zhen Lei, Sy-Miin Chow

Abstract: Research suggests that social relationships have substantial impacts on individuals' health outcomes. Network intervention, through careful planning, can assist a network of users to build healthy relationships. However, most previous work is not designed to assist such planning by carefully examining and improving multiple network characteristics. In this paper, we propose and evaluate algorithms… ▽ More Research suggests that social relationships have substantial impacts on individuals' health outcomes. Network intervention, through careful planning, can assist a network of users to build healthy relationships. However, most previous work is not designed to assist such planning by carefully examining and improving multiple network characteristics. In this paper, we propose and evaluate algorithms that facilitate network intervention planning through simultaneous optimization of network degree, closeness, betweenness, and local clustering coefficient, under scenarios involving Network Intervention with Limited Degradation - for Single target (NILD-S) and Network Intervention with Limited Degradation - for Multiple targets (NILD-M). We prove that NILD-S and NILD-M are NP-hard and cannot be approximated within any ratio in polynomial time unless P=NP. We propose the Candidate Re-selection with Preserved Dependency (CRPD) algorithm for NILD-S, and the Objective-aware Intervention edge Selection and Adjustment (OISA) algorithm for NILD-M. Various pruning strategies are designed to boost the efficiency of the proposed algorithms. Extensive experiments on various real social networks collected from public schools and Web and an empirical study are conducted to show that CRPD and OISA outperform the baselines in both efficiency and effectiveness. △ Less

Submitted 27 January, 2020; originally announced January 2020.

arXiv:1906.09445 [pdf, other]

doi 10.1016/j.ins.2019.05.020

Learning with fuzzy hypergraphs: a topical approach to query-oriented text summarization

Authors: Hadrien Van Lierde, Tommy W. S. Chow

Abstract: Existing graph-based methods for extractive document summarization represent sentences of a corpus as the nodes of a graph or a hypergraph in which edges depict relationships of lexical similarity between sentences. Such approaches fail to capture semantic similarities between sentences when they express a similar information but have few words in common and are thus lexically dissimilar. To overc… ▽ More Existing graph-based methods for extractive document summarization represent sentences of a corpus as the nodes of a graph or a hypergraph in which edges depict relationships of lexical similarity between sentences. Such approaches fail to capture semantic similarities between sentences when they express a similar information but have few words in common and are thus lexically dissimilar. To overcome this issue, we propose to extract semantic similarities based on topical representations of sentences. Inspired by the Hierarchical Dirichlet Process, we propose a probabilistic topic model in order to infer topic distributions of sentences. As each topic defines a semantic connection among a group of sentences with a certain degree of membership for each sentence, we propose a fuzzy hypergraph model in which nodes are sentences and fuzzy hyperedges are topics. To produce an informative summary, we extract a set of sentences from the corpus by simultaneously maximizing their relevance to a user-defined query, their centrality in the fuzzy hypergraph and their coverage of topics present in the corpus. We formulate a polynomial time algorithm building on the theory of submodular functions to solve the associated optimization problem. A thorough comparative analysis with other graph-based summarization systems is included in the paper. Our obtained results show the superiority of our method in terms of content coverage of the summaries. △ Less

Submitted 22 June, 2019; originally announced June 2019.

Comments: 8 figures

Journal ref: Information Sciences, 496 (2019), 212-224

arXiv:1902.00672 [pdf, other]

doi 10.1016/j.ipm.2019.03.003

Query-oriented text summarization based on hypergraph transversals

Authors: Hadrien Van Lierde, Tommy W. S. Chow

Abstract: Existing graph- and hypergraph-based algorithms for document summarization represent the sentences of a corpus as the nodes of a graph or a hypergraph in which the edges represent relationships of lexical similarities between sentences. Each sentence of the corpus is then scored individually, using popular node ranking algorithms, and a summary is produced by extracting highly scored sentences. Th… ▽ More Existing graph- and hypergraph-based algorithms for document summarization represent the sentences of a corpus as the nodes of a graph or a hypergraph in which the edges represent relationships of lexical similarities between sentences. Each sentence of the corpus is then scored individually, using popular node ranking algorithms, and a summary is produced by extracting highly scored sentences. This approach fails to select a subset of jointly relevant sentences and it may produce redundant summaries that are missing important topics of the corpus. To alleviate this issue, a new hypergraph-based summarizer is proposed in this paper, in which each node is a sentence and each hyperedge is a theme, namely a group of sentences sharing a topic. Themes are weighted in terms of their prominence in the corpus and their relevance to a user-defined query. It is further shown that the problem of identifying a subset of sentences covering the relevant themes of the corpus is equivalent to that of finding a hypergraph transversal in our theme-based hypergraph. Two extensions of the notion of hypergraph transversal are proposed for the purpose of summarization, and polynomial time algorithms building on the theory of submodular functions are proposed for solving the associated discrete optimization problems. The worst-case time complexity of the proposed algorithms is squared in the number of terms, which makes it cheaper than the existing hypergraph-based methods. A thorough comparative analysis with related models on DUC benchmark datasets demonstrates the effectiveness of our approach, which outperforms existing graph- or hypergraph-based methods by at least 6% of ROUGE-SU4 score. △ Less

Submitted 2 February, 2019; originally announced February 2019.

Comments: This is the unrefereed Author's Original Version (or pre-print Version) of the article

Journal ref: Information Processing & Management, Volume 56, Issue 4, July 2019, Pages 1317-1338

arXiv:1805.00862 [pdf, other]

doi 10.1093/comnet/cny011

Spectral clustering algorithms for the detection of clusters in block-cyclic and block-acyclic graphs

Authors: H. Van Lierde, T. W. S. Chow, J. -C. Delvenne

Abstract: We propose two spectral algorithms for partitioning nodes in directed graphs respectively with a cyclic and an acyclic pattern of connection between groups of nodes. Our methods are based on the computation of extremal eigenvalues of the transition matrix associated to the directed graph. The two algorithms outperform state-of-the art methods for directed graph clustering on synthetic datasets, in… ▽ More We propose two spectral algorithms for partitioning nodes in directed graphs respectively with a cyclic and an acyclic pattern of connection between groups of nodes. Our methods are based on the computation of extremal eigenvalues of the transition matrix associated to the directed graph. The two algorithms outperform state-of-the art methods for directed graph clustering on synthetic datasets, including methods based on blockmodels, bibliometric symmetrization and random walks. Our algorithms have the same space complexity as classical spectral clustering algorithms for undirected graphs and their time complexity is also linear in the number of edges in the graph. One of our methods is applied to a trophic network based on predator-prey relationships. It successfully extracts common categories of preys and predators encountered in food chains. The same method is also applied to highlight the hierarchical structure of a worldwide network of Autonomous Systems depicting business agreements between Internet Service Providers. △ Less

Submitted 2 May, 2018; originally announced May 2018.

Comments: This is the unrefereed Author's Original Version of the article. A peer-reviewed version has been accepted for publication in the Journal of Complex Networks published by Oxford University Press. The present version is not the Accepted Manuscript

Journal ref: Journal of Complex Networks, cny011 (2018)

arXiv:1802.10558 [pdf, ps, other]

doi 10.1109/TNNLS.2019.2909686

Exactly Robust Kernel Principal Component Analysis

Authors: Jicong Fan, Tommy W. S. Chow

Abstract: Robust principal component analysis (RPCA) can recover low-rank matrices when they are corrupted by sparse noises. In practice, many matrices are, however, of high-rank and hence cannot be recovered by RPCA. We propose a novel method called robust kernel principal component analysis (RKPCA) to decompose a partially corrupted matrix as a sparse matrix plus a high or full-rank matrix with low latent… ▽ More Robust principal component analysis (RPCA) can recover low-rank matrices when they are corrupted by sparse noises. In practice, many matrices are, however, of high-rank and hence cannot be recovered by RPCA. We propose a novel method called robust kernel principal component analysis (RKPCA) to decompose a partially corrupted matrix as a sparse matrix plus a high or full-rank matrix with low latent dimensionality. RKPCA can be applied to many problems such as noise removal and subspace clustering and is still the only unsupervised nonlinear method robust to sparse noises. Our theoretical analysis shows that, with high probability, RKPCA can provide high recovery accuracy. The optimization of RKPCA involves nonconvex and indifferentiable problems. We propose two nonconvex optimization algorithms for RKPCA. They are alternating direction method of multipliers with backtracking line search and proximal linearized minimization with adaptive step size. Comparative studies in noise removal and robust subspace clustering corroborate the effectiveness and superiority of RKPCA. △ Less

Submitted 17 April, 2019; v1 submitted 28 February, 2018; originally announced February 2018.

Comments: The paper was accepted by IEEE Transactions on Neural Networks and Learning Systems

arXiv:1512.05417 [pdf, other]

Influence Prediction for Continuous-Time Information Propagation on Networks

Authors: Shui-Nee Chow, Xiaojing Ye, Hongyuan Zha, Haomin Zhou

Abstract: We consider the problem of predicting the time evolution of influence, the expected number of activated nodes, given a set of initially active nodes on a propagation network. To address the significant computational challenges of this problem on large-scale heterogeneous networks, we establish a system of differential equations governing the dynamics of probability mass functions on the state grap… ▽ More We consider the problem of predicting the time evolution of influence, the expected number of activated nodes, given a set of initially active nodes on a propagation network. To address the significant computational challenges of this problem on large-scale heterogeneous networks, we establish a system of differential equations governing the dynamics of probability mass functions on the state graph where the nodes each lumps a number of activation states of the network, which can be considered as an analogue to the Fokker-Planck equation in continuous space. We provides several methods to estimate the system parameters which depend on the identities of the initially active nodes, network topology, and activation rates etc. The influence is then estimated by the solution of such a system of differential equations. This approach gives rise to a class of novel and scalable algorithms that work effectively for large-scale and dense networks. Numerical results are provided to show the very promising performance in terms of prediction accuracy and computational efficiency of this approach. △ Less

Submitted 7 January, 2017; v1 submitted 16 December, 2015; originally announced December 2015.

Comments: 14 pages, 17 figures

MSC Class: 65C40; 65Y10; 68U35

arXiv:1411.6400

Mutual Information-Based Unsupervised Feature Transformation for Heterogeneous Feature Subset Selection

Authors: Min Wei, Tommy W. S. Chow, Rosa H. M. Chan

Abstract: Conventional mutual information (MI) based feature selection (FS) methods are unable to handle heterogeneous feature subset selection properly because of data format differences or estimation methods of MI between feature subset and class label. A way to solve this problem is feature transformation (FT). In this study, a novel unsupervised feature transformation (UFT) which can transform non-numer… ▽ More Conventional mutual information (MI) based feature selection (FS) methods are unable to handle heterogeneous feature subset selection properly because of data format differences or estimation methods of MI between feature subset and class label. A way to solve this problem is feature transformation (FT). In this study, a novel unsupervised feature transformation (UFT) which can transform non-numerical features into numerical features is developed and tested. The UFT process is MI-based and independent of class label. MI-based FS algorithms, such as Parzen window feature selector (PWFS), minimum redundancy maximum relevance feature selection (mRMR), and normalized MI feature selection (NMIFS), can all adopt UFT for pre-processing of non-numerical features. Unlike traditional FT methods, the proposed UFT is unbiased while PWFS is utilized to its full advantage. Simulations and analyses of large-scale datasets showed that feature subset selected by the integrated method, UFT-PWFS, outperformed other FT-FS integrated methods in classification accuracy. △ Less

Submitted 29 March, 2015; v1 submitted 24 November, 2014; originally announced November 2014.

Comments: This paper has been withdrawn by the author due to the number of datasets and classifiers are not sufficient to support the claim. Need more simulation work

arXiv:1405.4951 [pdf, ps, other]

Secure Friend Discovery via Privacy-Preserving and Decentralized Community Detection

Authors: Pili Hu, Sherman S. M. Chow, Wing Cheong Lau

Abstract: The problem of secure friend discovery on a social network has long been proposed and studied. The requirement is that a pair of nodes can make befriending decisions with minimum information exposed to the other party. In this paper, we propose to use community detection to tackle the problem of secure friend discovery. We formulate the first privacy-preserving and decentralized community detectio… ▽ More The problem of secure friend discovery on a social network has long been proposed and studied. The requirement is that a pair of nodes can make befriending decisions with minimum information exposed to the other party. In this paper, we propose to use community detection to tackle the problem of secure friend discovery. We formulate the first privacy-preserving and decentralized community detection problem as a multi-objective optimization. We design the first protocol to solve this problem, which transforms community detection to a series of Private Set Intersection (PSI) instances using Truncated Random Walk (TRW). Preliminary theoretical results show that our protocol can uncover communities with overwhelming probability and preserve privacy. We also discuss future works, potential extensions and variations. △ Less

Submitted 20 May, 2014; originally announced May 2014.

Comments: ICML 2014 Workshop on Learning, Security and Privacy

Showing 1–24 of 24 results for author: Chow, S