Skip to main content

Showing 1–50 of 60 results for author: Phung, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2502.10716  [pdf, other

    cs.LG stat.ML

    Why Domain Generalization Fail? A View of Necessity and Sufficiency

    Authors: Long-Tung Vuong, Vy Vo, Hien Dang, Van-Anh Nguyen, Thanh-Toan Do, Mehrtash Harandi, Trung Le, Dinh Phung

    Abstract: Despite a strong theoretical foundation, empirical experiments reveal that existing domain generalization (DG) algorithms often fail to consistently outperform the ERM baseline. We argue that this issue arises because most DG studies focus on establishing theoretical guarantees for generalization under unrealistic assumptions, such as the availability of sufficient, diverse (or even infinite) doma… ▽ More

    Submitted 15 February, 2025; originally announced February 2025.

  2. arXiv:2412.15645  [pdf

    stat.AP

    A District-level Ensemble Model to Enhance Dengue Prediction and Control for the Mekong Delta Region of Vietnam

    Authors: Wala Draidi Areed, Thi Thanh Thao Nguyen, Kien Quoc Do, Thinh Nguyen, Vinh Bui, Elisabeth Nelson, Joshua L. Warren, Quang-Van Doan, Nam Vu Sinh, Nicholas Osborne, Russell Richards, Nu Quy Linh Tran, Hong Le, Tuan Pham, Trinh Manh Hung, Son Nghiem, Hai Phung, Cordia Chu, Robert Dubrow, Daniel M. Weinberger, Dung Phung

    Abstract: The Mekong Delta Region of Vietnam faces increasing dengue risks driven by urbanization, globalization, and climate change. This study introduces a probabilistic forecasting model for predicting dengue incidence and outbreaks with one to three month lead times, integrating meteorological, sociodemographic, preventive, and epidemiological data. Seventy-two models were evaluated, and an ensemble com… ▽ More

    Submitted 20 December, 2024; originally announced December 2024.

    Comments: 34 pages, 6 figures

  3. arXiv:2410.04196  [pdf, ps, other

    cs.LG stat.ML

    Improving Generalization with Flat Hilbert Bayesian Inference

    Authors: Tuan Truong, Quyen Tran, Quan Pham-Ngoc, Nhat Ho, Dinh Phung, Trung Le

    Abstract: We introduce Flat Hilbert Bayesian Inference (FHBI), an algorithm designed to enhance generalization in Bayesian inference. Our approach involves an iterative two-step procedure with an adversarial functional perturbation step and a functional descent step within a reproducing kernel Hilbert space. This methodology is supported by a theoretical analysis that extends previous findings on generaliza… ▽ More

    Submitted 8 June, 2025; v1 submitted 5 October, 2024; originally announced October 2024.

    Comments: Accepted (ICML 2025)

  4. arXiv:2403.13204  [pdf, other

    cs.LG cs.CV stat.ML

    Diversity-Aware Agnostic Ensemble of Sharpness Minimizers

    Authors: Anh Bui, Vy Vo, Tung Pham, Dinh Phung, Trung Le

    Abstract: There has long been plenty of theoretical and empirical evidence supporting the success of ensemble learning. Deep ensembles in particular take advantage of training randomness and expressivity of individual neural networks to gain prediction diversity, ultimately leading to better generalization, robustness and uncertainty estimation. In respect of generalization, it is found that pursuing wider… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  5. arXiv:2402.17834  [pdf, other

    cs.CL stat.ML

    Stable LM 2 1.6B Technical Report

    Authors: Marco Bellagente, Jonathan Tow, Dakota Mahan, Duy Phung, Maksym Zhuravinskyi, Reshinth Adithyan, James Baicoianu, Ben Brooks, Nathan Cooper, Ashish Datta, Meng Lee, Emad Mostaque, Michael Pieler, Nikhil Pinnaparju, Paulo Rocha, Harry Saini, Hannah Teufel, Niccolo Zanichelli, Carlos Riquelme

    Abstract: We introduce StableLM 2 1.6B, the first in a new generation of our language model series. In this technical report, we present in detail the data and training procedure leading to the base and instruction-tuned versions of StableLM 2 1.6B. The weights for both models are available via Hugging Face for anyone to download and use. The report contains thorough evaluations of these models, including z… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 23 pages, 6 figures

  6. arXiv:2206.01934  [pdf, other

    cs.LG cs.AI stat.ML

    Stochastic Multiple Target Sampling Gradient Descent

    Authors: Hoang Phan, Ngoc Tran, Trung Le, Toan Tran, Nhat Ho, Dinh Phung

    Abstract: Sampling from an unnormalized target distribution is an essential problem with many applications in probabilistic inference. Stein Variational Gradient Descent (SVGD) has been shown to be a powerful method that iteratively updates a set of particles to approximate the distribution of interest. Furthermore, when analysing its asymptotic properties, SVGD reduces exactly to a single-objective optimiz… ▽ More

    Submitted 10 February, 2023; v1 submitted 4 June, 2022; originally announced June 2022.

    Comments: Accepted to Advances in Neural Information Processing Systems (NeurIPS) 2022. 27 pages, 10 figures, 5 tables

  7. arXiv:2202.10723  [pdf, other

    cs.LG cs.AI stat.ML

    Sobolev Transport: A Scalable Metric for Probability Measures with Graph Metrics

    Authors: Tam Le, Truyen Nguyen, Dinh Phung, Viet Anh Nguyen

    Abstract: Optimal transport (OT) is a popular measure to compare probability distributions. However, OT suffers a few drawbacks such as (i) a high complexity for computation, (ii) indefiniteness which limits its applicability to kernel machines. In this work, we consider probability measures supported on a graph metric space and propose a novel Sobolev transport metric. We show that the Sobolev transport me… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

    Comments: AISTATS 2022

  8. arXiv:2111.13822  [pdf, other

    cs.LG cs.AI stat.ML

    On Learning Domain-Invariant Representations for Transfer Learning with Multiple Sources

    Authors: Trung Phung, Trung Le, Long Vuong, Toan Tran, Anh Tran, Hung Bui, Dinh Phung

    Abstract: Domain adaptation (DA) benefits from the rigorous theoretical works that study its insightful characteristics and various aspects, e.g., learning domain-invariant representations and its trade-off. However, it seems not the case for the multiple source DA and domain generalization (DG) settings which are remarkably more complicated and sophisticated due to the involvement of multiple source domain… ▽ More

    Submitted 27 November, 2021; originally announced November 2021.

    Comments: NeurIPS 2021

    Journal ref: Proceedings of Advances in Neural Information Processing Systems (2021) 27720-27733

  9. arXiv:2110.15520  [pdf, other

    cs.LG stat.ME stat.ML

    On Label Shift in Domain Adaptation via Wasserstein Distance

    Authors: Trung Le, Dat Do, Tuan Nguyen, Huy Nguyen, Hung Bui, Nhat Ho, Dinh Phung

    Abstract: We study the label shift problem between the source and target domains in general domain adaptation (DA) settings. We consider transformations transporting the target to source domains, which enable us to align the source and target examples. Through those transformations, we define the label shift between two domains via optimal transport and develop theory to investigate the properties of DA und… ▽ More

    Submitted 1 March, 2022; v1 submitted 28 October, 2021; originally announced October 2021.

    Comments: 35 pages, 7 figures, 6 tables

  10. arXiv:2102.05912  [pdf, other

    stat.ML cs.LG

    On Transportation of Mini-batches: A Hierarchical Approach

    Authors: Khai Nguyen, Dang Nguyen, Quoc Nguyen, Tung Pham, Hung Bui, Dinh Phung, Trung Le, Nhat Ho

    Abstract: Mini-batch optimal transport (m-OT) has been successfully used in practical applications that involve probability measures with a very high number of supports. The m-OT solves several smaller optimal transport problems and then returns the average of their costs and transportation plans. Despite its scalability advantage, the m-OT does not consider the relationship between mini-batches which leads… ▽ More

    Submitted 6 June, 2022; v1 submitted 11 February, 2021; originally announced February 2021.

    Comments: Accepted to ICML 2022, 34 pages, 16 figures, 9 tables

  11. arXiv:2008.13537  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    Neural Topic Model via Optimal Transport

    Authors: He Zhao, Dinh Phung, Viet Huynh, Trung Le, Wray Buntine

    Abstract: Recently, Neural Topic Models (NTMs) inspired by variational autoencoders have obtained increasingly research interest due to their promising results on text analysis. However, it is usually hard for existing NTMs to achieve good document representation and coherent/diverse topics at the same time. Moreover, they often degrade their performance severely on short documents. The requirement of repar… ▽ More

    Submitted 31 May, 2022; v1 submitted 12 August, 2020; originally announced August 2020.

    Comments: Published in ICLR 2021, link: https://openreview.net/forum?id=Oos98K9Lv-k, code: https://github.com/ethanhezhao/NeuralSinkhornTopicModel

  12. arXiv:2008.05089  [pdf, other

    cs.LG stat.ML

    Quaternion Graph Neural Networks

    Authors: Dai Quoc Nguyen, Tu Dinh Nguyen, Dinh Phung

    Abstract: Recently, graph neural networks (GNNs) have become an important and active research direction in deep learning. It is worth noting that most of the existing GNN-based methods learn graph representations within the Euclidean vector space. Beyond the Euclidean space, learning representation and embeddings in hyper-complex space have also shown to be a promising and effective approach. To this end, w… ▽ More

    Submitted 6 October, 2021; v1 submitted 11 August, 2020; originally announced August 2020.

    Comments: Camera-ready for ACML 2021. Additional implementations for Gated QGNNs, Dual QGNNs, Simplifying QGNNs

  13. arXiv:2007.05123  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Improving Adversarial Robustness by Enforcing Local and Global Compactness

    Authors: Anh Bui, Trung Le, He Zhao, Paul Montague, Olivier deVel, Tamas Abraham, Dinh Phung

    Abstract: The fact that deep neural networks are susceptible to crafted perturbations severely impacts the use of deep learning in certain domains of application. Among many developed defense models against such attacks, adversarial training emerges as the most successful method that consistently resists a wide range of attacks. In this work, based on an observation from a previous study that the representa… ▽ More

    Submitted 9 July, 2020; originally announced July 2020.

    Comments: Proceeding of the European Conference on Computer Vision (ECCV) 2020

  14. arXiv:2006.12100  [pdf, other

    cs.LG cs.CL cs.SI stat.ML

    A Self-Attention Network based Node Embedding Model

    Authors: Dai Quoc Nguyen, Tu Dinh Nguyen, Dinh Phung

    Abstract: Despite several signs of progress have been made recently, limited research has been conducted for an inductive setting where embeddings are required for newly unseen nodes -- a setting encountered commonly in practical applications of deep learning for graph networks. This significantly affects the performances of downstream tasks such as node classification, link prediction or community extracti… ▽ More

    Submitted 22 June, 2020; originally announced June 2020.

    Comments: Accepted version, ECML-PKDD 2020

  15. arXiv:2004.07534  [pdf, other

    cs.LG stat.ML

    OptiGAN: Generative Adversarial Networks for Goal Optimized Sequence Generation

    Authors: Mahmoud Hossam, Trung Le, Viet Huynh, Michael Papasimeon, Dinh Phung

    Abstract: One of the challenging problems in sequence generation tasks is the optimized generation of sequences with specific desired goals. Current sequential generative models mainly generate sequences to closely mimic the training data, without direct optimization of desired goals or properties specific to the task. We introduce OptiGAN, a generative model that incorporates both Generative Adversarial Ne… ▽ More

    Submitted 14 January, 2021; v1 submitted 16 April, 2020; originally announced April 2020.

    Comments: Preprint for accepted conference paper at International Joint Conference on Neural Networks (IJCNN) 2020

    MSC Class: 68T01; 68T50 ACM Class: I.2.0; I.2.7; I.5.0

  16. arXiv:1911.04822  [pdf, other

    cs.LG cs.CL stat.ML

    A Capsule Network-based Model for Learning Node Embeddings

    Authors: Dai Quoc Nguyen, Tu Dinh Nguyen, Dat Quoc Nguyen, Dinh Phung

    Abstract: In this paper, we focus on learning low-dimensional embeddings for nodes in graph-structured data. To achieve this, we propose Caps2NE -- a new unsupervised embedding model leveraging a network of two capsule layers. Caps2NE induces a routing process to aggregate feature vectors of context neighbors of a given target node at the first capsule layer, then feed these features into the second capsule… ▽ More

    Submitted 18 August, 2020; v1 submitted 12 November, 2019; originally announced November 2019.

    Comments: Extended version of our CIKM 2020 paper, including inductive results

  17. arXiv:1910.04483  [pdf, other

    stat.ML cs.LG

    Tree-Wasserstein Barycenter for Large-Scale Multilevel Clustering and Scalable Bayes

    Authors: Tam Le, Viet Huynh, Nhat Ho, Dinh Phung, Makoto Yamada

    Abstract: We study in this paper a variant of Wasserstein barycenter problem, which we refer to as tree-Wasserstein barycenter, by leveraging a specific class of ground metrics, namely tree metrics, for Wasserstein distance. Drawing on the tree structure, we propose an efficient algorithmic approach to solve the tree-Wasserstein barycenter and its variants. The proposed approach is not only fast for computa… ▽ More

    Submitted 26 February, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

  18. arXiv:1910.01329  [pdf, other

    cs.LG cs.CR stat.ML

    Perturbations are not Enough: Generating Adversarial Examples with Spatial Distortions

    Authors: He Zhao, Trung Le, Paul Montague, Olivier De Vel, Tamas Abraham, Dinh Phung

    Abstract: Deep neural network image classifiers are reported to be susceptible to adversarial evasion attacks, which use carefully crafted images created to mislead a classifier. Recently, various kinds of adversarial attack methods have been proposed, most of which focus on adding small perturbations to input images. Despite the success of existing approaches, the way to generate realistic adversarial imag… ▽ More

    Submitted 3 October, 2019; originally announced October 2019.

  19. arXiv:1909.11855  [pdf, other

    cs.LG cs.CV stat.ML

    Universal Graph Transformer Self-Attention Networks

    Authors: Dai Quoc Nguyen, Tu Dinh Nguyen, Dinh Phung

    Abstract: We introduce a transformer-based GNN model, named UGformer, to learn graph representations. In particular, we present two UGformer variants, wherein the first variant (publicized in September 2019) is to leverage the transformer on a set of sampled neighbors for each input node, while the second (publicized in May 2021) is to leverage the transformer on all input nodes. Experimental results demons… ▽ More

    Submitted 8 March, 2022; v1 submitted 25 September, 2019; originally announced September 2019.

    Comments: Accepted to The ACM Web Conference 2022 (WWW '22) (Poster and Demo Track)

  20. arXiv:1909.08787  [pdf, other

    stat.ML cs.LG

    On Efficient Multilevel Clustering via Wasserstein Distances

    Authors: Viet Huynh, Nhat Ho, Nhan Dam, XuanLong Nguyen, Mikhail Yurochkin, Hung Bui, and Dinh Phung

    Abstract: We propose a novel approach to the problem of multilevel clustering, which aims to simultaneously partition data in each group and discover grouping patterns among groups in a potentially large hierarchically structured corpus of data. Our method involves a joint optimization formulation over several spaces of discrete probability measures, which are endowed with Wasserstein distance metrics. We p… ▽ More

    Submitted 24 May, 2021; v1 submitted 18 September, 2019; originally announced September 2019.

    Comments: 32 pages, 8 figures, JMLR submission. arXiv admin note: substantial text overlap with arXiv:1706.03883

  21. arXiv:1901.08710  [pdf, ps, other

    cs.LG stat.ML

    When Can Neural Networks Learn Connected Decision Regions?

    Authors: Trung Le, Dinh Phung

    Abstract: Previous work has questioned the conditions under which the decision regions of a neural network are connected and further showed the implications of the corresponding theory to the problem of adversarial manipulation of classifiers. It has been proven that for a class of activation functions including leaky ReLU, neural networks having a pyramidal structure, that is no layer has more hidden units… ▽ More

    Submitted 24 January, 2019; originally announced January 2019.

  22. arXiv:1811.06199  [pdf, other

    cs.LG cs.AI stat.ML

    On Deep Domain Adaptation: Some Theoretical Understandings

    Authors: Trung Le, Khanh Nguyen, Nhat Ho, Hung Bui, Dinh Phung

    Abstract: Compared with shallow domain adaptation, recent progress in deep domain adaptation has shown that it can achieve higher predictive performance and stronger capacity to tackle structural data (e.g., image and sequential data). The underlying idea of deep domain adaptation is to bridge the gap between source and target domains in a joint space so that a supervised classifier trained on labeled sourc… ▽ More

    Submitted 19 June, 2019; v1 submitted 15 November, 2018; originally announced November 2018.

  23. arXiv:1810.11911  [pdf, ps, other

    cs.LG stat.ML

    Probabilistic Multilevel Clustering via Composite Transportation Distance

    Authors: Nhat Ho, Viet Huynh, Dinh Phung, Michael I. Jordan

    Abstract: We propose a novel probabilistic approach to multilevel clustering problems based on composite transportation distance, which is a variant of transportation distance where the underlying metric is Kullback-Leibler divergence. Our method involves solving a joint optimization problem over spaces of probability measures to simultaneously discover grouping structures within groups and among groups. By… ▽ More

    Submitted 28 October, 2018; originally announced October 2018.

    Comments: 25 pages, 3 figures

  24. arXiv:1711.01744  [pdf, ps, other

    cs.LG cs.AI stat.ML

    KGAN: How to Break The Minimax Game in GAN

    Authors: Trung Le, Tu Dinh Nguyen, Dinh Phung

    Abstract: Generative Adversarial Networks (GANs) were intuitively and attractively explained under the perspective of game theory, wherein two involving parties are a discriminator and a generator. In this game, the task of the discriminator is to discriminate the real and generated (i.e., fake) data, whilst the task of the generator is to generate the fake data that maximally confuses the discriminator. In… ▽ More

    Submitted 6 November, 2017; originally announced November 2017.

  25. arXiv:1709.06390  [pdf, ps, other

    cs.LG stat.ML

    Analogical-based Bayesian Optimization

    Authors: Trung Le, Khanh Nguyen, Tu Dinh Nguyen, Dinh Phung

    Abstract: Some real-world problems revolve to solve the optimization problem \max_{x\in\mathcal{X}}f\left(x\right) where f\left(.\right) is a black-box function and X might be the set of non-vectorial objects (e.g., distributions) where we can only define a symmetric and non-negative similarity score on it. This setting requires a novel view for the standard framework of Bayesian Optimization that generaliz… ▽ More

    Submitted 19 September, 2017; originally announced September 2017.

  26. arXiv:1709.03831  [pdf, other

    cs.LG stat.ML

    Dual Discriminator Generative Adversarial Nets

    Authors: Tu Dinh Nguyen, Trung Le, Hung Vu, Dinh Phung

    Abstract: We propose in this paper a novel approach to tackle the problem of mode collapse encountered in generative adversarial network (GAN). Our idea is intuitive but proven to be very effective, especially in addressing some key limitations of GAN. In essence, it combines the Kullback-Leibler (KL) and reverse KL divergences into a unified objective function, thus it exploits the complementary statistica… ▽ More

    Submitted 12 September, 2017; originally announced September 2017.

  27. arXiv:1708.05594  [pdf, other

    cs.LG stat.ML

    Statistical Latent Space Approach for Mixed Data Modelling and Applications

    Authors: Tu Dinh Nguyen, Truyen Tran, Dinh Phung, Svetha Venkatesh

    Abstract: The analysis of mixed data has been raising challenges in statistics and machine learning. One of two most prominent challenges is to develop new statistical techniques and methodologies to effectively handle mixed data by making the data less heterogeneous with minimum loss of information. The other challenge is that such methods must be able to apply in large-scale tasks when dealing with huge a… ▽ More

    Submitted 18 August, 2017; originally announced August 2017.

  28. arXiv:1708.04733  [pdf, other

    cs.LG cs.AI stat.ML

    Geometric Enclosing Networks

    Authors: Trung Le, Hung Vu, Tu Dinh Nguyen, Dinh Phung

    Abstract: Training model to generate data has increasingly attracted research attention and become important in modern world applications. We propose in this paper a new geometry-based optimization approach to address this problem. Orthogonal to current state-of-the-art density-based approaches, most notably VAE and GAN, we present a fresh new idea that borrows the principle of minimal enclosing ball to tra… ▽ More

    Submitted 17 August, 2017; v1 submitted 15 August, 2017; originally announced August 2017.

  29. arXiv:1708.02556  [pdf, other

    cs.LG cs.AI stat.ML

    Multi-Generator Generative Adversarial Nets

    Authors: Quan Hoang, Tu Dinh Nguyen, Trung Le, Dinh Phung

    Abstract: We propose a new approach to train the Generative Adversarial Nets (GANs) with a mixture of generators to overcome the mode collapsing problem. The main intuition is to employ multiple generators, instead of using a single one as in the original GAN. The idea is simple, yet proven to be extremely effective at covering diverse data modes, easily overcoming the mode collapse and delivering state-of-… ▽ More

    Submitted 27 October, 2017; v1 submitted 8 August, 2017; originally announced August 2017.

  30. arXiv:1706.03883  [pdf, other

    stat.ML stat.CO stat.ME

    Multilevel Clustering via Wasserstein Means

    Authors: Nhat Ho, XuanLong Nguyen, Mikhail Yurochkin, Hung Hai Bui, Viet Huynh, Dinh Phung

    Abstract: We propose a novel approach to the problem of multilevel clustering, which aims to simultaneously partition data in each group and discover grouping patterns among groups in a potentially large hierarchically structured corpus of data. Our method involves a joint optimization formulation over several spaces of discrete probability measures, which are endowed with Wasserstein distance metrics. We p… ▽ More

    Submitted 12 June, 2017; originally announced June 2017.

    Comments: Proceedings of the ICML, 2017

  31. arXiv:1703.04832  [pdf, ps, other

    stat.ML

    A Random Finite Set Model for Data Clustering

    Authors: Dinh Phung, Ba-Ngu Bo

    Abstract: The goal of data clustering is to partition data points into groups to minimize a given objective function. While most existing clustering algorithms treat each data point as vector, in many applications each datum is not a vector but a point pattern or a set of points. Moreover, many existing clustering methods require the user to specify the number of clusters, which is not available in advance.… ▽ More

    Submitted 14 March, 2017; originally announced March 2017.

    Comments: In Proceedings of International Conference on Fusion (FUSION), Salamanca, Spain, July 2014

  32. arXiv:1703.02155  [pdf, other

    stat.ML cs.LG

    Model-Based Multiple Instance Learning

    Authors: Ba-Ngu Vo, Dinh Phung, Quang N. Tran, Ba-Tuong Vo

    Abstract: While Multiple Instance (MI) data are point patterns -- sets or multi-sets of unordered points -- appropriate statistical point pattern models have not been used in MI learning. This article proposes a framework for model-based MI learning using point process theory. Likelihood functions for point pattern data derived from point process theory enable principled yet conceptually transparent extensi… ▽ More

    Submitted 13 August, 2017; v1 submitted 6 March, 2017; originally announced March 2017.

    Comments: 16 pages, 15 figures

  33. arXiv:1702.02262  [pdf, other

    cs.LG stat.ML

    Clustering For Point Pattern Data

    Authors: Quang N. Tran, Ba-Ngu Vo, Dinh Phung, Ba-Tuong Vo

    Abstract: Clustering is one of the most common unsupervised learning tasks in machine learning and data mining. Clustering algorithms have been used in a plethora of applications across several scientific fields. However, there has been limited research in the clustering of point patterns - sets or multi-sets of unordered elements - that are found in numerous applications and data sources. In this paper, we… ▽ More

    Submitted 7 February, 2017; originally announced February 2017.

    Comments: Preprint: 23rd Int. Conf. Pattern Recognition (ICPR). Cancun, Mexico, December 2016

  34. arXiv:1701.08473  [pdf, other

    cs.LG stat.ML

    Model-based Classification and Novelty Detection For Point Pattern Data

    Authors: Ba-Ngu Vo, Quang N. Tran, Dinh Phung, Ba-Tuong Vo

    Abstract: Point patterns are sets or multi-sets of unordered elements that can be found in numerous data sources. However, in data analysis tasks such as classification and novelty detection, appropriate statistical models for point pattern data have not received much attention. This paper proposes the modelling of point pattern data via random finite sets (RFS). In particular, we propose appropriate likeli… ▽ More

    Submitted 7 February, 2017; v1 submitted 29 January, 2017; originally announced January 2017.

    Comments: Prepint: 23rd Int. Conf. Pattern Recognition (ICPR). Cancun, Mexico, December 2016

  35. arXiv:1609.08752  [pdf, other

    stat.ML

    Stabilizing Linear Prediction Models using Autoencoder

    Authors: Shivapratap Gopakumar, Truyen Tran, Dinh Phung, Svetha Venkatesh

    Abstract: To date, the instability of prognostic predictors in a sparse high dimensional model, which hinders their clinical adoption, has received little attention. Stable prediction is often overlooked in favour of performance. Yet, stability prevails as key when adopting models in critical areas as healthcare. Our study proposes a stabilization scheme by detecting higher order feature correlations. Using… ▽ More

    Submitted 27 September, 2016; originally announced September 2016.

    Comments: accepted in ADMA 2016

  36. arXiv:1609.04508  [pdf, other

    cs.LG cs.AI stat.ML

    Column Networks for Collective Classification

    Authors: Trang Pham, Truyen Tran, Dinh Phung, Svetha Venkatesh

    Abstract: Relational learning deals with data that are characterized by relational structures. An important task is collective classification, which is to jointly classify networked objects. While it holds a great promise to produce a better accuracy than non-collective classifiers, collective classification is computational challenging and has not leveraged on the recent breakthroughs of deep learning. We… ▽ More

    Submitted 28 November, 2016; v1 submitted 15 September, 2016; originally announced September 2016.

    Comments: Accepted at AAAI'17

  37. arXiv:1608.04830  [pdf, other

    stat.ML cs.LG

    Outlier Detection on Mixed-Type Data: An Energy-based Approach

    Authors: Kien Do, Truyen Tran, Dinh Phung, Svetha Venkatesh

    Abstract: Outlier detection amounts to finding data points that differ significantly from the norm. Classic outlier detection methods are largely designed for single data type such as continuous or discrete. However, real world data is increasingly heterogeneous, where a data point can have both discrete and continuous attributes. Handling mixed-type data in a disciplined way remains a great challenge. In t… ▽ More

    Submitted 16 August, 2016; originally announced August 2016.

  38. arXiv:1608.03639  [pdf, other

    stat.ML cs.LG cs.NE

    Faster Training of Very Deep Networks Via p-Norm Gates

    Authors: Trang Pham, Truyen Tran, Dinh Phung, Svetha Venkatesh

    Abstract: A major contributing factor to the recent advances in deep neural networks is structural units that let sensory information and gradients to propagate easily. Gating is one such structure that acts as a flow control. Gates are employed in many recent state-of-the-art recurrent models such as LSTM and GRU, and feedforward models such as Residual Nets and Highway Networks. This enables learning in v… ▽ More

    Submitted 11 August, 2016; originally announced August 2016.

    Comments: To appear in ICPR'16

  39. arXiv:1607.08310  [pdf, other

    stat.ML

    Preterm Birth Prediction: Deriving Stable and Interpretable Rules from High Dimensional Data

    Authors: Truyen Tran, Wei Luo, Dinh Phung, Jonathan Morris, Kristen Rickard, Svetha Venkatesh

    Abstract: Preterm births occur at an alarming rate of 10-15%. Preemies have a higher risk of infant mortality, developmental retardation and long-term disabilities. Predicting preterm birth is difficult, even for the most experienced clinicians. The most well-designed clinical study thus far reaches a modest sensitivity of 18.2-24.2% at specificity of 28.6-33.3%. We take a different approach by exploiting d… ▽ More

    Submitted 28 July, 2016; originally announced July 2016.

    Comments: Presented at 2016 Machine Learning and Healthcare Conference (MLHC 2016), Los Angeles, CA

  40. arXiv:1605.01116  [pdf, other

    stat.ML cs.LG

    An evaluation of randomized machine learning methods for redundant data: Predicting short and medium-term suicide risk from administrative records and risk assessments

    Authors: Thuong Nguyen, Truyen Tran, Shivapratap Gopakumar, Dinh Phung, Svetha Venkatesh

    Abstract: Accurate prediction of suicide risk in mental health patients remains an open problem. Existing methods including clinician judgments have acceptable sensitivity, but yield many false positives. Exploiting administrative data has a great potential, but the data has high dimensionality and redundancies in the recording processes. We investigate the efficacy of three most effective randomized machin… ▽ More

    Submitted 3 May, 2016; originally announced May 2016.

  41. arXiv:1604.06518  [pdf, ps, other

    cs.LG stat.ML

    Approximation Vector Machines for Large-scale Online Learning

    Authors: Trung Le, Tu Dinh Nguyen, Vu Nguyen, Dinh Phung

    Abstract: One of the most challenging problems in kernel online learning is to bound the model size and to promote the model sparsity. Sparse models not only improve computation and memory usage, but also enhance the generalization capacity, a principle that concurs with the law of parsimony. However, inappropriate sparsity modeling may also significantly degrade the performance. In this paper, we propose A… ▽ More

    Submitted 27 May, 2017; v1 submitted 21 April, 2016; originally announced April 2016.

    Comments: 54 pages

  42. arXiv:1603.01359  [pdf, other

    stat.ML cs.CV cs.LG

    Learning deep representation of multityped objects and tasks

    Authors: Truyen Tran, Dinh Phung, Svetha Venkatesh

    Abstract: We introduce a deep multitask architecture to integrate multityped representations of multimodal objects. This multitype exposition is less abstract than the multimodal characterization, but more machine-friendly, and thus is more precise to model. For example, an image can be described by multiple visual views, which can be in the forms of bag-of-words (counts) or color/texture histograms (real-v… ▽ More

    Submitted 4 March, 2016; originally announced March 2016.

  43. arXiv:1602.05285  [pdf, other

    stat.ML cs.IR cs.LG

    Choice by Elimination via Deep Neural Networks

    Authors: Truyen Tran, Dinh Phung, Svetha Venkatesh

    Abstract: We introduce Neural Choice by Elimination, a new framework that integrates deep neural networks into probabilistic sequential choice models for learning to rank. Given a set of items to chose from, the elimination strategy starts with the whole item set and iteratively eliminates the least worthy item in the remaining subset. We prove that the choice by elimination is equivalent to marginalizing o… ▽ More

    Submitted 16 February, 2016; originally announced February 2016.

    Comments: PAKDD workshop on Biologically Inspired Techniques for Data Mining (BDM'16)

  44. arXiv:1602.02842  [pdf, other

    stat.ML cs.IR cs.LG

    Collaborative filtering via sparse Markov random fields

    Authors: Truyen Tran, Dinh Phung, Svetha Venkatesh

    Abstract: Recommender systems play a central role in providing individualized access to information and services. This paper focuses on collaborative filtering, an approach that exploits the shared structure among mind-liked users and similar items. In particular, we focus on a formal probabilistic framework known as Markov random fields (MRF). We address the open problem of structure learning and introduce… ▽ More

    Submitted 8 February, 2016; originally announced February 2016.

  45. arXiv:1602.00357  [pdf, other

    stat.ML cs.LG

    DeepCare: A Deep Dynamic Memory Model for Predictive Medicine

    Authors: Trang Pham, Truyen Tran, Dinh Phung, Svetha Venkatesh

    Abstract: Personalized predictive medicine necessitates the modeling of patient illness and care processes, which inherently have long-term temporal dependencies. Healthcare observations, recorded in electronic medical records, are episodic and irregular in time. We introduce DeepCare, an end-to-end deep dynamic neural network that reads medical records, stores previous illness history, infers current illne… ▽ More

    Submitted 10 April, 2017; v1 submitted 31 January, 2016; originally announced February 2016.

    Comments: Accepted at JBI under the new name: "Predicting healthcare trajectories from medical records: A deep learning approach"

  46. arXiv:1408.1162  [pdf, other

    stat.ML cs.LG stat.ME

    MCMC for Hierarchical Semi-Markov Conditional Random Fields

    Authors: Truyen Tran, Dinh Phung, Svetha Venkatesh, Hung H. Bui

    Abstract: Deep architecture such as hierarchical semi-Markov models is an important class of models for nested sequential data. Current exact inference schemes either cost cubic time in sequence length, or exponential time in model depth. These costs are prohibitive for large-scale problems with arbitrary length and depth. In this contribution, we propose a new approximation technique that may have the pote… ▽ More

    Submitted 5 August, 2014; originally announced August 2014.

    Comments: NIPS'09 Workshop on Deep Learning for Speech Recognition and Related Applications

  47. arXiv:1408.1160  [pdf, other

    stat.ML cs.LG stat.ME

    Mixed-Variate Restricted Boltzmann Machines

    Authors: Truyen Tran, Dinh Phung, Svetha Venkatesh

    Abstract: Modern datasets are becoming heterogeneous. To this end, we present in this paper Mixed-Variate Restricted Boltzmann Machines for simultaneously modelling variables of multiple types and modalities, including binary and continuous responses, categorical options, multicategorical choices, ordinal assessment and category-ranked preferences. Dependency among variables is modeled using latent binary v… ▽ More

    Submitted 5 August, 2014; originally announced August 2014.

    Comments: Originally published in Proceedings of ACML'11

  48. arXiv:1408.0055  [pdf, other

    stat.ML cs.LG stat.ME

    Thurstonian Boltzmann Machines: Learning from Multiple Inequalities

    Authors: Truyen Tran, Dinh Phung, Svetha Venkatesh

    Abstract: We introduce Thurstonian Boltzmann Machines (TBM), a unified architecture that can naturally incorporate a wide range of data inputs at the same time. Our motivation rests in the Thurstonian view that many discrete data types can be considered as being generated from a subset of underlying latent continuous variables, and in the observation that each realisation of a discrete type imposes certain… ▽ More

    Submitted 31 July, 2014; originally announced August 2014.

    Comments: Proceedings of the 30 th International Conference on Machine Learning, Atlanta, Georgia, USA, 2013. JMLR: W&CP volume 28

  49. arXiv:1408.0047  [pdf, other

    stat.ML cs.IR cs.LG stat.AP stat.ME

    Cumulative Restricted Boltzmann Machines for Ordinal Matrix Data Analysis

    Authors: Truyen Tran, Dinh Phung, Svetha Venkatesh

    Abstract: Ordinal data is omnipresent in almost all multiuser-generated feedback - questionnaires, preferences etc. This paper investigates modelling of ordinal data with Gaussian restricted Boltzmann machines (RBMs). In particular, we present the model architecture, learning and inference procedures for both vector-variate and matrix-variate ordinal data. We show that our model is able to capture latent op… ▽ More

    Submitted 31 July, 2014; originally announced August 2014.

    Comments: JMLR: Workshop and Conference Proceedings 25:1-16, 2012; Asian Conference on Machine Learning

  50. arXiv:1408.0043  [pdf, other

    cs.LG cs.IR stat.ML

    Learning From Ordered Sets and Applications in Collaborative Ranking

    Authors: Truyen Tran, Dinh Phung, Svetha Venkatesh

    Abstract: Ranking over sets arise when users choose between groups of items. For example, a group may be of those movies deemed $5$ stars to them, or a customized tour package. It turns out, to model this data type properly, we need to investigate the general combinatorics problem of partitioning a set and ordering the subsets. Here we construct a probabilistic log-linear model over a set of ordered subsets… ▽ More

    Submitted 31 July, 2014; originally announced August 2014.

    Comments: JMLR: Workshop and Conference Proceedings 25:1-16, 2012, Asian Conference on Machine Learning