Skip to main content

Showing 1–11 of 11 results for author: Lim, K W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2103.04246  [pdf, other

    q-bio.GN cs.AI cs.LG

    RNA Alternative Splicing Prediction with Discrete Compositional Energy Network

    Authors: Alvin Chan, Anna Korsakova, Yew-Soon Ong, Fernaldo Richtia Winnerdy, Kah Wai Lim, Anh Tuan Phan

    Abstract: A single gene can encode for different protein versions through a process called alternative splicing. Since proteins play major roles in cellular functions, aberrant splicing profiles can result in a variety of diseases, including cancers. Alternative splicing is determined by the gene's primary sequence and other regulatory factors such as RNA-binding protein levels. With these as input, we form… ▽ More

    Submitted 6 March, 2021; originally announced March 2021.

    Comments: ACM CHIL 2021 Camera-Ready

  2. arXiv:2009.12199  [pdf, other

    q-bio.QM cs.AI

    Explaining Chemical Toxicity using Missing Features

    Authors: Kar Wai Lim, Bhanushee Sharma, Payel Das, Vijil Chenthamarakshan, Jonathan S. Dordick

    Abstract: Chemical toxicity prediction using machine learning is important in drug development to reduce repeated animal and human testing, thus saving cost and time. It is highly recommended that the predictions of computational toxicology models are mechanistically explainable. Current state of the art machine learning classifiers are based on deep neural networks, which tend to be complex and harder to i… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

  3. arXiv:2004.01215  [pdf, other

    cs.LG q-bio.QM stat.ML

    CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models

    Authors: Vijil Chenthamarakshan, Payel Das, Samuel C. Hoffman, Hendrik Strobelt, Inkit Padhi, Kar Wai Lim, Benjamin Hoover, Matteo Manica, Jannis Born, Teodoro Laino, Aleksandra Mojsilovic

    Abstract: The novel nature of SARS-CoV-2 calls for the development of efficient de novo drug design approaches. In this study, we propose an end-to-end framework, named CogMol (Controlled Generation of Molecules), for designing new drug-like small molecules targeting novel viral proteins with high affinity and off-target selectivity. CogMol combines adaptive pre-training of a molecular SMILES Variational Au… ▽ More

    Submitted 23 June, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

  4. arXiv:1903.06661  [pdf, other

    cs.LG stat.ML

    GEE: A Gradient-based Explainable Variational Autoencoder for Network Anomaly Detection

    Authors: Quoc Phong Nguyen, Kar Wai Lim, Dinil Mon Divakaran, Kian Hsiang Low, Mun Choon Chan

    Abstract: This paper looks into the problem of detecting network anomalies by analyzing NetFlow records. While many previous works have used statistical models and machine learning techniques in a supervised way, such solutions have the limitations that they require large amount of labeled data for training and are unlikely to detect zero-day attacks. Existing anomaly detection solutions also do not provide… ▽ More

    Submitted 15 March, 2019; originally announced March 2019.

    Comments: to appear in 2019 IEEE Conference on Communications and Network Security (CNS)

  5. arXiv:1609.06831  [pdf, other

    cs.LG stat.ML

    Hawkes Processes with Stochastic Excitations

    Authors: Young Lee, Kar Wai Lim, Cheng Soon Ong

    Abstract: We propose an extension to Hawkes processes by treating the levels of self-excitation as a stochastic differential equation. Our new point process allows better approximation in application domains where events and intensities accelerate each other with correlated levels of contagion. We generalize a recent algorithm for simulating draws from Hawkes processes whose levels of excitation are stochas… ▽ More

    Submitted 22 September, 2016; originally announced September 2016.

    Comments: Copy of ICML paper

    Journal ref: Proceedings of The 33rd International Conference on Machine Learning (ICML), pp. 79-88. JMLR. 2016

  6. arXiv:1609.06826  [pdf, other

    cs.DL cs.LG stat.ML

    Bibliographic Analysis with the Citation Network Topic Model

    Authors: Kar Wai Lim, Wray Buntine

    Abstract: Bibliographic analysis considers author's research areas, the citation network and paper content among other things. In this paper, we combine these three in a topic model that produces a bibliographic model of authors, topics and documents using a non-parametric extension of a combination of the Poisson mixed-topic link model and the author-topic model. We propose a novel and efficient inference… ▽ More

    Submitted 22 September, 2016; originally announced September 2016.

    Comments: A copy of ACML paper. arXiv admin note: substantial text overlap with arXiv:1609.06532

    Journal ref: Proceedings of the Sixth Asian Conference on Machine Learning (ACML), pp. 142-158. JMLR. 2014

  7. arXiv:1609.06791  [pdf, other

    cs.CL cs.IR cs.SI

    Twitter-Network Topic Model: A Full Bayesian Treatment for Social Network and Text Modeling

    Authors: Kar Wai Lim, Changyou Chen, Wray Buntine

    Abstract: Twitter data is extremely noisy -- each tweet is short, unstructured and with informal language, a challenge for current topic modeling. On the other hand, tweets are accompanied by extra information such as authorship, hashtags and the user-follower network. Exploiting this additional information, we propose the Twitter-Network (TN) topic model to jointly model the text and the social network in… ▽ More

    Submitted 21 September, 2016; originally announced September 2016.

    Comments: NIPS workshop paper

    Journal ref: NIPS 2013 Topic Models: Computation, Application, and Evaluation, pp. 1-5. Google Sites. 2013

  8. arXiv:1609.06783  [pdf, other

    stat.ML cs.CL cs.LG

    Nonparametric Bayesian Topic Modelling with the Hierarchical Pitman-Yor Processes

    Authors: Kar Wai Lim, Wray Buntine, Changyou Chen, Lan Du

    Abstract: The Dirichlet process and its extension, the Pitman-Yor process, are stochastic processes that take probability distributions as a parameter. These processes can be stacked up to form a hierarchical nonparametric Bayesian model. In this article, we present efficient methods for the use of these processes in this hierarchical context, and apply them to latent variable models for text analytics. In… ▽ More

    Submitted 21 September, 2016; originally announced September 2016.

    Comments: Preprint for International Journal of Approximate Reasoning

    Journal ref: International Journal of Approximate Reasoning, Volume 78, pp. 172-191. Elsevier. 2016

  9. arXiv:1609.06578  [pdf, other

    cs.CL cs.IR cs.LG

    Twitter Opinion Topic Model: Extracting Product Opinions from Tweets by Leveraging Hashtags and Sentiment Lexicon

    Authors: Kar Wai Lim, Wray Buntine

    Abstract: Aspect-based opinion mining is widely applied to review data to aggregate or summarize opinions of a product, and the current state-of-the-art is achieved with Latent Dirichlet Allocation (LDA)-based model. Although social media data like tweets are laden with opinions, their "dirty" nature (as natural language) has discouraged researchers from applying LDA-based opinion model for product review m… ▽ More

    Submitted 21 September, 2016; originally announced September 2016.

    Comments: CIKM paper

    Journal ref: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management (CIKM), pp. 1319-1328. ACM. 2014

  10. On the Mathematical Relationship between Expected n-call@k and the Relevance vs. Diversity Trade-off

    Authors: Kar Wai Lim, Scott Sanner, Shengbo Guo

    Abstract: It has been previously noted that optimization of the n-call@k relevance objective (i.e., a set-based objective that is 1 if at least n documents in a set of k are relevant, otherwise 0) encourages more result set diversification for smaller n, but this statement has never been formally quantified. In this work, we explicitly derive the mathematical relationship between expected n-call@k and the r… ▽ More

    Submitted 21 September, 2016; originally announced September 2016.

    Comments: SIGIR short paper

    Journal ref: Proceedings of the 35th Annual ACM SIG Information Retrieval Conference (SIGIR), pp. 1117-1118. ACM. 2012

  11. arXiv:1609.06532  [pdf, other

    cs.DL cs.LG stat.ML

    Bibliographic Analysis on Research Publications using Authors, Categorical Labels and the Citation Network

    Authors: Kar Wai Lim, Wray Buntine

    Abstract: Bibliographic analysis considers the author's research areas, the citation network and the paper content among other things. In this paper, we combine these three in a topic model that produces a bibliographic model of authors, topics and documents, using a nonparametric extension of a combination of the Poisson mixed-topic link model and the author-topic model. This gives rise to the Citation Net… ▽ More

    Submitted 21 September, 2016; originally announced September 2016.

    Comments: Preprint for Journal Machine Learning

    Journal ref: Machine Learning 103(2):185-213, 2016