Skip to main content

Showing 1–24 of 24 results for author: Wei, G

Searching in archive stat. Search in all archives.
.
  1. arXiv:2409.20423  [pdf, other

    stat.ML cs.AI cs.LG

    Stream-level flow matching with Gaussian processes

    Authors: Ganchao Wei, Li Ma

    Abstract: Flow matching (FM) is a family of training algorithms for fitting continuous normalizing flows (CNFs). Conditional flow matching (CFM) exploits the fact that the marginal vector field of a CNF can be learned by fitting least-squares regression to the conditional vector field specified given one or both ends of the flow path. In this paper, we extend the CFM algorithm by defining conditional probab… ▽ More

    Submitted 3 February, 2025; v1 submitted 30 September, 2024; originally announced September 2024.

  2. arXiv:2409.19717  [pdf, other

    stat.AP q-bio.NC

    Covariance Regression for High Dimensional Neural Data via Graph

    Authors: Ganchao Wei

    Abstract: Modern recording techniques enable neuroscientists to simultaneously study neural activity across large populations of neurons, with capturing predictor-dependent correlations being a fundamental challenge in neuroscience. Moreover, the fact that input covariates often lie in restricted subdomains, according to experimental settings, makes inference even more challenging. To address these challeng… ▽ More

    Submitted 3 February, 2025; v1 submitted 29 September, 2024; originally announced September 2024.

  3. arXiv:2402.08871  [pdf, other

    cs.LG stat.ML

    Position: Topological Deep Learning is the New Frontier for Relational Learning

    Authors: Theodore Papamarkou, Tolga Birdal, Michael Bronstein, Gunnar Carlsson, Justin Curry, Yue Gao, Mustafa Hajij, Roland Kwitt, Pietro Liò, Paolo Di Lorenzo, Vasileios Maroulas, Nina Miolane, Farzana Nasrin, Karthikeyan Natesan Ramamurthy, Bastian Rieck, Simone Scardapane, Michael T. Schaub, Petar Veličković, Bei Wang, Yusu Wang, Guo-Wei Wei, Ghada Zamzmi

    Abstract: Topological deep learning (TDL) is a rapidly evolving field that uses topological features to understand and design deep learning models. This paper posits that TDL is the new frontier for relational learning. TDL may complement graph representation learning and geometric deep learning by incorporating topological concepts, and can thus provide a natural choice for various machine learning setting… ▽ More

    Submitted 6 August, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  4. arXiv:2310.15744  [pdf, other

    stat.ML cs.LG math.AT

    Analyzing Single Cell RNA Sequencing with Topological Nonnegative Matrix Factorization

    Authors: Yuta Hozumi, Guo-Wei Wei

    Abstract: Single-cell RNA sequencing (scRNA-seq) is a relatively new technology that has stimulated enormous interest in statistics, data science, and computational biology due to the high dimensionality, complexity, and large scale associated with scRNA-seq data. Nonnegative matrix factorization (NMF) offers a unique approach due to its meta-gene interpretation of resulting low-dimensional components. Howe… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  5. arXiv:2309.02213  [pdf, ps, other

    stat.AP q-bio.NC

    Bayesian Bi-clustering of Neural Spiking Activity with Latent Structures

    Authors: Ganchao Wei

    Abstract: Modern neural recording techniques allow neuroscientists to obtain spiking activity of multiple neurons from different brain regions over long time periods, which requires new statistical methods to be developed for understanding structure of the large-scale data. In this paper, we develop a bi-clustering method to cluster the neural spiking activity spatially and temporally, according to their lo… ▽ More

    Submitted 26 December, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

  6. arXiv:2301.00067  [pdf, other

    stat.ME stat.AP

    Convolutional Non-homogeneous Poisson Process with Application to Wildfire Risk Quantification for Power Delivery Networks

    Authors: Guanzhou Wei, Feng Qiu, Xiao Liu

    Abstract: The current projection shows that much of the continental U.S. will have significantly hotter and drier days in the following decades, leading to more wildfire hazards that threaten the safety of power grid. Unfortunately, the U.S. power industry is not well prepared and still predominantly relies on empirical fire indices which do not consider the full spectrum of dynamic environmental factors. T… ▽ More

    Submitted 30 December, 2022; originally announced January 2023.

  7. arXiv:2208.06409  [pdf, other

    math.NA stat.ME

    Gibbs Phenomenon Suppression in PDE-Based Statistical Spatio-Temporal Models

    Authors: Guanzhou Wei, Xiao Liu, Russell Barton

    Abstract: A class of physics-informed spatio-temporal models has recently been proposed for modeling spatio-temporal processes governed by advection-diffusion equations. The central idea is to approximate the process by a truncated Fourier series and let the governing physics determine the dynamics of the spectral coefficients. However, because many spatio-temporal processes in real applications are non-per… ▽ More

    Submitted 6 August, 2022; originally announced August 2022.

  8. arXiv:2208.03989  [pdf, other

    stat.CO math.ST stat.ME

    An integer grid bridge sampler for the Bayesian inference of incomplete birth-death records

    Authors: Lin Sun, Gang Wei

    Abstract: A one-to-one correspondence is established between the bridge path space of birth-death processes and the exclusive union of the product spaces of simplexes and integer grids. Formulae are derived for the exact counting of the integer grid bridges with fixed number of upward jumps. Then a uniform sampler over such restricted bridge path space is constructed. This leads to a Monte Carlo scheme, the… ▽ More

    Submitted 8 August, 2022; originally announced August 2022.

  9. arXiv:2206.11766  [pdf, other

    stat.AP stat.ML

    Physics-Informed Statistical Modeling for Wildfire Aerosols Process Using Multi-Source Geostationary Satellite Remote-Sensing Data Streams

    Authors: Guanzhou Wei, Venkat Krishnan, Yu Xie, Manajit Sengupta, Yingchen Zhang, Haitao Liao, Xiao Liu

    Abstract: Increasingly frequent wildfires significantly affect solar energy production as the atmospheric aerosols generated by wildfires diminish the incoming solar radiation to the earth. Atmospheric aerosols are measured by Aerosol Optical Depth (AOD), and AOD data streams can be retrieved and monitored by geostationary satellites. However, multi-source remote-sensing data streams often present heterogen… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

  10. arXiv:2206.04189  [pdf, other

    stat.ML cs.CG cs.LG

    CCP: Correlated Clustering and Projection for Dimensionality Reduction

    Authors: Yuta Hozumi, Rui Wang, Guo-Wei Wei

    Abstract: Most dimensionality reduction methods employ frequency domain representations obtained from matrix diagonalization and may not be efficient for large datasets with relatively high intrinsic dimensions. To address this challenge, Correlated Clustering and Projection (CCP) offers a novel data domain strategy that does not need to solve any matrix. CCP partitions high-dimensional features into correl… ▽ More

    Submitted 8 June, 2022; originally announced June 2022.

  11. arXiv:2205.10639  [pdf, ps, other

    q-bio.NC stat.AP

    A Flexible Bayesian Clustering of Dynamic Subpopulations in Neural Spiking Activity

    Authors: Ganchao Wei, Ian H. Stevenson, Xiaojing Wang

    Abstract: With advances in neural recording techniques, neuroscientists are now able to record the spiking activity of many hundreds of neurons simultaneously, and new statistical methods are needed to understand the structure of this large-scale neural population activity. Although previous work has tried to summarize neural activity within and between known populations by extracting low-dimensional latent… ▽ More

    Submitted 2 March, 2023; v1 submitted 21 May, 2022; originally announced May 2022.

  12. arXiv:2205.00507  [pdf, other

    q-bio.NC stat.AP

    Dynamic modeling of spike count data with Conway-Maxwell Poisson variability

    Authors: Ganchao Wei, Ian H. Stevenson

    Abstract: In many areas of the brain, neural spiking activity covaries with features of the external world, such as sensory stimuli or an animal's movement. Experimental findings suggest that the variability of neural activity changes over time and may provide information about the external world beyond the information provided by the average neural activity. To flexibly track time-varying neural response p… ▽ More

    Submitted 8 October, 2022; v1 submitted 1 May, 2022; originally announced May 2022.

    Comments: 6 figures

  13. arXiv:2102.01803  [pdf

    q-bio.NC stat.AP stat.CO

    Tracking fast and slow changes in synaptic weights from simultaneously observed pre- and postsynaptic spiking

    Authors: Ganchao Wei, Ian H. Stevenson

    Abstract: Synapses change on multiple timescales, ranging from milliseconds to minutes, due to a combination of both short- and long-term plasticity. Here we develop an extension of the common Generalized Linear Model to infer both short- and long-term changes in the coupling between a pre- and post-synaptic neuron based on observed spiking activity. We model short-term synaptic plasticity using additive ef… ▽ More

    Submitted 8 April, 2021; v1 submitted 2 February, 2021; originally announced February 2021.

    Journal ref: Neural Computation (2021) 33 (10) 2682-2709

  14. arXiv:1912.00106  [pdf, other

    cs.LG cs.ET stat.ML

    A binary-activation, multi-level weight RNN and training algorithm for ADC-/DAC-free and noise-resilient processing-in-memory inference with eNVM

    Authors: Siming Ma, David Brooks, Gu-Yeon Wei

    Abstract: We propose a new algorithm for training neural networks with binary activations and multi-level weights, which enables efficient processing-in-memory circuits with embedded nonvolatile memories (eNVM). Binary activations obviate costly DACs and ADCs. Multi-level weights leverage multi-level eNVM cells. Compared to existing algorithms, our method not only works for feed-forward networks (e.g., full… ▽ More

    Submitted 12 October, 2020; v1 submitted 29 November, 2019; originally announced December 2019.

    Comments: 10 pages, 6 figures

  15. arXiv:1910.01500  [pdf, other

    cs.LG cs.PF stat.ML

    MLPerf Training Benchmark

    Authors: Peter Mattson, Christine Cheng, Cody Coleman, Greg Diamos, Paulius Micikevicius, David Patterson, Hanlin Tang, Gu-Yeon Wei, Peter Bailis, Victor Bittorf, David Brooks, Dehao Chen, Debojyoti Dutta, Udit Gupta, Kim Hazelwood, Andrew Hock, Xinyuan Huang, Atsushi Ike, Bill Jia, Daniel Kang, David Kanter, Naveen Kumar, Jeffery Liao, Guokai Ma, Deepak Narayanan , et al. (12 additional authors not shown)

    Abstract: Machine learning (ML) needs industry-standard performance benchmarks to support design and competitive evaluation of the many emerging software and hardware solutions for ML. But ML training presents three unique benchmarking challenges absent from other domains: optimizations that improve training throughput can increase the time to solution, training is stochastic and time to solution exhibits h… ▽ More

    Submitted 2 March, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

    Comments: MLSys 2020

  16. arXiv:1909.13271  [pdf, other

    cs.LG cs.AR stat.ML

    AdaptivFloat: A Floating-point based Data Type for Resilient Deep Learning Inference

    Authors: Thierry Tambe, En-Yu Yang, Zishen Wan, Yuntian Deng, Vijay Janapa Reddi, Alexander Rush, David Brooks, Gu-Yeon Wei

    Abstract: Conventional hardware-friendly quantization methods, such as fixed-point or integer, tend to perform poorly at very low word sizes as their shrinking dynamic ranges cannot adequately capture the wide data distributions commonly seen in sequence transduction models. We present AdaptivFloat, a floating-point inspired number representation format for deep learning that dynamically maximizes and optim… ▽ More

    Submitted 11 February, 2020; v1 submitted 29 September, 2019; originally announced September 2019.

    Comments: 10 pages

  17. arXiv:1907.10701  [pdf, other

    cs.LG cs.PF stat.ML

    Benchmarking TPU, GPU, and CPU Platforms for Deep Learning

    Authors: Yu Emma Wang, Gu-Yeon Wei, David Brooks

    Abstract: Training deep learning models is compute-intensive and there is an industry-wide trend towards hardware specialization to improve performance. To systematically benchmark deep learning platforms, we introduce ParaDnn, a parameterized benchmark suite for deep learning that generates end-to-end models for fully connected (FC), convolutional (CNN), and recurrent (RNN) neural networks. Along with six… ▽ More

    Submitted 22 October, 2019; v1 submitted 24 July, 2019; originally announced July 2019.

  18. arXiv:1906.11196  [pdf, other

    q-bio.BM cs.LG stat.ML

    Seq-SetNet: Exploring Sequence Sets for Inferring Structures

    Authors: Fusong Ju, Jianwei Zhu, Guozheng Wei, Qi Zhang, Shiwei Sun, Dongbo Bu

    Abstract: Sequence set is a widely-used type of data source in a large variety of fields. A typical example is protein structure prediction, which takes an multiple sequence alignment (MSA) as input and aims to infer structural information from it. Almost all of the existing approaches exploit MSAs in an indirect fashion, i.e., they transform MSAs into position-specific scoring matrices (PSSM) that represen… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

  19. arXiv:1905.10145  [pdf, ps, other

    cs.LG stat.ML

    Learning Low-Rank Approximation for CNNs

    Authors: Dongsoo Lee, Se Jung Kwon, Byeongwook Kim, Gu-Yeon Wei

    Abstract: Low-rank approximation is an effective model compression technique to not only reduce parameter storage requirements, but to also reduce computations. For convolutional neural networks (CNNs), however, well-known low-rank approximation methods, such as Tucker or CP decomposition, result in degraded model accuracy because decomposed layers hinder training convergence. In this paper, we propose a ne… ▽ More

    Submitted 24 May, 2019; originally announced May 2019.

  20. arXiv:1905.10138  [pdf, ps, other

    cs.LG stat.ML

    Structured Compression by Weight Encryption for Unstructured Pruning and Quantization

    Authors: Se Jung Kwon, Dongsoo Lee, Byeongwook Kim, Parichay Kapoor, Baeseong Park, Gu-Yeon Wei

    Abstract: Model compression techniques, such as pruning and quantization, are becoming increasingly important to reduce the memory footprints and the amount of computations. Despite model size reduction, achieving performance enhancement on devices is, however, still challenging mainly due to the irregular representations of sparse matrix formats. This paper proposes a new weight representation scheme for S… ▽ More

    Submitted 5 March, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

  21. arXiv:1905.05686  [pdf, ps, other

    cs.LG stat.ML

    Network Pruning for Low-Rank Binary Indexing

    Authors: Dongsoo Lee, Se Jung Kwon, Byeongwook Kim, Parichay Kapoor, Gu-Yeon Wei

    Abstract: Pruning is an efficient model compression technique to remove redundancy in the connectivity of deep neural networks (DNNs). Computations using sparse matrices obtained by pruning parameters, however, exhibit vastly different parallelism depending on the index representation scheme. As a result, fine-grained pruning has not gained much attention due to its irregular index form leading to large mem… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

  22. arXiv:1711.04686  [pdf, other

    cs.LG stat.ML

    Weightless: Lossy Weight Encoding For Deep Neural Network Compression

    Authors: Brandon Reagen, Udit Gupta, Robert Adolf, Michael M. Mitzenmacher, Alexander M. Rush, Gu-Yeon Wei, David Brooks

    Abstract: The large memory requirements of deep neural networks limit their deployment and adoption on many devices. Model compression methods effectively reduce the memory requirements of these models, usually through applying transformations such as weight pruning or quantization. In this paper, we present a novel scheme for lossy weight encoding which complements conventional compression techniques. The… ▽ More

    Submitted 13 November, 2017; originally announced November 2017.

  23. arXiv:1703.10951  [pdf, other

    q-bio.QM cs.LG stat.ML

    Comparison of multi-task convolutional neural network (MT-CNN) and a few other methods for toxicity prediction

    Authors: Kedi Wu, Guo-Wei Wei

    Abstract: Toxicity analysis and prediction are of paramount importance to human health and environmental protection. Existing computational methods are built from a wide variety of descriptors and regressors, which makes their performance analysis difficult. For example, deep neural network (DNN), a successful approach in many occasions, acts like a black box and offers little conceptual elegance or physica… ▽ More

    Submitted 31 March, 2017; originally announced March 2017.

  24. arXiv:1403.5105   

    stat.ME

    The factorization and simulation for fundamental solution of Cauchy problem

    Authors: Xinjun Gan, Gang Wei, Jie Zhang, Qi Zhang

    Abstract: In this paper, we demonstrate the simulation of fundamental solution for the parabolic equation by the relationship with Ito diffusion. The factorization and Monte Carlo methods of the fundamental solution are considered. With the fact that the fundamental solution can be written as a product of the transition function and the expectation of a bridge path integral, we give an novel and efficient a… ▽ More

    Submitted 4 July, 2014; v1 submitted 20 March, 2014; originally announced March 2014.

    Comments: This paper has been withdrawn by the author due to a crucial sign error in simulation