Skip to main content

Showing 1–12 of 12 results for author: Kon, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2401.04874  [pdf, other

    stat.ML cs.LG

    Feature Network Methods in Machine Learning and Applications

    Authors: Xinying Mu, Mark Kon

    Abstract: A machine learning (ML) feature network is a graph that connects ML features in learning tasks based on their similarity. This network representation allows us to view feature vectors as functions on the network. By leveraging function operations from Fourier analysis and from functional analysis, one can easily generate new and novel features, making use of the graph structure imposed on the feat… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

  2. arXiv:2207.06229  [pdf, other

    stat.ML cs.LG math.FA math.PR math.ST stat.CO

    Stochastic Functional Analysis and Multilevel Vector Field Anomaly Detection

    Authors: Julio E Castrillon-Candas, Mark Kon

    Abstract: Massive vector field datasets are common in multi-spectral optical and radar sensors, among many other emerging areas of application. In this paper we develop a novel stochastic functional (data) analysis approach for detecting anomalies based on the covariance structure of nominal stochastic behavior across a domain. An optimal vector field Karhunen-Loeve expansion is applied to such random field… ▽ More

    Submitted 5 October, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

  3. arXiv:2110.09680  [pdf, other

    stat.ML cs.LG stat.AP

    Multilevel Stochastic Optimization for Imputation in Massive Medical Data Records

    Authors: Wenrui Li, Xiaoyu Wang, Yuetian Sun, Snezana Milanovic, Mark Kon, Julio Enrique Castrillon-Candas

    Abstract: It has long been a recognized problem that many datasets contain significant levels of missing numerical data. A potentially critical predicate for application of machine learning methods to datasets involves addressing this problem. However, this is a challenging task. In this paper, we apply a recently developed multi-level stochastic optimization approach to the problem of imputation in massive… ▽ More

    Submitted 3 April, 2024; v1 submitted 18 October, 2021; originally announced October 2021.

    Comments: 11 pages, 4 figures

    Journal ref: in IEEE Transactions on Big Data, vol. 10, no. 02, pp. 122-131, 2024

  4. arXiv:2110.01729  [pdf, other

    stat.ML cs.LG

    Stochastic tensor space feature theory with applications to robust machine learning

    Authors: Julio Enrique Castrillon-Candas, Dingning Liu, Sicheng Yang, Xiaoling Zhang, Mark Kon

    Abstract: In this paper we develop a Multilevel Orthogonal Subspace (MOS) Karhunen-Loeve feature theory based on stochastic tensor spaces, for the construction of robust machine learning features. Training data is treated as instances of a random field within a relevant Bochner space. Our key observation is that separate machine learning classes can reside predominantly in mostly distinct subspaces. Using t… ▽ More

    Submitted 20 March, 2025; v1 submitted 4 October, 2021; originally announced October 2021.

    MSC Class: 62R10; 60G35; 62-08; 60G60; 65F25; 46B09

  5. arXiv:1906.00536  [pdf, other

    cs.LG stat.ML

    Coupled VAE: Improved Accuracy and Robustness of a Variational Autoencoder

    Authors: Shichen Cao, Jingjing Li, Kenric P. Nelson, Mark A. Kon

    Abstract: We present a coupled Variational Auto-Encoder (VAE) method that improves the accuracy and robustness of the probabilistic inferences on represented data. The new method models the dependency between input feature vectors (images) and weighs the outliers with a higher penalty by generalizing the original loss function to the coupled entropy function, using the principles of nonlinear statistical co… ▽ More

    Submitted 12 July, 2021; v1 submitted 2 June, 2019; originally announced June 2019.

    Comments: 19 pages, 11 figures

  6. arXiv:1905.09149  [pdf, other

    math.NA stat.CO

    Analytic regularity and stochastic collocation of high dimensional Newton iterates

    Authors: Julio Enrique Castrillon-Candas, Mark Kon

    Abstract: In this paper we introduce concepts from uncertainty quantification (UQ) and numerical analysis for the efficient evaluation of stochastic high dimensional Newton iterates. In particular, we develop complex analytic regularity theory of the solution with respect to the random variables. This justifies the application of sparse grids for the computation of stochastic moments. Convergence rates are… ▽ More

    Submitted 22 May, 2019; originally announced May 2019.

  7. arXiv:1804.03989  [pdf

    stat.ME cond-mat.stat-mech math.ST

    Use of the geometric mean as a statistic for the scale of the coupled Gaussian distributions

    Authors: Kenric P. Nelson, Mark A. Kon, Sabir R. Umarov

    Abstract: The geometric mean is shown to be an appropriate statistic for the scale of a heavy-tailed coupled Gaussian distribution or equivalently the Student's t distribution. The coupled Gaussian is a member of a family of distributions parameterized by the nonlinear statistical coupling which is the reciprocal of the degree of freedom and is proportional to fluctuations in the inverse scale of the Gaussi… ▽ More

    Submitted 13 August, 2018; v1 submitted 4 April, 2018; originally announced April 2018.

    Comments: 17 pages, 5 figures

  8. arXiv:1212.4569  [pdf

    stat.ML

    Feature vector regularization in machine learning

    Authors: Yue Fan, Louise Raphael, Mark Kon

    Abstract: Problems in machine learning (ML) can involve noisy input data, and ML classification methods have reached limiting accuracies when based on standard ML data sets consisting of feature vectors and their classes. Greater accuracy will require incorporation of prior structural information on data into learning. We study methods to regularize feature vectors (unsupervised regularization methods), ana… ▽ More

    Submitted 30 December, 2013; v1 submitted 18 December, 2012; originally announced December 2012.

    Comments: 31 pages, one figure

  9. arXiv:1212.4562  [pdf

    stat.ML

    A complexity analysis of statistical learning algorithms

    Authors: Mark A. Kon

    Abstract: We apply information-based complexity analysis to support vector machine (SVM) algorithms, with the goal of a comprehensive continuous algorithmic analysis of such algorithms. This involves complexity measures in which some higher order operations (e.g., certain optimizations) are considered primitive for the purposes of measuring complexity. We consider classes of information operators and algori… ▽ More

    Submitted 18 December, 2012; originally announced December 2012.

  10. arXiv:1212.1263  [pdf

    stat.ML

    On the probabilistic continuous complexity conjecture

    Authors: Mark A. Kon

    Abstract: In this paper we prove the probabilistic continuous complexity conjecture. In continuous complexity theory, this states that the complexity of solving a continuous problem with probability approaching 1 converges (in this limit) to the complexity of solving the same problem in its worst case. We prove the conjecture holds if and only if space of problem elements is uniformly convex. The non-unifor… ▽ More

    Submitted 6 December, 2012; originally announced December 2012.

  11. arXiv:1212.1180  [pdf

    stat.ML cs.LG

    On Some Integrated Approaches to Inference

    Authors: Mark A. Kon, Leszek Plaskota

    Abstract: We present arguments for the formulation of unified approach to different standard continuous inference methods from partial information. It is claimed that an explicit partition of information into a priori (prior knowledge) and a posteriori information (data) is an important way of standardizing inference approaches so that they can be compared on a normative scale, and so that notions of optima… ▽ More

    Submitted 5 December, 2012; originally announced December 2012.

  12. Empirical Normalization for Quadratic Discriminant Analysis and Classifying Cancer Subtypes

    Authors: Mark A. Kon, Nikolay Nikolaev

    Abstract: We introduce a new discriminant analysis method (Empirical Discriminant Analysis or EDA) for binary classification in machine learning. Given a dataset of feature vectors, this method defines an empirical feature map transforming the training and test data into new data with components having Gaussian empirical distributions. This map is an empirical version of the Gaussian copula used in probabil… ▽ More

    Submitted 29 October, 2012; v1 submitted 28 March, 2012; originally announced March 2012.

    Comments: 2011 10th International Conference on Machine Learning and Applications and Workshops