Skip to main content

Showing 1–8 of 8 results for author: Keerthi, S S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2108.05839  [pdf, ps, other

    cs.LG cs.AI cs.CV stat.ML

    Logit Attenuating Weight Normalization

    Authors: Aman Gupta, Rohan Ramanath, Jun Shi, Anika Ramachandran, Sirou Zhou, Mingzhou Zhou, S. Sathiya Keerthi

    Abstract: Over-parameterized deep networks trained using gradient-based optimizers are a popular choice for solving classification and ranking problems. Without appropriately tuned $\ell_2$ regularization or weight decay, such networks have the tendency to make output scores (logits) and network weights large, causing training loss to become too small and the network to lose its adaptivity (ability to move… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

    Comments: 23 pages

  2. arXiv:2103.05277  [pdf, ps, other

    cs.AI cs.LG stat.ML

    Efficient Vertex-Oriented Polytopic Projection for Web-scale Applications

    Authors: Rohan Ramanath, S. Sathiya Keerthi, Yao Pan, Konstantin Salomatin, Kinjal Basu

    Abstract: We consider applications involving a large set of instances of projecting points to polytopes. We develop an intuition guided by theoretical and empirical analysis to show that when these instances follow certain structures, a large majority of the projections lie on vertices of the polytopes. To do these projections efficiently we derive a vertex-oriented incremental algorithm to project a point… ▽ More

    Submitted 6 January, 2022; v1 submitted 9 March, 2021; originally announced March 2021.

    ACM Class: G.1.6; I.2.11

  3. arXiv:2003.01296  [pdf, other

    cs.LG stat.ML

    Regression via Implicit Models and Optimal Transport Cost Minimization

    Authors: Saurav Manchanda, Khoa Doan, Pranjul Yadav, S. Sathiya Keerthi

    Abstract: This paper addresses the classic problem of regression, which involves the inductive learning of a map, $y=f(x,z)$, $z$ denoting noise, $f:\mathbb{R}^n\times \mathbb{R}^k \rightarrow \mathbb{R}^m$. Recently, Conditional GAN (CGAN) has been applied for regression and has shown to be advantageous over the other standard approaches like Gaussian Process Regression, given its ability to implicitly mod… ▽ More

    Submitted 2 March, 2020; originally announced March 2020.

  4. arXiv:2002.07971  [pdf, other

    cs.LG stat.ML

    Gradient Boosting Neural Networks: GrowNet

    Authors: Sarkhan Badirli, Xuanqing Liu, Zhengming Xing, Avradeep Bhowmik, Khoa Doan, Sathiya S. Keerthi

    Abstract: A novel gradient boosting framework is proposed where shallow neural networks are employed as ``weak learners''. General loss functions are considered under this unified framework with specific examples presented for classification, regression, and learning to rank. A fully corrective step is incorporated to remedy the pitfall of greedy function approximation of classic gradient boosting decision… ▽ More

    Submitted 14 June, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

    Comments: Supplementary material starts after references

  5. arXiv:2002.02879  [pdf, other

    cs.LG cs.IR cs.SI stat.ML

    Targeted display advertising: the case of preferential attachment

    Authors: Saurav Manchanda, Pranjul Yadav, Khoa Doan, S. Sathiya Keerthi

    Abstract: An average adult is exposed to hundreds of digital advertisements daily (https://www.mediadynamicsinc.com/uploads/files/PR092214-Note-only-150-Ads-2mk.pdf), making the digital advertisement industry a classic example of a big-data-driven platform. As such, the ad-tech industry relies on historical engagement logs (clicks or purchases) to identify potentially interested users for the advertisement… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.

    Comments: IEEE BigData 2019 paper

  6. arXiv:1905.12868  [pdf, other

    cs.LG stat.ML

    Benchmarking Regression Methods: A comparison with CGAN

    Authors: Karan Aggarwal, Matthieu Kirchmeyer, Pranjul Yadav, S. Sathiya Keerthi, Patrick Gallinari

    Abstract: In recent years, impressive progress has been made in the design of implicit probabilistic models via Generative Adversarial Networks (GAN) and its extension, the Conditional GAN (CGAN). Excellent solutions have been demonstrated mostly in image processing applications which involve large, continuous output spaces. There is almost no application of these powerful tools to problems having small dim… ▽ More

    Submitted 4 February, 2020; v1 submitted 30 May, 2019; originally announced May 2019.

  7. arXiv:1802.00130  [pdf, other

    stat.ML cs.LG math.OC

    Distributed Newton Methods for Deep Neural Networks

    Authors: Chien-Chih Wang, Kent Loong Tan, Chun-Ting Chen, Yu-Hsiang Lin, S. Sathiya Keerthi, Dhruv Mahajan, S. Sundararajan, Chih-Jen Lin

    Abstract: Deep learning involves a difficult non-convex optimization problem with a large number of weights between any two adjacent layers of a deep structure. To handle large data sets or complicated networks, distributed training is needed, but the calculation of function, gradient, and Hessian is expensive. In particular, the communication and the synchronization cost may become a bottleneck. In this pa… ▽ More

    Submitted 31 January, 2018; originally announced February 2018.

    Comments: Supplementary materials and experimental code are available at https://www.csie.ntu.edu.tw/~cjlin/papers/dnn

  8. arXiv:1711.05482  [pdf, ps, other

    cs.LG stat.ML

    Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles

    Authors: Dhruv Mahajan, Vivek Gupta, S Sathiya Keerthi, Sellamanickam Sundararajan, Shravan Narayanamurthy, Rahul Kidambi

    Abstract: For many applications, an ensemble of base classifiers is an effective solution. The tuning of its parameters(number of classes, amount of data on which each classifier is to be trained on, etc.) requires G, the generalization error of a given ensemble. The efficient estimation of G is the focus of this paper. The key idea is to approximate the variance of the class scores/probabilities of the bas… ▽ More

    Submitted 15 November, 2017; originally announced November 2017.

    Comments: 12 Pages, 4 Figures, 12 Pages, Under Review in SDM 2018