Skip to main content

Showing 1–3 of 3 results for author: Ozturkler, B

Searching in archive math. Search in all archives.
.
  1. arXiv:2205.08078  [pdf, other

    cs.LG cs.CV math.OC

    Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers

    Authors: Arda Sahiner, Tolga Ergen, Batu Ozturkler, John Pauly, Morteza Mardani, Mert Pilanci

    Abstract: Vision transformers using self-attention or its proposed alternatives have demonstrated promising results in many image related tasks. However, the underpinning inductive bias of attention is not well understood. To address this issue, this paper analyzes attention through the lens of convex duality. For the non-linear dot-product self-attention, and alternative mechanisms such as MLP-mixer and Fo… ▽ More

    Submitted 20 May, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

    Comments: 38 pages, 2 figures. To appear in ICML 2022

  2. arXiv:2107.05680  [pdf, other

    cs.LG cs.CV eess.IV math.OC stat.ML

    Hidden Convexity of Wasserstein GANs: Interpretable Generative Models with Closed-Form Solutions

    Authors: Arda Sahiner, Tolga Ergen, Batu Ozturkler, Burak Bartan, John Pauly, Morteza Mardani, Mert Pilanci

    Abstract: Generative Adversarial Networks (GANs) are commonly used for modeling complex distributions of data. Both the generators and discriminators of GANs are often modeled by neural networks, posing a non-transparent optimization problem which is non-convex and non-concave over the generator and discriminator, respectively. Such networks are often heuristically optimized with gradient descent-ascent (GD… ▽ More

    Submitted 21 March, 2022; v1 submitted 12 July, 2021; originally announced July 2021.

    Comments: Published as paper in ICLR 2022. First two authors contributed equally to this work; 34 pages, 11 figures

  3. arXiv:2103.01499  [pdf, other

    cs.LG math.OC stat.ML

    Demystifying Batch Normalization in ReLU Networks: Equivalent Convex Optimization Models and Implicit Regularization

    Authors: Tolga Ergen, Arda Sahiner, Batu Ozturkler, John Pauly, Morteza Mardani, Mert Pilanci

    Abstract: Batch Normalization (BN) is a commonly used technique to accelerate and stabilize training of deep neural networks. Despite its empirical success, a full theoretical understanding of BN is yet to be developed. In this work, we analyze BN through the lens of convex optimization. We introduce an analytic framework based on convex duality to obtain exact convex representations of weight-decay regular… ▽ More

    Submitted 21 March, 2022; v1 submitted 2 March, 2021; originally announced March 2021.

    Comments: Accepted to ICLR 2022. First two authors contributed equally to this work; 36 pages, 13 figures