Skip to main content

Showing 1–5 of 5 results for author: Król, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.07871  [pdf, other

    cs.LG cs.AI cs.CL

    Scaling Laws for Fine-Grained Mixture of Experts

    Authors: Jakub Krajewski, Jan Ludziejewski, Kamil Adamczewski, Maciej Pióro, Michał Krutul, Szymon Antoniak, Kamil Ciebiera, Krystian Król, Tomasz Odrzygóźdź, Piotr Sankowski, Marek Cygan, Sebastian Jaszczur

    Abstract: Mixture of Experts (MoE) models have emerged as a primary solution for reducing the computational cost of Large Language Models. In this work, we analyze their scaling properties, incorporating an expanded range of variables. Specifically, we introduce a new hyperparameter, granularity, whose adjustment enables precise control over the size of the experts. Building on this, we establish scaling la… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  2. arXiv:2401.04081  [pdf, other

    cs.LG cs.AI cs.CL

    MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

    Authors: Maciej Pióro, Kamil Ciebiera, Krystian Król, Jan Ludziejewski, Michał Krutul, Jakub Krajewski, Szymon Antoniak, Piotr Miłoś, Marek Cygan, Sebastian Jaszczur

    Abstract: State Space Models (SSMs) have become serious contenders in the field of sequential modeling, challenging the dominance of Transformers. At the same time, Mixture of Experts (MoE) has significantly improved Transformer-based Large Language Models, including recent state-of-the-art open models. We propose that to unlock the potential of SSMs for scaling, they should be combined with MoE. We showcas… ▽ More

    Submitted 26 February, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

  3. arXiv:2310.15961  [pdf, other

    cs.CL cs.LG

    Mixture of Tokens: Continuous MoE through Cross-Example Aggregation

    Authors: Szymon Antoniak, Michał Krutul, Maciej Pióro, Jakub Krajewski, Jan Ludziejewski, Kamil Ciebiera, Krystian Król, Tomasz Odrzygóźdź, Marek Cygan, Sebastian Jaszczur

    Abstract: Mixture of Experts (MoE) models based on Transformer architecture are pushing the boundaries of language and vision tasks. The allure of these models lies in their ability to substantially increase the parameter count without a corresponding increase in FLOPs. Most widely adopted MoE models are discontinuous with respect to their parameters - often referred to as sparse. At the same time, existing… ▽ More

    Submitted 24 September, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

  4. arXiv:1811.03415  [pdf

    cs.CY

    Credibility of Automatic Appraisal of Domain Names

    Authors: Karol Król, Artur Strzelecki, Dariusz Zdonek

    Abstract: Both domain names and entire websites are increasingly frequently treated as assets, the value of which can be appraised. The objective of the present thesis was to verify the credibility of domain name appraisals obtained using generally available web applications in an automated, algorithmic way. In conclusions section, it was mentioned that the terms domain name appraisal and website appraisal… ▽ More

    Submitted 29 October, 2018; originally announced November 2018.

    Comments: 4 pages, 3 tables

  5. arXiv:1501.04434  [pdf, other

    cs.CR cs.HC

    "`They brought in the horrible key ring thing!" Analysing the Usability of Two-Factor Authentication in UK Online Banking

    Authors: Kat Krol, Eleni Philippou, Emiliano De Cristofaro, M. Angela Sasse

    Abstract: To prevent password breaches and guessing attacks, banks increasingly turn to two-factor authentication (2FA), requiring users to present at least one more factor, such as a one-time password generated by a hardware token or received via SMS, besides a password. We can expect some solutions -- especially those adding a token -- to create extra work for users, but little research has investigated u… ▽ More

    Submitted 19 January, 2015; originally announced January 2015.

    Comments: To appear in NDSS Workshop on Usable Security (USEC 2015)