Skip to main content

Showing 1–50 of 102 results for author: Markov, I

.
  1. arXiv:2506.16200  [pdf, ps, other

    physics.acc-ph physics.plasm-ph

    Theory of wakefield in a transversely inhomogeneous plasma waveguide

    Authors: K. V. Galaydych, P. I. Markov, G. V. Sotnikov

    Abstract: Theoretical studies have been made into the relativistic drive bunch generation of a wakefield in a cylindrical waveguide filled with a transversely inhomogeneous plasma. According to the model used, the transversely inhomogeneous plasma is considered as a combination of tubular plasma and the plasma background of different density. Analytical expressions have been derived for the excited radial a… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  2. arXiv:2505.14371  [pdf, ps, other

    cs.LG math.OC

    Layer-wise Quantization for Quantized Optimistic Dual Averaging

    Authors: Anh Duc Nguyen, Ilia Markov, Frank Zhengqing Wu, Ali Ramezani-Kebrya, Kimon Antonakopoulos, Dan Alistarh, Volkan Cevher

    Abstract: Modern deep neural networks exhibit heterogeneity across numerous layers of various types such as residuals, multi-head attention, etc., due to varying structures (dimensions, activation functions, etc.), distinct representation characteristics, which impact predictions. We develop a general layer-wise quantization framework with tight variance and code-length bounds, adapting to the heterogeneiti… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

    Comments: Accepted at the International Conference on Machine Learning (ICML 2025)

  3. arXiv:2411.10406  [pdf, other

    quant-ph cond-mat.dis-nn cs.AI cs.DC

    How to Build a Quantum Supercomputer: Scaling from Hundreds to Millions of Qubits

    Authors: Masoud Mohseni, Artur Scherer, K. Grace Johnson, Oded Wertheim, Matthew Otten, Navid Anjum Aadit, Yuri Alexeev, Kirk M. Bresniker, Kerem Y. Camsari, Barbara Chapman, Soumitra Chatterjee, Gebremedhin A. Dagnew, Aniello Esposito, Farah Fahim, Marco Fiorentino, Archit Gajjar, Abdullah Khalid, Xiangzhou Kong, Bohdan Kulchytskyy, Elica Kyoseva, Ruoyu Li, P. Aaron Lott, Igor L. Markov, Robert F. McDermott, Giacomo Pedretti , et al. (16 additional authors not shown)

    Abstract: In the span of four decades, quantum computation has evolved from an intellectual curiosity to a potentially realizable technology. Today, small-scale demonstrations have become possible for quantum algorithmic primitives on hundreds of physical qubits and proof-of-principle error-correction on a single logical qubit. Nevertheless, despite significant progress and excitement, the path toward a ful… ▽ More

    Submitted 31 January, 2025; v1 submitted 15 November, 2024; originally announced November 2024.

    Comments: 76 pages, 46 figures. General revision, added figures, added references, added appendices

  4. arXiv:2410.24038  [pdf, other

    physics.acc-ph

    Acceleration and Focusing Electron/Positron Bunches in Plasma-Dielectric Wakefield Accelerator

    Authors: Gennadiy V. Sotnikov, Kostyantyn V. Galaydych, Jay L. Hirshfield, Peter I. Markov, Ivan M. Onishchenko

    Abstract: To mitigate the BBU instability and improve characteristics of accelerated bunches in Dielectric Wakefield Accelerator one can be used the isotropic plasma filling of the transport channel. Here we present the results of analytical and numerical studies of the dynamics of accelerated electron/positron and drive electron bunches under wake acceleration in a plasma DWA (PDWA) with a vacuum channel.… ▽ More

    Submitted 31 October, 2024; originally announced October 2024.

    Comments: 33 pages, 13 fugures, AAC2024 Worhshop, will be submitted to Nuclear Instruments and Methods in Physics Research

  5. arXiv:2409.09659  [pdf, other

    cs.CL

    Leveraging Open-Source Large Language Models for Native Language Identification

    Authors: Yee Man Ng, Ilia Markov

    Abstract: Native Language Identification (NLI) - the task of identifying the native language (L1) of a person based on their writing in the second language (L2) - has applications in forensics, marketing, and second language acquisition. Historically, conventional machine learning approaches that heavily rely on extensive feature engineering have outperformed transformer-based language models on this task.… ▽ More

    Submitted 19 January, 2025; v1 submitted 15 September, 2024; originally announced September 2024.

  6. arXiv:2407.10994  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Panza: Design and Analysis of a Fully-Local Personalized Text Writing Assistant

    Authors: Armand Nicolicioiu, Eugenia Iofinova, Andrej Jovanovic, Eldar Kurtic, Mahdi Nikdan, Andrei Panferov, Ilia Markov, Nir Shavit, Dan Alistarh

    Abstract: The availability of powerful open-source large language models (LLMs) opens exciting use-cases, such as using personal data to fine-tune these models to imitate a user's unique writing style. Two key requirements for such assistants are personalization - in the sense that the assistant should recognizably reflect the user's own writing style - and privacy - users may justifiably be wary of uploadi… ▽ More

    Submitted 10 February, 2025; v1 submitted 24 June, 2024; originally announced July 2024.

    Comments: Panza is available at https://github.com/IST-DASLab/PanzaMail

  7. arXiv:2405.15756  [pdf, other

    cs.LG cs.AI

    Wasserstein Distances, Neuronal Entanglement, and Sparsity

    Authors: Shashata Sawmya, Linghao Kong, Ilia Markov, Dan Alistarh, Nir Shavit

    Abstract: Disentangling polysemantic neurons is at the core of many current approaches to interpretability of large language models. Here we attempt to study how disentanglement can be used to understand performance, particularly under weight sparsity, a leading post-training optimization technique. We suggest a novel measure for estimating neuronal entanglement: the Wasserstein distance of a neuron's outpu… ▽ More

    Submitted 26 February, 2025; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 10 pages, 9 figures

  8. arXiv:2405.13754  [pdf, other

    cs.CL

    Grounding Toxicity in Real-World Events across Languages

    Authors: Wondimagegnhue Tsegaye Tufa, Ilia Markov, Piek Vossen

    Abstract: Social media conversations frequently suffer from toxicity, creating significant issues for users, moderators, and entire communities. Events in the real world, like elections or conflicts, can initiate and escalate toxic behavior online. Our study investigates how real-world events influence the origin and spread of toxicity in online discussions across various languages and regions. We gathered… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: Paper accepted for at The 29th International Conference on Natural Language & Information Systems (NLDB 2024)

  9. arXiv:2404.18865  [pdf, other

    cs.CL

    Truth-value judgment in language models: belief directions are context sensitive

    Authors: Stefan F. Schouten, Peter Bloem, Ilia Markov, Piek Vossen

    Abstract: Recent work has demonstrated that the latent spaces of large language models (LLMs) contain directions predictive of the truth of sentences. Multiple methods recover such directions and build probes that are described as getting at a model's "knowledge" or "beliefs". We investigate this phenomenon, looking closely at the impact of context on the probes. Our experiments establish where in the LLM t… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  10. arXiv:2404.18810  [pdf, other

    cs.CL

    Unknown Script: Impact of Script on Cross-Lingual Transfer

    Authors: Wondimagegnhue Tsegaye Tufa, Ilia Markov, Piek Vossen

    Abstract: Cross-lingual transfer has become an effective way of transferring knowledge between languages. In this paper, we explore an often overlooked aspect in this domain: the influence of the source language of a language model on language transfer performance. We consider a case where the target language and its script are not part of the pre-trained model. We conduct a series of experiments on monolin… ▽ More

    Submitted 7 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: Paper accepted to NAACL Student Research Workshop (SRW) 2024

  11. arXiv:2404.18726  [pdf, other

    cs.CL

    The Constant in HATE: Analyzing Toxicity in Reddit across Topics and Languages

    Authors: Wondimagegnhue Tsegaye Tufa, Ilia Markov, Piek Vossen

    Abstract: Toxic language remains an ongoing challenge on social media platforms, presenting significant issues for users and communities. This paper provides a cross-topic and cross-lingual analysis of toxicity in Reddit conversations. We collect 1.5 million comment threads from 481 communities in six languages: English, German, Spanish, Turkish,Arabic, and Dutch, covering 80 topics such as Culture, Politic… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Accepted to TRAC 2024

  12. arXiv:2311.17614  [pdf, other

    physics.acc-ph physics.plasm-ph

    Bunch-excited wakefield in dielectric waveguide with hollow plasma channel

    Authors: K. V. Galaydych, P. I. Markov, G. V. Sotnikov

    Abstract: Wakefield excitation by a single relativistic electron bunch in a plasma-dielectric accelerating structure has been studied both analytically and numerically. The structure represents a dielectric-loaded cylindrical metal waveguide, which has partially plasma-filled channel (the hollow plasma channel) to transport charged particles. Assuming the linear regime of excitation, analytical expressions… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  13. arXiv:2311.05787  [pdf, other

    cs.LG

    Towards stable real-world equation discovery with assessing differentiating quality influence

    Authors: Mikhail Masliaev, Ilya Markov, Alexander Hvatov

    Abstract: This paper explores the critical role of differentiation approaches for data-driven differential equation discovery. Accurate derivatives of the input data are essential for reliable algorithmic operation, particularly in real-world scenarios where measurement quality is inevitably compromised. We propose alternatives to the commonly used finite differences-based method, notorious for its instabil… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  14. arXiv:2310.14657  [pdf, other

    cs.CL cs.AI

    Reasoning about Ambiguous Definite Descriptions

    Authors: Stefan F. Schouten, Peter Bloem, Ilia Markov, Piek Vossen

    Abstract: Natural language reasoning plays an increasingly important role in improving language models' ability to solve complex language understanding tasks. An interesting use case for reasoning is the resolution of context-dependent ambiguity. But no resources exist to evaluate how well Large Language Models can use explicit reasoning to resolve ambiguity in language. We propose to use ambiguous definite… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Findings

  15. arXiv:2310.09259  [pdf, other

    cs.LG

    QUIK: Towards End-to-End 4-Bit Inference on Generative Large Language Models

    Authors: Saleh Ashkboos, Ilia Markov, Elias Frantar, Tingxuan Zhong, Xincheng Wang, Jie Ren, Torsten Hoefler, Dan Alistarh

    Abstract: Large Language Models (LLMs) from the GPT family have become extremely popular, leading to a race towards reducing their inference costs to allow for efficient local computation. Yet, the vast majority of existing work focuses on weight-only quantization, which can reduce runtime costs in the memory-bound one-token-at-a-time generative setting, but does not address them in compute-bound scenarios,… ▽ More

    Submitted 2 November, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: 16 pages

  16. arXiv:2306.09642  [pdf, ps, other

    cs.CL cs.LG

    Cross-Domain Toxic Spans Detection

    Authors: Stefan F. Schouten, Baran Barbarestani, Wondimagegnhue Tufa, Piek Vossen, Ilia Markov

    Abstract: Given the dynamic nature of toxic language use, automated methods for detecting toxic spans are likely to encounter distributional shift. To explore this phenomenon, we evaluate three approaches for detecting toxic spans under cross-domain conditions: lexicon-based, rationale extraction, and fine-tuned language models. Our findings indicate that a simple method using off-the-shelf lexicons perform… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: NLDB 2023

  17. arXiv:2306.09633  [pdf, other

    cs.LG cs.AI cs.AR cs.CY

    The False Dawn: Reevaluating Google's Reinforcement Learning for Chip Macro Placement

    Authors: Igor L. Markov

    Abstract: Reinforcement learning (RL) for physical design of silicon chips in a Google 2021 Nature paper stirred controversy due to poorly documented claims that raised eyebrows and drew critical media coverage. The paper withheld critical methodology steps and most inputs needed to reproduce results. Our meta-analysis shows how two separate evaluations filled in the gaps and demonstrated that Google RL lag… ▽ More

    Submitted 28 September, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: 14 pages, 1 figure, 4 tables, 83 references

  18. arXiv:2303.16531  [pdf, other

    cs.CV

    RusTitW: Russian Language Text Dataset for Visual Text in-the-Wild Recognition

    Authors: Igor Markov, Sergey Nesteruk, Andrey Kuznetsov, Denis Dimitrov

    Abstract: Information surrounds people in modern life. Text is a very efficient type of information that people use for communication for centuries. However, automated text-in-the-wild recognition remains a challenging problem. The major limitation for a DL system is the lack of training data. For the competitive performance, training set must contain many samples that replicate the real-world cases. While… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: 5 pages, 6 figures, 2 tables

  19. arXiv:2303.11580  [pdf, other

    cs.LG

    Efficient Multi-stage Inference on Tabular Data

    Authors: Daniel S Johnson, Igor L Markov

    Abstract: Many ML applications and products train on medium amounts of input data but get bottlenecked in real-time inference. When implementing ML systems, conventional wisdom favors segregating ML code into services queried by product code via Remote Procedure Call (RPC) APIs. This approach clarifies the overall software architecture and simplifies product code by abstracting away ML internals. However, t… ▽ More

    Submitted 21 July, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

  20. arXiv:2303.03460  [pdf, other

    quant-ph

    Ever more optimized simulations of fermionic systems on a quantum computer

    Authors: Qingfeng Wang, Ze-Pei Cian, Ming Li, Igor L. Markov, Yunseong Nam

    Abstract: Despite using a novel model of computation, quantum computers break down programs into elementary gates. Among such gates, entangling gates are the most expensive. In the context of fermionic simulations, we develop a suite of compilation and optimization techniques that massively reduce the entangling-gate counts. We exploit the well-studied non-quantum optimization algorithms to achieve up to 24… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

  21. arXiv:2302.14139  [pdf, other

    cs.LG cs.AI cs.SE

    Scalable End-to-End ML Platforms: from AutoML to Self-serve

    Authors: Igor L. Markov, Pavlos A. Apostolopoulos, Mia R. Garrard, Tanya Qie, Yin Huang, Tanvi Gupta, Anika Li, Cesar Cardoso, George Han, Ryan Maghsoudian, Norm Zhou

    Abstract: ML platforms help enable intelligent data-driven applications and maintain them with limited engineering effort. Upon sufficiently broad adoption, such platforms reach economies of scale that bring greater component reuse while improving efficiency of system development and maintenance. For an end-to-end ML platform with broad adoption, scaling relies on pervasive ML automation and system integrat… ▽ More

    Submitted 3 March, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: 10 pages, 1 figure, 2 tables

  22. arXiv:2302.12360  [pdf, other

    cs.LG cs.AI

    Practical Knowledge Distillation: Using DNNs to Beat DNNs

    Authors: Chung-Wei Lee, Pavlos Athanasios Apostolopulos, Igor L. Markov

    Abstract: For tabular data sets, we explore data and model distillation, as well as data denoising. These techniques improve both gradient-boosting models and a specialized DNN architecture. While gradient boosting is known to outperform DNNs on tabular data, we close the gap for datasets with 100K+ rows and give DNNs an advantage on small data sets. We extend these results with input-data distillation and… ▽ More

    Submitted 1 March, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: 11 pages, 1 figure, 17 tables

  23. arXiv:2302.02390  [pdf, other

    cs.LG

    Quantized Distributed Training of Large Models with Convergence Guarantees

    Authors: Ilia Markov, Adrian Vladu, Qi Guo, Dan Alistarh

    Abstract: Communication-reduction techniques are a popular way to improve scalability in data-parallel training of deep neural networks (DNNs). The recent emergence of large language models such as GPT has created the need for new approaches to exploit data-parallelism. Among these, fully-sharded data parallel (FSDP) training is highly popular, yet it still encounters scalability bottlenecks. One reason is… ▽ More

    Submitted 5 February, 2023; originally announced February 2023.

  24. arXiv:2301.07233  [pdf, other

    quant-ph cs.ET

    Enhancing quantum computer performance via symmetrization

    Authors: Andrii Maksymov, Jason Nguyen, Yunseong Nam, Igor Markov

    Abstract: Large quantum computers promise to solve some critical problems not solvable otherwise. However, modern quantum technologies suffer various imperfections such as control errors and qubit decoherence, inhibiting their potential utility. The overheads of quantum error correction are too great for near-term quantum computers, whereas error-mitigation strategies that address specific device imperfecti… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

  25. arXiv:2210.17357  [pdf, other

    cs.LG cs.DC

    L-GreCo: Layerwise-Adaptive Gradient Compression for Efficient and Accurate Deep Learning

    Authors: Mohammadreza Alimohammadi, Ilia Markov, Elias Frantar, Dan Alistarh

    Abstract: Data-parallel distributed training of deep neural networks (DNN) has gained very widespread adoption, but can still experience communication bottlenecks. To address this issue, entire families of compression mechanisms have been developed, including quantization, sparsification, and low-rank approximation, some of which are seeing significant practical adoption. Despite this progress, almost all k… ▽ More

    Submitted 9 June, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

  26. arXiv:2210.12526  [pdf, other

    cs.CR cs.LG

    Federated Calibration and Evaluation of Binary Classifiers

    Authors: Graham Cormode, Igor Markov

    Abstract: We address two major obstacles to practical use of supervised classifiers on distributed private data. Whether a classifier was trained by a federation of cooperating clients or trained centrally out of distribution, (1) the output scores must be calibrated, and (2) performance metrics must be evaluated -- all without assembling labels in one place. In particular, we show how to perform calibratio… ▽ More

    Submitted 22 October, 2022; originally announced October 2022.

    Comments: 24 pages

  27. arXiv:2202.09483  [pdf, other

    cs.CL cs.SI

    Data-Driven Mitigation of Adversarial Text Perturbation

    Authors: Rasika Bhalerao, Mohammad Al-Rubaie, Anand Bhaskar, Igor Markov

    Abstract: Social networks have become an indispensable part of our lives, with billions of people producing ever-increasing amounts of text. At such scales, content policies and their enforcement become paramount. To automate moderation, questionable content is detected by Natural Language Processing (NLP) classifiers. However, high-performance classifiers are hampered by misspellings and adversarial text p… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

  28. arXiv:2111.12795  [pdf, other

    cs.HC cs.AI cs.LG

    Picasso: Model-free Feature Visualization

    Authors: Binh Vu, Igor Markov

    Abstract: Today, Machine Learning (ML) applications can have access to tens of thousands of features. With such feature sets, efficiently browsing and curating subsets of most relevant features is a challenge. In this paper, we present a novel approach to visualize up to several thousands of features in a single image. The image not only shows information on individual features, but also expresses feature i… ▽ More

    Submitted 24 November, 2021; originally announced November 2021.

  29. CGX: Adaptive System Support for Communication-Efficient Deep Learning

    Authors: Ilia Markov, Hamidreza Ramezanikebrya, Dan Alistarh

    Abstract: The ability to scale out training workloads has been one of the key performance enablers of deep learning. The main scaling approach is data-parallel GPU-based training, which has been boosted by hardware and software support for highly efficient point-to-point communication, and in particular via hardware bandwidth overprovisioning. Overprovisioning comes at a cost: there is an order of magnitude… ▽ More

    Submitted 29 December, 2022; v1 submitted 16 November, 2021; originally announced November 2021.

    Journal ref: Middleware 2022

  30. arXiv:2110.07554  [pdf, other

    cs.LG cs.AI cs.SE

    Looper: An end-to-end ML platform for product decisions

    Authors: Igor L. Markov, Hanson Wang, Nitya Kasturi, Shaun Singh, Sze Wai Yuen, Mia Garrard, Sarah Tran, Yin Huang, Zehui Wang, Igor Glotov, Tanvi Gupta, Boshuang Huang, Peng Chen, Xiaowen Xie, Michael Belkin, Sal Uryasev, Sam Howie, Eytan Bakshy, Norm Zhou

    Abstract: Modern software systems and products increasingly rely on machine learning models to make data-driven decisions based on interactions with users, infrastructure and other systems. For broader adoption, this practice must (i) accommodate product engineers without ML backgrounds, (ii) support finegrain product-metric evaluation and (iii) optimize for product goals. To address shortcomings of prior p… ▽ More

    Submitted 21 June, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: 11 pages + references, 7 figures; to appear in KDD 2022

  31. arXiv:2109.11577  [pdf, other

    cs.LG

    Text Ranking and Classification using Data Compression

    Authors: Nitya Kasturi, Igor L. Markov

    Abstract: A well-known but rarely used approach to text categorization uses conditional entropy estimates computed using data compression tools. Text affinity scores derived from compressed sizes can be used for classification and ranking tasks, but their success depends on the compression tools used. We use the Zstandard compressor and strengthen these ideas in several ways, calling the resulting language-… ▽ More

    Submitted 7 December, 2021; v1 submitted 23 September, 2021; originally announced September 2021.

    Journal ref: ICBINB workshop at NeurIPS 2021

  32. arXiv:2108.13815  [pdf, other

    physics.acc-ph physics.plasm-ph

    Acceleration and focusing of positron bunch in a dielectric waveguide accelerator with homogeneous plasma in transport channel

    Authors: P. I. Markov, R. R. Kniaziev, G. V. Sotnikov

    Abstract: The paper presents the results of numerical PIC-simulation of positron bunch focusing when acceleration in a plasma dielectric wakefield accelerator. The wakefield was excited by drive electron bunch in quartz dielectric tube, embedded in cylindrical metal waveguide. The internal area of dielectric tube has been filled with radially homogeneous plasma having in general case the vacuum channel alon… ▽ More

    Submitted 11 December, 2021; v1 submitted 31 August, 2021; originally announced August 2021.

    Comments: 10 pages, 9 figures

  33. Mixture-Based Correction for Position and Trust Bias in Counterfactual Learning to Rank

    Authors: Ali Vardasbi, Maarten de Rijke, Ilya Markov

    Abstract: In counterfactual learning to rank (CLTR) user interactions are used as a source of supervision. Since user interactions come with bias, an important focus of research in this field lies in developing methods to correct for the bias of interactions. Inverse propensity scoring (IPS) is a popular method suitable for correcting position bias. Affine correction (AC) is a generalization of IPS that cor… ▽ More

    Submitted 19 August, 2021; originally announced August 2021.

    Comments: CIKM 2021

  34. arXiv:2108.03708  [pdf, other

    quant-ph cs.ET

    Detecting Qubit-coupling Faults in Ion-trap Quantum Computers

    Authors: Andrii O. Maksymov, Jason Nguyen, Vandiver Chaplin, Yunseong Nam, Igor L. Markov

    Abstract: Ion-trap quantum computers offer a large number of possible qubit couplings, each of which requires individual calibration and can be misconfigured. To enhance the duty cycle of an ion trap, we develop a strategy that diagnoses individual miscalibrated couplings using only log-many tests. This strategy is validated on a commercial ion-trap quantum computer, where we illustrate the process of debug… ▽ More

    Submitted 12 December, 2021; v1 submitted 8 August, 2021; originally announced August 2021.

    Journal ref: HPCA 2022

  35. arXiv:2108.01521  [pdf, other

    cs.CR cs.DS

    Bit-efficient Numerical Aggregation and Stronger Privacy for Trust in Federated Analytics

    Authors: Graham Cormode, Igor L. Markov

    Abstract: Private data generated by edge devices -- from smart phones to automotive electronics -- are highly informative when aggregated but can be damaging when mishandled. A variety of solutions are being explored but have not yet won the public's trust and full backing of mobile platforms. In this work, we propose numerical aggregation protocols that empirically improve upon prior art, while providing c… ▽ More

    Submitted 3 August, 2021; originally announced August 2021.

    Comments: 15 pages

  36. arXiv:2104.13818   

    cs.LG math.OC stat.ML

    NUQSGD: Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization

    Authors: Ali Ramezani-Kebrya, Fartash Faghri, Ilya Markov, Vitalii Aksenov, Dan Alistarh, Daniel M. Roy

    Abstract: As the size and complexity of models and datasets grow, so does the need for communication-efficient variants of stochastic gradient descent that can be deployed to perform parallel model training. One popular communication-compression method for data-parallel SGD is QSGD (Alistarh et al., 2017), which quantizes and encodes gradients to reduce communication costs. The baseline variant of QSGD prov… ▽ More

    Submitted 1 May, 2021; v1 submitted 28 April, 2021; originally announced April 2021.

    Comments: This entry is redundant and was created in error. See arXiv:1908.06077 for the latest version

  37. arXiv:2102.09507  [pdf, ps, other

    cs.CL cs.LG cs.SI

    Regular Expressions for Fast-response COVID-19 Text Classification

    Authors: Igor L. Markov, Jacqueline Liu, Adam Vagner

    Abstract: Text classifiers are at the core of many NLP applications and use a variety of algorithmic approaches and software. This paper introduces infrastructure and methodologies for text classifiers based on large-scale regular expressions. In particular, we describe how Facebook determines if a given piece of text - anything from a hashtag to a post - belongs to a narrow topic such as COVID-19. To fully… ▽ More

    Submitted 21 June, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

    Comments: 10 pages, 7 tables

  38. arXiv:2102.08465  [pdf, other

    cs.SI cs.DL cs.IR cs.LG

    Prioritizing Original News on Facebook

    Authors: Xiuyan Ni, Shujian Bu, Igor L. Markov

    Abstract: This work outlines how we prioritize original news, a critical indicator of news quality. By examining the landscape and life-cycle of news posts on our social media platform, we identify challenges of building and deploying an originality score. We pursue an approach based on normalized PageRank values and three-step clustering, and refresh the score on an hourly basis to capture the dynamics of… ▽ More

    Submitted 14 March, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: 9 pages, 8 figures, 6 tables, 2 algorithm pseudocodes

    Journal ref: CIKM 2021

  39. arXiv:2102.05612  [pdf, other

    cs.LG cs.HC cs.SE

    Personalization for Web-based Services using Offline Reinforcement Learning

    Authors: Pavlos Athanasios Apostolopoulos, Zehui Wang, Hanson Wang, Chad Zhou, Kittipat Virochsiri, Norm Zhou, Igor L. Markov

    Abstract: Large-scale Web-based services present opportunities for improving UI policies based on observed user interactions. We address challenges of learning such policies through model-free offline Reinforcement Learning (RL) with off-policy training. Deployed in a production system for user authentication in a major social network, it significantly improves long-term objectives. We articulate practical… ▽ More

    Submitted 10 February, 2021; originally announced February 2021.

    Comments: 9 pages, 8 figures, 3 tables

    Journal ref: 2nd Offline Reinforcement Learning Workshop at NeurIPS 2021

  40. As Accurate as Needed, as Efficient as Possible: Approximations in DD-based Quantum Circuit Simulation

    Authors: Stefan Hillmich, Richard Kueng, Igor L. Markov, Robert Wille

    Abstract: Quantum computers promise to solve important problems faster than conventional computers. However, unleashing this power has been challenging. In particular, design automation runs into (1) the probabilistic nature of quantum computation and (2) exponential requirements for computational resources on non-quantum hardware. In quantum circuit simulation, Decision Diagrams (DDs) have previously shown… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

    Comments: 6 pages, 2 figures, to be published at Design, Automation, and Test in Europe 2021

  41. arXiv:2010.12460  [pdf, other

    cs.LG stat.ML

    Adaptive Gradient Quantization for Data-Parallel SGD

    Authors: Fartash Faghri, Iman Tabrizian, Ilia Markov, Dan Alistarh, Daniel Roy, Ali Ramezani-Kebrya

    Abstract: Many communication-efficient variants of SGD use gradient quantization schemes. These schemes are often heuristic and fixed over the course of training. We empirically observe that the statistics of gradients of deep models change during the training. Motivated by this observation, we introduce two adaptive quantization schemes, ALQ and AMQ. In both schemes, processors update their compression sch… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

    Comments: Accepted at the conference on Neural Information Processing Systems (NeurIPS 2020)

  42. arXiv:2008.00216  [pdf, other

    quant-ph cs.AR cs.DC cs.ET physics.comp-ph

    Faster Schrödinger-style simulation of quantum circuits

    Authors: Aneeqa Fatima, Igor L. Markov

    Abstract: Recent demonstrations of superconducting quantum computers by Google and IBM and trapped-ion computers from IonQ fueled new research in quantum algorithms, compilation into quantum circuits, and empirical algorithmics. While online access to quantum hardware remains too limited to meet the demand, simulating quantum circuits on conventional computers satisfies many needs. We advance Schrödinger-st… ▽ More

    Submitted 24 November, 2020; v1 submitted 1 August, 2020; originally announced August 2020.

    Comments: 14 pages, 15 figures, 4 tables. Version 2 : Additional optimizations; improved simulation runtimes; profiling data; comparisons with the latest IBM QISKit simulator; dispelled apparent limitations of techniques. Version 3 : Ablation experiments and images for the code snippets

    Journal ref: HPCA 2021

  43. Just Like the Real Thing: Fast Weak Simulation of Quantum Computation

    Authors: Stefan Hillmich, Igor L. Markov, Robert Wille

    Abstract: Quantum computers promise significant speedups in solving problems intractable for conventional computers but, despite recent progress, remain limited in scaling and availability. Therefore, quantum software and hardware development heavily rely on simulation that runs on conventional computers. Most such approaches perform strong simulation in that they explicitly compute amplitudes of quantum st… ▽ More

    Submitted 30 July, 2020; originally announced July 2020.

    Comments: 6 pages, 4 figures

    Journal ref: Design Automation Conference (DAC) 2020

  44. Cascade Model-based Propensity Estimation for Counterfactual Learning to Rank

    Authors: Ali Vardasbi, Maarten de Rijke, Ilya Markov

    Abstract: Unbiased CLTR requires click propensities to compensate for the difference between user clicks and true relevance of search results via IPS. Current propensity estimation methods assume that user click behavior follows the PBM and estimate click propensities based on this assumption. However, in reality, user clicks often follow the CM, where users scan search results from top to bottom and where… ▽ More

    Submitted 25 May, 2020; originally announced May 2020.

    Comments: 4 pages, 2 figures, 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '20)

  45. arXiv:2005.01588  [pdf

    cs.CY

    Workshops on Extreme Scale Design Automation (ESDA) Challenges and Opportunities for 2025 and Beyond

    Authors: R. Iris Bahar, Alex K. Jones, Srinivas Katkoori, Patrick H. Madden, Diana Marculescu, Igor L. Markov

    Abstract: Integrated circuits and electronic systems, as well as design technologies, are evolving at a great rate -- both quantitatively and qualitatively. Major developments include new interconnects and switching devices with atomic-scale uncertainty, the depth and scale of on-chip integration, electronic system-level integration, the increasing significance of software, as well as more effective means o… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

    Comments: A Computing Community Consortium (CCC) workshop report, 32 pages

    Report number: ccc2014report_1

  46. Approximation of Quantum States Using Decision Diagrams

    Authors: Alwin Zulehner, Stefan Hillmich, Igor L. Markov, Robert Wille

    Abstract: The computational power of quantum computers poses major challenges to new design tools since representing pure quantum states typically requires exponentially large memory. As shown previously, decision diagrams can reduce these memory requirements by exploiting redundancies. In this work, we demonstrate further reductions by allowing for small inaccuracies in the quantum state representation. Su… ▽ More

    Submitted 12 February, 2020; originally announced February 2020.

    Journal ref: Asia and South Pacific Design Automation Conference 2020

  47. arXiv:2002.00467  [pdf, other

    cs.IR cs.LG

    Safe Exploration for Optimizing Contextual Bandits

    Authors: Rolf Jagerman, Ilya Markov, Maarten de Rijke

    Abstract: Contextual bandit problems are a natural fit for many information retrieval tasks, such as learning to rank, text classification, recommendation, etc. However, existing learning methods for contextual bandit problems have one of two drawbacks: they either do not explore the space of all possible document rankings (i.e., actions) and, thus, may miss the optimal ranking, or they present suboptimal r… ▽ More

    Submitted 2 February, 2020; originally announced February 2020.

    Comments: 23 pages, 3 figures

  48. arXiv:2001.05918  [pdf, other

    cs.LG stat.ML

    Elastic Consistency: A General Consistency Model for Distributed Stochastic Gradient Descent

    Authors: Giorgi Nadiradze, Ilia Markov, Bapi Chatterjee, Vyacheslav Kungurtsev, Dan Alistarh

    Abstract: Machine learning has made tremendous progress in recent years, with models matching or even surpassing humans on a series of specialized tasks. One key element behind the progress of machine learning in recent years has been the ability to train machine learning models in large-scale distributed shared-memory and message-passing environments. Many of these models are trained employing variants of… ▽ More

    Submitted 28 June, 2020; v1 submitted 16 January, 2020; originally announced January 2020.

  49. arXiv:1912.07263  [pdf, other

    physics.acc-ph physics.plasm-ph

    Focusing of Drive and Test Bunches in a Dielectric Waveguide Filled with Inhomogeneous Plasma

    Authors: G. V. Sotnikov, P. I. Markov, I. N. Onishchenko

    Abstract: The paper presents the results of numerical PIC-simulation of accelerated and drive bunches dynamics in a dielectric waveguide filled with radially inhomogeneous plasma. The wakefield was excited by the electron bunch in a quartz (permittivity 3.75) dielectric tube with outer and inner diameters of 1.2 mm and 1.0 mm, respectively, which was nested into a cylindrical metallic waveguide. The drive b… ▽ More

    Submitted 16 December, 2019; originally announced December 2019.

    Comments: 7 pages, 8 figures

  50. arXiv:1908.06077  [pdf, other

    cs.LG stat.ML

    NUQSGD: Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization

    Authors: Ali Ramezani-Kebrya, Fartash Faghri, Ilya Markov, Vitalii Aksenov, Dan Alistarh, Daniel M. Roy

    Abstract: As the size and complexity of models and datasets grow, so does the need for communication-efficient variants of stochastic gradient descent that can be deployed to perform parallel model training. One popular communication-compression method for data-parallel SGD is QSGD (Alistarh et al., 2017), which quantizes and encodes gradients to reduce communication costs. The baseline variant of QSGD prov… ▽ More

    Submitted 3 May, 2021; v1 submitted 16 August, 2019; originally announced August 2019.

    Comments: 42 pages, 21 figures. To appear in the Journal of Machine Learning Research (JMLR)