Search | arXiv e-print repository

Extreme Multi-label Completion for Semantic Document Labelling with Taxonomy-Aware Parallel Learning

Authors: Julien Audiffren, Christophe Broillet, Ljiljana Dolamic, Philippe Cudré-Mauroux

Abstract: In Extreme Multi Label Completion (XMLCo), the objective is to predict the missing labels of a collection of documents. Together with XML Classification, XMLCo is arguably one of the most challenging document classification tasks, as the very high number of labels (at least ten of thousands) is generally very large compared to the number of available labelled documents in the training dataset. Suc… ▽ More In Extreme Multi Label Completion (XMLCo), the objective is to predict the missing labels of a collection of documents. Together with XML Classification, XMLCo is arguably one of the most challenging document classification tasks, as the very high number of labels (at least ten of thousands) is generally very large compared to the number of available labelled documents in the training dataset. Such a task is often accompanied by a taxonomy that encodes the labels organic relationships, and many methods have been proposed to leverage this hierarchy to improve the results of XMLCo algorithms. In this paper, we propose a new approach to this problem, TAMLEC (Taxonomy-Aware Multi-task Learning for Extreme multi-label Completion). TAMLEC divides the problem into several Taxonomy-Aware Tasks, i.e. subsets of labels adapted to the hierarchical paths of the taxonomy, and trains on these tasks using a dynamic Parallel Feature sharing approach, where some parts of the model are shared between tasks while others are task-specific. Then, at inference time, TAMLEC uses the labels available in a document to infer the appropriate tasks and to predict missing labels. To achieve this result, TAMLEC uses a modified transformer architecture that predicts ordered sequences of labels on a Weak-Semilattice structure that is naturally induced by the tasks. This approach yields multiple advantages. First, our experiments on real-world datasets show that TAMLEC outperforms state-of-the-art methods for various XMLCo problems. Second, TAMLEC is by construction particularly suited for few-shots XML tasks, where new tasks or labels are introduced with only few examples, and extensive evaluations highlight its strong performance compared to existing methods. △ Less

Submitted 18 December, 2024; originally announced December 2024.

arXiv:2411.12473 [pdf, other]

NMT-Obfuscator Attack: Ignore a sentence in translation with only one word

Authors: Sahar Sadrizadeh, César Descalzo, Ljiljana Dolamic, Pascal Frossard

Abstract: Neural Machine Translation systems are used in diverse applications due to their impressive performance. However, recent studies have shown that these systems are vulnerable to carefully crafted small perturbations to their inputs, known as adversarial attacks. In this paper, we propose a new type of adversarial attack against NMT models. In this attack, we find a word to be added between two sent… ▽ More Neural Machine Translation systems are used in diverse applications due to their impressive performance. However, recent studies have shown that these systems are vulnerable to carefully crafted small perturbations to their inputs, known as adversarial attacks. In this paper, we propose a new type of adversarial attack against NMT models. In this attack, we find a word to be added between two sentences such that the second sentence is ignored and not translated by the NMT model. The word added between the two sentences is such that the whole adversarial text is natural in the source language. This type of attack can be harmful in practical scenarios since the attacker can hide malicious information in the automatic translation made by the target NMT model. Our experiments show that different NMT models and translation tasks are vulnerable to this type of attack. Our attack can successfully force the NMT models to ignore the second part of the input in the translation for more than 50% of all cases while being able to maintain low perplexity for the whole input. △ Less

Submitted 19 November, 2024; originally announced November 2024.

arXiv:2410.04147 [pdf, other]

Can the Variation of Model Weights be used as a Criterion for Self-Paced Multilingual NMT?

Authors: Àlex R. Atrio, Alexis Allemann, Ljiljana Dolamic, Andrei Popescu-Belis

Abstract: Many-to-one neural machine translation systems improve over one-to-one systems when training data is scarce. In this paper, we design and test a novel algorithm for selecting the language of minibatches when training such systems. The algorithm changes the language of the minibatch when the weights of the model do not evolve significantly, as measured by the smoothed KL divergence between all laye… ▽ More Many-to-one neural machine translation systems improve over one-to-one systems when training data is scarce. In this paper, we design and test a novel algorithm for selecting the language of minibatches when training such systems. The algorithm changes the language of the minibatch when the weights of the model do not evolve significantly, as measured by the smoothed KL divergence between all layers of the Transformer network. This algorithm outperforms the use of alternating monolingual batches, but not the use of shuffled batches, in terms of translation quality (measured with BLEU and COMET) and convergence speed. △ Less

Submitted 5 October, 2024; originally announced October 2024.

arXiv:2409.03291 [pdf, other]

LLM Detectors Still Fall Short of Real World: Case of LLM-Generated Short News-Like Posts

Authors: Henrique Da Silva Gameiro, Andrei Kucharavy, Ljiljana Dolamic

Abstract: With the emergence of widely available powerful LLMs, disinformation generated by large Language Models (LLMs) has become a major concern. Historically, LLM detectors have been touted as a solution, but their effectiveness in the real world is still to be proven. In this paper, we focus on an important setting in information operations -- short news-like posts generated by moderately sophisticated… ▽ More With the emergence of widely available powerful LLMs, disinformation generated by large Language Models (LLMs) has become a major concern. Historically, LLM detectors have been touted as a solution, but their effectiveness in the real world is still to be proven. In this paper, we focus on an important setting in information operations -- short news-like posts generated by moderately sophisticated attackers. We demonstrate that existing LLM detectors, whether zero-shot or purpose-trained, are not ready for real-world use in that setting. All tested zero-shot detectors perform inconsistently with prior benchmarks and are highly vulnerable to sampling temperature increase, a trivial attack absent from recent benchmarks. A purpose-trained detector generalizing across LLMs and unseen attacks can be developed, but it fails to generalize to new human-written texts. We argue that the former indicates domain-specific benchmarking is needed, while the latter suggests a trade-off between the adversarial evasion resilience and overfitting to the reference human text, with both needing evaluation in benchmarks and currently absent. We believe this suggests a re-consideration of current LLM detector benchmarking approaches and provides a dynamically extensible benchmark to allow it (https://github.com/Reliable-Information-Lab-HEVS/benchmark_llm_texts_detection). △ Less

Submitted 27 September, 2024; v1 submitted 5 September, 2024; originally announced September 2024.

Comments: 20 pages, 7 tables, 13 figures, under consideration for EMNLP

ACM Class: I.2.7; K.6.5

arXiv:2407.18251 [pdf, other]

Sparse vs Contiguous Adversarial Pixel Perturbations in Multimodal Models: An Empirical Analysis

Authors: Cristian-Alexandru Botocan, Raphael Meier, Ljiljana Dolamic

Abstract: Assessing the robustness of multimodal models against adversarial examples is an important aspect for the safety of its users. We craft L0-norm perturbation attacks on the preprocessed input images. We launch them in a black-box setup against four multimodal models and two unimodal DNNs, considering both targeted and untargeted misclassification. Our attacks target less than 0.04% of perturbed ima… ▽ More Assessing the robustness of multimodal models against adversarial examples is an important aspect for the safety of its users. We craft L0-norm perturbation attacks on the preprocessed input images. We launch them in a black-box setup against four multimodal models and two unimodal DNNs, considering both targeted and untargeted misclassification. Our attacks target less than 0.04% of perturbed image area and integrate different spatial positioning of perturbed pixels: sparse positioning and pixels arranged in different contiguous shapes (row, column, diagonal, and patch). To the best of our knowledge, we are the first to assess the robustness of three state-of-the-art multimodal models (ALIGN, AltCLIP, GroupViT) against different sparse and contiguous pixel distribution perturbations. The obtained results indicate that unimodal DNNs are more robust than multimodal models. Furthermore, models using CNN-based Image Encoder are more vulnerable than models with ViT - for untargeted attacks, we obtain a 99% success rate by perturbing less than 0.02% of the image area. △ Less

Submitted 25 July, 2024; originally announced July 2024.

ACM Class: I.2.0; I.4.0

arXiv:2406.14986 [pdf, ps, other]

Do Large Language Models Exhibit Cognitive Dissonance? Studying the Difference Between Revealed Beliefs and Stated Answers

Authors: Manuel Mondal, Ljiljana Dolamic, Gérôme Bovet, Philippe Cudré-Mauroux, Julien Audiffren

Abstract: Multiple Choice Questions (MCQ) have become a commonly used approach to assess the capabilities of Large Language Models (LLMs), due to their ease of manipulation and evaluation. The experimental appraisals of the LLMs' Stated Answer (their answer to MCQ) have pointed to their apparent ability to perform probabilistic reasoning or to grasp uncertainty. In this work, we investigate whether these ap… ▽ More Multiple Choice Questions (MCQ) have become a commonly used approach to assess the capabilities of Large Language Models (LLMs), due to their ease of manipulation and evaluation. The experimental appraisals of the LLMs' Stated Answer (their answer to MCQ) have pointed to their apparent ability to perform probabilistic reasoning or to grasp uncertainty. In this work, we investigate whether these aptitudes are measurable outside tailored prompting and MCQ by reformulating these issues as direct text-completion - the fundamental computational unit of LLMs. We introduce Revealed Belief, an evaluation framework that evaluates LLMs on tasks requiring reasoning under uncertainty, which complements MCQ scoring by analyzing text-completion probability distributions. Our findings suggest that while LLMs frequently state the correct answer, their Revealed Belief shows that they often allocate probability mass inconsistently, exhibit systematic biases, and often fail to update their beliefs appropriately when presented with new evidence, leading to strong potential impacts on downstream tasks. These results suggest that common evaluation methods may only provide a partial picture and that more research is needed to assess the extent and nature of their capabilities. △ Less

Submitted 17 June, 2025; v1 submitted 21 June, 2024; originally announced June 2024.

arXiv:2308.15246 [pdf, other]

A Classification-Guided Approach for Adversarial Attacks against Neural Machine Translation

Authors: Sahar Sadrizadeh, Ljiljana Dolamic, Pascal Frossard

Abstract: Neural Machine Translation (NMT) models have been shown to be vulnerable to adversarial attacks, wherein carefully crafted perturbations of the input can mislead the target model. In this paper, we introduce ACT, a novel adversarial attack framework against NMT systems guided by a classifier. In our attack, the adversary aims to craft meaning-preserving adversarial examples whose translations in t… ▽ More Neural Machine Translation (NMT) models have been shown to be vulnerable to adversarial attacks, wherein carefully crafted perturbations of the input can mislead the target model. In this paper, we introduce ACT, a novel adversarial attack framework against NMT systems guided by a classifier. In our attack, the adversary aims to craft meaning-preserving adversarial examples whose translations in the target language by the NMT model belong to a different class than the original translations. Unlike previous attacks, our new approach has a more substantial effect on the translation by altering the overall meaning, which then leads to a different class determined by an oracle classifier. To evaluate the robustness of NMT models to our attack, we propose enhancements to existing black-box word-replacement-based attacks by incorporating output translations of the target NMT model and the output logits of a classifier within the attack process. Extensive experiments, including a comparison with existing untargeted attacks, show that our attack is considerably more successful in altering the class of the output translation and has more effect on the translation. This new paradigm can reveal the vulnerabilities of NMT systems by focusing on the class of translation rather than the mere translation quality as studied traditionally. △ Less

Submitted 22 February, 2024; v1 submitted 29 August, 2023; originally announced August 2023.

arXiv:2306.09991 [pdf, other]

Evolutionary Algorithms in the Light of SGD: Limit Equivalence, Minima Flatness, and Transfer Learning

Authors: Andrei Kucharavy, Rachid Guerraoui, Ljiljana Dolamic

Abstract: Whenever applicable, the Stochastic Gradient Descent (SGD) has shown itself to be unreasonably effective. Instead of underperforming and getting trapped in local minima due to the batch noise, SGD leverages it to learn to generalize better and find minima that are good enough for the entire dataset. This led to numerous theoretical and experimental investigations, especially in the context of Arti… ▽ More Whenever applicable, the Stochastic Gradient Descent (SGD) has shown itself to be unreasonably effective. Instead of underperforming and getting trapped in local minima due to the batch noise, SGD leverages it to learn to generalize better and find minima that are good enough for the entire dataset. This led to numerous theoretical and experimental investigations, especially in the context of Artificial Neural Networks (ANNs), leading to better machine learning algorithms. However, SGD is not applicable in a non-differentiable setting, leaving all that prior research off the table. In this paper, we show that a class of evolutionary algorithms (EAs) inspired by the Gillespie-Orr Mutational Landscapes model for natural evolution is formally equivalent to SGD in certain settings and, in practice, is well adapted to large ANNs. We refer to such EAs as Gillespie-Orr EA class (GO-EAs) and empirically show how an insight transfer from SGD can work for them. We then show that for ANNs trained to near-optimality or in the transfer learning setting, the equivalence also allows transferring the insights from the Mutational Landscapes model to SGD. We then leverage this equivalence to experimentally show how SGD and GO-EAs can provide mutual insight through examples of minima flatness, transfer learning, and mixing of individuals in EAs applied to large models. △ Less

Submitted 20 May, 2023; originally announced June 2023.

Comments: To be published in ALIFE 2023; 16 pages, 10 figures, 1 listing

ACM Class: I.2.8; G.1.6

arXiv:2306.08492 [pdf, other]

A Relaxed Optimization Approach for Adversarial Attacks against Neural Machine Translation Models

Authors: Sahar Sadrizadeh, Clément Barbier, Ljiljana Dolamic, Pascal Frossard

Abstract: In this paper, we propose an optimization-based adversarial attack against Neural Machine Translation (NMT) models. First, we propose an optimization problem to generate adversarial examples that are semantically similar to the original sentences but destroy the translation generated by the target NMT model. This optimization problem is discrete, and we propose a continuous relaxation to solve it.… ▽ More In this paper, we propose an optimization-based adversarial attack against Neural Machine Translation (NMT) models. First, we propose an optimization problem to generate adversarial examples that are semantically similar to the original sentences but destroy the translation generated by the target NMT model. This optimization problem is discrete, and we propose a continuous relaxation to solve it. With this relaxation, we find a probability distribution for each token in the adversarial example, and then we can generate multiple adversarial examples by sampling from these distributions. Experimental results show that our attack significantly degrades the translation quality of multiple NMT models while maintaining the semantic similarity between the original and adversarial sentences. Furthermore, our attack outperforms the baselines in terms of success rate, similarity preservation, effect on translation quality, and token error rate. Finally, we propose a black-box extension of our attack by sampling from an optimized probability distribution for a reference model whose gradients are accessible. △ Less

Submitted 14 June, 2023; originally announced June 2023.

arXiv:2306.01393 [pdf, other]

Assessing the Importance of Frequency versus Compositionality for Subword-based Tokenization in NMT

Authors: Benoist Wolleb, Romain Silvestri, Giorgos Vernikos, Ljiljana Dolamic, Andrei Popescu-Belis

Abstract: Subword tokenization is the de facto standard for tokenization in neural language models and machine translation systems. Three advantages are frequently cited in favor of subwords: shorter encoding of frequent tokens, compositionality of subwords, and ability to deal with unknown words. As their relative importance is not entirely clear yet, we propose a tokenization approach that enables us to s… ▽ More Subword tokenization is the de facto standard for tokenization in neural language models and machine translation systems. Three advantages are frequently cited in favor of subwords: shorter encoding of frequent tokens, compositionality of subwords, and ability to deal with unknown words. As their relative importance is not entirely clear yet, we propose a tokenization approach that enables us to separate frequency (the first advantage) from compositionality. The approach uses Huffman coding to tokenize words, by order of frequency, using a fixed amount of symbols. Experiments with CS-DE, EN-FR and EN-DE NMT show that frequency alone accounts for 90%-95% of the scores reached by BPE, hence compositionality has less importance than previously thought. △ Less

Submitted 12 January, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

Comments: Accepted at EAMT 2023

arXiv:2304.13540 [pdf, ps, other]

Byzantine-Resilient Learning Beyond Gradients: Distributing Evolutionary Search

Authors: Andrei Kucharavy, Matteo Monti, Rachid Guerraoui, Ljiljana Dolamic

Abstract: Modern machine learning (ML) models are capable of impressive performances. However, their prowess is not due only to the improvements in their architecture and training algorithms but also to a drastic increase in computational power used to train them. Such a drastic increase led to a growing interest in distributed ML, which in turn made worker failures and adversarial attacks an increasingly… ▽ More Modern machine learning (ML) models are capable of impressive performances. However, their prowess is not due only to the improvements in their architecture and training algorithms but also to a drastic increase in computational power used to train them. Such a drastic increase led to a growing interest in distributed ML, which in turn made worker failures and adversarial attacks an increasingly pressing concern. While distributed byzantine resilient algorithms have been proposed in a differentiable setting, none exist in a gradient-free setting. The goal of this work is to address this shortcoming. For that, we introduce a more general definition of byzantine-resilience in ML - the \textit{model-consensus}, that extends the definition of the classical distributed consensus. We then leverage this definition to show that a general class of gradient-free ML algorithms - ($1,λ$)-Evolutionary Search - can be combined with classical distributed consensus algorithms to generate gradient-free byzantine-resilient distributed learning algorithms. We provide proofs and pseudo-code for two specific cases - the Total Order Broadcast and proof-of-work leader election. △ Less

Submitted 20 April, 2023; originally announced April 2023.

Comments: 10 pages, 4 listings, 2 theorems

ACM Class: I.2.11; D.1.3; F.1.2

arXiv:2303.12132 [pdf, other]

Fundamentals of Generative Large Language Models and Perspectives in Cyber-Defense

Authors: Andrei Kucharavy, Zachary Schillaci, Loïc Maréchal, Maxime Würsch, Ljiljana Dolamic, Remi Sabonnadiere, Dimitri Percia David, Alain Mermoud, Vincent Lenders

Abstract: Generative Language Models gained significant attention in late 2022 / early 2023, notably with the introduction of models refined to act consistently with users' expectations of interactions with AI (conversational models). Arguably the focal point of public attention has been such a refinement of the GPT3 model -- the ChatGPT and its subsequent integration with auxiliary capabilities, including… ▽ More Generative Language Models gained significant attention in late 2022 / early 2023, notably with the introduction of models refined to act consistently with users' expectations of interactions with AI (conversational models). Arguably the focal point of public attention has been such a refinement of the GPT3 model -- the ChatGPT and its subsequent integration with auxiliary capabilities, including search as part of Microsoft Bing. Despite extensive prior research invested in their development, their performance and applicability to a range of daily tasks remained unclear and niche. However, their wider utilization without a requirement for technical expertise, made in large part possible through conversational fine-tuning, revealed the extent of their true capabilities in a real-world environment. This has garnered both public excitement for their potential applications and concerns about their capabilities and potential malicious uses. This review aims to provide a brief overview of the history, state of the art, and implications of Generative Language Models in terms of their principles, abilities, limitations, and future prospects -- especially in the context of cyber-defense, with a focus on the Swiss operational environment. △ Less

Submitted 21 March, 2023; originally announced March 2023.

Comments: 41 pages (without references), 13 figures; public report of Cyber-Defence Campus

ACM Class: I.2.7; I.2.1; K.6.5; K.4.2; J.7

arXiv:2303.01068 [pdf, other]

Targeted Adversarial Attacks against Neural Machine Translation

Authors: Sahar Sadrizadeh, AmirHossein Dabiri Aghdam, Ljiljana Dolamic, Pascal Frossard

Abstract: Neural Machine Translation (NMT) systems are used in various applications. However, it has been shown that they are vulnerable to very small perturbations of their inputs, known as adversarial attacks. In this paper, we propose a new targeted adversarial attack against NMT models. In particular, our goal is to insert a predefined target keyword into the translation of the adversarial sentence whil… ▽ More Neural Machine Translation (NMT) systems are used in various applications. However, it has been shown that they are vulnerable to very small perturbations of their inputs, known as adversarial attacks. In this paper, we propose a new targeted adversarial attack against NMT models. In particular, our goal is to insert a predefined target keyword into the translation of the adversarial sentence while maintaining similarity between the original sentence and the perturbed one in the source domain. To this aim, we propose an optimization problem, including an adversarial loss term and a similarity term. We use gradient projection in the embedding space to craft an adversarial sentence. Experimental results show that our attack outperforms Seq2Sick, the other targeted adversarial attack against NMT models, in terms of success rate and decrease in translation quality. Our attack succeeds in inserting a keyword into the translation for more than 75% of sentences while similarity with the original sentence stays preserved. △ Less

Submitted 2 March, 2023; originally announced March 2023.

Comments: ICASSP 2023, Code available at: http://github.com/sssadrizadeh/NMT-targeted-attack

arXiv:2302.00944 [pdf, other]

TransFool: An Adversarial Attack against Neural Machine Translation Models

Authors: Sahar Sadrizadeh, Ljiljana Dolamic, Pascal Frossard

Abstract: Deep neural networks have been shown to be vulnerable to small perturbations of their inputs, known as adversarial attacks. In this paper, we investigate the vulnerability of Neural Machine Translation (NMT) models to adversarial attacks and propose a new attack algorithm called TransFool. To fool NMT models, TransFool builds on a multi-term optimization problem and a gradient projection step. By… ▽ More Deep neural networks have been shown to be vulnerable to small perturbations of their inputs, known as adversarial attacks. In this paper, we investigate the vulnerability of Neural Machine Translation (NMT) models to adversarial attacks and propose a new attack algorithm called TransFool. To fool NMT models, TransFool builds on a multi-term optimization problem and a gradient projection step. By integrating the embedding representation of a language model, we generate fluent adversarial examples in the source language that maintain a high level of semantic similarity with the clean samples. Experimental results demonstrate that, for different translation tasks and NMT architectures, our white-box attack can severely degrade the translation quality while the semantic similarity between the original and the adversarial sentences stays high. Moreover, we show that TransFool is transferable to unknown target models. Finally, based on automatic and human evaluations, TransFool leads to improvement in terms of success rate, semantic similarity, and fluency compared to the existing attacks both in white-box and black-box settings. Thus, TransFool permits us to better characterize the vulnerability of NMT models and outlines the necessity to design strong defense mechanisms and more robust NMT systems for real-life applications. △ Less

Submitted 16 June, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

arXiv:2209.10224 [pdf, other]

Identifying Emerging Technologies and Leading Companies using Network Dynamics of Patent Clusters: a Cybersecurity Case Study

Authors: Michael Tsesmelis, Ljiljana Dolamic, Marcus Matthias Keupp, Dimitri Percia David, Alain Mermoud

Abstract: Strategic decisions rely heavily on non-scientific instrumentation to forecast emerging technologies and leading companies. Instead, we build a fast quantitative system with a small computational footprint to discover the most important technologies and companies in a given field, using generalisable methods applicable to any industry. With the help of patent data from the US Patent and Trademark… ▽ More Strategic decisions rely heavily on non-scientific instrumentation to forecast emerging technologies and leading companies. Instead, we build a fast quantitative system with a small computational footprint to discover the most important technologies and companies in a given field, using generalisable methods applicable to any industry. With the help of patent data from the US Patent and Trademark Office, we first assign a value to each patent thanks to automated machine learning tools. We then apply network science to track the interaction and evolution of companies and clusters of patents (i.e. technologies) to create rankings for both sets that highlight important or emerging network nodes thanks to five network centrality indices. Finally, we illustrate our system with a case study based on the cybersecurity industry. Our results produce useful insights, for instance by highlighting (i) emerging technologies with a growing mean patent value and cluster size, (ii) the most influential companies in the field and (iii) attractive startups with few but impactful patents. Complementary analysis also provides evidence of decreasing marginal returns of research and development in larger companies in the cybersecurity industry. △ Less

Submitted 21 September, 2022; originally announced September 2022.

Comments: 24 pages, 8 figures

arXiv:2206.00282 [pdf, other]

Needle In A Haystack, Fast: Benchmarking Image Perceptual Similarity Metrics At Scale

Authors: Cyril Vallez, Andrei Kucharavy, Ljiljana Dolamic

Abstract: The advent of the internet, followed shortly by the social media made it ubiquitous in consuming and sharing information between anyone with access to it. The evolution in the consumption of media driven by this change, led to the emergence of images as means to express oneself, convey information and convince others efficiently. With computer vision algorithms progressing radically over the last… ▽ More The advent of the internet, followed shortly by the social media made it ubiquitous in consuming and sharing information between anyone with access to it. The evolution in the consumption of media driven by this change, led to the emergence of images as means to express oneself, convey information and convince others efficiently. With computer vision algorithms progressing radically over the last decade, it is become easier and easier to study at scale the role of images in the flow of information online. While the research questions and overall pipelines differ radically, almost all start with a crucial first step - evaluation of global perceptual similarity between different images. That initial step is crucial for overall pipeline performance and processes most images. A number of algorithms are available and currently used to perform it, but so far no comprehensive review was available to guide the choice of researchers as to the choice of an algorithm best suited to their question, assumptions and computational resources. With this paper we aim to fill this gap, showing that classical computer vision methods are not necessarily the best approach, whereas a pair of relatively little used methods - Dhash perceptual hash and SimCLR v2 ResNets achieve excellent performance, scale well and are computationally efficient. △ Less

Submitted 1 June, 2022; originally announced June 2022.

Comments: 26 pages, 10 figures

ACM Class: H.3.1; I.4.10; I.4.7; I.5.5; I.5.4; K.4

arXiv:2203.05948 [pdf, other]

Block-Sparse Adversarial Attack to Fool Transformer-Based Text Classifiers

Authors: Sahar Sadrizadeh, Ljiljana Dolamic, Pascal Frossard

Abstract: Recently, it has been shown that, in spite of the significant performance of deep neural networks in different fields, those are vulnerable to adversarial examples. In this paper, we propose a gradient-based adversarial attack against transformer-based text classifiers. The adversarial perturbation in our method is imposed to be block-sparse so that the resultant adversarial example differs from t… ▽ More Recently, it has been shown that, in spite of the significant performance of deep neural networks in different fields, those are vulnerable to adversarial examples. In this paper, we propose a gradient-based adversarial attack against transformer-based text classifiers. The adversarial perturbation in our method is imposed to be block-sparse so that the resultant adversarial example differs from the original sentence in only a few words. Due to the discrete nature of textual data, we perform gradient projection to find the minimizer of our proposed optimization problem. Experimental results demonstrate that, while our adversarial attack maintains the semantics of the sentence, it can reduce the accuracy of GPT-2 to less than 5% on different datasets (AG News, MNLI, and Yelp Reviews). Furthermore, the block-sparsity constraint of the proposed optimization problem results in small perturbations in the adversarial example. △ Less

Submitted 11 March, 2022; originally announced March 2022.

Comments: ICASSP 2022, Code available at: https://github.com/sssadrizadeh/transformer-text-classifier-attack

arXiv:2112.04810 [pdf, ps, other]

From Scattered Sources to Comprehensive Technology Landscape: A Recommendation-based Retrieval Approach

Authors: Chi Thang Duong, Dimitri Percia David, Ljiljana Dolamic, Alain Mermoud, Vincent Lenders, Karl Aberer

Abstract: Mapping the technology landscape is crucial for market actors to take informed investment decisions. However, given the large amount of data on the Web and its subsequent information overload, manually retrieving information is a seemingly ineffective and incomplete approach. In this work, we propose an end-to-end recommendation based retrieval approach to support automatic retrieval of technologi… ▽ More Mapping the technology landscape is crucial for market actors to take informed investment decisions. However, given the large amount of data on the Web and its subsequent information overload, manually retrieving information is a seemingly ineffective and incomplete approach. In this work, we propose an end-to-end recommendation based retrieval approach to support automatic retrieval of technologies and their associated companies from raw Web data. This is a two-task setup involving (i) technology classification of entities extracted from company corpus, and (ii) technology and company retrieval based on classified technologies. Our proposed framework approaches the first task by leveraging DistilBERT which is a state-of-the-art language model. For the retrieval task, we introduce a recommendation-based retrieval technique to simultaneously support retrieving related companies, technologies related to a specific company and companies relevant to a technology. To evaluate these tasks, we also construct a data set that includes company documents and entities extracted from these documents together with company categories and technology labels. Experiments show that our approach is able to return 4 times more relevant companies while outperforming traditional retrieval baseline in retrieving technologies. △ Less

Submitted 9 December, 2021; originally announced December 2021.

Showing 1–18 of 18 results for author: Dolamic, L