Search | arXiv e-print repository

A Manually Annotated Image-Caption Dataset for Detecting Children in the Wild

Authors: Klim Kireev, Ana-Maria Creţu, Raphael Meier, Sarah Adel Bargal, Elissa Redmiles, Carmela Troncoso

Abstract: Platforms and the law regulate digital content depicting minors (defined as individuals under 18 years of age) differently from other types of content. Given the sheer amount of content that needs to be assessed, machine learning-based automation tools are commonly used to detect content depicting minors. To our knowledge, no dataset or benchmark currently exists for detecting these identification… ▽ More Platforms and the law regulate digital content depicting minors (defined as individuals under 18 years of age) differently from other types of content. Given the sheer amount of content that needs to be assessed, machine learning-based automation tools are commonly used to detect content depicting minors. To our knowledge, no dataset or benchmark currently exists for detecting these identification methods in a multi-modal environment. To fill this gap, we release the Image-Caption Children in the Wild Dataset (ICCWD), an image-caption dataset aimed at benchmarking tools that detect depictions of minors. Our dataset is richer than previous child image datasets, containing images of children in a variety of contexts, including fictional depictions and partially visible bodies. ICCWD contains 10,000 image-caption pairs manually labeled to indicate the presence or absence of a child in the image. To demonstrate the possible utility of our dataset, we use it to benchmark three different detectors, including a commercial age estimation system applied to images. Our results suggest that child detection is a challenging task, with the best method achieving a 75.3% true positive rate. We hope the release of our dataset will aid in the design of better minor detection methods in a wide range of scenarios. △ Less

Submitted 11 June, 2025; originally announced June 2025.

Comments: 14 pages, 6 figures

arXiv:2506.09777 [pdf, ps, other]

Inverting Black-Box Face Recognition Systems via Zero-Order Optimization in Eigenface Space

Authors: Anton Razzhigaev, Matvey Mikhalchuk, Klim Kireev, Igor Udovichenko, Andrey Kuznetsov, Aleksandr Petiushko

Abstract: Reconstructing facial images from black-box recognition models poses a significant privacy threat. While many methods require access to embeddings, we address the more challenging scenario of model inversion using only similarity scores. This paper introduces DarkerBB, a novel approach that reconstructs color faces by performing zero-order optimization within a PCA-derived eigenface space. Despite… ▽ More Reconstructing facial images from black-box recognition models poses a significant privacy threat. While many methods require access to embeddings, we address the more challenging scenario of model inversion using only similarity scores. This paper introduces DarkerBB, a novel approach that reconstructs color faces by performing zero-order optimization within a PCA-derived eigenface space. Despite this highly limited information, experiments on LFW, AgeDB-30, and CFP-FP benchmarks demonstrate that DarkerBB achieves state-of-the-art verification accuracies in the similarity-only setting, with competitive query efficiency. △ Less

Submitted 11 June, 2025; originally announced June 2025.

arXiv:2406.08084 [pdf, other]

Characterizing and Detecting Propaganda-Spreading Accounts on Telegram

Authors: Klim Kireev, Yevhen Mykhno, Carmela Troncoso, Rebekah Overdorf

Abstract: Information-based attacks on social media, such as disinformation campaigns and propaganda, are emerging cybersecurity threats. The security community has focused on countering these threats on social media platforms like X and Reddit. However, they also appear in instant-messaging social media platforms such as WhatsApp, Telegram, and Signal. In these platforms information-based attacks primarily… ▽ More Information-based attacks on social media, such as disinformation campaigns and propaganda, are emerging cybersecurity threats. The security community has focused on countering these threats on social media platforms like X and Reddit. However, they also appear in instant-messaging social media platforms such as WhatsApp, Telegram, and Signal. In these platforms information-based attacks primarily happen in groups and channels, requiring manual moderation efforts by channel administrators. We collect, label, and analyze a large dataset of more than 17 million Telegram comments and messages. Our analysis uncovers two independent, coordinated networks that spread pro-Russian and pro-Ukrainian propaganda, garnering replies from real users. We propose a novel mechanism for detecting propaganda that capitalizes on the relationship between legitimate user messages and propaganda replies and is tailored to the information that Telegram makes available to moderators. Our method is faster, cheaper, and has a detection rate (97.6%) 11.6 percentage points higher than human moderators after seeing only one message from an account. It remains effective despite evolving propaganda. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2306.04064 [pdf, other]

Transferable Adversarial Robustness for Categorical Data via Universal Robust Embeddings

Authors: Klim Kireev, Maksym Andriushchenko, Carmela Troncoso, Nicolas Flammarion

Abstract: Research on adversarial robustness is primarily focused on image and text data. Yet, many scenarios in which lack of robustness can result in serious risks, such as fraud detection, medical diagnosis, or recommender systems often do not rely on images or text but instead on tabular data. Adversarial robustness in tabular data poses two serious challenges. First, tabular datasets often contain cate… ▽ More Research on adversarial robustness is primarily focused on image and text data. Yet, many scenarios in which lack of robustness can result in serious risks, such as fraud detection, medical diagnosis, or recommender systems often do not rely on images or text but instead on tabular data. Adversarial robustness in tabular data poses two serious challenges. First, tabular datasets often contain categorical features, and therefore cannot be tackled directly with existing optimization procedures. Second, in the tabular domain, algorithms that are not based on deep networks are widely used and offer great performance, but algorithms to enhance robustness are tailored to neural networks (e.g. adversarial training). In this paper, we tackle both challenges. We present a method that allows us to train adversarially robust deep networks for tabular data and to transfer this robustness to other classifiers via universal robust embeddings tailored to categorical data. These embeddings, created using a bilevel alternating minimization framework, can be transferred to boosted trees or random forests making them robust without the need for adversarial training while preserving their high accuracy on tabular data. We show that our methods outperform existing techniques within a practical threat model suitable for tabular data. △ Less

Submitted 13 December, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

arXiv:2208.13058 [pdf, other]

Adversarial Robustness for Tabular Data through Cost and Utility Awareness

Authors: Klim Kireev, Bogdan Kulynych, Carmela Troncoso

Abstract: Many safety-critical applications of machine learning, such as fraud or abuse detection, use data in tabular domains. Adversarial examples can be particularly damaging for these applications. Yet, existing works on adversarial robustness primarily focus on machine-learning models in image and text domains. We argue that, due to the differences between tabular data and images or text, existing thre… ▽ More Many safety-critical applications of machine learning, such as fraud or abuse detection, use data in tabular domains. Adversarial examples can be particularly damaging for these applications. Yet, existing works on adversarial robustness primarily focus on machine-learning models in image and text domains. We argue that, due to the differences between tabular data and images or text, existing threat models are not suitable for tabular domains. These models do not capture that the costs of an attack could be more significant than imperceptibility, or that the adversary could assign different values to the utility obtained from deploying different adversarial examples. We demonstrate that, due to these differences, the attack and defense methods used for images and text cannot be directly applied to tabular settings. We address these issues by proposing new cost and utility-aware threat models that are tailored to the adversarial capabilities and constraints of attackers targeting tabular domains. We introduce a framework that enables us to design attack and defense mechanisms that result in models protected against cost and utility-aware adversaries, for example, adversaries constrained by a certain financial budget. We show that our approach is effective on three datasets corresponding to applications for which adversarial examples can have economic and social implications. △ Less

Submitted 24 February, 2023; v1 submitted 27 August, 2022; originally announced August 2022.

Comments: The first two authors contributed equally. To appear in the proceedings of NDSS 2023

arXiv:2106.14290 [pdf, other]

Darker than Black-Box: Face Reconstruction from Similarity Queries

Authors: Anton Razzhigaev, Klim Kireev, Igor Udovichenko, Aleksandr Petiushko

Abstract: Several methods for inversion of face recognition models were recently presented, attempting to reconstruct a face from deep templates. Although some of these approaches work in a black-box setup using only face embeddings, usually, on the end-user side, only similarity scores are provided. Therefore, these algorithms are inapplicable in such scenarios. We propose a novel approach that allows reco… ▽ More Several methods for inversion of face recognition models were recently presented, attempting to reconstruct a face from deep templates. Although some of these approaches work in a black-box setup using only face embeddings, usually, on the end-user side, only similarity scores are provided. Therefore, these algorithms are inapplicable in such scenarios. We propose a novel approach that allows reconstructing the face querying only similarity scores of the black-box model. While our algorithm operates in a more general setup, experiments show that it is query efficient and outperforms the existing methods. △ Less

Submitted 2 July, 2021; v1 submitted 27 June, 2021; originally announced June 2021.

arXiv:2103.02325 [pdf, other]

On the effectiveness of adversarial training against common corruptions

Authors: Klim Kireev, Maksym Andriushchenko, Nicolas Flammarion

Abstract: The literature on robustness towards common corruptions shows no consensus on whether adversarial training can improve the performance in this setting. First, we show that, when used with an appropriately selected perturbation radius, $\ell_p$ adversarial training can serve as a strong baseline against common corruptions improving both accuracy and calibration. Then we explain why adversarial trai… ▽ More The literature on robustness towards common corruptions shows no consensus on whether adversarial training can improve the performance in this setting. First, we show that, when used with an appropriately selected perturbation radius, $\ell_p$ adversarial training can serve as a strong baseline against common corruptions improving both accuracy and calibration. Then we explain why adversarial training performs better than data augmentation with simple Gaussian noise which has been observed to be a meaningful baseline on common corruptions. Related to this, we identify the $σ$-overfitting phenomenon when Gaussian augmentation overfits to a particular standard deviation used for training which has a significant detrimental effect on common corruption accuracy. We discuss how to alleviate this problem and then how to further enhance $\ell_p$ adversarial training by introducing an efficient relaxation of adversarial training with learned perceptual image patch similarity as the distance metric. Through experiments on CIFAR-10 and ImageNet-100, we show that our approach does not only improve the $\ell_p$ adversarial training baseline but also has cumulative gains with data augmentation methods such as AugMix, DeepAugment, ANT, and SIN, leading to state-of-the-art performance on common corruptions. The code of our experiments is publicly available at https://github.com/tml-epfl/adv-training-corruptions. △ Less

Submitted 4 January, 2022; v1 submitted 3 March, 2021; originally announced March 2021.

Comments: New calibration results, more comprehensive experimental evaluation (e.g., new results with AugMix+JSD and DeepAugment)

arXiv:2007.13635 [pdf, other]

doi 10.1007/978-3-030-68238-5_34

Black-Box Face Recovery from Identity Features

Authors: Anton Razzhigaev, Klim Kireev, Edgar Kaziakhmedov, Nurislam Tursynbek, Aleksandr Petiushko

Abstract: In this work, we present a novel algorithm based on an it-erative sampling of random Gaussian blobs for black-box face recovery, given only an output feature vector of deep face recognition systems. We attack the state-of-the-art face recognition system (ArcFace) to test our algorithm. Another network with different architecture (FaceNet) is used as an independent critic showing that the target pe… ▽ More In this work, we present a novel algorithm based on an it-erative sampling of random Gaussian blobs for black-box face recovery, given only an output feature vector of deep face recognition systems. We attack the state-of-the-art face recognition system (ArcFace) to test our algorithm. Another network with different architecture (FaceNet) is used as an independent critic showing that the target person can be identified with the reconstructed image even with no access to the attacked model. Furthermore, our algorithm requires a significantly less number of queries compared to the state-of-the-art solution. △ Less

Submitted 30 July, 2020; v1 submitted 27 July, 2020; originally announced July 2020.

Journal ref: ECCV Workshops (5) 2020: 462-475

arXiv:1910.07067 [pdf, other]

doi 10.1109/SIBIRCON48586.2019.8958134

On adversarial patches: real-world attack on ArcFace-100 face recognition system

Authors: Mikhail Pautov, Grigorii Melnikov, Edgar Kaziakhmedov, Klim Kireev, Aleksandr Petiushko

Abstract: Recent works showed the vulnerability of image classifiers to adversarial attacks in the digital domain. However, the majority of attacks involve adding small perturbation to an image to fool the classifier. Unfortunately, such procedures can not be used to conduct a real-world attack, where adding an adversarial attribute to the photo is a more practical approach. In this paper, we study the prob… ▽ More Recent works showed the vulnerability of image classifiers to adversarial attacks in the digital domain. However, the majority of attacks involve adding small perturbation to an image to fool the classifier. Unfortunately, such procedures can not be used to conduct a real-world attack, where adding an adversarial attribute to the photo is a more practical approach. In this paper, we study the problem of real-world attacks on face recognition systems. We examine security of one of the best public face recognition systems, LResNet100E-IR with ArcFace loss, and propose a simple method to attack it in the physical world. The method suggests creating an adversarial patch that can be printed, added as a face attribute and photographed; the photo of a person with such attribute is then passed to the classifier such that the classifier's recognized class changes from correct to the desired one. Proposed generating procedure allows projecting adversarial patches not only on different areas of the face, such as nose or forehead but also on some wearable accessory, such as eyeglasses. △ Less

Submitted 1 April, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

Journal ref: 2019 International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON)

arXiv:1910.06261 [pdf, other]

doi 10.1109/SIBIRCON48586.2019.8958122

Real-world adversarial attack on MTCNN face detection system

Authors: Edgar Kaziakhmedov, Klim Kireev, Grigorii Melnikov, Mikhail Pautov, Aleksandr Petiushko

Abstract: Recent studies proved that deep learning approaches achieve remarkable results on face detection task. On the other hand, the advances gave rise to a new problem associated with the security of the deep convolutional neural network models unveiling potential risks of DCNNs based applications. Even minor input changes in the digital domain can result in the network being fooled. It was shown then t… ▽ More Recent studies proved that deep learning approaches achieve remarkable results on face detection task. On the other hand, the advances gave rise to a new problem associated with the security of the deep convolutional neural network models unveiling potential risks of DCNNs based applications. Even minor input changes in the digital domain can result in the network being fooled. It was shown then that some deep learning-based face detectors are prone to adversarial attacks not only in a digital domain but also in the real world. In the paper, we investigate the security of the well-known cascade CNN face detection system - MTCNN and introduce an easily reproducible and a robust way to attack it. We propose different face attributes printed on an ordinary white and black printer and attached either to the medical face mask or to the face directly. Our approach is capable of breaking the MTCNN detector in a real-world scenario. △ Less

Submitted 2 April, 2020; v1 submitted 14 October, 2019; originally announced October 2019.

Journal ref: 2019 International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON)

Showing 1–10 of 10 results for author: Kireev, K