-
Intrinsic and Extrinsic Factor Disentanglement for Recommendation in Various Context Scenarios
Authors:
Yixin Su,
Wei Jiang,
Fangquan Lin,
Cheng Yang,
Sarah M. Erfani,
Junhao Gan,
Yunxiang Zhao,
Ruixuan Li,
Rui Zhang
Abstract:
In recommender systems, the patterns of user behaviors (e.g., purchase, click) may vary greatly in different contexts (e.g., time and location). This is because user behavior is jointly determined by two types of factors: intrinsic factors, which reflect consistent user preference, and extrinsic factors, which reflect external incentives that may vary in different contexts. Differentiating between…
▽ More
In recommender systems, the patterns of user behaviors (e.g., purchase, click) may vary greatly in different contexts (e.g., time and location). This is because user behavior is jointly determined by two types of factors: intrinsic factors, which reflect consistent user preference, and extrinsic factors, which reflect external incentives that may vary in different contexts. Differentiating between intrinsic and extrinsic factors helps learn user behaviors better. However, existing studies have only considered differentiating them from a single, pre-defined context (e.g., time or location), ignoring the fact that a user's extrinsic factors may be influenced by the interplay of various contexts at the same time. In this paper, we propose the Intrinsic-Extrinsic Disentangled Recommendation (IEDR) model, a generic framework that differentiates intrinsic from extrinsic factors considering various contexts simultaneously, enabling more accurate differentiation of factors and hence the improvement of recommendation accuracy. IEDR contains a context-invariant contrastive learning component to capture intrinsic factors, and a disentanglement component to extract extrinsic factors under the interplay of various contexts. The two components work together to achieve effective factor learning. Extensive experiments on real-world datasets demonstrate IEDR's effectiveness in learning disentangled factors and significantly improving recommendation accuracy by up to 4% in NDCG.
△ Less
Submitted 15 March, 2025; v1 submitted 5 March, 2025;
originally announced March 2025.
-
CURVALID: Geometrically-guided Adversarial Prompt Detection
Authors:
Canaan Yung,
Hanxun Huang,
Sarah Monazam Erfani,
Christopher Leckie
Abstract:
Adversarial prompts capable of jailbreaking large language models (LLMs) and inducing undesirable behaviours pose a significant obstacle to their safe deployment. Current mitigation strategies rely on activating built-in defence mechanisms or fine-tuning the LLMs, but the fundamental distinctions between adversarial and benign prompts are yet to be understood. In this work, we introduce CurvaLID,…
▽ More
Adversarial prompts capable of jailbreaking large language models (LLMs) and inducing undesirable behaviours pose a significant obstacle to their safe deployment. Current mitigation strategies rely on activating built-in defence mechanisms or fine-tuning the LLMs, but the fundamental distinctions between adversarial and benign prompts are yet to be understood. In this work, we introduce CurvaLID, a novel defense framework that efficiently detects adversarial prompts by leveraging their geometric properties. It is agnostic to the type of LLM, offering a unified detection framework across diverse adversarial prompts and LLM architectures. CurvaLID builds on the geometric analysis of text prompts to uncover their underlying differences. We theoretically extend the concept of curvature via the Whewell equation into an $n$-dimensional word embedding space, enabling us to quantify local geometric properties, including semantic shifts and curvature in the underlying manifolds. Additionally, we employ Local Intrinsic Dimensionality (LID) to capture geometric features of text prompts within adversarial subspaces. Our findings reveal that adversarial prompts differ fundamentally from benign prompts in terms of their geometric characteristics. Our results demonstrate that CurvaLID delivers superior detection and rejection of adversarial queries, paving the way for safer LLM deployment. The source code can be found at https://github.com/Cancanxxx/CurvaLID
△ Less
Submitted 5 March, 2025;
originally announced March 2025.
-
Efficient Neural Implicit Representation for 3D Human Reconstruction
Authors:
Zexu Huang,
Sarah Monazam Erfani,
Siying Lu,
Mingming Gong
Abstract:
High-fidelity digital human representations are increasingly in demand in the digital world, particularly for interactive telepresence, AR/VR, 3D graphics, and the rapidly evolving metaverse. Even though they work well in small spaces, conventional methods for reconstructing 3D human motion frequently require the use of expensive hardware and have high processing costs. This study presents HumanAv…
▽ More
High-fidelity digital human representations are increasingly in demand in the digital world, particularly for interactive telepresence, AR/VR, 3D graphics, and the rapidly evolving metaverse. Even though they work well in small spaces, conventional methods for reconstructing 3D human motion frequently require the use of expensive hardware and have high processing costs. This study presents HumanAvatar, an innovative approach that efficiently reconstructs precise human avatars from monocular video sources. At the core of our methodology, we integrate the pre-trained HuMoR, a model celebrated for its proficiency in human motion estimation. This is adeptly fused with the cutting-edge neural radiance field technology, Instant-NGP, and the state-of-the-art articulated model, Fast-SNARF, to enhance the reconstruction fidelity and speed. By combining these two technologies, a system is created that can render quickly and effectively while also providing estimation of human pose parameters that are unmatched in accuracy. We have enhanced our system with an advanced posture-sensitive space reduction technique, which optimally balances rendering quality with computational efficiency. In our detailed experimental analysis using both artificial and real-world monocular videos, we establish the advanced performance of our approach. HumanAvatar consistently equals or surpasses contemporary leading-edge reconstruction techniques in quality. Furthermore, it achieves these complex reconstructions in minutes, a fraction of the time typically required by existing methods. Our models achieve a training speed that is 110X faster than that of State-of-The-Art (SoTA) NeRF-based models. Our technique performs noticeably better than SoTA dynamic human NeRF methods if given an identical runtime limit. HumanAvatar can provide effective visuals after only 30 seconds of training.
△ Less
Submitted 23 October, 2024;
originally announced October 2024.
-
Be Persistent: Towards a Unified Solution for Mitigating Shortcuts in Deep Learning
Authors:
Hadi M. Dolatabadi,
Sarah M. Erfani,
Christopher Leckie
Abstract:
Deep neural networks (DNNs) are vulnerable to shortcut learning: rather than learning the intended task, they tend to draw inconclusive relationships between their inputs and outputs. Shortcut learning is ubiquitous among many failure cases of neural networks, and traces of this phenomenon can be seen in their generalizability issues, domain shift, adversarial vulnerability, and even bias towards…
▽ More
Deep neural networks (DNNs) are vulnerable to shortcut learning: rather than learning the intended task, they tend to draw inconclusive relationships between their inputs and outputs. Shortcut learning is ubiquitous among many failure cases of neural networks, and traces of this phenomenon can be seen in their generalizability issues, domain shift, adversarial vulnerability, and even bias towards majority groups. In this paper, we argue that this commonality in the cause of various DNN issues creates a significant opportunity that should be leveraged to find a unified solution for shortcut learning. To this end, we outline the recent advances in topological data analysis (TDA), and persistent homology (PH) in particular, to sketch a unified roadmap for detecting shortcuts in deep learning. We demonstrate our arguments by investigating the topological features of computational graphs in DNNs using two cases of unlearnable examples and bias in decision-making as our test studies. Our analysis of these two failure cases of DNNs reveals that finding a unified solution for shortcut learning in DNNs is not out of reach, and TDA can play a significant role in forming such a framework.
△ Less
Submitted 26 August, 2024; v1 submitted 17 February, 2024;
originally announced February 2024.
-
Unlearnable Examples For Time Series
Authors:
Yujing Jiang,
Xingjun Ma,
Sarah Monazam Erfani,
James Bailey
Abstract:
Unlearnable examples (UEs) refer to training samples modified to be unlearnable to Deep Neural Networks (DNNs). These examples are usually generated by adding error-minimizing noises that can fool a DNN model into believing that there is nothing (no error) to learn from the data. The concept of UE has been proposed as a countermeasure against unauthorized data exploitation on personal data. While…
▽ More
Unlearnable examples (UEs) refer to training samples modified to be unlearnable to Deep Neural Networks (DNNs). These examples are usually generated by adding error-minimizing noises that can fool a DNN model into believing that there is nothing (no error) to learn from the data. The concept of UE has been proposed as a countermeasure against unauthorized data exploitation on personal data. While UE has been extensively studied on images, it is unclear how to craft effective UEs for time series data. In this work, we introduce the first UE generation method to protect time series data from unauthorized training by deep learning models. To this end, we propose a new form of error-minimizing noise that can be \emph{selectively} applied to specific segments of time series, rendering them unlearnable to DNN models while remaining imperceptible to human observers. Through extensive experiments on a wide range of time series datasets, we demonstrate that the proposed UE generation method is effective in both classification and generation tasks. It can protect time series data against unauthorized exploitation, while preserving their utility for legitimate usage, thereby contributing to the development of secure and trustworthy machine learning systems.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
LDReg: Local Dimensionality Regularized Self-Supervised Learning
Authors:
Hanxun Huang,
Ricardo J. G. B. Campello,
Sarah Monazam Erfani,
Xingjun Ma,
Michael E. Houle,
James Bailey
Abstract:
Representations learned via self-supervised learning (SSL) can be susceptible to dimensional collapse, where the learned representation subspace is of extremely low dimensionality and thus fails to represent the full data distribution and modalities. Dimensional collapse also known as the "underfilling" phenomenon is one of the major causes of degraded performance on downstream tasks. Previous wor…
▽ More
Representations learned via self-supervised learning (SSL) can be susceptible to dimensional collapse, where the learned representation subspace is of extremely low dimensionality and thus fails to represent the full data distribution and modalities. Dimensional collapse also known as the "underfilling" phenomenon is one of the major causes of degraded performance on downstream tasks. Previous work has investigated the dimensional collapse problem of SSL at a global level. In this paper, we demonstrate that representations can span over high dimensional space globally, but collapse locally. To address this, we propose a method called $\textit{local dimensionality regularization (LDReg)}$. Our formulation is based on the derivation of the Fisher-Rao metric to compare and optimize local distance distributions at an asymptotically small radius for each data point. By increasing the local intrinsic dimensionality, we demonstrate through a range of experiments that LDReg improves the representation quality of SSL. The results also show that LDReg can regularize dimensionality at both local and global levels.
△ Less
Submitted 14 March, 2024; v1 submitted 18 January, 2024;
originally announced January 2024.
-
End-to-End Anti-Backdoor Learning on Images and Time Series
Authors:
Yujing Jiang,
Xingjun Ma,
Sarah Monazam Erfani,
Yige Li,
James Bailey
Abstract:
Backdoor attacks present a substantial security concern for deep learning models, especially those utilized in applications critical to safety and security. These attacks manipulate model behavior by embedding a hidden trigger during the training phase, allowing unauthorized control over the model's output during inference time. Although numerous defenses exist for image classification models, the…
▽ More
Backdoor attacks present a substantial security concern for deep learning models, especially those utilized in applications critical to safety and security. These attacks manipulate model behavior by embedding a hidden trigger during the training phase, allowing unauthorized control over the model's output during inference time. Although numerous defenses exist for image classification models, there is a conspicuous absence of defenses tailored for time series data, as well as an end-to-end solution capable of training clean models on poisoned data. To address this gap, this paper builds upon Anti-Backdoor Learning (ABL) and introduces an innovative method, End-to-End Anti-Backdoor Learning (E2ABL), for robust training against backdoor attacks. Unlike the original ABL, which employs a two-stage training procedure, E2ABL accomplishes end-to-end training through an additional classification head linked to the shallow layers of a Deep Neural Network (DNN). This secondary head actively identifies potential backdoor triggers, allowing the model to dynamically cleanse these samples and their corresponding labels during training. Our experiments reveal that E2ABL significantly improves on existing defenses and is effective against a broad range of backdoor attacks in both image and time series domains.
△ Less
Submitted 6 January, 2024;
originally announced January 2024.
-
The geometry of flow: Advancing predictions of river geometry with multi-model machine learning
Authors:
Shuyu Y Chang,
Zahra Ghahremani,
Laura Manuel,
Mohammad Erfani,
Chaopeng Shen,
Sagy Cohen,
Kimberly Van Meter,
Jennifer L Pierce,
Ehab A Meselhe,
Erfan Goharian
Abstract:
Hydraulic geometry parameters describing river hydrogeomorphic is important for flood forecasting. Although well-established, power-law hydraulic geometry curves have been widely used to understand riverine systems and mapping flooding inundation worldwide for the past 70 years, we have become increasingly aware of the limitations of these approaches. In the present study, we have moved beyond the…
▽ More
Hydraulic geometry parameters describing river hydrogeomorphic is important for flood forecasting. Although well-established, power-law hydraulic geometry curves have been widely used to understand riverine systems and mapping flooding inundation worldwide for the past 70 years, we have become increasingly aware of the limitations of these approaches. In the present study, we have moved beyond these traditional power-law relationships for river geometry, testing the ability of machine-learning models to provide improved predictions of river width and depth. For this work, we have used an unprecedentedly large river measurement dataset (HYDRoSWOT) as well as a suite of watershed predictor data to develop novel data-driven approaches to better estimate river geometries over the contiguous United States (CONUS). Our Random Forest, XGBoost, and neural network models out-performed the traditional, regionalized power law-based hydraulic geometry equations for both width and depth, providing R-squared values of as high as 0.75 for width and as high as 0.67 for depth, compared with R-squared values of 0.57 for width and 0.18 for depth from the regional hydraulic geometry equations. Our results also show diverse performance outcomes across stream orders and geographical regions for the different machine-learning models, demonstrating the value of using multi-model approaches to maximize the predictability of river geometry. The developed models have been used to create the newly publicly available STREAM-geo dataset, which provides river width, depth, width/depth ratio, and river and stream surface area (%RSSA) for nearly 2.7 million NHDPlus stream reaches across the rivers and streams across the contiguous US.
△ Less
Submitted 27 November, 2023;
originally announced December 2023.
-
It's Simplex! Disaggregating Measures to Improve Certified Robustness
Authors:
Andrew C. Cullen,
Paul Montague,
Shijie Liu,
Sarah M. Erfani,
Benjamin I. P. Rubinstein
Abstract:
Certified robustness circumvents the fragility of defences against adversarial attacks, by endowing model predictions with guarantees of class invariance for attacks up to a calculated size. While there is value in these certifications, the techniques through which we assess their performance do not present a proper accounting of their strengths and weaknesses, as their analysis has eschewed consi…
▽ More
Certified robustness circumvents the fragility of defences against adversarial attacks, by endowing model predictions with guarantees of class invariance for attacks up to a calculated size. While there is value in these certifications, the techniques through which we assess their performance do not present a proper accounting of their strengths and weaknesses, as their analysis has eschewed consideration of performance over individual samples in favour of aggregated measures. By considering the potential output space of certified models, this work presents two distinct approaches to improve the analysis of certification mechanisms, that allow for both dataset-independent and dataset-dependent measures of certification performance. Embracing such a perspective uncovers new certification approaches, which have the potential to more than double the achievable radius of certification, relative to current state-of-the-art. Empirical evaluation verifies that our new approach can certify $9\%$ more samples at noise scale $σ= 1$, with greater relative improvements observed as the difficulty of the predictive task increases.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Enhancing the Antidote: Improved Pointwise Certifications against Poisoning Attacks
Authors:
Shijie Liu,
Andrew C. Cullen,
Paul Montague,
Sarah M. Erfani,
Benjamin I. P. Rubinstein
Abstract:
Poisoning attacks can disproportionately influence model behaviour by making small changes to the training corpus. While defences against specific poisoning attacks do exist, they in general do not provide any guarantees, leaving them potentially countered by novel attacks. In contrast, by examining worst-case behaviours Certified Defences make it possible to provide guarantees of the robustness o…
▽ More
Poisoning attacks can disproportionately influence model behaviour by making small changes to the training corpus. While defences against specific poisoning attacks do exist, they in general do not provide any guarantees, leaving them potentially countered by novel attacks. In contrast, by examining worst-case behaviours Certified Defences make it possible to provide guarantees of the robustness of a sample against adversarial attacks modifying a finite number of training samples, known as pointwise certification. We achieve this by exploiting both Differential Privacy and the Sampled Gaussian Mechanism to ensure the invariance of prediction for each testing instance against finite numbers of poisoned examples. In doing so, our model provides guarantees of adversarial robustness that are more than twice as large as those provided by prior certifications.
△ Less
Submitted 18 March, 2024; v1 submitted 14 August, 2023;
originally announced August 2023.
-
Towards quantum enhanced adversarial robustness in machine learning
Authors:
Maxwell T. West,
Shu-Lok Tsang,
Jia S. Low,
Charles D. Hill,
Christopher Leckie,
Lloyd C. L. Hollenberg,
Sarah M. Erfani,
Muhammad Usman
Abstract:
Machine learning algorithms are powerful tools for data driven tasks such as image classification and feature detection, however their vulnerability to adversarial examples - input samples manipulated to fool the algorithm - remains a serious challenge. The integration of machine learning with quantum computing has the potential to yield tools offering not only better accuracy and computational ef…
▽ More
Machine learning algorithms are powerful tools for data driven tasks such as image classification and feature detection, however their vulnerability to adversarial examples - input samples manipulated to fool the algorithm - remains a serious challenge. The integration of machine learning with quantum computing has the potential to yield tools offering not only better accuracy and computational efficiency, but also superior robustness against adversarial attacks. Indeed, recent work has employed quantum mechanical phenomena to defend against adversarial attacks, spurring the rapid development of the field of quantum adversarial machine learning (QAML) and potentially yielding a new source of quantum advantage. Despite promising early results, there remain challenges towards building robust real-world QAML tools. In this review we discuss recent progress in QAML and identify key challenges. We also suggest future research directions which could determine the route to practicality for QAML approaches as quantum computing hardware scales up and noise levels are reduced.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
-
Et Tu Certifications: Robustness Certificates Yield Better Adversarial Examples
Authors:
Andrew C. Cullen,
Shijie Liu,
Paul Montague,
Sarah M. Erfani,
Benjamin I. P. Rubinstein
Abstract:
In guaranteeing the absence of adversarial examples in an instance's neighbourhood, certification mechanisms play an important role in demonstrating neural net robustness. In this paper, we ask if these certifications can compromise the very models they help to protect? Our new \emph{Certification Aware Attack} exploits certifications to produce computationally efficient norm-minimising adversaria…
▽ More
In guaranteeing the absence of adversarial examples in an instance's neighbourhood, certification mechanisms play an important role in demonstrating neural net robustness. In this paper, we ask if these certifications can compromise the very models they help to protect? Our new \emph{Certification Aware Attack} exploits certifications to produce computationally efficient norm-minimising adversarial examples $74 \%$ more often than comparable attacks, while reducing the median perturbation norm by more than $10\%$. While these attacks can be used to assess the tightness of certification bounds, they also highlight that releasing certifications can paradoxically reduce security.
△ Less
Submitted 11 June, 2024; v1 submitted 8 February, 2023;
originally announced February 2023.
-
Hybrid Quantum-Classical Generative Adversarial Network for High Resolution Image Generation
Authors:
Shu Lok Tsang,
Maxwell T. West,
Sarah M. Erfani,
Muhammad Usman
Abstract:
Quantum machine learning (QML) has received increasing attention due to its potential to outperform classical machine learning methods in problems pertaining classification and identification tasks. A subclass of QML methods is quantum generative adversarial networks (QGANs) which have been studied as a quantum counterpart of classical GANs widely used in image manipulation and generation tasks. T…
▽ More
Quantum machine learning (QML) has received increasing attention due to its potential to outperform classical machine learning methods in problems pertaining classification and identification tasks. A subclass of QML methods is quantum generative adversarial networks (QGANs) which have been studied as a quantum counterpart of classical GANs widely used in image manipulation and generation tasks. The existing work on QGANs is still limited to small-scale proof-of-concept examples based on images with significant downscaling. Here we integrate classical and quantum techniques to propose a new hybrid quantum-classical GAN framework. We demonstrate its superior learning capabilities by generating $28 \times 28$ pixels grey-scale images without dimensionality reduction or classical pre/post-processing on multiple classes of the standard MNIST and Fashion MNIST datasets, which achieves comparable results to classical frameworks with three orders of magnitude less trainable generator parameters. To gain further insight into the working of our hybrid approach, we systematically explore the impact of its parameter space by varying the number of qubits, the size of image patches, the number of layers in the generator, the shape of the patches and the choice of prior distribution. Our results show that increasing the quantum generator size generally improves the learning capability of the network. The developed framework provides a foundation for future design of QGANs with optimal parameter set tailored for complex image generation tasks.
△ Less
Submitted 20 January, 2023; v1 submitted 22 December, 2022;
originally announced December 2022.
-
Benchmarking Adversarially Robust Quantum Machine Learning at Scale
Authors:
Maxwell T. West,
Sarah M. Erfani,
Christopher Leckie,
Martin Sevior,
Lloyd C. L. Hollenberg,
Muhammad Usman
Abstract:
Machine learning (ML) methods such as artificial neural networks are rapidly becoming ubiquitous in modern science, technology and industry. Despite their accuracy and sophistication, neural networks can be easily fooled by carefully designed malicious inputs known as adversarial attacks. While such vulnerabilities remain a serious challenge for classical neural networks, the extent of their exist…
▽ More
Machine learning (ML) methods such as artificial neural networks are rapidly becoming ubiquitous in modern science, technology and industry. Despite their accuracy and sophistication, neural networks can be easily fooled by carefully designed malicious inputs known as adversarial attacks. While such vulnerabilities remain a serious challenge for classical neural networks, the extent of their existence is not fully understood in the quantum ML setting. In this work, we benchmark the robustness of quantum ML networks, such as quantum variational classifiers (QVC), at scale by performing rigorous training for both simple and complex image datasets and through a variety of high-end adversarial attacks. Our results show that QVCs offer a notably enhanced robustness against classical adversarial attacks by learning features which are not detected by the classical neural networks, indicating a possible quantum advantage for ML tasks. Contrarily, and remarkably, the converse is not true, with attacks on quantum networks also capable of deceiving classical neural networks. By combining quantum and classical network outcomes, we propose a novel adversarial attack detection technology. Traditionally quantum advantage in ML systems has been sought through increased accuracy or algorithmic speed-up, but our work has revealed the potential for a new kind of quantum advantage through superior robustness of ML models, whose practical realisation will address serious security concerns and reliability issues of ML algorithms employed in a myriad of applications including autonomous vehicles, cybersecurity, and surveillance robotic systems.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Backdoor Attacks on Time Series: A Generative Approach
Authors:
Yujing Jiang,
Xingjun Ma,
Sarah Monazam Erfani,
James Bailey
Abstract:
Backdoor attacks have emerged as one of the major security threats to deep learning models as they can easily control the model's test-time predictions by pre-injecting a backdoor trigger into the model at training time. While backdoor attacks have been extensively studied on images, few works have investigated the threat of backdoor attacks on time series data. To fill this gap, in this paper we…
▽ More
Backdoor attacks have emerged as one of the major security threats to deep learning models as they can easily control the model's test-time predictions by pre-injecting a backdoor trigger into the model at training time. While backdoor attacks have been extensively studied on images, few works have investigated the threat of backdoor attacks on time series data. To fill this gap, in this paper we present a novel generative approach for time series backdoor attacks against deep learning based time series classifiers. Backdoor attacks have two main goals: high stealthiness and high attack success rate. We find that, compared to images, it can be more challenging to achieve the two goals on time series. This is because time series have fewer input dimensions and lower degrees of freedom, making it hard to achieve a high attack success rate without compromising stealthiness. Our generative approach addresses this challenge by generating trigger patterns that are as realistic as real-time series patterns while achieving a high attack success rate without causing a significant drop in clean accuracy. We also show that our proposed attack is resistant to potential backdoor defenses. Furthermore, we propose a novel universal generator that can poison any type of time series with a single generator that allows universal attacks without the need to fine-tune the generative model for new time series datasets.
△ Less
Submitted 5 February, 2023; v1 submitted 15 November, 2022;
originally announced November 2022.
-
Double Bubble, Toil and Trouble: Enhancing Certified Robustness through Transitivity
Authors:
Andrew C. Cullen,
Paul Montague,
Shijie Liu,
Sarah M. Erfani,
Benjamin I. P. Rubinstein
Abstract:
In response to subtle adversarial examples flipping classifications of neural network models, recent research has promoted certified robustness as a solution. There, invariance of predictions to all norm-bounded attacks is achieved through randomised smoothing of network inputs. Today's state-of-the-art certifications make optimal use of the class output scores at the input instance under test: no…
▽ More
In response to subtle adversarial examples flipping classifications of neural network models, recent research has promoted certified robustness as a solution. There, invariance of predictions to all norm-bounded attacks is achieved through randomised smoothing of network inputs. Today's state-of-the-art certifications make optimal use of the class output scores at the input instance under test: no better radius of certification (under the $L_2$ norm) is possible given only these score. However, it is an open question as to whether such lower bounds can be improved using local information around the instance under test. In this work, we demonstrate how today's "optimal" certificates can be improved by exploiting both the transitivity of certifications, and the geometry of the input space, giving rise to what we term Geometrically-Informed Certified Robustness. By considering the smallest distance to points on the boundary of a set of certifications this approach improves certifications for more than $80\%$ of Tiny-Imagenet instances, yielding an on average $5 \%$ increase in the associated certification. When incorporating training time processes that enhance the certified radius, our technique shows even more promising results, with a uniform $4$ percentage point increase in the achieved certified radius.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
Performance analysis of coreset selection for quantum implementation of K-Means clustering algorithm
Authors:
Fanzhe Qu,
Sarah M. Erfani,
Muhammad Usman
Abstract:
Quantum computing is anticipated to offer immense computational capabilities which could provide efficient solutions to many data science problems. However, the current generation of quantum devices are small and noisy, which makes it difficult to process large data sets relevant for practical problems. Coreset selection aims to circumvent this problem by reducing the size of input data without co…
▽ More
Quantum computing is anticipated to offer immense computational capabilities which could provide efficient solutions to many data science problems. However, the current generation of quantum devices are small and noisy, which makes it difficult to process large data sets relevant for practical problems. Coreset selection aims to circumvent this problem by reducing the size of input data without compromising the accuracy. Recent work has shown that coreset selection can help to implement quantum K-Means clustering problem. However, the impact of coreset selection on the performance of quantum K-Means clustering has not been explored. In this work, we compare the relative performance of two coreset techniques (BFL16 and ONESHOT), and the size of coreset construction in each case, with respect to a variety of data sets and layout the advantages and limitations of coreset selection in implementing quantum algorithms. We also investigated the effect of depolarisation quantum noise and bit-flip error, and implemented the Quantum AutoEncoder technique for surpassing the noise effect. Our work provides useful insights for future implementation of data science algorithms on near-term quantum devices where problem size has been reduced by coreset selection.
△ Less
Submitted 15 June, 2022;
originally announced June 2022.
-
ATLANTIS: A Benchmark for Semantic Segmentation of Waterbody Images
Authors:
Seyed Mohammad Hassan Erfani,
Zhenyao Wu,
Xinyi Wu,
Song Wang,
Erfan Goharian
Abstract:
Vision-based semantic segmentation of waterbodies and nearby related objects provides important information for managing water resources and handling flooding emergency. However, the lack of large-scale labeled training and testing datasets for water-related categories prevents researchers from studying water-related issues in the computer vision field. To tackle this problem, we present ATLANTIS,…
▽ More
Vision-based semantic segmentation of waterbodies and nearby related objects provides important information for managing water resources and handling flooding emergency. However, the lack of large-scale labeled training and testing datasets for water-related categories prevents researchers from studying water-related issues in the computer vision field. To tackle this problem, we present ATLANTIS, a new benchmark for semantic segmentation of waterbodies and related objects. ATLANTIS consists of 5,195 images of waterbodies, as well as high quality pixel-level manual annotations of 56 classes of objects, including 17 classes of man-made objects, 18 classes of natural objects and 21 general classes. We analyze ATLANTIS in detail and evaluate several state-of-the-art semantic segmentation networks on our benchmark. In addition, a novel deep neural network, AQUANet, is developed for waterbody semantic segmentation by processing the aquatic and non-aquatic regions in two different paths. AQUANet also incorporates low-level feature modulation and cross-path modulation for enhancing feature representation. Experimental results show that the proposed AQUANet outperforms other state-of-the-art semantic segmentation networks on ATLANTIS. We claim that ATLANTIS is the largest waterbody image dataset for semantic segmentation providing a wide range of water and water-related classes and it will benefit researchers of both computer vision and water resources engineering.
△ Less
Submitted 22 November, 2021;
originally announced November 2021.
-
Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks
Authors:
Hanxun Huang,
Yisen Wang,
Sarah Monazam Erfani,
Quanquan Gu,
James Bailey,
Xingjun Ma
Abstract:
Deep neural networks (DNNs) are known to be vulnerable to adversarial attacks. A range of defense methods have been proposed to train adversarially robust DNNs, among which adversarial training has demonstrated promising results. However, despite preliminary understandings developed for adversarial training, it is still not clear, from the architectural perspective, what configurations can lead to…
▽ More
Deep neural networks (DNNs) are known to be vulnerable to adversarial attacks. A range of defense methods have been proposed to train adversarially robust DNNs, among which adversarial training has demonstrated promising results. However, despite preliminary understandings developed for adversarial training, it is still not clear, from the architectural perspective, what configurations can lead to more robust DNNs. In this paper, we address this gap via a comprehensive investigation on the impact of network width and depth on the robustness of adversarially trained DNNs. Specifically, we make the following key observations: 1) more parameters (higher model capacity) does not necessarily help adversarial robustness; 2) reducing capacity at the last stage (the last group of blocks) of the network can actually improve adversarial robustness; and 3) under the same parameter budget, there exists an optimal architectural configuration for adversarial robustness. We also provide a theoretical analysis explaning why such network configuration can help robustness. These architectural insights can help design adversarially robust DNNs. Code is available at \url{https://github.com/HanxunH/RobustWRN}.
△ Less
Submitted 22 January, 2022; v1 submitted 7 October, 2021;
originally announced October 2021.
-
Local Intrinsic Dimensionality Signals Adversarial Perturbations
Authors:
Sandamal Weerasinghe,
Tansu Alpcan,
Sarah M. Erfani,
Christopher Leckie,
Benjamin I. P. Rubinstein
Abstract:
The vulnerability of machine learning models to adversarial perturbations has motivated a significant amount of research under the broad umbrella of adversarial machine learning. Sophisticated attacks may cause learning algorithms to learn decision functions or make decisions with poor predictive performance. In this context, there is a growing body of literature that uses local intrinsic dimensio…
▽ More
The vulnerability of machine learning models to adversarial perturbations has motivated a significant amount of research under the broad umbrella of adversarial machine learning. Sophisticated attacks may cause learning algorithms to learn decision functions or make decisions with poor predictive performance. In this context, there is a growing body of literature that uses local intrinsic dimensionality (LID), a local metric that describes the minimum number of latent variables required to describe each data point, for detecting adversarial samples and subsequently mitigating their effects. The research to date has tended to focus on using LID as a practical defence method often without fully explaining why LID can detect adversarial samples. In this paper, we derive a lower-bound and an upper-bound for the LID value of a perturbed data point and demonstrate that the bounds, in particular the lower-bound, has a positive correlation with the magnitude of the perturbation. Hence, we demonstrate that data points that are perturbed by a large amount would have large LID values compared to unperturbed samples, thus justifying its use in the prior literature. Furthermore, our empirical validation demonstrates the validity of the bounds on benchmark datasets.
△ Less
Submitted 24 September, 2021;
originally announced September 2021.
-
Dual Head Adversarial Training
Authors:
Yujing Jiang,
Xingjun Ma,
Sarah Monazam Erfani,
James Bailey
Abstract:
Deep neural networks (DNNs) are known to be vulnerable to adversarial examples/attacks, raising concerns about their reliability in safety-critical applications. A number of defense methods have been proposed to train robust DNNs resistant to adversarial attacks, among which adversarial training has so far demonstrated the most promising results. However, recent studies have shown that there exist…
▽ More
Deep neural networks (DNNs) are known to be vulnerable to adversarial examples/attacks, raising concerns about their reliability in safety-critical applications. A number of defense methods have been proposed to train robust DNNs resistant to adversarial attacks, among which adversarial training has so far demonstrated the most promising results. However, recent studies have shown that there exists an inherent tradeoff between accuracy and robustness in adversarially-trained DNNs. In this paper, we propose a novel technique Dual Head Adversarial Training (DH-AT) to further improve the robustness of existing adversarial training methods. Different from existing improved variants of adversarial training, DH-AT modifies both the architecture of the network and the training strategy to seek more robustness. Specifically, DH-AT first attaches a second network head (or branch) to one intermediate layer of the network, then uses a lightweight convolutional neural network (CNN) to aggregate the outputs of the two heads. The training strategy is also adapted to reflect the relative importance of the two heads. We empirically show, on multiple benchmark datasets, that DH-AT can bring notable robustness improvements to existing adversarial training methods. Compared with TRADES, one state-of-the-art adversarial training method, our DH-AT can improve the robustness by 3.4% against PGD40 and 2.3% against AutoAttack, and also improve the clean accuracy by 1.8%.
△ Less
Submitted 22 April, 2021; v1 submitted 21 April, 2021;
originally announced April 2021.
-
A Deep Adversarial Model for Suffix and Remaining Time Prediction of Event Sequences
Authors:
Farbod Taymouri,
Marcello La Rosa,
Sarah M. Erfani
Abstract:
Event suffix and remaining time prediction are sequence to sequence learning tasks. They have wide applications in different areas such as economics, digital health, business process management and IT infrastructure monitoring. Timestamped event sequences contain ordered events which carry at least two attributes: the event's label and its timestamp. Suffix and remaining time prediction are about…
▽ More
Event suffix and remaining time prediction are sequence to sequence learning tasks. They have wide applications in different areas such as economics, digital health, business process management and IT infrastructure monitoring. Timestamped event sequences contain ordered events which carry at least two attributes: the event's label and its timestamp. Suffix and remaining time prediction are about obtaining the most likely continuation of event labels and the remaining time until the sequence finishes, respectively. Recent deep learning-based works for such predictions are prone to potentially large prediction errors because of closed-loop training (i.e., the next event is conditioned on the ground truth of previous events) and open-loop inference (i.e., the next event is conditioned on previously predicted events). In this work, we propose an encoder-decoder architecture for open-loop training to advance the suffix and remaining time prediction of event sequences. To capture the joint temporal dynamics of events, we harness the power of adversarial learning techniques to boost prediction performance. We consider four real-life datasets and three baselines in our experiments. The results show improvements up to four times compared to the state of the art in suffix and remaining time prediction of event sequences, specifically in the realm of business process executions. We also show that the obtained improvements of adversarial training are superior compared to standard training under the same experimental setup.
△ Less
Submitted 14 February, 2021;
originally announced February 2021.
-
Unlearnable Examples: Making Personal Data Unexploitable
Authors:
Hanxun Huang,
Xingjun Ma,
Sarah Monazam Erfani,
James Bailey,
Yisen Wang
Abstract:
The volume of "free" data on the internet has been key to the current success of deep learning. However, it also raises privacy concerns about the unauthorized exploitation of personal data for training commercial models. It is thus crucial to develop methods to prevent unauthorized data exploitation. This paper raises the question: \emph{can data be made unlearnable for deep learning models?} We…
▽ More
The volume of "free" data on the internet has been key to the current success of deep learning. However, it also raises privacy concerns about the unauthorized exploitation of personal data for training commercial models. It is thus crucial to develop methods to prevent unauthorized data exploitation. This paper raises the question: \emph{can data be made unlearnable for deep learning models?} We present a type of \emph{error-minimizing} noise that can indeed make training examples unlearnable. Error-minimizing noise is intentionally generated to reduce the error of one or more of the training example(s) close to zero, which can trick the model into believing there is "nothing" to learn from these example(s). The noise is restricted to be imperceptible to human eyes, and thus does not affect normal data utility. We empirically verify the effectiveness of error-minimizing noise in both sample-wise and class-wise forms. We also demonstrate its flexibility under extensive experimental settings and practicability in a case study of face recognition. Our work establishes an important first step towards making personal data unexploitable to deep learning models.
△ Less
Submitted 24 February, 2021; v1 submitted 13 January, 2021;
originally announced January 2021.
-
Neural Architecture Search via Combinatorial Multi-Armed Bandit
Authors:
Hanxun Huang,
Xingjun Ma,
Sarah M. Erfani,
James Bailey
Abstract:
Neural Architecture Search (NAS) has gained significant popularity as an effective tool for designing high performance deep neural networks (DNNs). NAS can be performed via policy gradient, evolutionary algorithms, differentiable architecture search or tree-search methods. While significant progress has been made for both policy gradient and differentiable architecture search, tree-search methods…
▽ More
Neural Architecture Search (NAS) has gained significant popularity as an effective tool for designing high performance deep neural networks (DNNs). NAS can be performed via policy gradient, evolutionary algorithms, differentiable architecture search or tree-search methods. While significant progress has been made for both policy gradient and differentiable architecture search, tree-search methods have so far failed to achieve comparable accuracy or search efficiency. In this paper, we formulate NAS as a Combinatorial Multi-Armed Bandit (CMAB) problem (CMAB-NAS). This allows the decomposition of a large search space into smaller blocks where tree-search methods can be applied more effectively and efficiently. We further leverage a tree-based method called Nested Monte-Carlo Search to tackle the CMAB-NAS problem. On CIFAR-10, our approach discovers a cell structure that achieves a low error rate that is comparable to the state-of-the-art, using only 0.58 GPU days, which is 20 times faster than current tree-search methods. Moreover, the discovered structure transfers well to large-scale datasets such as ImageNet.
△ Less
Submitted 24 April, 2021; v1 submitted 1 January, 2021;
originally announced January 2021.
-
Improving Scalability of Contrast Pattern Mining for Network Traffic Using Closed Patterns
Authors:
Elaheh AlipourChavary,
Sarah M. Erfani,
Christopher Leckie
Abstract:
Contrast pattern mining (CPM) aims to discover patterns whose support increases significantly from a background dataset compared to a target dataset. CPM is particularly useful for characterising changes in evolving systems, e.g., in network traffic analysis to detect unusual activity. While most existing techniques focus on extracting either the whole set of contrast patterns (CPs) or minimal set…
▽ More
Contrast pattern mining (CPM) aims to discover patterns whose support increases significantly from a background dataset compared to a target dataset. CPM is particularly useful for characterising changes in evolving systems, e.g., in network traffic analysis to detect unusual activity. While most existing techniques focus on extracting either the whole set of contrast patterns (CPs) or minimal sets, the problem of efficiently finding a relevant subset of CPs, especially in high dimensional datasets, is an open challenge. In this paper, we focus on extracting the most specific set of CPs to discover significant changes between two datasets. Our approach to this problem uses closed patterns to substantially reduce redundant patterns. Our experimental results on several real and emulated network traffic datasets demonstrate that our proposed unsupervised algorithm is up to 100 times faster than an existing approach for CPM on network traffic data [2]. In addition, as an application of CPs, we demonstrate that CPM is a highly effective method for detection of meaningful changes in network traffic.
△ Less
Submitted 16 November, 2020;
originally announced November 2020.
-
Defending Distributed Classifiers Against Data Poisoning Attacks
Authors:
Sandamal Weerasinghe,
Tansu Alpcan,
Sarah M. Erfani,
Christopher Leckie
Abstract:
Support Vector Machines (SVMs) are vulnerable to targeted training data manipulations such as poisoning attacks and label flips. By carefully manipulating a subset of training samples, the attacker forces the learner to compute an incorrect decision boundary, thereby cause misclassifications. Considering the increased importance of SVMs in engineering and life-critical applications, we develop a n…
▽ More
Support Vector Machines (SVMs) are vulnerable to targeted training data manipulations such as poisoning attacks and label flips. By carefully manipulating a subset of training samples, the attacker forces the learner to compute an incorrect decision boundary, thereby cause misclassifications. Considering the increased importance of SVMs in engineering and life-critical applications, we develop a novel defense algorithm that improves resistance against such attacks. Local Intrinsic Dimensionality (LID) is a promising metric that characterizes the outlierness of data samples. In this work, we introduce a new approximation of LID called K-LID that uses kernel distance in the LID calculation, which allows LID to be calculated in high dimensional transformed spaces. We introduce a weighted SVM against such attacks using K-LID as a distinguishing characteristic that de-emphasizes the effect of suspicious data samples on the SVM decision boundary. Each sample is weighted on how likely its K-LID value is from the benign K-LID distribution rather than the attacked K-LID distribution. We then demonstrate how the proposed defense can be applied to a distributed SVM framework through a case study on an SDR-based surveillance system. Experiments with benchmark data sets show that the proposed defense reduces classification error rates substantially (10% on average).
△ Less
Submitted 20 August, 2020;
originally announced August 2020.
-
Defending Regression Learners Against Poisoning Attacks
Authors:
Sandamal Weerasinghe,
Sarah M. Erfani,
Tansu Alpcan,
Christopher Leckie,
Justin Kopacz
Abstract:
Regression models, which are widely used from engineering applications to financial forecasting, are vulnerable to targeted malicious attacks such as training data poisoning, through which adversaries can manipulate their predictions. Previous works that attempt to address this problem rely on assumptions about the nature of the attack/attacker or overestimate the knowledge of the learner, making…
▽ More
Regression models, which are widely used from engineering applications to financial forecasting, are vulnerable to targeted malicious attacks such as training data poisoning, through which adversaries can manipulate their predictions. Previous works that attempt to address this problem rely on assumptions about the nature of the attack/attacker or overestimate the knowledge of the learner, making them impractical. We introduce a novel Local Intrinsic Dimensionality (LID) based measure called N-LID that measures the local deviation of a given data point's LID with respect to its neighbors. We then show that N-LID can distinguish poisoned samples from normal samples and propose an N-LID based defense approach that makes no assumptions of the attacker. Through extensive numerical experiments with benchmark datasets, we show that the proposed defense mechanism outperforms the state of the art defenses in terms of prediction accuracy (up to 76% lower MSE compared to an undefended ridge model) and running time.
△ Less
Submitted 20 August, 2020;
originally announced August 2020.
-
Applying support vector data description for fraud detection
Authors:
Mohamad Khedmati,
Masoud Erfani,
Mohammad GhasemiGol
Abstract:
Fraud detection is an important topic that applies to various enterprises such as banking and financial sectors, insurance, government agencies, law enforcement, and more. Fraud attempts have been risen remarkably in current years, shaping fraud detection an essential topic for research. One of the main challenges in fraud detection is acquiring fraud samples which is a complex and challenging tas…
▽ More
Fraud detection is an important topic that applies to various enterprises such as banking and financial sectors, insurance, government agencies, law enforcement, and more. Fraud attempts have been risen remarkably in current years, shaping fraud detection an essential topic for research. One of the main challenges in fraud detection is acquiring fraud samples which is a complex and challenging task. In order to deal with this challenge, we apply one-class classification methods such as SVDD which does not need the fraud samples for training. Also, we present our algorithm REDBSCAN which is an extension of DBSCAN to reduce the number of samples and select those that keep the shape of data. The results obtained by the implementation of the proposed method indicated that the fraud detection process was improved in both performance and speed.
△ Less
Submitted 31 May, 2020;
originally announced June 2020.
-
MMF: Attribute Interpretable Collaborative Filtering
Authors:
Yixin Su,
Sarah Monazam Erfani,
Rui Zhang
Abstract:
Collaborative filtering is one of the most popular techniques in designing recommendation systems, and its most representative model, matrix factorization, has been wildly used by researchers and the industry. However, this model suffers from the lack of interpretability and the item cold-start problem, which limit its reliability and practicability. In this paper, we propose an interpretable reco…
▽ More
Collaborative filtering is one of the most popular techniques in designing recommendation systems, and its most representative model, matrix factorization, has been wildly used by researchers and the industry. However, this model suffers from the lack of interpretability and the item cold-start problem, which limit its reliability and practicability. In this paper, we propose an interpretable recommendation model called Multi-Matrix Factorization (MMF), which addresses these two limitations and achieves the state-of-the-art prediction accuracy by exploiting common attributes that are present in different items. In the model, predicted item ratings are regarded as weighted aggregations of attribute ratings generated by the inner product of the user latent vectors and the attribute latent vectors. MMF provides more fine grained analyses than matrix factorization in the following ways: attribute ratings with weights allow the understanding of how much each attribute contributes to the recommendation and hence provide interpretability; the common attributes can act as a link between existing and new items, which solves the item cold-start problem when no rating exists on an item. We evaluate the interpretability of MMF comprehensively, and conduct extensive experiments on real datasets to show that MMF outperforms state-of-the-art baselines in terms of accuracy.
△ Less
Submitted 2 August, 2019;
originally announced August 2019.
-
FCC-GAN: A Fully Connected and Convolutional Net Architecture for GANs
Authors:
Sukarna Barua,
Sarah Monazam Erfani,
James Bailey
Abstract:
Generative Adversarial Networks (GANs) are a powerful class of generative models. Despite their successes, the most appropriate choice of a GAN network architecture is still not well understood. GAN models for image synthesis have adopted a deep convolutional network architecture, which eliminates or minimizes the use of fully connected and pooling layers in favor of convolution layers in the gene…
▽ More
Generative Adversarial Networks (GANs) are a powerful class of generative models. Despite their successes, the most appropriate choice of a GAN network architecture is still not well understood. GAN models for image synthesis have adopted a deep convolutional network architecture, which eliminates or minimizes the use of fully connected and pooling layers in favor of convolution layers in the generator and discriminator of GANs. In this paper, we demonstrate that a convolution network architecture utilizing deep fully connected layers and pooling layers can be more effective than the traditional convolution-only architecture, and we propose FCC-GAN, a fully connected and convolutional GAN architecture. Models based on our FCC-GAN architecture learn both faster than the conventional architecture and also generate higher quality of samples. We demonstrate the effectiveness and stability of our approach across four popular image datasets.
△ Less
Submitted 27 May, 2019; v1 submitted 7 May, 2019;
originally announced May 2019.
-
Quality Evaluation of GANs Using Cross Local Intrinsic Dimensionality
Authors:
Sukarna Barua,
Xingjun Ma,
Sarah Monazam Erfani,
Michael E. Houle,
James Bailey
Abstract:
Generative Adversarial Networks (GANs) are an elegant mechanism for data generation. However, a key challenge when using GANs is how to best measure their ability to generate realistic data. In this paper, we demonstrate that an intrinsic dimensional characterization of the data space learned by a GAN model leads to an effective evaluation metric for GAN quality. In particular, we propose a new ev…
▽ More
Generative Adversarial Networks (GANs) are an elegant mechanism for data generation. However, a key challenge when using GANs is how to best measure their ability to generate realistic data. In this paper, we demonstrate that an intrinsic dimensional characterization of the data space learned by a GAN model leads to an effective evaluation metric for GAN quality. In particular, we propose a new evaluation measure, CrossLID, that assesses the local intrinsic dimensionality (LID) of real-world data with respect to neighborhoods found in GAN-generated samples. Intuitively, CrossLID measures the degree to which manifolds of two data distributions coincide with each other. In experiments on 4 benchmark image datasets, we compare our proposed measure to several state-of-the-art evaluation metrics. Our experiments show that CrossLID is strongly correlated with the progress of GAN training, is sensitive to mode collapse, is robust to small-scale noise and image transformations, and robust to sample size. Furthermore, we show how CrossLID can be used within the GAN training process to improve generation quality.
△ Less
Submitted 2 May, 2019;
originally announced May 2019.
-
Learning Deep Hidden Nonlinear Dynamics from Aggregate Data
Authors:
Yisen Wang,
Bo Dai,
Lingkai Kong,
Sarah Monazam Erfani,
James Bailey,
Hongyuan Zha
Abstract:
Learning nonlinear dynamics from diffusion data is a challenging problem since the individuals observed may be different at different time points, generally following an aggregate behaviour. Existing work cannot handle the tasks well since they model such dynamics either directly on observations or enforce the availability of complete longitudinal individual-level trajectories. However, in most of…
▽ More
Learning nonlinear dynamics from diffusion data is a challenging problem since the individuals observed may be different at different time points, generally following an aggregate behaviour. Existing work cannot handle the tasks well since they model such dynamics either directly on observations or enforce the availability of complete longitudinal individual-level trajectories. However, in most of the practical applications, these requirements are unrealistic: the evolving dynamics may be too complex to be modeled directly on observations, and individual-level trajectories may not be available due to technical limitations, experimental costs and/or privacy issues. To address these challenges, we formulate a model of diffusion dynamics as the {\em hidden stochastic process} via the introduction of hidden variables for flexibility, and learn the hidden dynamics directly on {\em aggregate observations} without any requirement for individual-level trajectories. We propose a dynamic generative model with Wasserstein distance for LEarninG dEep hidden Nonlinear Dynamics (LEGEND) and prove its theoretical guarantees as well. Experiments on a range of synthetic and real-world datasets illustrate that LEGEND has very strong performance compared to state-of-the-art baselines.
△ Less
Submitted 29 July, 2018; v1 submitted 22 July, 2018;
originally announced July 2018.
-
Dimensionality-Driven Learning with Noisy Labels
Authors:
Xingjun Ma,
Yisen Wang,
Michael E. Houle,
Shuo Zhou,
Sarah M. Erfani,
Shu-Tao Xia,
Sudanthi Wijewickrema,
James Bailey
Abstract:
Datasets with significant proportions of noisy (incorrect) class labels present challenges for training accurate Deep Neural Networks (DNNs). We propose a new perspective for understanding DNN generalization for such datasets, by investigating the dimensionality of the deep representation subspace of training samples. We show that from a dimensionality perspective, DNNs exhibit quite distinctive l…
▽ More
Datasets with significant proportions of noisy (incorrect) class labels present challenges for training accurate Deep Neural Networks (DNNs). We propose a new perspective for understanding DNN generalization for such datasets, by investigating the dimensionality of the deep representation subspace of training samples. We show that from a dimensionality perspective, DNNs exhibit quite distinctive learning styles when trained with clean labels versus when trained with a proportion of noisy labels. Based on this finding, we develop a new dimensionality-driven learning strategy, which monitors the dimensionality of subspaces during training and adapts the loss function accordingly. We empirically demonstrate that our approach is highly tolerant to significant proportions of noisy labels, and can effectively learn low-dimensional local subspaces that capture the data distribution.
△ Less
Submitted 31 July, 2018; v1 submitted 7 June, 2018;
originally announced June 2018.
-
Indication of anisotropy in arrival directions of ultra-high-energy cosmic rays through comparison to the flux pattern of extragalactic gamma-ray sources
Authors:
The Pierre Auger Collaboration,
A. Aab,
P. Abreu,
M. Aglietta,
I. F. M. Albuquerque,
I. Allekotte,
A. Almela,
J. Alvarez Castillo,
J. Alvarez-Muñiz,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
S. Andringa,
C. Aramo,
N. Arsene,
H. Asorey,
P. Assis,
G. Avila,
A. M. Badescu,
A. Balaceanu,
F. Barbato,
R. J. Barreira Luz,
J. J. Beatty,
K. H. Becker,
J. A. Bellido
, et al. (368 additional authors not shown)
Abstract:
A new analysis of the dataset from the Pierre Auger Observatory provides evidence for anisotropy in the arrival directions of ultra-high-energy cosmic rays on an intermediate angular scale, which is indicative of excess arrivals from strong, nearby sources. The data consist of 5514 events above 20 EeV with zenith angles up to 80 deg recorded before 2017 April 30. Sky models have been created for t…
▽ More
A new analysis of the dataset from the Pierre Auger Observatory provides evidence for anisotropy in the arrival directions of ultra-high-energy cosmic rays on an intermediate angular scale, which is indicative of excess arrivals from strong, nearby sources. The data consist of 5514 events above 20 EeV with zenith angles up to 80 deg recorded before 2017 April 30. Sky models have been created for two distinct populations of extragalactic gamma-ray emitters: active galactic nuclei from the second catalog of hard Fermi-LAT sources (2FHL) and starburst galaxies from a sample that was examined with Fermi-LAT. Flux-limited samples, which include all types of galaxies from the Swift-BAT and 2MASS surveys, have been investigated for comparison. The sky model of cosmic-ray density constructed using each catalog has two free parameters, the fraction of events correlating with astrophysical objects and an angular scale characterizing the clustering of cosmic rays around extragalactic sources. A maximum-likelihood ratio test is used to evaluate the best values of these parameters and to quantify the strength of each model by contrast with isotropy. It is found that the starburst model fits the data better than the hypothesis of isotropy with a statistical significance of 4.0 sigma, the highest value of the test statistic being for energies above 39 EeV. The three alternative models are favored against isotropy with 2.7-3.2 sigma significance. The origin of the indicated deviation from isotropy is examined and prospects for more sensitive future studies are discussed.
△ Less
Submitted 6 February, 2018; v1 submitted 18 January, 2018;
originally announced January 2018.
-
Online Cluster Validity Indices for Streaming Data
Authors:
Masud Moshtaghi,
James C. Bezdek,
Sarah M. Erfani,
Christopher Leckie,
James Bailey
Abstract:
Cluster analysis is used to explore structure in unlabeled data sets in a wide range of applications. An important part of cluster analysis is validating the quality of computationally obtained clusters. A large number of different internal indices have been developed for validation in the offline setting. However, this concept has not been extended to the online setting. A key challenge is to fin…
▽ More
Cluster analysis is used to explore structure in unlabeled data sets in a wide range of applications. An important part of cluster analysis is validating the quality of computationally obtained clusters. A large number of different internal indices have been developed for validation in the offline setting. However, this concept has not been extended to the online setting. A key challenge is to find an efficient incremental formulation of an index that can capture both cohesion and separation of the clusters over potentially infinite data streams. In this paper, we develop two online versions (with and without forgetting factors) of the Xie-Beni and Davies-Bouldin internal validity indices, and analyze their characteristics, using two streaming clustering algorithms (sk-means and online ellipsoidal clustering), and illustrate their use in monitoring evolving clusters in streaming data. We also show that incremental cluster validity indices are capable of sending a distress signal to online monitors when evolving clusters go awry. Our numerical examples indicate that the incremental Xie-Beni index with forgetting factor is superior to the other three indices tested.
△ Less
Submitted 8 January, 2018;
originally announced January 2018.
-
Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality
Authors:
Xingjun Ma,
Bo Li,
Yisen Wang,
Sarah M. Erfani,
Sudanthi Wijewickrema,
Grant Schoenebeck,
Dawn Song,
Michael E. Houle,
James Bailey
Abstract:
Deep Neural Networks (DNNs) have recently been shown to be vulnerable against adversarial examples, which are carefully crafted instances that can mislead DNNs to make errors during prediction. To better understand such attacks, a characterization is needed of the properties of regions (the so-called 'adversarial subspaces') in which adversarial examples lie. We tackle this challenge by characteri…
▽ More
Deep Neural Networks (DNNs) have recently been shown to be vulnerable against adversarial examples, which are carefully crafted instances that can mislead DNNs to make errors during prediction. To better understand such attacks, a characterization is needed of the properties of regions (the so-called 'adversarial subspaces') in which adversarial examples lie. We tackle this challenge by characterizing the dimensional properties of adversarial regions, via the use of Local Intrinsic Dimensionality (LID). LID assesses the space-filling capability of the region surrounding a reference example, based on the distance distribution of the example to its neighbors. We first provide explanations about how adversarial perturbation can affect the LID characteristic of adversarial regions, and then show empirically that LID characteristics can facilitate the distinction of adversarial examples generated using state-of-the-art attacks. As a proof-of-concept, we show that a potential application of LID is to distinguish adversarial examples, and the preliminary results show that it can outperform several state-of-the-art detection measures by large margins for five attack strategies considered in this paper across three benchmark datasets. Our analysis of the LID characteristic for adversarial regions not only motivates new directions of effective adversarial defense, but also opens up more challenges for developing new attacks to better understand the vulnerabilities of DNNs.
△ Less
Submitted 14 March, 2018; v1 submitted 8 January, 2018;
originally announced January 2018.
-
Inferences on Mass Composition and Tests of Hadronic Interactions from 0.3 to 100 EeV using the water-Cherenkov Detectors of the Pierre Auger Observatory
Authors:
The Pierre Auger Collaboration,
A. Aab,
P. Abreu,
M. Aglietta,
I. Al Samarai,
I. F. M. Albuquerque,
I. Allekotte,
A. Almela,
J. Alvarez Castillo,
J. Alvarez-Muñiz,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
S. Andringa,
C. Aramo,
F. Arqueros,
N. Arsene,
H. Asorey,
P. Assis,
J. Aublin,
G. Avila,
A. M. Badescu,
A. Balaceanu,
F. Barbato,
R. J. Barreira Luz
, et al. (381 additional authors not shown)
Abstract:
We present a new method for probing the hadronic interaction models at ultra-high energy and extracting details about mass composition. This is done using the time profiles of the signals recorded with the water-Cherenkov detectors of the Pierre Auger Observatory. The profiles arise from a mix of the muon and electromagnetic components of air-showers. Using the risetimes of the recorded signals we…
▽ More
We present a new method for probing the hadronic interaction models at ultra-high energy and extracting details about mass composition. This is done using the time profiles of the signals recorded with the water-Cherenkov detectors of the Pierre Auger Observatory. The profiles arise from a mix of the muon and electromagnetic components of air-showers. Using the risetimes of the recorded signals we define a new parameter, which we use to compare our observations with predictions from simulations. We find, firstly, inconsistencies between our data and predictions over a greater energy range and with substantially more events than in previous studies. Secondly, by calibrating the new parameter with fluorescence measurements from observations made at the Auger Observatory, we can infer the depth of shower maximum for a sample of over 81,000 events extending from 0.3 EeV to over 100 EeV. Above 30 EeV, the sample is nearly fourteen times larger than currently available from fluorescence measurements and extending the covered energy range by half a decade. The energy dependence of the average depth of shower maximum is compared to simulations and interpreted in terms of the mean of the logarithmic mass. We find good agreement with previous work and extend the measurement of the mean depth of shower maximum to greater energies than before, reducing significantly the statistical uncertainty associated with the inferences about mass composition.
△ Less
Submitted 19 October, 2017;
originally announced October 2017.
-
Search for High-energy Neutrinos from Binary Neutron Star Merger GW170817 with ANTARES, IceCube, and the Pierre Auger Observatory
Authors:
A. Albert,
M. Andre,
M. Anghinolfi,
M. Ardid,
J. -J. Aubert,
J. Aublin,
T. Avgitas,
B. Baret,
J. Barrios-Marti,
S. Basa,
B. Belhorma,
V. Bertin,
S. Biagi,
R. Bormuth,
S. Bourret,
M. C. Bouwhuis,
H. Branzacs,
R. Bruijn,
J. Brunner,
J. Busto,
A. Capone,
L. Caramete,
J. Carr,
S. Celli,
R. Cherkaoui El Moursli
, et al. (1916 additional authors not shown)
Abstract:
The Advanced LIGO and Advanced Virgo observatories recently discovered gravitational waves from a binary neutron star inspiral. A short gamma-ray burst (GRB) that followed the merger of this binary was also recorded by the Fermi Gamma-ray Burst Monitor (Fermi-GBM), and the Anticoincidence Shield for the Spectrometer for the International Gamma-Ray Astrophysics Laboratory (INTEGRAL), indicating par…
▽ More
The Advanced LIGO and Advanced Virgo observatories recently discovered gravitational waves from a binary neutron star inspiral. A short gamma-ray burst (GRB) that followed the merger of this binary was also recorded by the Fermi Gamma-ray Burst Monitor (Fermi-GBM), and the Anticoincidence Shield for the Spectrometer for the International Gamma-Ray Astrophysics Laboratory (INTEGRAL), indicating particle acceleration by the source. The precise location of the event was determined by optical detections of emission following the merger. We searched for high-energy neutrinos from the merger in the GeV--EeV energy range using the ANTARES, IceCube, and Pierre Auger Observatories. No neutrinos directionally coincident with the source were detected within $\pm500$ s around the merger time. Additionally, no MeV neutrino burst signal was detected coincident with the merger. We further carried out an extended search in the direction of the source for high-energy neutrinos within the 14-day period following the merger, but found no evidence of emission. We used these results to probe dissipation mechanisms in relativistic outflows driven by the binary neutron star merger. The non-detection is consistent with model predictions of short GRBs observed at a large off-axis angle.
△ Less
Submitted 9 November, 2017; v1 submitted 16 October, 2017;
originally announced October 2017.
-
Observation of a Large-scale Anisotropy in the Arrival Directions of Cosmic Rays above $8 \times 10^{18}$ eV
Authors:
The Pierre Auger Collaboration,
A. Aab,
P. Abreu,
M. Aglietta,
I. Al Samarai,
I. F. M. Albuquerque,
I. Allekotte,
A. Almela,
J. Alvarez Castillo,
J. Alvarez-Muñiz,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
S. Andringa,
C. Aramo,
F. Arqueros,
N. Arsene,
H. Asorey,
P. Assis,
J. Aublin,
G. Avila,
A. M. Badescu,
A. Balaceanu,
F. Barbato,
R. J. Barreira Luz
, et al. (382 additional authors not shown)
Abstract:
Cosmic rays are atomic nuclei arriving from outer space that reach the highest energies observed in nature. Clues to their origin come from studying the distribution of their arrival directions. Using $3 \times 10^4$ cosmic rays above $8 \times 10^{18}$ electron volts, recorded with the Pierre Auger Observatory from a total exposure of 76,800 square kilometers steradian year, we report an anisotro…
▽ More
Cosmic rays are atomic nuclei arriving from outer space that reach the highest energies observed in nature. Clues to their origin come from studying the distribution of their arrival directions. Using $3 \times 10^4$ cosmic rays above $8 \times 10^{18}$ electron volts, recorded with the Pierre Auger Observatory from a total exposure of 76,800 square kilometers steradian year, we report an anisotropy in the arrival directions. The anisotropy, detected at more than the 5.2$σ$ level of significance, can be described by a dipole with an amplitude of $6.5_{-0.9}^{+1.3}$% towards right ascension $α_{d} = 100 \pm 10$ degrees and declination $δ_{d} = -24_{-13}^{+12}$ degrees. That direction indicates an extragalactic origin for these ultra-high energy particles.
△ Less
Submitted 21 September, 2017;
originally announced September 2017.
-
Spectral Calibration of the Fluorescence Telescopes of the Pierre Auger Observatory
Authors:
The Pierre Auger Collaboration,
A. Aab,
P. Abreu,
M. Aglietta,
I. Al Samarai,
I. F. M. Albuquerque,
I. Allekotte,
A. Almela,
J. Alvarez Castillo,
J. Alvarez-Muñiz,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
S. Andringa,
C. Aramo,
F. Arqueros,
N. Arsene,
H. Asorey,
P. Assis,
J. Aublin,
G. Avila,
A. M. Badescu,
A. Balaceanu,
F. Barbato,
R. J. Barreira Luz
, et al. (381 additional authors not shown)
Abstract:
We present a novel method to measure precisely the relative spectral response of the fluorescence telescopes of the Pierre Auger Observatory. We used a portable light source based on a xenon flasher and a monochromator to measure the relative spectral efficiencies of eight telescopes in steps of 5 nm from 280 nm to 440 nm. Each point in a scan had approximately 2 nm FWHM out of the monochromator.…
▽ More
We present a novel method to measure precisely the relative spectral response of the fluorescence telescopes of the Pierre Auger Observatory. We used a portable light source based on a xenon flasher and a monochromator to measure the relative spectral efficiencies of eight telescopes in steps of 5 nm from 280 nm to 440 nm. Each point in a scan had approximately 2 nm FWHM out of the monochromator. Different sets of telescopes in the observatory have different optical components, and the eight telescopes measured represent two each of the four combinations of components represented in the observatory. We made an end-to-end measurement of the response from different combinations of optical components, and the monochromator setup allowed for more precise and complete measurements than our previous multi-wavelength calibrations. We find an overall uncertainty in the calibration of the spectral response of most of the telescopes of 1.5% for all wavelengths; the six oldest telescopes have larger overall uncertainties of about 2.2%. We also report changes in physics measureables due to the change in calibration, which are generally small.
△ Less
Submitted 2 October, 2017; v1 submitted 5 September, 2017;
originally announced September 2017.
-
The Pierre Auger Observatory: Contributions to the 35th International Cosmic Ray Conference (ICRC 2017)
Authors:
The Pierre Auger Collaboration,
A. Aab,
P. Abreu,
M. Aglietta,
I. F. M. Albuquerque,
I. Allekotte,
A. Almela,
J. Alvarez Castillo,
J. Alvarez-Muñiz,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
S. Andringa,
C. Aramo,
N. Arsene,
H. Asorey,
P. Assis,
J. Aublin,
G. Avila,
A. M. Badescu,
A. Balaceanu,
F. Barbato,
R. J. Barreira Luz,
K. H. Becker,
J. A. Bellido
, et al. (373 additional authors not shown)
Abstract:
Contributions of the Pierre Auger Collaboration to the 35th International Cosmic Ray Conference (ICRC 2017), 12-20 July 2017, Bexco, Busan, Korea.
Contributions of the Pierre Auger Collaboration to the 35th International Cosmic Ray Conference (ICRC 2017), 12-20 July 2017, Bexco, Busan, Korea.
△ Less
Submitted 2 October, 2017; v1 submitted 22 August, 2017;
originally announced August 2017.
-
Toward the Starting Line: A Systems Engineering Approach to Strong AI
Authors:
Tansu Alpcan,
Sarah M. Erfani,
Christopher Leckie
Abstract:
Artificial General Intelligence (AGI) or Strong AI aims to create machines with human-like or human-level intelligence, which is still a very ambitious goal when compared to the existing computing and AI systems. After many hype cycles and lessons from AI history, it is clear that a big conceptual leap is needed for crossing the starting line to kick-start mainstream AGI research. This position pa…
▽ More
Artificial General Intelligence (AGI) or Strong AI aims to create machines with human-like or human-level intelligence, which is still a very ambitious goal when compared to the existing computing and AI systems. After many hype cycles and lessons from AI history, it is clear that a big conceptual leap is needed for crossing the starting line to kick-start mainstream AGI research. This position paper aims to make a small conceptual contribution toward reaching that starting line. After a broad analysis of the AGI problem from different perspectives, a system-theoretic and engineering-based research approach is introduced, which builds upon the existing mainstream AI and systems foundations. Several promising cross-fertilization opportunities between systems disciplines and AI research are identified. Specific potential research directions are discussed.
△ Less
Submitted 18 October, 2017; v1 submitted 27 July, 2017;
originally announced July 2017.
-
Muon Counting using Silicon Photomultipliers in the AMIGA detector of the Pierre Auger Observatory
Authors:
The Pierre Auger Collaboration,
A. Aab,
P. Abreu,
M. Aglietta,
E. J. Ahn,
I. Al Samarai,
I. F. M. Albuquerque,
I. Allekotte,
P. Allison,
A. Almela,
J. Alvarez Castillo,
J. Alvarez-Muñiz,
M. Ambrosio,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
S. Andringa,
C. Aramo,
F. Arqueros,
N. Arsene,
H. Asorey,
P. Assis,
J. Aublin,
G. Avila,
A. M. Badescu
, et al. (400 additional authors not shown)
Abstract:
AMIGA (Auger Muons and Infill for the Ground Array) is an upgrade of the Pierre Auger Observatory designed to extend its energy range of detection and to directly measure the muon content of the cosmic ray primary particle showers. The array will be formed by an infill of surface water-Cherenkov detectors associated with buried scintillation counters employed for muon counting. Each counter is com…
▽ More
AMIGA (Auger Muons and Infill for the Ground Array) is an upgrade of the Pierre Auger Observatory designed to extend its energy range of detection and to directly measure the muon content of the cosmic ray primary particle showers. The array will be formed by an infill of surface water-Cherenkov detectors associated with buried scintillation counters employed for muon counting. Each counter is composed of three scintillation modules, with a 10 m$^2$ detection area per module. In this paper, a new generation of detectors, replacing the current multi-pixel photomultiplier tube (PMT) with silicon photo sensors (aka. SiPMs), is proposed. The selection of the new device and its front-end electronics is explained. A method to calibrate the counting system that ensures the performance of the detector is detailed. This method has the advantage of being able to be carried out in a remote place such as the one where the detectors are deployed. High efficiency results, i.e. 98 % efficiency for the highest tested overvoltage, combined with a low probability of accidental counting ($\sim$2 %), show a promising performance for this new system.
△ Less
Submitted 4 October, 2017; v1 submitted 17 March, 2017;
originally announced March 2017.
-
Calibration of the Logarithmic-Periodic Dipole Antenna (LPDA) Radio Stations at the Pierre Auger Observatory using an Octocopter
Authors:
The Pierre Auger Collaboration,
A. Aab,
P. Abreu,
M. Aglietta,
I. Al Samarai,
I. F. M. Albuquerque,
I. Allekotte,
A. Almela,
J. Alvarez Castillo,
J. Alvarez-Muñiz,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
S. Andringa,
C. Aramo,
F. Arqueros,
N. Arsene,
H. Asorey,
P. Assis,
J. Aublin,
G. Avila,
A. M. Badescu,
A. Balaceanu,
F. Barbato,
R. J. Barreira Luz
, et al. (380 additional authors not shown)
Abstract:
An in-situ calibration of a logarithmic periodic dipole antenna with a frequency coverage of 30 MHz to 80 MHz is performed. Such antennas are part of a radio station system used for detection of cosmic ray induced air showers at the Engineering Radio Array of the Pierre Auger Observatory, the so-called Auger Engineering Radio Array (AERA). The directional and frequency characteristics of the broad…
▽ More
An in-situ calibration of a logarithmic periodic dipole antenna with a frequency coverage of 30 MHz to 80 MHz is performed. Such antennas are part of a radio station system used for detection of cosmic ray induced air showers at the Engineering Radio Array of the Pierre Auger Observatory, the so-called Auger Engineering Radio Array (AERA). The directional and frequency characteristics of the broadband antenna are investigated using a remotely piloted aircraft (RPA) carrying a small transmitting antenna. The antenna sensitivity is described by the vector effective length relating the measured voltage with the electric-field components perpendicular to the incoming signal direction. The horizontal and meridional components are determined with an overall uncertainty of 7.4^{+0.9}_{-0.3} % and 10.3^{+2.8}_{-1.7} % respectively. The measurement is used to correct a simulated response of the frequency and directional response of the antenna. In addition, the influence of the ground conductivity and permittivity on the antenna response is simulated. Both have a negligible influence given the ground conditions measured at the detector site. The overall uncertainties of the vector effective length components result in an uncertainty of 8.8^{+2.1}_{-1.3} % in the square root of the energy fluence for incoming signal directions with zenith angles smaller than 60°.
△ Less
Submitted 13 June, 2018; v1 submitted 5 February, 2017;
originally announced February 2017.
-
Combined fit of spectrum and composition data as measured by the Pierre Auger Observatory
Authors:
The Pierre Auger Collaboration,
A. Aab,
P. Abreu,
M. Aglietta,
I. Al Samarai,
I. F. M. Albuquerque,
I. Allekotte,
A. Almela,
J. Alvarez Castillo,
J. Alvarez-Muñiz,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
S. Andringa,
C. Aramo,
F. Arqueros,
N. Arsene,
H. Asorey,
P. Assis,
J. Aublin,
G. Avila,
A. M. Badescu,
A. Balaceanu,
R. J. Barreira Luz,
J. J. Beatty
, et al. (375 additional authors not shown)
Abstract:
We present a combined fit of a simple astrophysical model of UHECR sources to both the energy spectrum and mass composition data measured by the Pierre Auger Observatory. The fit has been performed for energies above $5 \cdot 10^{18}$ eV, i.e.~the region of the all-particle spectrum above the so-called "ankle" feature. The astrophysical model we adopted consists of identical sources uniformly dist…
▽ More
We present a combined fit of a simple astrophysical model of UHECR sources to both the energy spectrum and mass composition data measured by the Pierre Auger Observatory. The fit has been performed for energies above $5 \cdot 10^{18}$ eV, i.e.~the region of the all-particle spectrum above the so-called "ankle" feature. The astrophysical model we adopted consists of identical sources uniformly distributed in a comoving volume, where nuclei are accelerated through a rigidity-dependent mechanism. The fit results suggest sources characterized by relatively low maximum injection energies, hard spectra and heavy chemical composition. We also show that uncertainties about physical quantities relevant to UHECR propagation and shower development have a non-negligible impact on the fit results.
△ Less
Submitted 26 February, 2018; v1 submitted 21 December, 2016;
originally announced December 2016.
-
A targeted search for point sources of EeV photons with the Pierre Auger Observatory
Authors:
The Pierre Auger Collaboration,
A. Aab,
P. Abreu,
M. Aglietta,
I. Al Samarai,
I. F. M. Albuquerque,
I. Allekotte,
A. Almela,
J. Alvarez Castillo,
J. Alvarez-Muñiz,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
S. Andringa,
C. Aramo,
F. Arqueros,
N. Arsene,
H. Asorey,
P. Assis,
J. Aublin,
G. Avila,
A. M. Badescu,
A. Balaceanu,
R. J. Barreira Luz,
J. J. Beatty
, et al. (375 additional authors not shown)
Abstract:
Simultaneous measurements of air showers with the fluorescence and surface detectors of the Pierre Auger Observatory allow a sensitive search for EeV photon point sources. Several Galactic and extragalactic candidate objects are grouped in classes to reduce the statistical penalty of many trials from that of a blind search and are analyzed for a significant excess above the background expectation.…
▽ More
Simultaneous measurements of air showers with the fluorescence and surface detectors of the Pierre Auger Observatory allow a sensitive search for EeV photon point sources. Several Galactic and extragalactic candidate objects are grouped in classes to reduce the statistical penalty of many trials from that of a blind search and are analyzed for a significant excess above the background expectation. The presented search does not find any evidence for photon emission at candidate sources, and combined $p$-values for every class are reported. Particle and energy flux upper limits are given for selected candidate sources. These limits significantly constrain predictions of EeV proton emission models from non-transient Galactic and nearby extragalactic sources, as illustrated for the particular case of the Galactic center region.
△ Less
Submitted 21 March, 2017; v1 submitted 13 December, 2016;
originally announced December 2016.
-
Search for photons with energies above 10$^{18}$ eV using the hybrid detector of the Pierre Auger Observatory
Authors:
The Pierre Auger Collaboration,
A. Aab,
P. Abreu,
M. Aglietta,
I. Al Samarai,
I. F. M. Albuquerque,
I. Allekotte,
A. Almela,
J. Alvarez Castillo,
J. Alvarez-Muñiz,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
S. Andringa,
C. Aramo,
F. Arqueros,
N. Arsene,
H. Asorey,
P. Assis,
J. Aublin,
G. Avila,
A. M. Badescu,
A. Balaceanu,
R. J. Barreira Luz,
J. J. Beatty
, et al. (375 additional authors not shown)
Abstract:
A search for ultra-high energy photons with energies above 1 EeV is performed using nine years of data collected by the Pierre Auger Observatory in hybrid operation mode. An unprecedented separation power between photon and hadron primaries is achieved by combining measurements of the longitudinal air-shower development with the particle content at ground measured by the fluorescence and surface d…
▽ More
A search for ultra-high energy photons with energies above 1 EeV is performed using nine years of data collected by the Pierre Auger Observatory in hybrid operation mode. An unprecedented separation power between photon and hadron primaries is achieved by combining measurements of the longitudinal air-shower development with the particle content at ground measured by the fluorescence and surface detectors, respectively. Only three photon candidates at energies 1 - 2 EeV are found, which is compatible with the expected hadron-induced background. Upper limits on the integral flux of ultra-high energy photons of 0.038, 0.010, 0.009, 0.008 and 0.007 km$^{-2}$ sr$^{-1}$ yr$^{-1}$ are derived at 95% C.L. for energy thresholds of 1, 2, 3, 5 and 10 EeV. These limits bound the fractions of photons in the all-particle integral flux below 0.14%, 0.17%, 0.42%, 0.86% and 2.9%. For the first time the photon fraction at EeV energies is constrained at the sub-percent level. The improved limits are below the flux of diffuse photons predicted by some astrophysical scenarios for cosmogenic photon production. The new results rule-out the early top-down models $-$ in which ultra-high energy cosmic rays are produced by, e.g., the decay of super-massive particles $-$ and challenge the most recent super-heavy dark matter models.
△ Less
Submitted 28 September, 2020; v1 submitted 5 December, 2016;
originally announced December 2016.
-
Multi-resolution anisotropy studies of ultrahigh-energy cosmic rays detected at the Pierre Auger Observatory
Authors:
The Pierre Auger Collaboration,
A. Aab,
P. Abreu,
M. Aglietta,
I. Al Samarai,
I. F. M. Albuquerque,
I. Allekotte,
A. Almela,
J. Alvarez Castillo,
J. Alvarez-Muñiz,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
S. Andringa,
C. Aramo,
F. Arqueros,
N. Arsene,
H. Asorey,
P. Assis,
J. Aublin,
G. Avila,
A. M. Badescu,
A. Balaceanu,
R. J. Barreira Luz,
C. Baus
, et al. (378 additional authors not shown)
Abstract:
We report a multi-resolution search for anisotropies in the arrival directions of cosmic rays detected at the Pierre Auger Observatory with local zenith angles up to $80^\circ$ and energies in excess of 4 EeV ($4 \times 10^{18}$ eV). This search is conducted by measuring the angular power spectrum and performing a needlet wavelet analysis in two independent energy ranges. Both analyses are complem…
▽ More
We report a multi-resolution search for anisotropies in the arrival directions of cosmic rays detected at the Pierre Auger Observatory with local zenith angles up to $80^\circ$ and energies in excess of 4 EeV ($4 \times 10^{18}$ eV). This search is conducted by measuring the angular power spectrum and performing a needlet wavelet analysis in two independent energy ranges. Both analyses are complementary since the angular power spectrum achieves a better performance in identifying large-scale patterns while the needlet wavelet analysis, considering the parameters used in this work, presents a higher efficiency in detecting smaller-scale anisotropies, potentially providing directional information on any observed anisotropies. No deviation from isotropy is observed on any angular scale in the energy range between 4 and 8 EeV. Above 8 EeV, an indication for a dipole moment is captured; while no other deviation from isotropy is observed for moments beyond the dipole one. The corresponding $p$-values obtained after accounting for searches blindly performed at several angular scales, are $1.3 \times 10^{-5}$ in the case of the angular power spectrum, and $2.5 \times 10^{-3}$ in the case of the needlet analysis. While these results are consistent with previous reports making use of the same data set, they provide extensions of the previous works through the thorough scans of the angular scales.
△ Less
Submitted 20 June, 2017; v1 submitted 21 November, 2016;
originally announced November 2016.
-
Testing Hadronic Interactions at Ultrahigh Energies with Air Showers Measured by the Pierre Auger Observatory
Authors:
The Pierre Auger Collaboration,
A. Aab,
P. Abreu,
M. Aglietta,
E. J. Ahn,
I. Al Samarai,
I. F. M. Albuquerque,
I. Allekotte,
J. Allen,
P. Allison,
A. Almela,
J. Alvarez Castillo,
J. Alvarez-Muñiz,
M. Ambrosio,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
S. Andringa,
C. Aramo,
F. Arqueros,
N. Arsene,
H. Asorey,
P. Assis,
J. Aublin,
G. Avila
, et al. (413 additional authors not shown)
Abstract:
Ultrahigh energy cosmic ray air showers probe particle physics at energies beyond the reach of accelerators. Here we introduce a new method to test hadronic interaction models without relying on the absolute energy calibration, and apply it to events with primary energy 6-16 EeV (E_CM = 110-170 TeV), whose longitudinal development and lateral distribution were simultaneously measured by the Pierre…
▽ More
Ultrahigh energy cosmic ray air showers probe particle physics at energies beyond the reach of accelerators. Here we introduce a new method to test hadronic interaction models without relying on the absolute energy calibration, and apply it to events with primary energy 6-16 EeV (E_CM = 110-170 TeV), whose longitudinal development and lateral distribution were simultaneously measured by the Pierre Auger Observatory. The average hadronic shower is 1.33 +- 0.16 (1.61 +- 0.21) times larger than predicted using the leading LHC-tuned models EPOS-LHC (QGSJetII-04), with a corresponding excess of muons.
△ Less
Submitted 31 October, 2016; v1 submitted 26 October, 2016;
originally announced October 2016.
-
Evidence for a mixed mass composition at the `ankle' in the cosmic-ray spectrum
Authors:
The Pierre Auger Collaboration,
A. Aab,
P. Abreu,
M. Aglietta,
E. J. Ahn,
I. Al Samarai,
I. F. M. Albuquerque,
I. Allekotte,
P. Allison,
A. Almela,
J. Alvarez Castillo,
J. Alvarez-Muñiz,
M. Ambrosio,
G. A. Anastasi,
L. Anchordoqui,
B. Andrada,
S. Andringa,
C. Aramo,
F. Arqueros,
N. Arsene,
H. Asorey,
P. Assis,
J. Aublin,
G. Avila,
A. M. Badescu
, et al. (401 additional authors not shown)
Abstract:
We report a first measurement for ultra-high energy cosmic rays of the correlation between the depth of shower maximum and the signal in the water Cherenkov stations of air-showers registered simultaneously by the fluorescence and the surface detectors of the Pierre Auger Observatory. Such a correlation measurement is a unique feature of a hybrid air-shower observatory with sensitivity to both the…
▽ More
We report a first measurement for ultra-high energy cosmic rays of the correlation between the depth of shower maximum and the signal in the water Cherenkov stations of air-showers registered simultaneously by the fluorescence and the surface detectors of the Pierre Auger Observatory. Such a correlation measurement is a unique feature of a hybrid air-shower observatory with sensitivity to both the electromagnetic and muonic components. It allows an accurate determination of the spread of primary masses in the cosmic-ray flux. Up till now, constraints on the spread of primary masses have been dominated by systematic uncertainties. The present correlation measurement is not affected by systematics in the measurement of the depth of shower maximum or the signal in the water Cherenkov stations. The analysis relies on general characteristics of air showers and is thus robust also with respect to uncertainties in hadronic event generators. The observed correlation in the energy range around the `ankle' at $\lg(E/{\rm eV})=18.5-19.0$ differs significantly from expectations for pure primary cosmic-ray compositions. A light composition made up of proton and helium only is equally inconsistent with observations. The data are explained well by a mixed composition including nuclei with mass $A > 4$. Scenarios such as the proton dip model, with almost pure compositions, are thus disfavoured as the sole explanation of the ultrahigh-energy cosmic-ray flux at Earth.
△ Less
Submitted 22 November, 2016; v1 submitted 27 September, 2016;
originally announced September 2016.