-
Detecting Anomalous Cryptocurrency Transactions: an AML/CFT Application of Machine Learning-based Forensics
Authors:
Nadia Pocher,
Mirko Zichichi,
Fabio Merizzi,
Muhammad Zohaib Shafiq,
Stefano Ferretti
Abstract:
In shaping the Internet of Money, the application of blockchain and distributed ledger technologies (DLTs) to the financial sector triggered regulatory concerns. Notably, while the user anonymity enabled in this field may safeguard privacy and data protection, the lack of identifiability hinders accountability and challenges the fight against money laundering and the financing of terrorism and pro…
▽ More
In shaping the Internet of Money, the application of blockchain and distributed ledger technologies (DLTs) to the financial sector triggered regulatory concerns. Notably, while the user anonymity enabled in this field may safeguard privacy and data protection, the lack of identifiability hinders accountability and challenges the fight against money laundering and the financing of terrorism and proliferation (AML/CFT). As law enforcement agencies and the private sector apply forensics to track crypto transfers across ecosystems that are socio-technical in nature, this paper focuses on the growing relevance of these techniques in a domain where their deployment impacts the traits and evolution of the sphere. In particular, this work offers contextualized insights into the application of methods of machine learning and transaction graph analysis. Namely, it analyzes a real-world dataset of Bitcoin transactions represented as a directed graph network through various techniques. The modeling of blockchain transactions as a complex network suggests that the use of graph-based data analysis methods can help classify transactions and identify illicit ones. Indeed, this work shows that the neural network types known as Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT) are a promising AML/CFT solution. Notably, in this scenario GCN outperform other classic approaches and GAT are applied for the first time to detect anomalies in Bitcoin. Ultimately, the paper upholds the value of public-private synergies to devise forensic strategies conscious of the spirit of explainability and data openness.
△ Less
Submitted 18 March, 2023; v1 submitted 7 June, 2022;
originally announced June 2022.
-
A novel attention model for salient structure detection in seismic volumes
Authors:
Muhammad Amir Shafiq,
Zhiling Long,
Haibin Di,
Ghassan AlRegib
Abstract:
A new approach to seismic interpretation is proposed to leverage visual perception and human visual system modeling. Specifically, a saliency detection algorithm based on a novel attention model is proposed for identifying subsurface structures within seismic data volumes. The algorithm employs 3D-FFT and a multi-dimensional spectral projection, which decomposes local spectra into three distinct c…
▽ More
A new approach to seismic interpretation is proposed to leverage visual perception and human visual system modeling. Specifically, a saliency detection algorithm based on a novel attention model is proposed for identifying subsurface structures within seismic data volumes. The algorithm employs 3D-FFT and a multi-dimensional spectral projection, which decomposes local spectra into three distinct components, each depicting variations along different dimensions of the data. Subsequently, a novel directional center-surround attention model is proposed to incorporate directional comparisons around each voxel for saliency detection within each projected dimension. Next, the resulting saliency maps along each dimension are combined adaptively to yield a consolidated saliency map, which highlights various structures characterized by subtle variations and relative motion with respect to their neighboring sections. A priori information about the seismic data can be either embedded into the proposed attention model in the directional comparisons, or incorporated into the algorithm by specifying a template when combining saliency maps adaptively. Experimental results on two real seismic datasets from the North Sea, Netherlands and Great South Basin, New Zealand demonstrate the effectiveness of the proposed algorithm for detecting salient seismic structures of different natures and appearances in one shot, which differs significantly from traditional seismic interpretation algorithms. The results further demonstrate that the proposed method outperforms comparable state-of-the-art saliency detection algorithms for natural images and videos, which are inadequate for seismic imaging data.
△ Less
Submitted 16 January, 2022;
originally announced January 2022.
-
DiPSeN: Differentially Private Self-normalizing Neural Networks For Adversarial Robustness in Federated Learning
Authors:
Olakunle Ibitoye,
M. Omair Shafiq,
Ashraf Matrawy
Abstract:
The need for robust, secure and private machine learning is an important goal for realizing the full potential of the Internet of Things (IoT). Federated learning has proven to help protect against privacy violations and information leakage. However, it introduces new risk vectors which make machine learning models more difficult to defend against adversarial samples. In this study, we examine the…
▽ More
The need for robust, secure and private machine learning is an important goal for realizing the full potential of the Internet of Things (IoT). Federated learning has proven to help protect against privacy violations and information leakage. However, it introduces new risk vectors which make machine learning models more difficult to defend against adversarial samples. In this study, we examine the role of differential privacy and self-normalization in mitigating the risk of adversarial samples specifically in a federated learning environment. We introduce DiPSeN, a Differentially Private Self-normalizing Neural Network which combines elements of differential privacy noise with self-normalizing techniques. Our empirical results on three publicly available datasets show that DiPSeN successfully improves the adversarial robustness of a deep learning classifier in a federated learning environment based on several evaluation metrics.
△ Less
Submitted 8 January, 2021;
originally announced January 2021.
-
A GAN-based Approach for Mitigating Inference Attacks in Smart Home Environment
Authors:
Olakunle Ibitoye,
Ashraf Matrawy,
M. Omair Shafiq
Abstract:
The proliferation of smart, connected, always listening devices have introduced significant privacy risks to users in a smart home environment. Beyond the notable risk of eavesdropping, intruders can adopt machine learning techniques to infer sensitive information from audio recordings on these devices, resulting in a new dimension of privacy concerns and attack variables to smart home users. Tech…
▽ More
The proliferation of smart, connected, always listening devices have introduced significant privacy risks to users in a smart home environment. Beyond the notable risk of eavesdropping, intruders can adopt machine learning techniques to infer sensitive information from audio recordings on these devices, resulting in a new dimension of privacy concerns and attack variables to smart home users. Techniques such as sound masking and microphone jamming have been effectively used to prevent eavesdroppers from listening in to private conversations. In this study, we explore the problem of adversaries spying on smart home users to infer sensitive information with the aid of machine learning techniques. We then analyze the role of randomness in the effectiveness of sound masking for mitigating sensitive information leakage. We propose a Generative Adversarial Network (GAN) based approach for privacy preservation in smart homes which generates random noise to distort the unwanted machine learning-based inference. Our experimental results demonstrate that GANs can be used to generate more effective sound masking noise signals which exhibit more randomness and effectively mitigate deep learning-based inference attacks while preserving the semantics of the audio samples.
△ Less
Submitted 12 November, 2020;
originally announced November 2020.
-
The Threat of Adversarial Attacks on Machine Learning in Network Security -- A Survey
Authors:
Olakunle Ibitoye,
Rana Abou-Khamis,
Mohamed el Shehaby,
Ashraf Matrawy,
M. Omair Shafiq
Abstract:
Machine learning models have made many decision support systems to be faster, more accurate, and more efficient. However, applications of machine learning in network security face a more disproportionate threat of active adversarial attacks compared to other domains. This is because machine learning applications in network security such as malware detection, intrusion detection, and spam filtering…
▽ More
Machine learning models have made many decision support systems to be faster, more accurate, and more efficient. However, applications of machine learning in network security face a more disproportionate threat of active adversarial attacks compared to other domains. This is because machine learning applications in network security such as malware detection, intrusion detection, and spam filtering are by themselves adversarial in nature. In what could be considered an arm's race between attackers and defenders, adversaries constantly probe machine learning systems with inputs that are explicitly designed to bypass the system and induce a wrong prediction. In this survey, we first provide a taxonomy of machine learning techniques, tasks, and depth. We then introduce a classification of machine learning in network security applications. Next, we examine various adversarial attacks against machine learning in network security and introduce two classification approaches for adversarial attacks in network security. First, we classify adversarial attacks in network security based on a taxonomy of network security applications. Secondly, we categorize adversarial attacks in network security into a problem space vs feature space dimensional classification model. We then analyze the various defenses against adversarial attacks on machine learning-based network security applications. We conclude by introducing an adversarial risk grid map and evaluating several existing adversarial attacks against machine learning in network security using the risk grid map. We also identify where each attack classification resides within the adversarial risk grid map.
△ Less
Submitted 21 March, 2023; v1 submitted 6 November, 2019;
originally announced November 2019.
-
Saliency detection for seismic applications using multi-dimensional spectral projections and directional comparisons
Authors:
Muhammad Amir Shafiq,
Zhiling Long,
Tariq Alshawi,
Ghassan AlRegib
Abstract:
In this paper, we propose a novel approach for saliency detection for seismic applications using 3D-FFT local spectra and multi-dimensional plane projections. We develop a projection scheme by dividing a 3D-FFT local spectrum of a data volume into three distinct components, each depicting changes along a different dimension of the data. The saliency detection results obtained using each projected…
▽ More
In this paper, we propose a novel approach for saliency detection for seismic applications using 3D-FFT local spectra and multi-dimensional plane projections. We develop a projection scheme by dividing a 3D-FFT local spectrum of a data volume into three distinct components, each depicting changes along a different dimension of the data. The saliency detection results obtained using each projected component are then combined to yield a saliency map. To accommodate the directional nature of seismic data, in this work, we modify the center-surround model, proven to be biologically plausible for visual attention, to incorporate directional comparisons around each voxel in a 3D volume. Experimental results on real seismic dataset from the F3 block in Netherlands offshore in the North Sea prove that the proposed algorithm is effective, efficient, and scalable. Furthermore, a subjective comparison of the results shows that it outperforms the state-of-the-art methods for saliency detection.
△ Less
Submitted 30 January, 2019;
originally announced January 2019.
-
SalSi: A new seismic attribute for salt dome detection
Authors:
Muhammad Amir Shafiq,
Tariq Alshawi,
Zhiling Long,
Ghassan AlRegib
Abstract:
In this paper, we propose a saliency-based attribute, SalSi, to detect salt dome bodies within seismic volumes. SalSi is based on the saliency theory and modeling of the human vision system (HVS). In this work, we aim to highlight the parts of the seismic volume that receive highest attention from the human interpreter, and based on the salient features of a seismic image, we detect the salt domes…
▽ More
In this paper, we propose a saliency-based attribute, SalSi, to detect salt dome bodies within seismic volumes. SalSi is based on the saliency theory and modeling of the human vision system (HVS). In this work, we aim to highlight the parts of the seismic volume that receive highest attention from the human interpreter, and based on the salient features of a seismic image, we detect the salt domes. Experimental results show the effectiveness of SalSi on the real seismic dataset acquired from the North Sea, F3 block. Subjectively, we have used the ground truth and the output of different salt dome delineation algorithms to validate the results of SalSi. For the objective evaluation of results, we have used the receiver operating characteristics (ROC) curves and area under the curves (AUC) to demonstrate SalSi is a promising and an effective attribute for seismic interpretation.
△ Less
Submitted 9 January, 2019;
originally announced January 2019.
-
The role of visual saliency in the automation of seismic interpretation
Authors:
Muhammad Amir Shafiq,
Tariq Alshawi,
Zhiling Long,
Ghassan AlRegib
Abstract:
In this paper, we propose a workflow based on SalSi for the detection and delineation of geological structures such as salt domes. SalSi is a seismic attribute designed based on the modeling of human visual system that detects the salient features and captures the spatial correlation within seismic volumes for delineating seismic structures. Using SalSi, we can not only highlight the neighboring r…
▽ More
In this paper, we propose a workflow based on SalSi for the detection and delineation of geological structures such as salt domes. SalSi is a seismic attribute designed based on the modeling of human visual system that detects the salient features and captures the spatial correlation within seismic volumes for delineating seismic structures. Using SalSi, we can not only highlight the neighboring regions of salt domes to assist a seismic interpreter but also delineate such structures using a region growing method and post-processing. The proposed delineation workflow detects the salt-dome boundary with very good precision and accuracy. Experimental results show the effectiveness of the proposed workflow on a real seismic dataset acquired from the North Sea, F3 block. For the subjective evaluation of the results of different salt-dome delineation algorithms, we have used a reference salt-dome boundary interpreted by a geophysicist. For the objective evaluation of results, we have used five different metrics based on pixels, shape, and curvedness to establish the effectiveness of the proposed workflow. The proposed workflow is not only fast but also yields better results as compared to other salt-dome delineation algorithms and shows a promising potential in seismic interpretation.
△ Less
Submitted 31 December, 2018;
originally announced December 2018.
-
Subsurface structure analysis using computational interpretation and learning: A visual signal processing perspective
Authors:
G. AlRegib,
M. Deriche,
Z. Long,
H. Di,
Z. Wang,
Y. Alaudah,
M. Shafiq,
M. Alfarraj
Abstract:
Understanding Earth's subsurface structures has been and continues to be an essential component of various applications such as environmental monitoring, carbon sequestration, and oil and gas exploration. By viewing the seismic volumes that are generated through the processing of recorded seismic traces, researchers were able to learn from applying advanced image processing and computer vision alg…
▽ More
Understanding Earth's subsurface structures has been and continues to be an essential component of various applications such as environmental monitoring, carbon sequestration, and oil and gas exploration. By viewing the seismic volumes that are generated through the processing of recorded seismic traces, researchers were able to learn from applying advanced image processing and computer vision algorithms to effectively analyze and understand Earth's subsurface structures. In this paper, first, we summarize the recent advances in this direction that relied heavily on the fields of image processing and computer vision. Second, we discuss the challenges in seismic interpretation and provide insights and some directions to address such challenges using emerging machine learning algorithms.
△ Less
Submitted 20 December, 2018;
originally announced December 2018.
-
Measuring, Characterizing, and Detecting Facebook Like Farms
Authors:
Muhammad Ikram,
Lucky Onwuzurike,
Shehroze Farooqi,
Emiliano De Cristofaro,
Arik Friedman,
Guillaume Jourjon,
Dali Kaafar,
M. Zubair Shafiq
Abstract:
Social networks offer convenient ways to seamlessly reach out to large audiences. In particular, Facebook pages are increasingly used by businesses, brands, and organizations to connect with multitudes of users worldwide. As the number of likes of a page has become a de-facto measure of its popularity and profitability, an underground market of services artificially inflating page likes, aka like…
▽ More
Social networks offer convenient ways to seamlessly reach out to large audiences. In particular, Facebook pages are increasingly used by businesses, brands, and organizations to connect with multitudes of users worldwide. As the number of likes of a page has become a de-facto measure of its popularity and profitability, an underground market of services artificially inflating page likes, aka like farms, has emerged alongside Facebook's official targeted advertising platform. Nonetheless, there is little work that systematically analyzes Facebook pages' promotion methods. Aiming to fill this gap, we present a honeypot-based comparative measurement study of page likes garnered via Facebook advertising and from popular like farms. First, we analyze likes based on demographic, temporal, and social characteristics, and find that some farms seem to be operated by bots and do not really try to hide the nature of their operations, while others follow a stealthier approach, mimicking regular users' behavior. Next, we look at fraud detection algorithms currently deployed by Facebook and show that they do not work well to detect stealthy farms which spread likes over longer timespans and like popular pages to mimic regular users. To overcome their limitations, we investigate the feasibility of timeline-based detection of like farm accounts, focusing on characterizing content generated by Facebook accounts on their timelines as an indicator of genuine versus fake social activity. We analyze a range of features, grouped into two main categories: lexical and non-lexical. We find that like farm accounts tend to re-share content, use fewer words and poorer vocabulary, and more often generate duplicate comments and likes compared to normal users. Using relevant lexical and non-lexical features, we build a classifier to detect like farms accounts that achieves precision higher than 99% and 93% recall.
△ Less
Submitted 4 July, 2017; v1 submitted 1 July, 2017;
originally announced July 2017.
-
AZ Model for Software Development
Authors:
Ahmed Mateen,
Muhammad Azeem,
Mohammad Shafiq
Abstract:
Know a days Computer system become essential and it is most commonly used in every field of life. The computer saves time and use to solve complex and extensive problem quickly in an efficient way. For this purpose software programs are develop to facilitate the works for administrator, offices, banks etc. so Quality is the most important factor as it mostly defines CUSTOMER SATISFACTION which dir…
▽ More
Know a days Computer system become essential and it is most commonly used in every field of life. The computer saves time and use to solve complex and extensive problem quickly in an efficient way. For this purpose software programs are develop to facilitate the works for administrator, offices, banks etc. so Quality is the most important factor as it mostly defines CUSTOMER SATISFACTION which directly related to success of the project so there are many approaches (methodologies) have been developed for this purpose occasionally. The main study of this paper is to propose a new methodology for the development of the software which focuses on the quality improvement of all kind of product. This study will also discuss the features and limitation of the traditional methodologies like waterfall iterative spiral RUP and Agile and show how the new innovative methodology is better than previous one.
△ Less
Submitted 28 December, 2016;
originally announced December 2016.
-
Combating Fraud in Online Social Networks: Detecting Stealthy Facebook Like Farms
Authors:
Muhammad Ikram,
Lucky Onwuzurike,
Shehroze Farooqi,
Emiliano De Cristofaro,
Arik Friedman,
Guillaume Jourjon,
Mohammad Ali Kaafar,
M. Zubair Shafiq
Abstract:
As businesses increasingly rely on social networking sites to engage with their customers, it is crucial to understand and counter reputation manipulation activities, including fraudulently boosting the number of Facebook page likes using like farms. To this end, several fraud detection algorithms have been proposed and some deployed by Facebook that use graph co-clustering to distinguish between…
▽ More
As businesses increasingly rely on social networking sites to engage with their customers, it is crucial to understand and counter reputation manipulation activities, including fraudulently boosting the number of Facebook page likes using like farms. To this end, several fraud detection algorithms have been proposed and some deployed by Facebook that use graph co-clustering to distinguish between genuine likes and those generated by farm-controlled profiles. However, as we show in this paper, these tools do not work well with stealthy farms whose users spread likes over longer timespans and like popular pages, aiming to mimic regular users. We present an empirical analysis of the graph-based detection tools used by Facebook and highlight their shortcomings against more sophisticated farms. Next, we focus on characterizing content generated by social networks accounts on their timelines, as an indicator of genuine versus fake social activity. We analyze a wide range of features extracted from timeline posts, which we group into two main classes: lexical and non-lexical. We postulate and verify that like farm accounts tend to often re-share content, use fewer words and poorer vocabulary, and more often generate duplicate comments and likes compared to normal users. We extract relevant lexical and non-lexical features and and use them to build a classifier to detect like farms accounts, achieving significantly higher accuracy, namely, at least 99% precision and 93% recall.
△ Less
Submitted 9 May, 2016; v1 submitted 1 June, 2015;
originally announced June 2015.
-
Characterizing Key Stakeholders in an Online Black-Hat Marketplace
Authors:
Shehroze Farooqi,
Muhammad Ikram,
Emiliano De Cristofaro,
Arik Friedman,
Guillaume Jourjon,
Mohamed Ali Kaafar,
M. Zubair Shafiq,
Fareed Zaffar
Abstract:
Over the past few years, many black-hat marketplaces have emerged that facilitate access to reputation manipulation services such as fake Facebook likes, fraudulent search engine optimization (SEO), or bogus Amazon reviews. In order to deploy effective technical and legal countermeasures, it is important to understand how these black-hat marketplaces operate, shedding light on the services they of…
▽ More
Over the past few years, many black-hat marketplaces have emerged that facilitate access to reputation manipulation services such as fake Facebook likes, fraudulent search engine optimization (SEO), or bogus Amazon reviews. In order to deploy effective technical and legal countermeasures, it is important to understand how these black-hat marketplaces operate, shedding light on the services they offer, who is selling, who is buying, what are they buying, who is more successful, why are they successful, etc. Toward this goal, in this paper, we present a detailed micro-economic analysis of a popular online black-hat marketplace, namely, SEOClerks.com. As the site provides non-anonymized transaction information, we set to analyze selling and buying behavior of individual users, propose a strategy to identify key users, and study their tactics as compared to other (non-key) users. We find that key users: (1) are mostly located in Asian countries, (2) are focused more on selling black-hat SEO services, (3) tend to list more lower priced services, and (4) sometimes buy services from other sellers and then sell at higher prices. Finally, we discuss the implications of our analysis with respect to devising effective economic and legal intervention strategies against marketplace operators and key users.
△ Less
Submitted 4 April, 2017; v1 submitted 7 May, 2015;
originally announced May 2015.
-
Paying for Likes? Understanding Facebook Like Fraud Using Honeypots
Authors:
Emiliano De Cristofaro,
Arik Friedman,
Guillaume Jourjon,
Mohamed Ali Kaafar,
M. Zubair Shafiq
Abstract:
Facebook pages offer an easy way to reach out to a very large audience as they can easily be promoted using Facebook's advertising platform. Recently, the number of likes of a Facebook page has become a measure of its popularity and profitability, and an underground market of services boosting page likes, aka like farms, has emerged. Some reports have suggested that like farms use a network of pro…
▽ More
Facebook pages offer an easy way to reach out to a very large audience as they can easily be promoted using Facebook's advertising platform. Recently, the number of likes of a Facebook page has become a measure of its popularity and profitability, and an underground market of services boosting page likes, aka like farms, has emerged. Some reports have suggested that like farms use a network of profiles that also like other pages to elude fraud protection algorithms, however, to the best of our knowledge, there has been no systematic analysis of Facebook pages' promotion methods.
This paper presents a comparative measurement study of page likes garnered via Facebook ads and by a few like farms. We deploy a set of honeypot pages, promote them using both methods, and analyze garnered likes based on likers' demographic, temporal, and social characteristics. We highlight a few interesting findings, including that some farms seem to be operated by bots and do not really try to hide the nature of their operations, while others follow a stealthier approach, mimicking regular users' behavior.
△ Less
Submitted 4 October, 2014; v1 submitted 7 September, 2014;
originally announced September 2014.
-
Modeling Morphology of Social Network Cascades
Authors:
M. Zubair Shafiq,
Alex X. Liu
Abstract:
Cascades represent an important phenomenon across various disciplines such as sociology, economy, psychology, political science, marketing, and epidemiology. An important property of cascades is their morphology, which encompasses the structure, shape, and size. However, cascade morphology has not been rigorously characterized and modeled in prior literature. In this paper, we propose a Multi-orde…
▽ More
Cascades represent an important phenomenon across various disciplines such as sociology, economy, psychology, political science, marketing, and epidemiology. An important property of cascades is their morphology, which encompasses the structure, shape, and size. However, cascade morphology has not been rigorously characterized and modeled in prior literature. In this paper, we propose a Multi-order Markov Model for the Morphology of Cascades ($M^4C$) that can represent and quantitatively characterize the morphology of cascades with arbitrary structures, shapes, and sizes. $M^4C$ can be used in a variety of applications to classify different types of cascades. To demonstrate this, we apply it to an unexplored but important problem in online social networks -- cascade size prediction. Our evaluations using real-world Twitter data show that $M^4C$ based cascade size prediction scheme outperforms the baseline scheme based on cascade graph features such as edge growth rate, degree distribution, clustering, and diameter. $M^4C$ based cascade size prediction scheme consistently achieves more than 90% classification accuracy under different experimental scenarios.
△ Less
Submitted 10 February, 2013;
originally announced February 2013.