-
WavePulse: Real-time Content Analytics of Radio Livestreams
Authors:
Govind Mittal,
Sarthak Gupta,
Shruti Wagle,
Chirag Chopra,
Anthony J DeMattee,
Nasir Memon,
Mustaque Ahamad,
Chinmay Hegde
Abstract:
Radio remains a pervasive medium for mass information dissemination, with AM/FM stations reaching more Americans than either smartphone-based social networking or live television. Increasingly, radio broadcasts are also streamed online and accessed over the Internet. We present WavePulse, a framework that records, documents, and analyzes radio content in real-time. While our framework is generally…
▽ More
Radio remains a pervasive medium for mass information dissemination, with AM/FM stations reaching more Americans than either smartphone-based social networking or live television. Increasingly, radio broadcasts are also streamed online and accessed over the Internet. We present WavePulse, a framework that records, documents, and analyzes radio content in real-time. While our framework is generally applicable, we showcase the efficacy of WavePulse in a collaborative project with a team of political scientists focusing on the 2024 Presidential Elections. We use WavePulse to monitor livestreams of 396 news radio stations over a period of three months, processing close to 500,000 hours of audio streams. These streams were converted into time-stamped, diarized transcripts and analyzed to track answer key political science questions at both the national and state levels. Our analysis revealed how local issues interacted with national trends, providing insights into information flow. Our results demonstrate WavePulse's efficacy in capturing and analyzing content from radio livestreams sourced from the Web. Code and dataset can be accessed at \url{https://wave-pulse.io}.
△ Less
Submitted 29 January, 2025; v1 submitted 23 December, 2024;
originally announced December 2024.
-
Evaluating Synthetic Command Attacks on Smart Voice Assistants
Authors:
Zhengxian He,
Ashish Kundu,
Mustaque Ahamad
Abstract:
Recent advances in voice synthesis, coupled with the ease with which speech can be harvested for millions of people, introduce new threats to applications that are enabled by devices such as voice assistants (e.g., Amazon Alexa, Google Home etc.). We explore if unrelated and limited amount of speech from a target can be used to synthesize commands for a voice assistant like Amazon Alexa. More spec…
▽ More
Recent advances in voice synthesis, coupled with the ease with which speech can be harvested for millions of people, introduce new threats to applications that are enabled by devices such as voice assistants (e.g., Amazon Alexa, Google Home etc.). We explore if unrelated and limited amount of speech from a target can be used to synthesize commands for a voice assistant like Amazon Alexa. More specifically, we investigate attacks on voice assistants with synthetic commands when they match command sources to authorized users, and applications (e.g., Alexa Skills) process commands only when their source is an authorized user with a chosen confidence level. We demonstrate that even simple concatenative speech synthesis can be used by an attacker to command voice assistants to perform sensitive operations. We also show that such attacks, when launched by exploiting compromised devices in the vicinity of voice assistants, can have relatively small host and network footprint. Our results demonstrate the need for better defenses against synthetic malicious commands that could target voice assistants.
△ Less
Submitted 14 November, 2024; v1 submitted 12 November, 2024;
originally announced November 2024.
-
SoK: An Essential Guide For Using Malware Sandboxes In Security Applications: Challenges, Pitfalls, and Lessons Learned
Authors:
Omar Alrawi,
Miuyin Yong Wong,
Athanasios Avgetidis,
Kevin Valakuzhy,
Boladji Vinny Adjibi,
Konstantinos Karakatsanis,
Mustaque Ahamad,
Doug Blough,
Fabian Monrose,
Manos Antonakakis
Abstract:
Malware sandboxes provide many benefits for security applications, but they are complex. These complexities can overwhelm new users in different research areas and make it difficult to select, configure, and use sandboxes. Even worse, incorrectly using sandboxes can have a negative impact on security applications. In this paper, we address this knowledge gap by systematizing 84 representative pape…
▽ More
Malware sandboxes provide many benefits for security applications, but they are complex. These complexities can overwhelm new users in different research areas and make it difficult to select, configure, and use sandboxes. Even worse, incorrectly using sandboxes can have a negative impact on security applications. In this paper, we address this knowledge gap by systematizing 84 representative papers for using x86/64 malware sandboxes in the academic literature. We propose a novel framework to simplify sandbox components and organize the literature to derive practical guidelines for using sandboxes. We evaluate the proposed guidelines systematically using three common security applications and demonstrate that the choice of different sandboxes can significantly impact the results. Specifically, our results show that the proposed guidelines improve the sandbox observable activities by at least 1.6x and up to 11.3x. Furthermore, we observe a roughly 25% improvement in accuracy, precision, and recall when using the guidelines to help with a malware family classification task. We conclude by affirming that there is no "silver bullet" sandbox deployment that generalizes, and we recommend that users apply our framework to define a scope for their analysis, a threat model, and derive context about how the sandbox artifacts will influence their intended use case. Finally, it is important that users document their experiment, limitations, and potential solutions for reproducibility
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
Corrective or Backfire: Characterizing and Predicting User Response to Social Correction
Authors:
Bing He,
Yingchen Ma,
Mustaque Ahamad,
Srijan Kumar
Abstract:
Online misinformation poses a global risk with harmful implications for society. Ordinary social media users are known to actively reply to misinformation posts with counter-misinformation messages, which is shown to be effective in containing the spread of misinformation. Such a practice is defined as "social correction". Nevertheless, it remains unknown how users respond to social correction in…
▽ More
Online misinformation poses a global risk with harmful implications for society. Ordinary social media users are known to actively reply to misinformation posts with counter-misinformation messages, which is shown to be effective in containing the spread of misinformation. Such a practice is defined as "social correction". Nevertheless, it remains unknown how users respond to social correction in real-world scenarios, especially, will it have a corrective or backfire effect on users. Investigating this research question is pivotal for developing and refining strategies that maximize the efficacy of social correction initiatives. To fill this gap, we conduct an in-depth study to characterize and predict the user response to social correction in a data-driven manner through the lens of X (Formerly Twitter), where the user response is instantiated as the reply that is written toward a counter-misinformation message. Particularly, we first create a novel dataset with 55, 549 triples of misinformation tweets, counter-misinformation replies, and responses to counter-misinformation replies, and then curate a taxonomy to illustrate different kinds of user responses. Next, fine-grained statistical analysis of reply linguistic and engagement features as well as repliers' user attributes is conducted to illustrate the characteristics that are significant in determining whether a reply will have a corrective or backfire effect. Finally, we build a user response prediction model to identify whether a social correction will be corrective, neutral, or have a backfire effect, which achieves a promising F1 score of 0.816. Our work enables stakeholders to monitor and predict user responses effectively, thus guiding the use of social correction to maximize their corrective impact and minimize backfire effects. The code and data is accessible on https://github.com/claws-lab/response-to-social-correction.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Mie-scattering controlled all-dielectric resonator-antenna for bright and directional point dipole emission
Authors:
Mohammed Ashahar Ahamad,
Sughra Shaikh,
Faraz Ahmed Inam
Abstract:
Designing a deterministic, bright, robust, room temperature stable, on-demand solid-state single photon source has been a major demand in the field of quantum-photonics. For this, various single-photon resonator and antenna schemes are being actively explored. Here, using the Cartesian multi-polar decomposition of the excited Mie-scattering moments, we present the design of a all-dielectric couple…
▽ More
Designing a deterministic, bright, robust, room temperature stable, on-demand solid-state single photon source has been a major demand in the field of quantum-photonics. For this, various single-photon resonator and antenna schemes are being actively explored. Here, using the Cartesian multi-polar decomposition of the excited Mie-scattering moments, we present the design of a all-dielectric coupled-dipolar antenna comprising of two dielectric (Tin-oxide, TiO$_2$) cylinders sandwiching a nanodiamond based nitrogen-vacancy (NV$^-$) center trapped in a poly-vinyl alcohol (PVA) matrix. The Mie-scattering resonant cavity formed in the middle PVA layer provides more than an order of magnitude decay rate or Purcell enhancement. The balancing of the electric and magnetic dipolar moments (a phenomenon commonly known as the Kerker condition) of the coupled TiO$_2$ cylinders under NV$^-$ dipole excitation, provides significant directionality to the radiation pattern. Using a collection lens with a numerical aperture (NA) of 0.9 the vertical collection efficiency (VCE) was observed to be around 80\% at the NV$^-$ center's zero-phonon line wavelength.
△ Less
Submitted 10 November, 2023; v1 submitted 8 November, 2023;
originally announced November 2023.
-
Reinforcement Learning-based Counter-Misinformation Response Generation: A Case Study of COVID-19 Vaccine Misinformation
Authors:
Bing He,
Mustaque Ahamad,
Srijan Kumar
Abstract:
The spread of online misinformation threatens public health, democracy, and the broader society. While professional fact-checkers form the first line of defense by fact-checking popular false claims, they do not engage directly in conversations with misinformation spreaders. On the other hand, non-expert ordinary users act as eyes-on-the-ground who proactively counter misinformation -- recent rese…
▽ More
The spread of online misinformation threatens public health, democracy, and the broader society. While professional fact-checkers form the first line of defense by fact-checking popular false claims, they do not engage directly in conversations with misinformation spreaders. On the other hand, non-expert ordinary users act as eyes-on-the-ground who proactively counter misinformation -- recent research has shown that 96% counter-misinformation responses are made by ordinary users. However, research also found that 2/3 times, these responses are rude and lack evidence. This work seeks to create a counter-misinformation response generation model to empower users to effectively correct misinformation. This objective is challenging due to the absence of datasets containing ground-truth of ideal counter-misinformation responses, and the lack of models that can generate responses backed by communication theories. In this work, we create two novel datasets of misinformation and counter-misinformation response pairs from in-the-wild social media and crowdsourcing from college-educated students. We annotate the collected data to distinguish poor from ideal responses that are factual, polite, and refute misinformation. We propose MisinfoCorrect, a reinforcement learning-based framework that learns to generate counter-misinformation responses for an input misinformation post. The model rewards the generator to increase the politeness, factuality, and refutation attitude while retaining text fluency and relevancy. Quantitative and qualitative evaluation shows that our model outperforms several baselines by generating high-quality counter-responses. This work illustrates the promise of generative text models for social good -- here, to help create a safe and reliable information ecosystem. The code and data is accessible on https://github.com/claws-lab/MisinfoCorrect.
△ Less
Submitted 11 March, 2023;
originally announced March 2023.
-
Silicon Carbide Metasurfaces for Controlling the Spontaneous Emission of Embedded Color Centers
Authors:
Mohammed Ashahar Ahamad,
Faraz Ahmed Inam,
Stefania Castelletto
Abstract:
While electric and magnetic dipolar resonances in SiC have been studied in the far-infrared, they have not been studied in the near infrared. Here we show for the first time that electromagnetic Mie-scattering moments within SiC metasurfaces can control the spontaneous emission process of point defects in the near infrared. Using SiC nanopillars based metasurfaces, we theoretically demonstrate a c…
▽ More
While electric and magnetic dipolar resonances in SiC have been studied in the far-infrared, they have not been studied in the near infrared. Here we show for the first time that electromagnetic Mie-scattering moments within SiC metasurfaces can control the spontaneous emission process of point defects in the near infrared. Using SiC nanopillars based metasurfaces, we theoretically demonstrate a control over the spontaneous emission rate of embedded color-centers by using the coherent superposition of the electric dipolar and magnetic quadrupolar electromagnetic Mie-scattering moments of the structure. More than an order of magnitude emission/decay rate enhancement is obtained with the maximum enhancement close to 30. We also demonstrate that the relative phase of the Mie-scattering moments helps in controlling the emission directionality. SiC metasurfaces in the spectral range of color centres, from the visible to the near infrared, can be used to control the confinement and directionality of their spontaneous emission, increasing the opportunities to study light-matter interaction and to advance quantum photonic and quantum sensing device integration.
△ Less
Submitted 12 January, 2023;
originally announced January 2023.
-
A survey, review, and future trends of skin lesion segmentation and classification
Authors:
Md. Kamrul Hasan,
Md. Asif Ahamad,
Choon Hwai Yap,
Guang Yang
Abstract:
The Computer-aided Diagnosis or Detection (CAD) approach for skin lesion analysis is an emerging field of research that has the potential to alleviate the burden and cost of skin cancer screening. Researchers have recently indicated increasing interest in developing such CAD systems, with the intention of providing a user-friendly tool to dermatologists to reduce the challenges encountered or asso…
▽ More
The Computer-aided Diagnosis or Detection (CAD) approach for skin lesion analysis is an emerging field of research that has the potential to alleviate the burden and cost of skin cancer screening. Researchers have recently indicated increasing interest in developing such CAD systems, with the intention of providing a user-friendly tool to dermatologists to reduce the challenges encountered or associated with manual inspection. This article aims to provide a comprehensive literature survey and review of a total of 594 publications (356 for skin lesion segmentation and 238 for skin lesion classification) published between 2011 and 2022. These articles are analyzed and summarized in a number of different ways to contribute vital information regarding the methods for the development of CAD systems. These ways include relevant and essential definitions and theories, input data (dataset utilization, preprocessing, augmentations, and fixing imbalance problems), method configuration (techniques, architectures, module frameworks, and losses), training tactics (hyperparameter settings), and evaluation criteria. We intend to investigate a variety of performance-enhancing approaches, including ensemble and post-processing. We also discuss these dimensions to reveal their current trends based on utilization frequencies. In addition, we highlight the primary difficulties associated with evaluating skin lesion segmentation and classification systems using minimal datasets, as well as the potential solutions to these difficulties. Findings, recommendations, and trends are disclosed to inform future research on developing an automated and robust CAD system for skin lesion analysis.
△ Less
Submitted 2 February, 2023; v1 submitted 25 August, 2022;
originally announced August 2022.
-
PETGEN: Personalized Text Generation Attack on Deep Sequence Embedding-based Classification Models
Authors:
Bing He,
Mustaque Ahamad,
Srijan Kumar
Abstract:
What should a malicious user write next to fool a detection model? Identifying malicious users is critical to ensure the safety and integrity of internet platforms. Several deep learning-based detection models have been created. However, malicious users can evade deep detection models by manipulating their behavior, rendering these models of little use. The vulnerability of such deep detection mod…
▽ More
What should a malicious user write next to fool a detection model? Identifying malicious users is critical to ensure the safety and integrity of internet platforms. Several deep learning-based detection models have been created. However, malicious users can evade deep detection models by manipulating their behavior, rendering these models of little use. The vulnerability of such deep detection models against adversarial attacks is unknown. Here we create a novel adversarial attack model against deep user sequence embedding based classification models, which use the sequence of user posts to generate user embeddings and detect malicious users. In the attack, the adversary generates a new post to fool the classifier. We propose a novel end-to-end Personalized Text Generation Attack model, called PETGEN, that simultaneously reduces the efficacy of the detection model and generates posts that have several key desirable properties. Specifically, PETGEN generates posts that are personalized to the user's writing style, have knowledge about a given target context, are aware of the user's historical posts on the target context, and encapsulate the user's recent topical interests. We conduct extensive experiments on two real-world datasets (Yelp and Wikipedia, both with ground-truth of malicious users) to show that PETGEN significantly reduces the performance of popular deep user sequence embedding-based classification models. PETGEN outperforms five attack baselines in terms of text quality and attack efficacy in both white-box and black-box classifier settings. Overall, this work paves the path towards the next generation of adversary-aware sequence classification models.
△ Less
Submitted 19 October, 2021; v1 submitted 14 September, 2021;
originally announced September 2021.
-
Predicting Patient COVID-19 Disease Severity by means of Statistical and Machine Learning Analysis of Blood Cell Transcriptome Data
Authors:
Sakifa Aktar,
Md. Martuza Ahamad,
Md. Rashed-Al-Mahfuz,
AKM Azad,
Shahadat Uddin,
A H M Kamal,
Salem A. Alyami,
Ping-I Lin,
Sheikh Mohammed Shariful Islam,
Julian M. W. Quinn,
Valsamma Eapen,
Mohammad Ali Moni
Abstract:
Introduction: For COVID-19 patients accurate prediction of disease severity and mortality risk would greatly improve care delivery and resource allocation. There are many patient-related factors, such as pre-existing comorbidities that affect disease severity. Since rapid automated profiling of peripheral blood samples is widely available, we investigated how such data from the peripheral blood of…
▽ More
Introduction: For COVID-19 patients accurate prediction of disease severity and mortality risk would greatly improve care delivery and resource allocation. There are many patient-related factors, such as pre-existing comorbidities that affect disease severity. Since rapid automated profiling of peripheral blood samples is widely available, we investigated how such data from the peripheral blood of COVID-19 patients might be used to predict clinical outcomes.
Methods: We thus investigated such clinical datasets from COVID-19 patients with known outcomes by combining statistical comparison and correlation methods with machine learning algorithms; the latter included decision tree, random forest, variants of gradient boosting machine, support vector machine, K-nearest neighbour and deep learning methods.
Results: Our work revealed several clinical parameters measurable in blood samples, which discriminated between healthy people and COVID-19 positive patients and showed predictive value for later severity of COVID-19 symptoms. We thus developed a number of analytic methods that showed accuracy and precision for disease severity and mortality outcome predictions that were above 90%.
Conclusions: In sum, we developed methodologies to analyse patient routine clinical data which enables more accurate prediction of COVID-19 patient outcomes. This type of approaches could, by employing standard hospital laboratory analyses of patient blood, be utilised to identify, COVID-19 patients at high risk of mortality and so enable their treatment to be optimised.
△ Less
Submitted 19 November, 2020;
originally announced November 2020.
-
The Role of the Crowd in Countering Misinformation: A Case Study of the COVID-19 Infodemic
Authors:
Nicholas Micallef,
Bing He,
Srijan Kumar,
Mustaque Ahamad,
Nasir Memon
Abstract:
Fact checking by professionals is viewed as a vital defense in the fight against misinformation.While fact checking is important and its impact has been significant, fact checks could have limited visibility and may not reach the intended audience, such as those deeply embedded in polarized communities. Concerned citizens (i.e., the crowd), who are users of the platforms where misinformation appea…
▽ More
Fact checking by professionals is viewed as a vital defense in the fight against misinformation.While fact checking is important and its impact has been significant, fact checks could have limited visibility and may not reach the intended audience, such as those deeply embedded in polarized communities. Concerned citizens (i.e., the crowd), who are users of the platforms where misinformation appears, can play a crucial role in disseminating fact-checking information and in countering the spread of misinformation. To explore if this is the case, we conduct a data-driven study of misinformation on the Twitter platform, focusing on tweets related to the COVID-19 pandemic, analyzing the spread of misinformation, professional fact checks, and the crowd response to popular misleading claims about COVID-19. In this work, we curate a dataset of false claims and statements that seek to challenge or refute them. We train a classifier to create a novel dataset of 155,468 COVID-19-related tweets, containing 33,237 false claims and 33,413 refuting arguments.Our findings show that professional fact-checking tweets have limited volume and reach. In contrast, we observe that the surge in misinformation tweets results in a quick response and a corresponding increase in tweets that refute such misinformation. More importantly, we find contrasting differences in the way the crowd refutes tweets, some tweets appear to be opinions, while others contain concrete evidence, such as a link to a reputed source. Our work provides insights into how misinformation is organically countered in social platforms by some of their users and the role they play in amplifying professional fact checks.These insights could lead to development of tools and mechanisms that can empower concerned citizens in combating misinformation. The code and data can be found in http://claws.cc.gatech.edu/covid_counter_misinformation.html.
△ Less
Submitted 11 November, 2020; v1 submitted 11 November, 2020;
originally announced November 2020.
-
Using Inaudible Audio and Voice Assistants to Transmit Sensitive Data over Telephony
Authors:
Zhengxian He,
Mohit Narayan Rajput,
Mustaque Ahamad
Abstract:
New security and privacy concerns arise due to the growing popularity of voice assistant (VA) deployments in home and enterprise networks. A number of past research results have demonstrated how malicious actors can use hidden commands to get VAs to perform certain operations even when a person may be in their vicinity. However, such work has not explored how compromised computers that are close t…
▽ More
New security and privacy concerns arise due to the growing popularity of voice assistant (VA) deployments in home and enterprise networks. A number of past research results have demonstrated how malicious actors can use hidden commands to get VAs to perform certain operations even when a person may be in their vicinity. However, such work has not explored how compromised computers that are close to VAs can leverage the phone channel to exfiltrate data with the help of VAs. After characterizing the communication channel that is set up by commanding a VA to make a call to a phone number, we demonstrate how malware can encode data into audio and send it via the phone channel. Such an attack, which can be crafted remotely, at scale and at low cost, can be used to bypass network defenses that may be deployed against leakage of sensitive data. We use Dual-Tone Multi-Frequency tones to encode arbitrary binary data into audio that can be played over computer speakers and sent through a VA mediated phone channel to a remote system. We show that modest amounts of data can be transmitted with high accuracy with a short phone call lasting a few minutes. This can be done while making the audio nearly inaudible for most people by modulating it with a carrier with frequencies that are near the higher end of the human hearing range. Several factors influence the data transfer rate, including the distance between the computer and the VA, the ambient noise that may be present and the frequency of modulating carrier. With the help of a prototype built by us, we experimentally assess the impact of these factors on data transfer rates and transmission accuracy. Our results show that voice assistants in the vicinity of computers can pose new threats to data stored on such computers. These threats are not addressed by traditional host and network defenses. We briefly discuss possible mitigation ways.
△ Less
Submitted 21 September, 2020;
originally announced September 2020.
-
Machine Learning and Meta-Analysis Approach to Identify Patient Comorbidities and Symptoms that Increased Risk of Mortality in COVID-19
Authors:
Sakifa Aktar,
Ashis Talukder,
Md. Martuza Ahamad,
A. H. M. Kamal,
Jahidur Rahman Khan,
Md. Protikuzzaman,
Nasif Hossain,
Julian M. W. Quinn,
Mathew A. Summers,
Teng Liaw,
Valsamma Eapen,
Mohammad Ali Moni
Abstract:
Background: Providing appropriate care for people suffering from COVID-19, the disease caused by the pandemic SARS-CoV-2 virus is a significant global challenge. Many individuals who become infected have pre-existing conditions that may interact with COVID-19 to increase symptom severity and mortality risk. COVID-19 patient comorbidities are likely to be informative about individual risk of severe…
▽ More
Background: Providing appropriate care for people suffering from COVID-19, the disease caused by the pandemic SARS-CoV-2 virus is a significant global challenge. Many individuals who become infected have pre-existing conditions that may interact with COVID-19 to increase symptom severity and mortality risk. COVID-19 patient comorbidities are likely to be informative about individual risk of severe illness and mortality. Accurately determining how comorbidities are associated with severe symptoms and mortality would thus greatly assist in COVID-19 care planning and provision.
Methods: To assess the interaction of patient comorbidities with COVID-19 severity and mortality we performed a meta-analysis of the published global literature, and machine learning predictive analysis using an aggregated COVID-19 global dataset.
Results: Our meta-analysis identified chronic obstructive pulmonary disease (COPD), cerebrovascular disease (CEVD), cardiovascular disease (CVD), type 2 diabetes, malignancy, and hypertension as most significantly associated with COVID-19 severity in the current published literature. Machine learning classification using novel aggregated cohort data similarly found COPD, CVD, CKD, type 2 diabetes, malignancy and hypertension, as well as asthma, as the most significant features for classifying those deceased versus those who survived COVID-19. While age and gender were the most significant predictor of mortality, in terms of symptom-comorbidity combinations, it was observed that Pneumonia-Hypertension, Pneumonia-Diabetes and Acute Respiratory Distress Syndrome (ARDS)-Hypertension showed the most significant effects on COVID-19 mortality.
Conclusions: These results highlight patient cohorts most at risk of COVID-19 related severe morbidity and mortality which have implications for prioritization of hospital resources.
△ Less
Submitted 21 August, 2020;
originally announced August 2020.
-
Fighting Voice Spam with a Virtual Assistant Prototype
Authors:
Sharbani Pandit,
Jienan Liu,
Roberto Perdisci,
Mustaque Ahamad
Abstract:
Mass robocalls affect millions of people on a daily basis. Unfortunately, most current defenses against robocalls rely on phone blocklists and are ineffective against caller ID spoofing. To enable the detection of spoofed robocalls, we propose a {\em virtual assistant} application that could be integrated on smartphones to automatically vet incoming calls. Similar to a human assistant, the virtual…
▽ More
Mass robocalls affect millions of people on a daily basis. Unfortunately, most current defenses against robocalls rely on phone blocklists and are ineffective against caller ID spoofing. To enable the detection of spoofed robocalls, we propose a {\em virtual assistant} application that could be integrated on smartphones to automatically vet incoming calls. Similar to a human assistant, the virtual assistant can pick up an incoming call and screen it without user interruption to determine if the call is unwanted. Via a user study, we show that our virtual assistant is able to preserve the user experience of a typical phone call. At the same time, we show that our system can detect mass robocalls without negatively impacting legitimate callers.
△ Less
Submitted 8 August, 2020;
originally announced August 2020.
-
Building a Collaborative Phone Blacklisting System with Local Differential Privacy
Authors:
Daniele Ucci,
Roberto Perdisci,
Jaewoo Lee,
Mustaque Ahamad
Abstract:
Spam phone calls have been rapidly growing from nuisance to an increasingly effective scam delivery tool. To counter this increasingly successful attack vector, a number of commercial smartphone apps that promise to block spam phone calls have appeared on app stores, and are now used by hundreds of thousands or even millions of users. However, following a business model similar to some online soci…
▽ More
Spam phone calls have been rapidly growing from nuisance to an increasingly effective scam delivery tool. To counter this increasingly successful attack vector, a number of commercial smartphone apps that promise to block spam phone calls have appeared on app stores, and are now used by hundreds of thousands or even millions of users. However, following a business model similar to some online social network services, these apps often collect call records or other potentially sensitive information from users' phones with little or no formal privacy guarantees.
In this paper, we study whether it is possible to build a practical collaborative phone blacklisting system that makes use of local differential privacy (LDP) mechanisms to provide clear privacy guarantees. We analyze the challenges and trade-offs related to using LDP, evaluate our LDP-based system on real-world user-reported call records collected by the FTC, and show that it is possible to learn a phone blacklist using a reasonable overall privacy budget and at the same time preserve users' privacy while maintaining utility for the learned blacklist.
△ Less
Submitted 16 June, 2020;
originally announced June 2020.
-
Under the Shadow of Sunshine: Characterizing Spam Campaigns Abusing Phone Numbers Across Online Social Networks
Authors:
Srishti Gupta,
Dhruv Kuchhal,
Payas Gupta,
Mustaque Ahamad,
Manish Gupta,
Ponnurangam Kumaraguru
Abstract:
Cybercriminals abuse Online Social Networks (OSNs) to lure victims into a variety of spam. Among different spam types, a less explored area is OSN abuse that leverages the telephony channel to defraud users. Phone numbers are advertized via OSNs, and users are tricked into calling these numbers. To expand the reach of such scam / spam campaigns, phone numbers are advertised across multiple platfor…
▽ More
Cybercriminals abuse Online Social Networks (OSNs) to lure victims into a variety of spam. Among different spam types, a less explored area is OSN abuse that leverages the telephony channel to defraud users. Phone numbers are advertized via OSNs, and users are tricked into calling these numbers. To expand the reach of such scam / spam campaigns, phone numbers are advertised across multiple platforms like Facebook, Twitter, GooglePlus, Flickr, and YouTube. In this paper, we present the first data-driven characterization of cross-platform campaigns that use multiple OSN platforms to reach their victims and use phone numbers for monetization.
We collect 23M posts containing 1.8M unique phone numbers from Twitter, Facebook, GooglePlus, Youtube, and Flickr over a period of six months. Clustering these posts helps us identify 202 campaigns operating across the globe with Indonesia, United States, India, and United Arab Emirates being the most prominent originators. We find that even though Indonesian campaigns generate highest volume (3.2M posts), only 1.6% of the accounts propagating Indonesian campaigns have been suspended so far. By examining campaigns running across multiple OSNs, we discover that Twitter detects and suspends 93% more accounts than Facebook. Therefore, sharing intelligence about abuse-related user accounts across OSNs can aid in spam detection. According to our dataset, around 35K victims and 8.8M USD could have been saved if intelligence was shared across the OSNs. By analyzing phone number based spam campaigns running on OSNs, we highlight the unexplored variety of phone-based attacks surfacing on OSNs.
△ Less
Submitted 2 April, 2018;
originally announced April 2018.
-
By Hook or by Crook: Exposing the Diverse Abuse Tactics of Technical Support Scammers
Authors:
Bharat Srinivasan,
Athanasios Kountouras,
Najmeh Miramirkhani,
Monjur Alam,
Nick Nikiforakis,
Manos Antonakakis,
Mustaque Ahamad
Abstract:
Technical Support Scams (TSS), which combine online abuse with social engineering over the phone channel, have persisted despite several law enforcement actions. The tactics used by these scammers have evolved over time and they have targeted an ever increasing number of technology brands. Although recent research has provided insights into TSS, these scams have now evolved to exploit ubiquitously…
▽ More
Technical Support Scams (TSS), which combine online abuse with social engineering over the phone channel, have persisted despite several law enforcement actions. The tactics used by these scammers have evolved over time and they have targeted an ever increasing number of technology brands. Although recent research has provided insights into TSS, these scams have now evolved to exploit ubiquitously used online services such as search and sponsored advertisements served in response to search queries. We use a data-driven approach to understand search-and-ad abuse by TSS to gain visibility into the online infrastructure that facilitates it. By carefully formulating tech support queries with multiple search engines, we collect data about both the support infrastructure and the websites to which TSS victims are directed when they search online for tech support resources. We augment this with a DNS-based amplification technique to further enhance visibility into this abuse infrastructure. By analyzing the collected data, we demonstrate that tech support scammers are (1) successful in getting major as well as custom search engines to return links to websites controlled by them, and (2) they are able to get ad networks to serve malicious advertisements that lead to scam pages. Our study period of 8 months uncovered over 9,000 TSS domains, of both passive and aggressive types, with minimal overlap between sets that are reached via organic search results and sponsored ads. Also, we found over 2,400 support domains which aid the TSS domains in manipulating organic search results. Moreover, we found little overlap with domains that are reached via abuse of domain parking and URL-shortening services which was investigated previously. Thus, investigation of search-and-ad abuse provides new insights into TSS tactics and helps detect previously unknown abuse infrastructure that facilitates these scams.
△ Less
Submitted 25 September, 2017;
originally announced September 2017.
-
Abusing Phone Numbers and Cross-Application Features for Crafting Targeted Attacks
Authors:
Srishti Gupta,
Payas Gupta,
Mustaque Ahamad,
Ponnurangam Kumaraguru
Abstract:
With the convergence of Internet and telephony, new applications (e.g., WhatsApp) have emerged as an important means of communication for billions of users. These applications are becoming an attractive medium for attackers to deliver spam and carry out more targeted attacks. Since such applications rely on phone numbers, we explore the feasibility, automation, and scalability of phishing attacks…
▽ More
With the convergence of Internet and telephony, new applications (e.g., WhatsApp) have emerged as an important means of communication for billions of users. These applications are becoming an attractive medium for attackers to deliver spam and carry out more targeted attacks. Since such applications rely on phone numbers, we explore the feasibility, automation, and scalability of phishing attacks that can be carried out by abusing a phone number. We demonstrate a novel system that takes a potential victim's phone number as an input, leverages information from applications like Truecaller and Facebook about the victim and his / her social network, checks the presence of phone number's owner (victim) on the attack channels (over-the-top or OTT messaging applications, voice, e-mail, or SMS), and finally targets the victim on the chosen channel. As a proof of concept, we enumerate through a random pool of 1.16 million phone numbers. By using information provided by popular applications, we show that social and spear phishing attacks can be launched against 51,409 and 180,000 users respectively. Furthermore, voice phishing or vishing attacks can be launched against 722,696 users. We also found 91,487 highly attractive targets who can be attacked by crafting whaling attacks. We show the effectiveness of one of these attacks, phishing, by conducting an online roleplay user study. We found that social (69.2%) and spear (54.3%) phishing attacks are more successful than non-targeted phishing attacks (35.5%) on OTT messaging applications. Although similar results were found for other mediums like e-mail, we demonstrate that due to the significantly increased user engagement via new communication applications and the ease with which phone numbers allow collection of information necessary for these attacks, there is a clear need for better protection of OTT messaging applications.
△ Less
Submitted 22 December, 2015;
originally announced December 2015.