Skip to main content

Showing 1–48 of 48 results for author: Zhao, B Y

Searching in archive cs. Search in all archives.
.
  1. Somesite I Used To Crawl: Awareness, Agency and Efficacy in Protecting Content Creators From AI Crawlers

    Authors: Enze Liu, Elisa Luo, Shawn Shan, Geoffrey M. Voelker, Ben Y. Zhao, Stefan Savage

    Abstract: The success of generative AI relies heavily on training on data scraped through extensive crawling of the Internet, a practice that has raised significant copyright, privacy, and ethical concerns. While few measures are designed to resist a resource-rich adversary determined to scrape a site, crawlers can be impacted by a range of existing tools such as robots.txt, NoAI meta tags, and active crawl… ▽ More

    Submitted 7 May, 2025; v1 submitted 22 November, 2024; originally announced November 2024.

    Comments: Accepted to IMC 25. Please cite the conference version

  2. arXiv:2410.08432  [pdf, other

    cs.LG

    MYCROFT: Towards Effective and Efficient External Data Augmentation

    Authors: Zain Sarwar, Van Tran, Arjun Nitin Bhagoji, Nick Feamster, Ben Y. Zhao, Supriyo Chakraborty

    Abstract: Machine learning (ML) models often require large amounts of data to perform well. When the available data is limited, model trainers may need to acquire more data from external sources. Often, useful data is held by private entities who are hesitant to share their data due to propriety and privacy concerns. This makes it challenging and expensive for model trainers to acquire the data they need to… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: 10 pages, 3 figures, 3 tables

  3. arXiv:2409.12314  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    Understanding Implosion in Text-to-Image Generative Models

    Authors: Wenxin Ding, Cathy Y. Li, Shawn Shan, Ben Y. Zhao, Haitao Zheng

    Abstract: Recent works show that text-to-image generative models are surprisingly vulnerable to a variety of poisoning attacks. Empirical results find that these models can be corrupted by altering associations between individual text prompts and associated visual features. Furthermore, a number of concurrent poisoning attacks can induce "model implosion," where the model becomes unable to produce meaningfu… ▽ More

    Submitted 18 September, 2024; originally announced September 2024.

    Comments: ACM CCS 2024

  4. arXiv:2405.06865  [pdf, other

    cs.CV cs.CR

    Disrupting Style Mimicry Attacks on Video Imagery

    Authors: Josephine Passananti, Stanley Wu, Shawn Shan, Haitao Zheng, Ben Y. Zhao

    Abstract: Generative AI models are often used to perform mimicry attacks, where a pretrained model is fine-tuned on a small sample of images to learn to mimic a specific artist of interest. While researchers have introduced multiple anti-mimicry protection tools (Mist, Glaze, Anti-Dreambooth), recent evidence points to a growing trend of mimicry models using videos as sources of training data. This paper pr… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  5. arXiv:2403.05721  [pdf, other

    cs.CR

    Inception Attacks: Immersive Hijacking in Virtual Reality Systems

    Authors: Zhuolin Yang, Cathy Yuanchen Li, Arman Bhalla, Ben Y. Zhao, Haitao Zheng

    Abstract: Today's virtual reality (VR) systems provide immersive interactions that seamlessly connect users with online services and one another. However, these immersive interfaces also introduce new vulnerabilities, making it easier for users to fall prey to new attacks. In this work, we introduce the immersive hijacking attack, where a remote attacker takes control of a user's interaction with their VR s… ▽ More

    Submitted 9 September, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: 13 pages

  6. arXiv:2402.03214  [pdf, other

    cs.CV cs.AI cs.LG

    Organic or Diffused: Can We Distinguish Human Art from AI-generated Images?

    Authors: Anna Yoo Jeong Ha, Josephine Passananti, Ronik Bhaskar, Shawn Shan, Reid Southen, Haitao Zheng, Ben Y. Zhao

    Abstract: The advent of generative AI images has completely disrupted the art world. Distinguishing AI generated images from human art is a challenging problem whose impact is growing over time. A failure to address this problem allows bad actors to defraud individuals paying a premium for human art and companies whose stated policies forbid AI imagery. It is also critical for content owners to establish co… ▽ More

    Submitted 2 July, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  7. arXiv:2401.09574  [pdf, ps, other

    cs.LG cs.CR

    Towards Scalable and Robust Model Versioning

    Authors: Wenxin Ding, Arjun Nitin Bhagoji, Ben Y. Zhao, Haitao Zheng

    Abstract: As the deployment of deep learning models continues to expand across industries, the threat of malicious incursions aimed at gaining access to these deployed models is on the rise. Should an attacker gain access to a deployed model, whether through server breaches, insider attacks, or model inversion techniques, they can then construct white-box adversarial attacks to manipulate the model's classi… ▽ More

    Submitted 10 March, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

    Comments: Published in IEEE SaTML 2024

  8. arXiv:2312.07731  [pdf, other

    cs.CR

    A Response to Glaze Purification via IMPRESS

    Authors: Shawn Shan, Stanley Wu, Haitao Zheng, Ben Y. Zhao

    Abstract: Recent work proposed a new mechanism to remove protective perturbation added by Glaze in order to again enable mimicry of art styles from images protected by Glaze. Despite promising results shown in the original paper, our own tests with the authors' code demonstrated several limitations of the proposed purification approach. The main limitations are 1) purification has a limited effect when test… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  9. arXiv:2310.16191  [pdf, other

    cs.CR

    Can Virtual Reality Protect Users from Keystroke Inference Attacks?

    Authors: Zhuolin Yang, Zain Sarwar, Iris Hwang, Ronik Bhaskar, Ben Y. Zhao, Haitao Zheng

    Abstract: Virtual Reality (VR) has gained popularity by providing immersive and interactive experiences without geographical limitations. It also provides a sense of personal privacy through physical separation. In this paper, we show that despite assumptions of enhanced privacy, VR is unable to shield its users from side-channel attacks that steal private information. Ironically, this vulnerability arises… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted by USENIX 2024

  10. arXiv:2310.13828  [pdf, other

    cs.CR cs.AI

    Nightshade: Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models

    Authors: Shawn Shan, Wenxin Ding, Josephine Passananti, Stanley Wu, Haitao Zheng, Ben Y. Zhao

    Abstract: Data poisoning attacks manipulate training data to introduce unexpected behaviors into machine learning models at training time. For text-to-image generative models with massive training datasets, current understanding of poisoning attacks suggests that a successful attack would require injecting millions of poison samples into their training pipeline. In this paper, we show that poisoning attacks… ▽ More

    Submitted 29 April, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: IEEE Security and Privacy 2024

  11. arXiv:2302.10722  [pdf, other

    cs.LG cs.CR

    Characterizing the Optimal 0-1 Loss for Multi-class Classification with a Test-time Attacker

    Authors: Sihui Dai, Wenxin Ding, Arjun Nitin Bhagoji, Daniel Cullina, Ben Y. Zhao, Haitao Zheng, Prateek Mittal

    Abstract: Finding classifiers robust to adversarial examples is critical for their safe deployment. Determining the robustness of the best possible classifier under a given threat model for a given data distribution and comparing it to that achieved by state-of-the-art training methods is thus an important diagnostic tool. In this paper, we find achievable information-theoretic lower bounds on loss in the p… ▽ More

    Submitted 6 December, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: NeurIPS 2023 Spotlight

  12. arXiv:2302.04222  [pdf, other

    cs.CR

    Glaze: Protecting Artists from Style Mimicry by Text-to-Image Models

    Authors: Shawn Shan, Jenna Cryan, Emily Wenger, Haitao Zheng, Rana Hanocka, Ben Y. Zhao

    Abstract: Recent text-to-image diffusion models such as MidJourney and Stable Diffusion threaten to displace many in the professional artist community. In particular, models can learn to mimic the artistic style of specific artists after "fine-tuning" on samples of their art. In this paper, we describe the design, implementation and evaluation of Glaze, a tool that enables artists to apply "style cloaks" to… ▽ More

    Submitted 5 April, 2025; v1 submitted 8 February, 2023; originally announced February 2023.

    Comments: USENIX Security 2023

  13. arXiv:2208.13893  [pdf, ps, other

    cs.CR cs.LG

    Data Isotopes for Data Provenance in DNNs

    Authors: Emily Wenger, Xiuyu Li, Ben Y. Zhao, Vitaly Shmatikov

    Abstract: Today, creators of data-hungry deep neural networks (DNNs) scour the Internet for training fodder, leaving users with little control over or knowledge of when their data is appropriated for model training. To empower users to counteract unwanted data use, we design, implement and evaluate a practical system that enables users to detect if their data was used to train an DNN model. We show how user… ▽ More

    Submitted 27 February, 2023; v1 submitted 29 August, 2022; originally announced August 2022.

    Comments: 17 pages

  14. arXiv:2206.10673  [pdf, ps, other

    cs.CV cs.CR

    Natural Backdoor Datasets

    Authors: Emily Wenger, Roma Bhattacharjee, Arjun Nitin Bhagoji, Josephine Passananti, Emilio Andere, Haitao Zheng, Ben Y. Zhao

    Abstract: Extensive literature on backdoor poison attacks has studied attacks and defenses for backdoors using "digital trigger patterns." In contrast, "physical backdoors" use physical objects as triggers, have only recently been identified, and are qualitatively different enough to resist all defenses targeting digital trigger backdoors. Research on physical backdoors is limited by access to large dataset… ▽ More

    Submitted 21 June, 2022; originally announced June 2022.

    Comments: 18 pages

  15. arXiv:2206.09868  [pdf, other

    cs.LG cs.CR cs.CV

    Understanding Robust Learning through the Lens of Representation Similarities

    Authors: Christian Cianfarani, Arjun Nitin Bhagoji, Vikash Sehwag, Ben Y. Zhao, Prateek Mittal, Haitao Zheng

    Abstract: Representation learning, i.e. the generation of representations useful for downstream applications, is a task of fundamental importance that underlies much of the success of deep neural networks (DNNs). Recently, robustness to adversarial examples has emerged as a desirable property for DNNs, spurring the development of robust training methods that account for adversarial examples. In this paper,… ▽ More

    Submitted 15 September, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: 35 pages, 29 figures; Accepted to Neurips 2022

  16. arXiv:2206.04677  [pdf, other

    cs.CR cs.CV cs.LG

    On the Permanence of Backdoors in Evolving Models

    Authors: Huiying Li, Arjun Nitin Bhagoji, Yuxin Chen, Haitao Zheng, Ben Y. Zhao

    Abstract: Existing research on training-time attacks for deep neural networks (DNNs), such as backdoors, largely assume that models are static once trained, and hidden backdoors trained into models remain active indefinitely. In practice, models are rarely static but evolve continuously to address distribution drifts in the underlying data. This paper explores the behavior of backdoor attacks in time-varyin… ▽ More

    Submitted 8 February, 2023; v1 submitted 7 June, 2022; originally announced June 2022.

  17. Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models

    Authors: Shawn Shan, Wenxin Ding, Emily Wenger, Haitao Zheng, Ben Y. Zhao

    Abstract: Server breaches are an unfortunate reality on today's Internet. In the context of deep neural network (DNN) models, they are particularly harmful, because a leaked model gives an attacker "white-box" access to generate adversarial examples, a threat model that has no practical robust defenses. For practitioners who have invested years and millions into proprietary DNNs, e.g. medical imaging, this… ▽ More

    Submitted 16 October, 2022; v1 submitted 21 May, 2022; originally announced May 2022.

    Journal ref: 2022 ACM Conference on Computer and Communications Security (CCS)

  18. arXiv:2202.05760  [pdf, ps, other

    cs.CR cs.CV

    Assessing Privacy Risks from Feature Vector Reconstruction Attacks

    Authors: Emily Wenger, Francesca Falzon, Josephine Passananti, Haitao Zheng, Ben Y. Zhao

    Abstract: In deep neural networks for facial recognition, feature vectors are numerical representations that capture the unique features of a given face. While it is known that a version of the original face can be recovered via "feature reconstruction," we lack an understanding of the end-to-end privacy risks produced by these attacks. In this work, we address this shortcoming by developing metrics that me… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

    Comments: 7 pages

  19. arXiv:2112.04558  [pdf, ps, other

    cs.CR cs.CV cs.LG

    SoK: Anti-Facial Recognition Technology

    Authors: Emily Wenger, Shawn Shan, Haitao Zheng, Ben Y. Zhao

    Abstract: The rapid adoption of facial recognition (FR) technology by both government and commercial entities in recent years has raised concerns about civil liberties and privacy. In response, a broad suite of so-called "anti-facial recognition" (AFR) tools has been developed to help users avoid unwanted facial recognition. The set of AFR tools proposed in the last few years is wide-ranging and rapidly evo… ▽ More

    Submitted 15 February, 2023; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: Camera-ready version for Oakland S&P 2023

  20. arXiv:2110.06904  [pdf, ps, other

    cs.CR cs.AI

    Poison Forensics: Traceback of Data Poisoning Attacks in Neural Networks

    Authors: Shawn Shan, Arjun Nitin Bhagoji, Haitao Zheng, Ben Y. Zhao

    Abstract: In adversarial machine learning, new defenses against attacks on deep learning systems are routinely broken soon after their release by more powerful attacks. In this context, forensic tools can offer a valuable complement to existing defenses, by tracing back a successful attack to its root cause, and offering a path forward for mitigation to prevent similar attacks in the future. In this paper… ▽ More

    Submitted 15 June, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

    Comments: 18 pages

    Journal ref: USENIX Security Symposium 2022

  21. arXiv:2109.09598  [pdf, ps, other

    cs.CR cs.AI cs.SD eess.AS

    "Hello, It's Me": Deep Learning-based Speech Synthesis Attacks in the Real World

    Authors: Emily Wenger, Max Bronckers, Christian Cianfarani, Jenna Cryan, Angela Sha, Haitao Zheng, Ben Y. Zhao

    Abstract: Advances in deep learning have introduced a new wave of voice synthesis tools, capable of producing audio that sounds as if spoken by a target speaker. If successful, such tools in the wrong hands will enable a range of powerful attacks against both humans and software systems (aka machines). This paper documents efforts and findings from a comprehensive experimental study on the impact of deep-le… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

    Comments: 13 pages

  22. arXiv:2105.08694  [pdf, other

    cs.PF

    Towards Performance Clarity of Edge Video Analytics

    Authors: Zhujun Xiao, Zhengxu Xia, Haitao Zheng, Ben Y. Zhao, Junchen Jiang

    Abstract: Edge video analytics is becoming the solution to many safety and management tasks. Its wide deployment, however, must first address the tension between inference accuracy and resource (compute/network) cost. This has led to the development of video analytics pipelines (VAPs), which reduce resource cost by combining DNN compression/speedup techniques with video processing heuristics. Our measuremen… ▽ More

    Submitted 18 May, 2021; originally announced May 2021.

  23. arXiv:2102.04291  [pdf, ps, other

    cs.CR

    A Real-time Defense against Website Fingerprinting Attacks

    Authors: Shawn Shan, Arjun Nitin Bhagoji, Haitao Zheng, Ben Y. Zhao

    Abstract: Anonymity systems like Tor are vulnerable to Website Fingerprinting (WF) attacks, where a local passive eavesdropper infers the victim's activity. Current WF attacks based on deep learning classifiers have successfully overcome numerous proposed defenses. While recent defenses leveraging adversarial examples offer promise, these adversarial examples can only be computed after the network session h… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

    Comments: 18 pages

  24. arXiv:2006.14580  [pdf, ps, other

    cs.CV cs.CR cs.LG

    Backdoor Attacks Against Deep Learning Systems in the Physical World

    Authors: Emily Wenger, Josephine Passananti, Arjun Bhagoji, Yuanshun Yao, Haitao Zheng, Ben Y. Zhao

    Abstract: Backdoor attacks embed hidden malicious behaviors into deep learning models, which only activate and cause misclassifications on model inputs containing a specific trigger. Existing works on backdoor attacks and defenses, however, mostly focus on digital attacks that use digitally generated patterns as triggers. A critical question remains unanswered: can backdoor attacks succeed using physical ob… ▽ More

    Submitted 7 September, 2021; v1 submitted 25 June, 2020; originally announced June 2020.

    Comments: Accepted to the 2021 Conference on Computer Vision and Pattern Recognition (CVPR 2021); 14 pages

  25. arXiv:2006.14042  [pdf, ps, other

    cs.CR cs.CV cs.LG

    Blacklight: Scalable Defense for Neural Networks against Query-Based Black-Box Attacks

    Authors: Huiying Li, Shawn Shan, Emily Wenger, Jiayun Zhang, Haitao Zheng, Ben Y. Zhao

    Abstract: Deep learning systems are known to be vulnerable to adversarial examples. In particular, query-based black-box attacks do not require knowledge of the deep learning model, but can compute adversarial examples over the network by submitting queries and inspecting returns. Recent work largely improves the efficiency of those attacks, demonstrating their practicality on today's ML-as-a-service platfo… ▽ More

    Submitted 9 June, 2022; v1 submitted 24 June, 2020; originally announced June 2020.

  26. arXiv:2002.08327  [pdf, ps, other

    cs.CR cs.CV cs.LG stat.ML

    Fawkes: Protecting Privacy against Unauthorized Deep Learning Models

    Authors: Shawn Shan, Emily Wenger, Jiayun Zhang, Huiying Li, Haitao Zheng, Ben Y. Zhao

    Abstract: Today's proliferation of powerful facial recognition systems poses a real threat to personal privacy. As Clearview.ai demonstrated, anyone can canvas the Internet for data and train highly accurate facial recognition models of individuals without their knowledge. We need tools to protect ourselves from potential misuses of unauthorized facial recognition systems. Unfortunately, no practical or eff… ▽ More

    Submitted 22 June, 2020; v1 submitted 19 February, 2020; originally announced February 2020.

    Journal ref: USENIX Security Symposium 2020

  27. arXiv:1912.01328  [pdf, other

    cs.SE eess.SY

    Trimming Mobile Applications for Bandwidth-Challenged Networks in Developing Regions

    Authors: Qinge Xie, Qingyuan Gong, Xinlei He, Yang Chen, Xin Wang, Haitao Zheng, Ben Y. Zhao

    Abstract: Despite continuous efforts to build and update network infrastructure, mobile devices in developing regions continue to be constrained by limited bandwidth. Unfortunately, this coincides with a period of unprecedented growth in the size of mobile applications. Thus it is becoming prohibitively expensive for users in developing regions to download and update mobile apps critical to their economic a… ▽ More

    Submitted 8 December, 2019; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: 12 pages, 8 figures

  28. arXiv:1912.01242  [pdf, other

    cs.LG eess.SP stat.ML

    "How do urban incidents affect traffic speed?" A Deep Graph Convolutional Network for Incident-driven Traffic Speed Prediction

    Authors: Qinge Xie, Tiancheng Guo, Yang Chen, Yu Xiao, Xin Wang, Ben Y. Zhao

    Abstract: Accurate traffic speed prediction is an important and challenging topic for transportation planning. Previous studies on traffic speed prediction predominately used spatio-temporal and context features for prediction. However, they have not made good use of the impact of urban traffic incidents. In this work, we aim to make use of the information of urban incidents to achieve a better prediction o… ▽ More

    Submitted 3 December, 2019; originally announced December 2019.

    Comments: 18 pages, 8 figures

  29. arXiv:1910.01226  [pdf, ps, other

    cs.CR cs.LG stat.ML

    Piracy Resistant Watermarks for Deep Neural Networks

    Authors: Huiying Li, Emily Wenger, Shawn Shan, Ben Y. Zhao, Haitao Zheng

    Abstract: As companies continue to invest heavily in larger, more accurate and more robust deep learning models, they are exploring approaches to monetize their models while protecting their intellectual property. Model licensing is promising, but requires a robust tool for owners to claim ownership of models, i.e. a watermark. Unfortunately, current designs have not been able to address piracy attacks, whe… ▽ More

    Submitted 2 December, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

    Comments: 18 pages

  30. arXiv:1905.10447  [pdf, other

    cs.LG cs.CR

    Regula Sub-rosa: Latent Backdoor Attacks on Deep Neural Networks

    Authors: Yuanshun Yao, Huiying Li, Haitao Zheng, Ben Y. Zhao

    Abstract: Recent work has proposed the concept of backdoor attacks on deep neural networks (DNNs), where misbehaviors are hidden inside "normal" models, only to be triggered by very specific inputs. In practice, however, these attacks are difficult to perform and highly constrained by sharing of models through transfer learning. Adversaries have a small window during which they must compromise the student m… ▽ More

    Submitted 24 May, 2019; originally announced May 2019.

  31. arXiv:1904.08554  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Gotta Catch 'Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks

    Authors: Shawn Shan, Emily Wenger, Bolun Wang, Bo Li, Haitao Zheng, Ben Y. Zhao

    Abstract: Deep neural networks (DNN) are known to be vulnerable to adversarial attacks. Numerous efforts either try to patch weaknesses in trained models, or try to make it difficult or costly to compute adversarial examples that exploit them. In our work, we explore a new "honeypot" approach to protect DNN models. We intentionally inject trapdoors, honeypot weaknesses in the classification manifold that at… ▽ More

    Submitted 28 September, 2020; v1 submitted 17 April, 2019; originally announced April 2019.

    Journal ref: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security

  32. arXiv:1904.08490  [pdf, other

    cs.CR cs.HC eess.SP

    Understanding the Effectiveness of Ultrasonic Microphone Jammer

    Authors: Yuxin Chen, Huiying Li, Steven Nagels, Zhijing Li, Pedro Lopes, Ben Y. Zhao, Haitao Zheng

    Abstract: Recent works have explained the principle of using ultrasonic transmissions to jam nearby microphones. These signals are inaudible to nearby users, but leverage "hardware nonlinearity" to induce a jamming signal inside microphones that disrupts voice recordings. This has great implications on audio privacy protection. In this work, we gain a deeper understanding on the effectiveness of ultrasonic… ▽ More

    Submitted 17 April, 2019; originally announced April 2019.

  33. arXiv:1810.10157  [pdf, ps, other

    cs.CR cs.NI

    Wireless Side-Lobe Eavesdropping Attacks

    Authors: Yanzi Zhu, Ying Ju, Bolun Wang, Jenna Cryan, Ben Y. Zhao, Haitao Zheng

    Abstract: Millimeter-wave wireless networks offer high throughput and can (ideally) prevent eavesdropping attacks using narrow, directional beams. Unfortunately, imperfections in physical hardware mean today's antenna arrays all exhibit side lobes, signals that carry the same sensitive data as the main lobe. Our work presents results of the first experimental study of the security properties of mmWave trans… ▽ More

    Submitted 23 October, 2018; originally announced October 2018.

  34. Et Tu Alexa? When Commodity WiFi Devices Turn into Adversarial Motion Sensors

    Authors: Yanzi Zhu, Zhujun Xiao, Yuxin Chen, Zhijing Li, Max Liu, Ben Y. Zhao, Haitao Zheng

    Abstract: Our work demonstrates a new set of silent reconnaissance attacks, which leverages the presence of commodity WiFi devices to track users inside private homes and offices, without compromising any WiFi network, data packets, or devices. We show that just by sniffing existing WiFi signals, an adversary can accurately detect and track movements of users inside a building. This is made possible by our… ▽ More

    Submitted 11 January, 2020; v1 submitted 23 October, 2018; originally announced October 2018.

    Comments: NDSS'20

  35. arXiv:1809.10242  [pdf, ps, other

    cs.CV cs.LG stat.ML

    Addressing Training Bias via Automated Image Annotation

    Authors: Zhujun Xiao, Yanzi Zhu, Yuxin Chen, Ben Y. Zhao, Junchen Jiang, Haitao Zheng

    Abstract: Build accurate DNN models requires training on large labeled, context specific datasets, especially those matching the target scenario. We believe advances in wireless localization, working in unison with cameras, can produce automated annotation of targets on images and videos captured in the wild. Using pedestrian and vehicle detection as examples, we demonstrate the feasibility, benefits, and c… ▽ More

    Submitted 10 October, 2018; v1 submitted 22 September, 2018; originally announced September 2018.

  36. arXiv:1708.08151  [pdf, other

    cs.CR cs.SI

    Automated Crowdturfing Attacks and Defenses in Online Review Systems

    Authors: Yuanshun Yao, Bimal Viswanath, Jenna Cryan, Haitao Zheng, Ben Y. Zhao

    Abstract: Malicious crowdsourcing forums are gaining traction as sources of spreading misinformation online, but are limited by the costs of hiring and managing human workers. In this paper, we identify a new class of attacks that leverage deep learning language models (Recurrent Neural Networks or RNNs) to automate the generation of fake online reviews for products and services. Not only are these attacks… ▽ More

    Submitted 7 September, 2017; v1 submitted 27 August, 2017; originally announced August 2017.

  37. arXiv:1508.00837  [pdf, ps, other

    cs.SI cs.CR

    Defending against Sybil Devices in Crowdsourced Mapping Services

    Authors: Gang Wang, Bolun Wang, Tianyi Wang, Ana Nika, Haitao Zheng, Ben Y. Zhao

    Abstract: Real-time crowdsourced maps such as Waze provide timely updates on traffic, congestion, accidents and points of interest. In this paper, we demonstrate how lack of strong location authentication allows creation of software-based {\em Sybil devices} that expose crowdsourced map systems to a variety of security and privacy attacks. Our experiments show that a single Sybil device with limited resourc… ▽ More

    Submitted 27 April, 2016; v1 submitted 4 August, 2015; originally announced August 2015.

    Comments: Measure and integration

    ACM Class: H.3; K.6

  38. arXiv:1506.00022  [pdf, ps, other

    cs.CR cs.SI

    Graph Watermarks

    Authors: Xiaohan Zhao, Qingyun Liu, Lin Zhou, Haitao Zheng, Ben Y. Zhao

    Abstract: From network topologies to online social networks, many of today's most sensitive datasets are captured in large graphs. A significant challenge facing owners of these datasets is how to share sensitive graphs with collaborators and authorized users, e.g. network topologies with network equipment vendors or Facebook's social graphs with academic collaborators. Current tools can provide limited nod… ▽ More

    Submitted 29 May, 2015; originally announced June 2015.

    Comments: 16 pages, 14 figures, full version

  39. arXiv:1406.1137  [pdf, ps, other

    cs.SI physics.soc-ph

    Crowds on Wall Street: Extracting Value from Social Investing Platforms

    Authors: Gang Wang, Tianyi Wang, Bolun Wang, Divya Sambasivan, Zengbin Zhang, Haitao Zheng, Ben Y. Zhao

    Abstract: For decades, the world of financial advisors has been dominated by large investment banks such as Goldman Sachs. In recent years, user-contributed investment services such as SeekingAlpha and StockTwits have grown to millions of users. In this paper, we seek to understand the quality and impact of content on social investment platforms, by empirically analyzing complete datasets of SeekingAlpha ar… ▽ More

    Submitted 4 June, 2014; originally announced June 2014.

  40. arXiv:1309.0874  [pdf

    cs.DC cs.DS cs.SI physics.soc-ph

    Shortest Paths in Microseconds

    Authors: Rachit Agarwal, Matthew Caesar, P. Brighten Godfrey, Ben Y. Zhao

    Abstract: Computing shortest paths is a fundamental primitive for several social network applications including socially-sensitive ranking, location-aware search, social auctions and social network privacy. Since these applications compute paths in response to a user query, the goal is to minimize latency while maintaining feasible memory requirements. We present ASAP, a system that achieves this goal by ex… ▽ More

    Submitted 3 September, 2013; originally announced September 2013.

    Comments: Extended version of WOSN'12 paper: new techniques (reduced memory, faster computations), distributed (MapReduce) algorithm, multiple paths between a source-destination pair

  41. arXiv:1206.1134  [pdf

    cs.SI cs.DB physics.soc-ph

    Shortest Paths in Less Than a Millisecond

    Authors: Rachit Agarwal, Matthew Caesar, P. Brighten Godfrey, Ben Y. Zhao

    Abstract: We consider the problem of answering point-to-point shortest path queries on massive social networks. The goal is to answer queries within tens of milliseconds while minimizing the memory requirements. We present a technique that achieves this goal for an extremely large fraction of path queries by exploiting the structure of the social networks. Using evaluations on real-world datasets, we argu… ▽ More

    Submitted 6 June, 2012; originally announced June 2012.

    Comments: 6 pages; to appear in SIGCOMM WOSN 2012

  42. arXiv:1205.4013  [pdf, ps, other

    cs.SI physics.soc-ph

    Multi-scale Dynamics in a Massive Online Social Network

    Authors: Xiaohan Zhao, Alessandra Sala, Christo Wilson, Xiao Wang, Sabrina Gaito, Haitao Zheng, Ben Y. Zhao

    Abstract: Data confidentiality policies at major social network providers have severely limited researchers' access to large-scale datasets. The biggest impact has been on the study of network dynamics, where researchers have studied citation graphs and content-sharing networks, but few have analyzed detailed dynamics in the massive social networks that dominate the web today. In this paper, we present resu… ▽ More

    Submitted 17 May, 2012; originally announced May 2012.

  43. arXiv:1205.3856  [pdf, ps, other

    cs.SI physics.soc-ph

    Social Turing Tests: Crowdsourcing Sybil Detection

    Authors: Gang Wang, Manish Mohanlal, Christo Wilson, Xiao Wang, Miriam Metzger, Haitao Zheng, Ben Y. Zhao

    Abstract: As popular tools for spreading spam and malware, Sybils (or fake accounts) pose a serious threat to online communities such as Online Social Networks (OSNs). Today, sophisticated attackers are creating realistic Sybils that effectively befriend legitimate users, rendering most automated Sybil detection techniques ineffective. In this paper, we explore the feasibility of a crowdsourced Sybil detect… ▽ More

    Submitted 7 December, 2012; v1 submitted 17 May, 2012; originally announced May 2012.

  44. arXiv:1203.6744  [pdf, ps, other

    cs.SI physics.soc-ph

    On the Bursty Evolution of Online Social Networks

    Authors: Sabrina Gaito, Matteo Zignani, Gian Paolo Rossi, Alessandra Sala, Xiao Wang, Haitao Zheng, Ben Y. Zhao

    Abstract: The high level of dynamics in today's online social networks (OSNs) creates new challenges for their infrastructures and providers. In particular, dynamics involving edge creation has direct implications on strategies for resource allocation, data partitioning and replication. Understanding network dynamics in the context of physical time is a critical first step towards a predictive approach towa… ▽ More

    Submitted 25 May, 2012; v1 submitted 30 March, 2012; originally announced March 2012.

    Comments: 13 pages, 7 figures

  45. arXiv:1111.5654  [pdf

    cs.SI cs.CR

    Serf and Turf: Crowdturfing for Fun and Profit

    Authors: Gang Wang, Christo Wilson, Xiaohan Zhao, Yibo Zhu, Manish Mohanlal, Haitao Zheng, Ben Y. Zhao

    Abstract: Popular Internet services in recent years have shown that remarkable things can be achieved by harnessing the power of the masses using crowd-sourcing systems. However, crowd-sourcing systems can also pose a real challenge to existing security mechanisms deployed to protect Internet services. Many of these techniques make the assumption that malicious activity is generated automatically by machine… ▽ More

    Submitted 18 May, 2012; v1 submitted 23 November, 2011; originally announced November 2011.

    Comments: Proceedings of WWW 2012 Conference, 10 pages, 23 figures, 4 tables

    ACM Class: H.3.5; J.4

  46. arXiv:1108.0027  [pdf, ps, other

    cs.SI physics.soc-ph

    Revisiting Degree Distribution Models for Social Graph Analysis

    Authors: Alessandra Sala, Sabrina Gaito, Gian Paolo Rossi, Haitao Zheng, Ben Y. Zhao

    Abstract: Degree distribution models are incredibly important tools for analyzing and understanding the structure and formation of social networks, and can help guide the design of efficient graph algorithms. In particular, the Power-law degree distribution has long been used to model the structure of online social networks, and is the basis for algorithms and heuristics in graph applications such as influe… ▽ More

    Submitted 29 July, 2011; originally announced August 2011.

  47. arXiv:1107.5114  [pdf, ps, other

    cs.SI physics.soc-ph

    Fast and Scalable Analysis of Massive Social Graphs

    Authors: Xiaohan Zhao, Alessandra Sala, Haitao Zheng, Ben Y. Zhao

    Abstract: Graph analysis is a critical component of applications such as online social networks, protein interactions in biological networks, and Internet traffic analysis. The arrival of massive graphs with hundreds of millions of nodes, e.g. social graphs, presents a unique challenge to graph analysis applications. Most of these applications rely on computing distances between node pairs, which for large… ▽ More

    Submitted 29 July, 2011; v1 submitted 26 July, 2011; originally announced July 2011.

  48. arXiv:1106.5321  [pdf, ps, other

    cs.SI physics.soc-ph

    Uncovering Social Network Sybils in the Wild

    Authors: Zhi Yang, Christo Wilson, Xiao Wang, Tingting Gao, Ben Y. Zhao, Yafei Dai

    Abstract: Sybil accounts are fake identities created to unfairly increase the power or resources of a single malicious user. Researchers have long known about the existence of Sybil accounts in online communities such as file-sharing systems, but have not been able to perform large scale measurements to detect them or measure their activities. In this paper, we describe our efforts to detect, characterize a… ▽ More

    Submitted 27 June, 2011; originally announced June 2011.

    Comments: 7 pages