Skip to main content

Showing 1–2 of 2 results for author: Umair, H

.
  1. arXiv:2506.09956  [pdf, ps, other

    cs.CR cs.AI

    LLMail-Inject: A Dataset from a Realistic Adaptive Prompt Injection Challenge

    Authors: Sahar Abdelnabi, Aideen Fay, Ahmed Salem, Egor Zverev, Kai-Chieh Liao, Chi-Huang Liu, Chun-Chih Kuo, Jannis Weigend, Danyael Manlangit, Alex Apostolov, Haris Umair, João Donato, Masayuki Kawakita, Athar Mahboob, Tran Huu Bach, Tsun-Han Chiang, Myeongjin Cho, Hajin Choi, Byeonghyeon Kim, Hyeonjin Lee, Benjamin Pannell, Conor McCauley, Mark Russinovich, Andrew Paverd, Giovanni Cherubin

    Abstract: Indirect Prompt Injection attacks exploit the inherent limitation of Large Language Models (LLMs) to distinguish between instructions and data in their inputs. Despite numerous defense proposals, the systematic evaluation against adaptive adversaries remains limited, even when successful attacks can have wide security and privacy implications, and many real-world LLM-based applications remain vuln… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: Dataset at: https://huggingface.co/datasets/microsoft/llmail-inject-challenge

  2. arXiv:2108.03305  [pdf, other

    cs.CL cs.AI

    Offensive Language and Hate Speech Detection with Deep Learning and Transfer Learning

    Authors: Bencheng Wei, Jason Li, Ajay Gupta, Hafiza Umair, Atsu Vovor, Natalie Durzynski

    Abstract: Toxic online speech has become a crucial problem nowadays due to an exponential increase in the use of internet by people from different cultures and educational backgrounds. Differentiating if a text message belongs to hate speech and offensive language is a key challenge in automatic detection of toxic text content. In this paper, we propose an approach to automatically classify tweets into thre… ▽ More

    Submitted 22 August, 2021; v1 submitted 6 August, 2021; originally announced August 2021.