Skip to main content

Showing 1–2 of 2 results for author: Pentapalli, L S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.00010  [pdf

    cs.CL cs.AI

    Jailbreak Detection in Clinical Training LLMs Using Feature-Based Predictive Models

    Authors: Tri Nguyen, Lohith Srikanth Pentapalli, Magnus Sieverding, Laurah Turner, Seth Overla, Weibing Zheng, Chris Zhou, David Furniss, Danielle Weber, Michael Gharib, Matt Kelleher, Michael Shukis, Cameron Pawlik, Kelly Cohen

    Abstract: Jailbreaking in Large Language Models (LLMs) threatens their safe use in sensitive domains like education by allowing users to bypass ethical safeguards. This study focuses on detecting jailbreaks in 2-Sigma, a clinical education platform that simulates patient interactions using LLMs. We annotated over 2,300 prompts across 158 conversations using four linguistic variables shown to correlate stron… ▽ More

    Submitted 21 April, 2025; originally announced May 2025.

  2. arXiv:2504.18636  [pdf, ps, other

    cs.CR cs.AI cs.LO

    A Gradient-Optimized TSK Fuzzy Framework for Explainable Phishing Detection

    Authors: Lohith Srikanth Pentapalli, Jon Salisbury, Josette Riep, Kelly Cohen

    Abstract: Phishing attacks represent an increasingly sophisticated and pervasive threat to individuals and organizations, causing significant financial losses, identity theft, and severe damage to institutional reputations. Existing phishing detection methods often struggle to simultaneously achieve high accuracy and explainability, either failing to detect novel attacks or operating as opaque black-box mod… ▽ More

    Submitted 25 April, 2025; originally announced April 2025.

    Comments: 14 pages, 5 figures