Skip to main content

Showing 1–5 of 5 results for author: Blandfort, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.10805  [pdf, ps, other

    cs.LG

    Detecting High-Stakes Interactions with Activation Probes

    Authors: Alex McKenzie, Urja Pawar, Phil Blandfort, William Bankes, David Krueger, Ekdeep Singh Lubana, Dmitrii Krasheninnikov

    Abstract: Monitoring is an important aspect of safely deploying Large Language Models (LLMs). This paper examines activation probes for detecting "high-stakes" interactions -- where the text indicates that the interaction might lead to significant harm -- as a critical, yet underexplored, target for such monitoring. We evaluate several probe architectures trained on synthetic data, and find them to exhibit… ▽ More

    Submitted 13 June, 2025; v1 submitted 12 June, 2025; originally announced June 2025.

    Comments: 33 pages

  2. arXiv:1901.02322  [pdf, other

    cs.LG cs.AI stat.ML

    Fusion Strategies for Learning User Embeddings with Neural Networks

    Authors: Philipp Blandfort, Tushar Karayil, Federico Raue, Jörn Hees, Andreas Dengel

    Abstract: Growing amounts of online user data motivate the need for automated processing techniques. In case of user ratings, one interesting option is to use neural networks for learning to predict ratings given an item and a user. While training for prediction, such an approach at the same time learns to map each user to a vector, a so-called user embedding. Such embeddings can for example be valuable for… ▽ More

    Submitted 8 January, 2019; originally announced January 2019.

    Comments: submitted to IJCNN 2019

  3. arXiv:1811.04028  [pdf, other

    cs.AI cs.LG stat.AP

    An Overview of Computational Approaches for Interpretation Analysis

    Authors: Philipp Blandfort, Jörn Hees, Desmond U. Patton

    Abstract: It is said that beauty is in the eye of the beholder. But how exactly can we characterize such discrepancies in interpretation? For example, are there any specific features of an image that makes person A regard an image as beautiful while person B finds the same image displeasing? Such questions ultimately aim at explaining our individual ways of interpretation, an intention that has been of fund… ▽ More

    Submitted 22 May, 2019; v1 submitted 9 November, 2018; originally announced November 2018.

    Comments: Preprint submitted to Digital Signal Processing

  4. arXiv:1810.06219  [pdf, other

    cs.CV cs.MM

    The Focus-Aspect-Polarity Model for Predicting Subjective Noun Attributes in Images

    Authors: Tushar Karayil, Philipp Blandfort, Jörn Hees, Andreas Dengel

    Abstract: Subjective visual interpretation is a challenging yet important topic in computer vision. Many approaches reduce this problem to the prediction of adjective- or attribute-labels from images. However, most of these do not take attribute semantics into account, or only process the image in a holistic manner. Furthermore, there is a lack of relevant datasets with fine-grained subjective labels. In th… ▽ More

    Submitted 15 October, 2018; originally announced October 2018.

  5. arXiv:1807.08465  [pdf, other

    cs.LG cs.CL stat.ML

    Multimodal Social Media Analysis for Gang Violence Prevention

    Authors: Philipp Blandfort, Desmond Patton, William R. Frey, Svebor Karaman, Surabhi Bhargava, Fei-Tzin Lee, Siddharth Varia, Chris Kedzie, Michael B. Gaskell, Rossano Schifanella, Kathleen McKeown, Shih-Fu Chang

    Abstract: Gang violence is a severe issue in major cities across the U.S. and recent studies [Patton et al. 2017] have found evidence of social media communications that can be linked to such violence in communities with high rates of exposure to gang activity. In this paper we partnered computer scientists with social work researchers, who have domain expertise in gang violence, to analyze how public tweet… ▽ More

    Submitted 23 July, 2018; originally announced July 2018.