Skip to main content

Showing 1–12 of 12 results for author: Foulds, J R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.07937  [pdf, other

    cs.AI

    LLM-based Corroborating and Refuting Evidence Retrieval for Scientific Claim Verification

    Authors: Siyuan Wang, James R. Foulds, Md Osman Gani, Shimei Pan

    Abstract: In this paper, we introduce CIBER (Claim Investigation Based on Evidence Retrieval), an extension of the Retrieval-Augmented Generation (RAG) framework designed to identify corroborating and refuting documents as evidence for scientific claim verification. CIBER addresses the inherent uncertainty in Large Language Models (LLMs) by evaluating response consistency across diverse interrogation probes… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

  2. arXiv:2502.07790  [pdf, other

    cs.CY cs.AI

    Can Generative AI be Egalitarian?

    Authors: Philip Feldman, James R. Foulds, Shimei Pan

    Abstract: The recent explosion of "foundation" generative AI models has been built upon the extensive extraction of value from online sources, often without corresponding reciprocation. This pattern mirrors and intensifies the extractive practices of surveillance capitalism, while the potential for enormous profit has challenged technology organizations' commitments to responsible AI practices, raising sign… ▽ More

    Submitted 20 January, 2025; originally announced February 2025.

    Comments: 14 pages, 5 figures

    ACM Class: K.4.1

    Journal ref: October 2024 IEEE Consumer Technology Society (CTSoc) News on Consumer Technology (https://ctsoc.ieee.org/images/CTSOC-NCT-2024-10-FA.pdf)

  3. arXiv:2403.01193  [pdf, other

    cs.CL cs.AI

    RAGged Edges: The Double-Edged Sword of Retrieval-Augmented Chatbots

    Authors: Philip Feldman, James R. Foulds, Shimei Pan

    Abstract: Large language models (LLMs) like ChatGPT demonstrate the remarkable progress of artificial intelligence. However, their tendency to hallucinate -- generate plausible but false information -- poses a significant challenge. This issue is critical, as seen in recent court cases where ChatGPT's use led to citations of non-existent legal rulings. This paper explores how Retrieval-Augmented Generation… ▽ More

    Submitted 12 June, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

    Comments: 7 Pages, 1 Figure, 1 Table

    ACM Class: H.3.3; I.2.7

  4. arXiv:2402.01663  [pdf, other

    cs.CY cs.CR cs.LG

    Killer Apps: Low-Speed, Large-Scale AI Weapons

    Authors: Philip Feldman, Aaron Dant, James R. Foulds

    Abstract: The accelerating advancements in Artificial Intelligence (AI) and Machine Learning (ML), highlighted by the development of cutting-edge Generative Pre-trained Transformer (GPT) models by organizations such as OpenAI, Meta, and Anthropic, present new challenges and opportunities in warfare and security. Much of the current focus is on AI's integration within weapons systems and its role in rapid de… ▽ More

    Submitted 17 June, 2024; v1 submitted 14 January, 2024; originally announced February 2024.

    Comments: 10 pages with 10 pages of appendices. 3 Figures, 2 code listings

    ACM Class: I.2.7; H.4.3; J.4

    Journal ref: Workshops at the International Conference on Intelligent User Interfaces (IUI) 2024

  5. arXiv:2306.06085  [pdf, other

    cs.CL cs.AI

    Trapping LLM Hallucinations Using Tagged Context Prompts

    Authors: Philip Feldman, James R. Foulds, Shimei Pan

    Abstract: Recent advances in large language models (LLMs), such as ChatGPT, have led to highly sophisticated conversation agents. However, these models suffer from "hallucinations," where the model generates false or fabricated information. Addressing this challenge is crucial, particularly with AI-driven platforms being adopted across various sectors. In this paper, we propose a novel method to recognize a… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: 13 pages, 3 Figures, 2 Tables

    ACM Class: I.2.7; K.4.2

  6. arXiv:2301.05198  [pdf, other

    cs.HC

    The Keyword Explorer Suite: A Toolkit for Understanding Online Populations

    Authors: Philip Feldman, Shimei Pan, James R. Foulds

    Abstract: We have developed a set of Python applications that use large language models to identify and analyze data from social media platforms relevant to a population of interest. Our pipeline begins with using OpenAI's GPT-3 to generate potential keywords for identifying relevant text content from the target population. The keywords are then validated, and the content downloaded and analyzed using GPT-3… ▽ More

    Submitted 13 January, 2023; v1 submitted 12 January, 2023; originally announced January 2023.

    Comments: 6 pages, 4 figures

    ACM Class: H.5.2; H.1.2; I.2.7

  7. arXiv:2209.07044  [pdf, other

    cs.LG cs.CY

    Fair Inference for Discrete Latent Variable Models

    Authors: Rashidul Islam, Shimei Pan, James R. Foulds

    Abstract: It is now well understood that machine learning models, trained on data without due care, often exhibit unfair and discriminatory behavior against certain populations. Traditional algorithmic fairness research has mainly focused on supervised learning tasks, particularly classification. While fairness in unsupervised learning has received some attention, the literature has primarily addressed fair… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

  8. arXiv:2204.07483  [pdf, other

    cs.CL cs.CY

    Polling Latent Opinions: A Method for Computational Sociolinguistics Using Transformer Language Models

    Authors: Philip Feldman, Aaron Dant, James R. Foulds, Shemei Pan

    Abstract: Text analysis of social media for sentiment, topic analysis, and other analysis depends initially on the selection of keywords and phrases that will be used to create the research corpora. However, keywords that researchers choose may occur infrequently, leading to errors that arise from using small samples. In this paper, we use the capacity for memorization, interpolation, and extrapolation of T… ▽ More

    Submitted 19 April, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

    Comments: 10 pages, 9 figures, 7 tables

    ACM Class: K.4.m

  9. arXiv:2105.07996  [pdf, ps, other

    cs.AI

    Learning User Embeddings from Temporal Social Media Data: A Survey

    Authors: Fatema Hasan, Kevin S. Xu, James R. Foulds, Shimei Pan

    Abstract: User-generated data on social media contain rich information about who we are, what we like and how we make decisions. In this paper, we survey representative work on learning a concise latent user representation (a.k.a. user embedding) that can capture the main characteristics of a social media user. The learned user embeddings can later be used to support different downstream user analysis tasks… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

  10. arXiv:2104.10259  [pdf, other

    cs.CL cs.CY

    Analyzing COVID-19 Tweets with Transformer-based Language Models

    Authors: Philip Feldman, Sim Tiwari, Charissa S. L. Cheah, James R. Foulds, Shimei Pan

    Abstract: This paper describes a method for using Transformer-based Language Models (TLMs) to understand public opinion from social media posts. In this approach, we train a set of GPT models on several COVID-19 tweet corpora that reflect populations of users with distinctive views. We then use prompt-based queries to probe these models to reveal insights into the biases and opinions of the users. We demons… ▽ More

    Submitted 5 May, 2021; v1 submitted 20 April, 2021; originally announced April 2021.

    Comments: Six pages, six tables, four figures

    ACM Class: J.4; I.2.7

  11. arXiv:2010.06820  [pdf, other

    cs.LG cs.AI cs.CY

    Equitable Allocation of Healthcare Resources with Fair Cox Models

    Authors: Kamrun Naher Keya, Rashidul Islam, Shimei Pan, Ian Stockwell, James R. Foulds

    Abstract: Healthcare programs such as Medicaid provide crucial services to vulnerable populations, but due to limited resources, many of the individuals who need these services the most languish on waiting lists. Survival models, e.g. the Cox proportional hazards model, can potentially improve this situation by predicting individuals' levels of need, which can then be used to prioritize the waiting lists. P… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

    Comments: AAAI Fall Symposium on AI in Government and Public Sector (AAAI FSS-20), 2020

  12. arXiv:1909.04702  [pdf, other

    cs.CL cs.IR cs.LG

    Neural Embedding Allocation: Distributed Representations of Topic Models

    Authors: Kamrun Naher Keya, Yannis Papanikolaou, James R. Foulds

    Abstract: Word embedding models such as the skip-gram learn vector representations of words' semantic relationships, and document embedding models learn similar representations for documents. On the other hand, topic models provide latent representations of the documents' topical themes. To get the benefits of these representations simultaneously, we propose a unifying algorithm, called neural embedding all… ▽ More

    Submitted 10 September, 2019; originally announced September 2019.