Skip to main content

Showing 1–18 of 18 results for author: Chae, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.12768  [pdf, other

    cs.CL

    ReEx-SQL: Reasoning with Execution-Aware Reinforcement Learning for Text-to-SQL

    Authors: Yaxun Dai, Wenxuan Xie, Xialie Zhuang, Tianyu Yang, Yiying Yang, Haiqin Yang, Yuhang Zhao, Pingfu Chao, Wenhao Jiang

    Abstract: In Text-to-SQL, execution feedback is essential for guiding large language models (LLMs) to reason accurately and generate reliable SQL queries. However, existing methods treat execution feedback solely as a post-hoc signal for correction or selection, failing to integrate it into the generation process. This limitation hinders their ability to address reasoning errors as they occur, ultimately re… ▽ More

    Submitted 19 May, 2025; v1 submitted 19 May, 2025; originally announced May 2025.

  2. arXiv:2505.12454  [pdf, other

    cs.CL cs.LG

    Towards DS-NER: Unveiling and Addressing Latent Noise in Distant Annotations

    Authors: Yuyang Ding, Dan Qiao, Juntao Li, Jiajie Xu, Pingfu Chao, Xiaofang Zhou, Min Zhang

    Abstract: Distantly supervised named entity recognition (DS-NER) has emerged as a cheap and convenient alternative to traditional human annotation methods, enabling the automatic generation of training data by aligning text with external resources. Despite the many efforts in noise measurement methods, few works focus on the latent noise distribution between different distant annotation methods. In this wor… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

  3. arXiv:2412.16720  [pdf, other

    cs.AI

    OpenAI o1 System Card

    Authors: OpenAI, :, Aaron Jaech, Adam Kalai, Adam Lerer, Adam Richardson, Ahmed El-Kishky, Aiden Low, Alec Helyar, Aleksander Madry, Alex Beutel, Alex Carney, Alex Iftimie, Alex Karpenko, Alex Tachard Passos, Alexander Neitz, Alexander Prokofiev, Alexander Wei, Allison Tam, Ally Bennett, Ananya Kumar, Andre Saraiva, Andrea Vallone, Andrew Duberstein, Andrew Kondrich , et al. (238 additional authors not shown)

    Abstract: The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought. These advanced reasoning capabilities provide new avenues for improving the safety and robustness of our models. In particular, our models can reason about our safety policies in context when responding to potentially unsafe prompts, through deliberative alignment. This leads to state-of-the-ar… ▽ More

    Submitted 21 December, 2024; originally announced December 2024.

  4. arXiv:2410.21276  [pdf, other

    cs.CL cs.AI cs.CV cs.CY cs.LG cs.SD eess.AS

    GPT-4o System Card

    Authors: OpenAI, :, Aaron Hurst, Adam Lerer, Adam P. Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, Aleksander Mądry, Alex Baker-Whitcomb, Alex Beutel, Alex Borzunov, Alex Carney, Alex Chow, Alex Kirillov, Alex Nichol, Alex Paino, Alex Renzin, Alex Tachard Passos, Alexander Kirillov, Alexi Christakis , et al. (395 additional authors not shown)

    Abstract: GPT-4o is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, and generates any combination of text, audio, and image outputs. It's trained end-to-end across text, vision, and audio, meaning all inputs and outputs are processed by the same neural network. GPT-4o can respond to audio inputs in as little as 232 milliseconds, with an average of 320 mil… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

  5. arXiv:2406.10281  [pdf, ps, other

    cs.CR cs.CL cs.LG

    Watermarking Language Models with Error Correcting Codes

    Authors: Patrick Chao, Yan Sun, Edgar Dobriban, Hamed Hassani

    Abstract: Recent progress in large language models enables the creation of realistic machine-generated content. Watermarking is a promising approach to distinguish machine-generated text from human text, embedding statistical signals in the output that are ideally undetectable to humans. We propose a watermarking framework that encodes such signals through an error correcting code. Our method, termed robust… ▽ More

    Submitted 8 June, 2025; v1 submitted 12 June, 2024; originally announced June 2024.

  6. arXiv:2405.05957  [pdf, other

    cs.CL

    OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning

    Authors: Dan Qiao, Yi Su, Pinzheng Wang, Jing Ye, Wenjing Xie, Yuechi Zhou, Yuyang Ding, Zecheng Tang, Jikai Wang, Yixin Ji, Yue Wang, Pei Guo, Zechen Sun, Zikang Zhang, Juntao Li, Pingfu Chao, Wenliang Chen, Guohong Fu, Guodong Zhou, Qiaoming Zhu, Min Zhang

    Abstract: Large Language Models (LLMs) have played an important role in many fields due to their powerful capabilities.However, their massive number of parameters leads to high deployment requirements and incurs significant inference costs, which impedes their practical applications. Training smaller models is an effective way to address this problem. Therefore, we introduce OpenBA-V2, a 3.4B model derived… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  7. arXiv:2404.01318  [pdf, other

    cs.CR cs.LG

    JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

    Authors: Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramer, Hamed Hassani, Eric Wong

    Abstract: Jailbreak attacks cause large language models (LLMs) to generate harmful, unethical, or otherwise objectionable content. Evaluating these attacks presents a number of challenges, which the current collection of benchmarks and evaluation techniques do not adequately address. First, there is no clear standard of practice regarding jailbreaking evaluation. Second, existing works compute costs and suc… ▽ More

    Submitted 31 October, 2024; v1 submitted 27 March, 2024; originally announced April 2024.

    Comments: The camera-ready version of JailbreakBench v1.0 (accepted at NeurIPS 2024 Datasets and Benchmarks Track): more attack artifacts, more test-time defenses, a more accurate jailbreak judge (Llama-3-70B with a custom prompt), a larger dataset of human preferences for selecting a jailbreak judge (300 examples), an over-refusal evaluation dataset, a semantic refusal judge based on Llama-3-8B

  8. arXiv:2403.04893  [pdf, other

    cs.AI

    A Safe Harbor for AI Evaluation and Red Teaming

    Authors: Shayne Longpre, Sayash Kapoor, Kevin Klyman, Ashwin Ramaswami, Rishi Bommasani, Borhane Blili-Hamelin, Yangsibo Huang, Aviya Skowron, Zheng-Xin Yong, Suhas Kotha, Yi Zeng, Weiyan Shi, Xianjun Yang, Reid Southen, Alexander Robey, Patrick Chao, Diyi Yang, Ruoxi Jia, Daniel Kang, Sandy Pentland, Arvind Narayanan, Percy Liang, Peter Henderson

    Abstract: Independent evaluation and red teaming are critical for identifying the risks posed by generative AI systems. However, the terms of service and enforcement strategies used by prominent AI companies to deter model misuse have disincentives on good faith safety evaluations. This causes some researchers to fear that conducting such research or releasing their findings will result in account suspensio… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  9. arXiv:2310.08419  [pdf, other

    cs.LG cs.AI

    Jailbreaking Black Box Large Language Models in Twenty Queries

    Authors: Patrick Chao, Alexander Robey, Edgar Dobriban, Hamed Hassani, George J. Pappas, Eric Wong

    Abstract: There is growing interest in ensuring that large language models (LLMs) align with human values. However, the alignment of such models is vulnerable to adversarial jailbreaks, which coax LLMs into overriding their safety guardrails. The identification of these vulnerabilities is therefore instrumental in understanding inherent weaknesses and preventing future misuse. To this end, we propose Prompt… ▽ More

    Submitted 18 July, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

  10. arXiv:2308.01853  [pdf, other

    stat.ML cs.LG math.ST

    Statistical Estimation Under Distribution Shift: Wasserstein Perturbations and Minimax Theory

    Authors: Patrick Chao, Edgar Dobriban

    Abstract: Distribution shifts are a serious concern in modern statistical learning as they can systematically change the properties of the data away from the truth. We focus on Wasserstein distribution shifts, where every data point may undergo a slight perturbation, as opposed to the Huber contamination model where a fraction of observations are outliers. We consider perturbations that are either independe… ▽ More

    Submitted 9 October, 2023; v1 submitted 3 August, 2023; originally announced August 2023.

    Comments: 60 pages, 7 figures

  11. arXiv:2304.14662  [pdf, other

    cs.CL

    CED: Catalog Extraction from Documents

    Authors: Tong Zhu, Guoliang Zhang, Zechang Li, Zijian Yu, Junfei Ren, Mengsong Wu, Zhefeng Wang, Baoxing Huai, Pingfu Chao, Wenliang Chen

    Abstract: Sentence-by-sentence information extraction from long documents is an exhausting and error-prone task. As the indicator of document skeleton, catalogs naturally chunk documents into segments and provide informative cascade semantics, which can help to reduce the search space. Despite their usefulness, catalogs are hard to be extracted without the assist from external knowledge. For documents that… ▽ More

    Submitted 28 April, 2023; originally announced April 2023.

  12. arXiv:2302.04237  [pdf, other

    cs.LG

    Black Box Adversarial Prompting for Foundation Models

    Authors: Natalie Maus, Patrick Chao, Eric Wong, Jacob Gardner

    Abstract: Prompting interfaces allow users to quickly adjust the output of generative models in both vision and language. However, small changes and design choices in the prompt can lead to significant differences in the output. In this work, we develop a black-box framework for generating adversarial prompts for unstructured image and text generation. These prompts, which can be standalone or prepended to… ▽ More

    Submitted 29 May, 2023; v1 submitted 8 February, 2023; originally announced February 2023.

  13. arXiv:2302.00860  [pdf, other

    stat.ML cs.LG stat.ME

    Modeling Causal Mechanisms with Diffusion Models for Interventional and Counterfactual Queries

    Authors: Patrick Chao, Patrick Blöbaum, Sapan Patel, Shiva Prasad Kasiviswanathan

    Abstract: We consider the problem of answering observational, interventional, and counterfactual queries in a causally sufficient setting where only observational data and the causal graph are available. Utilizing the recent developments in diffusion models, we introduce diffusion-based causal models (DCM) to learn causal mechanisms, that generate unique latent encodings. These encodings enable us to direct… ▽ More

    Submitted 9 October, 2024; v1 submitted 1 February, 2023; originally announced February 2023.

    Comments: 30 pages. In this new revision, the title has been changed from previous one "Interventional and Counterfactual Inference with Diffusion Models"

  14. Attention-based Quantum Tomography

    Authors: Peter Cha, Paul Ginsparg, Felix Wu, Juan Carrasquilla, Peter L. McMahon, Eun-Ah Kim

    Abstract: With rapid progress across platforms for quantum systems, the problem of many-body quantum state reconstruction for noisy quantum states becomes an important challenge. Recent works found promise in recasting the problem of quantum state reconstruction to learning the probability distribution of quantum state measurement vectors using generative neural network models. Here we propose the "Attentio… ▽ More

    Submitted 3 November, 2021; v1 submitted 22 June, 2020; originally announced June 2020.

    Journal ref: Mach. Learn.: Sci. Technol. 3 01LT01

  15. arXiv:2002.11624  [pdf, other

    cs.LG cs.AI cs.CY

    Deep Attentive Study Session Dropout Prediction in Mobile Learning Environment

    Authors: Youngnam Lee, Dongmin Shin, HyunBin Loh, Jaemin Lee, Piljae Chae, Junghyun Cho, Seoyon Park, Jinhwan Lee, Jineon Baek, Byungsoo Kim, Youngduck Choi

    Abstract: Student dropout prediction provides an opportunity to improve student engagement, which maximizes the overall effectiveness of learning experiences. However, researches on student dropout were mainly conducted on school dropout or course dropout, and study session dropout in a mobile learning environment has not been considered thoroughly. In this paper, we investigate the study session dropout pr… ▽ More

    Submitted 1 February, 2021; v1 submitted 14 February, 2020; originally announced February 2020.

    Comments: CSEDU 2020

  16. arXiv:1910.13065  [pdf, other

    cs.DB

    A Survey on Map-Matching Algorithms

    Authors: Pingfu Chao, Yehong Xu, Wen Hua, Xiaofang Zhou

    Abstract: The map-matching is an essential preprocessing step for most of the trajectory-based applications. Although it has been an active topic for more than two decades and, driven by the emerging applications, is still under development. There is a lack of categorisation of existing solutions recently and analysis for future research directions. In this paper, we review the current status of the map-mat… ▽ More

    Submitted 28 October, 2019; originally announced October 2019.

    Comments: 12 pages, 5 figures, submitted to ADC 2020

  17. arXiv:1909.00948  [pdf, other

    cs.CV

    HarDNet: A Low Memory Traffic Network

    Authors: Ping Chao, Chao-Yang Kao, Yu-Shan Ruan, Chien-Hsiang Huang, Youn-Long Lin

    Abstract: State-of-the-art neural network architectures such as ResNet, MobileNet, and DenseNet have achieved outstanding accuracy over low MACs and small model size counterparts. However, these metrics might not be accurate for predicting the inference time. We suggest that memory traffic for accessing intermediate feature maps can be a factor dominating the inference latency, especially in such tasks as r… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

    Comments: ICCV 2019

  18. arXiv:1806.09070  [pdf, other

    cs.GR cs.LG stat.ML

    Generative Models for Pose Transfer

    Authors: Patrick Chao, Alexander Li, Gokul Swamy

    Abstract: We investigate nearest neighbor and generative models for transferring pose between persons. We take in a video of one person performing a sequence of actions and attempt to generate a video of another person performing the same actions. Our generative model (pix2pix) outperforms k-NN at both generating corresponding frames and generalizing outside the demonstrated action set. Our most salient con… ▽ More

    Submitted 23 June, 2018; originally announced June 2018.